Analysis of DNA Sequences by an Optical Time-Integrating Correlator: Proof-of-Concept Experiments.
1992-05-01
DNA ANALYSIS STRATEGY 4 2.1 Representation of DNA Bases 4 2.2 DNA Analysis Strategy 6 3.0 CUSTOM GENERATORS FOR DNA SEQUENCES 10 3.1 Hardware Design 10...of the DNA bases where each base is represented by a 7-bits long pseudorandom sequence. 5 Figure 4: Coarse analysis of a DNA sequence. 7 Figure 5: Fine...a 20-bases long database. 32 xiii LIST OF TABLES PAGE Table 1: Short representations of the DNA bases where each base is represented by 7-bits long
RDNAnalyzer: A tool for DNA secondary structure prediction and sequence analysis.
Afzal, Muhammad; Shahid, Ahmad Ali; Shehzadi, Abida; Nadeem, Shahid; Husnain, Tayyab
2012-01-01
RDNAnalyzer is an innovative computer based tool designed for DNA secondary structure prediction and sequence analysis. It can randomly generate the DNA sequence or user can upload the sequences of their own interest in RAW format. It uses and extends the Nussinov dynamic programming algorithm and has various application for the sequence analysis. It predicts the DNA secondary structure and base pairings. It also provides the tools for routinely performed sequence analysis by the biological scientists such as DNA replication, reverse compliment generation, transcription, translation, sequence specific information as total number of nucleotide bases, ATGC base contents along with their respective percentages and sequence cleaner. RDNAnalyzer is a unique tool developed in Microsoft Visual Studio 2008 using Microsoft Visual C# and Windows Presentation Foundation and provides user friendly environment for sequence analysis. It is freely available. http://www.cemb.edu.pk/sw.html RDNAnalyzer - Random DNA Analyser, GUI - Graphical user interface, XAML - Extensible Application Markup Language.
RDNAnalyzer: A tool for DNA secondary structure prediction and sequence analysis
Afzal, Muhammad; Shahid, Ahmad Ali; Shehzadi, Abida; Nadeem, Shahid; Husnain, Tayyab
2012-01-01
RDNAnalyzer is an innovative computer based tool designed for DNA secondary structure prediction and sequence analysis. It can randomly generate the DNA sequence or user can upload the sequences of their own interest in RAW format. It uses and extends the Nussinov dynamic programming algorithm and has various application for the sequence analysis. It predicts the DNA secondary structure and base pairings. It also provides the tools for routinely performed sequence analysis by the biological scientists such as DNA replication, reverse compliment generation, transcription, translation, sequence specific information as total number of nucleotide bases, ATGC base contents along with their respective percentages and sequence cleaner. RDNAnalyzer is a unique tool developed in Microsoft Visual Studio 2008 using Microsoft Visual C# and Windows Presentation Foundation and provides user friendly environment for sequence analysis. It is freely available. Availability http://www.cemb.edu.pk/sw.html Abbreviations RDNAnalyzer - Random DNA Analyser, GUI - Graphical user interface, XAML - Extensible Application Markup Language. PMID:23055611
Analysis of DNA Sequences by An Optical Time-Integrating Correlator: Proof-Of-Concept Experiments.
1992-05-01
TABLES xv LIST OF ABBREVIATIONS xvii 1.0 INTRODUCTION 1 2.0 DNA ANALYSIS STRATEGY 4 2.1 Representation of DNA Bases 4 2.2 DNA Analysis Strategy 6 3.0...Zehnder architecture. 3 Figure 3: Short representations of the DNA bases where each base is represented by a 7-bits long pseudorandom sequence. 5... DNA bases where each base is represented by 7-bits long pseudorandom sequences. 4 Table 2: Long representations of the DNA bases with 255-bits maximum
A DNA sequence analysis package for the IBM personal computer.
Lagrimini, L M; Brentano, S T; Donelson, J E
1984-01-01
We present here a collection of DNA sequence analysis programs, called "PC Sequence" (PCS), which are designed to run on the IBM Personal Computer (PC). These programs are written in IBM PC compiled BASIC and take full advantage of the IBM PC's speed, error handling, and graphics capabilities. For a modest initial expense in hardware any laboratory can use these programs to quickly perform computer analysis on DNA sequences. They are written with the novice user in mind and require very little training or previous experience with computers. Also provided are a text editing program for creating and modifying DNA sequence files and a communications program which enables the PC to communicate with and collect information from mainframe computers and DNA sequence databases. PMID:6546433
Laser Desorption Mass Spectrometry for DNA Sequencing and Analysis
NASA Astrophysics Data System (ADS)
Chen, C. H. Winston; Taranenko, N. I.; Golovlev, V. V.; Isola, N. R.; Allman, S. L.
1998-03-01
Rapid DNA sequencing and/or analysis is critically important for biomedical research. In the past, gel electrophoresis has been the primary tool to achieve DNA analysis and sequencing. However, gel electrophoresis is a time-consuming and labor-extensive process. Recently, we have developed and used laser desorption mass spectrometry (LDMS) to achieve sequencing of ss-DNA longer than 100 nucleotides. With LDMS, we succeeded in sequencing DNA in seconds instead of hours or days required by gel electrophoresis. In addition to sequencing, we also applied LDMS for the detection of DNA probes for hybridization LDMS was also used to detect short tandem repeats for forensic applications. Clinical applications for disease diagnosis such as cystic fibrosis caused by base deletion and point mutation have also been demonstrated. Experimental details will be presented in the meeting. abstract.
Marck, C
1988-01-01
DNA Strider is a new integrated DNA and Protein sequence analysis program written with the C language for the Macintosh Plus, SE and II computers. It has been designed as an easy to learn and use program as well as a fast and efficient tool for the day-to-day sequence analysis work. The program consists of a multi-window sequence editor and of various DNA and Protein analysis functions. The editor may use 4 different types of sequences (DNA, degenerate DNA, RNA and one-letter coded protein) and can handle simultaneously 6 sequences of any type up to 32.5 kB each. Negative numbering of the bases is allowed for DNA sequences. All classical restriction and translation analysis functions are present and can be performed in any order on any open sequence or part of a sequence. The main feature of the program is that the same analysis function can be repeated several times on different sequences, thus generating multiple windows on the screen. Many graphic capabilities have been incorporated such as graphic restriction map, hydrophobicity profile and the CAI plot- codon adaptation index according to Sharp and Li. The restriction sites search uses a newly designed fast hexamer look-ahead algorithm. Typical runtime for the search of all sites with a library of 130 restriction endonucleases is 1 second per 10,000 bases. The circular graphic restriction map of the pBR322 plasmid can be therefore computed from its sequence and displayed on the Macintosh Plus screen within 2 seconds and its multiline restriction map obtained in a scrolling window within 5 seconds. PMID:2832831
Laser mass spectrometry for DNA sequencing, disease diagnosis, and fingerprinting
NASA Astrophysics Data System (ADS)
Chen, C. H. Winston; Taranenko, N. I.; Zhu, Y. F.; Chung, C. N.; Allman, S. L.
1997-05-01
Since laser mass spectrometry has the potential for achieving very fast DNA analysis, we recently applied it to DNA sequencing, DNA typing for fingerprinting, and DNA screening for disease diagnosis. Two different approaches for sequencing DNA have been successfully demonstrated. One is to sequence DNA with DNA ladders produced from Sanger's enzymatic method. The other is to do direct sequencing without DNA ladders. The need for quick DNA typing for identification purposes is critical for forensic application. Our preliminary results indicate laser mass spectrometry can possible be used for rapid DNA fingerprinting applications at a much lower cost than gel electrophoresis. Population screening for certain genetic disease can be a very efficient step to reducing medical costs through prevention. Since laser mass spectrometry can provide very fast DNA analysis, we applied laser mass spectrometry to disease diagnosis. Clinical samples with both base deletion and point mutation have been tested with complete success.
Genomics dataset of unidentified disclosed isolates.
Rekadwad, Bhagwan N
2016-09-01
Analysis of DNA sequences is necessary for higher hierarchical classification of the organisms. It gives clues about the characteristics of organisms and their taxonomic position. This dataset is chosen to find complexities in the unidentified DNA in the disclosed patents. A total of 17 unidentified DNA sequences were thoroughly analyzed. The quick response codes were generated. AT/GC content of the DNA sequences analysis was carried out. The QR is helpful for quick identification of isolates. AT/GC content is helpful for studying their stability at different temperatures. Additionally, a dataset on cleavage code and enzyme code studied under the restriction digestion study, which helpful for performing studies using short DNA sequences was reported. The dataset disclosed here is the new revelatory data for exploration of unique DNA sequences for evaluation, identification, comparison and analysis.
CRITICA: coding region identification tool invoking comparative analysis
NASA Technical Reports Server (NTRS)
Badger, J. H.; Olsen, G. J.; Woese, C. R. (Principal Investigator)
1999-01-01
Gene recognition is essential to understanding existing and future DNA sequence data. CRITICA (Coding Region Identification Tool Invoking Comparative Analysis) is a suite of programs for identifying likely protein-coding sequences in DNA by combining comparative analysis of DNA sequences with more common noncomparative methods. In the comparative component of the analysis, regions of DNA are aligned with related sequences from the DNA databases; if the translation of the aligned sequences has greater amino acid identity than expected for the observed percentage nucleotide identity, this is interpreted as evidence for coding. CRITICA also incorporates noncomparative information derived from the relative frequencies of hexanucleotides in coding frames versus other contexts (i.e., dicodon bias). The dicodon usage information is derived by iterative analysis of the data, such that CRITICA is not dependent on the existence or accuracy of coding sequence annotations in the databases. This independence makes the method particularly well suited for the analysis of novel genomes. CRITICA was tested by analyzing the available Salmonella typhimurium DNA sequences. Its predictions were compared with the DNA sequence annotations and with the predictions of GenMark. CRITICA proved to be more accurate than GenMark, and moreover, many of its predictions that would seem to be errors instead reflect problems in the sequence databases. The source code of CRITICA is freely available by anonymous FTP (rdp.life.uiuc.edu in/pub/critica) and on the World Wide Web (http:/(/)rdpwww.life.uiuc.edu).
A simple procedure for parallel sequence analysis of both strands of 5'-labeled DNA.
Razvi, F; Gargiulo, G; Worcel, A
1983-08-01
Ligation of a 5'-labeled DNA restriction fragment results in a circular DNA molecule carrying the two 32Ps at the reformed restriction site. Double digestions of the circular DNA with the original enzyme and a second restriction enzyme cleavage near the labeled site allows direct chemical sequencing of one 5'-labeled DNA strand. Similar double digestions, using an isoschizomer that cleaves differently at the 32P-labeled site, allows direct sequencing of the now 3'-labeled complementary DNA strand. It is possible to directly sequence both strands of cloned DNA inserts by using the above protocol and a multiple cloning site vector that provides the necessary restriction sites. The simultaneous and parallel visualization of both DNA strands eliminates sequence ambiguities. In addition, the labeled circular molecules are particularly useful for single-hit DNA cleavage studies and DNA footprint analysis. As an example, we show here an analysis of the micrococcal nuclease-induced breaks on the two strands of the somatic 5S RNA gene of Xenopus borealis, which suggests that the enzyme may recognize and cleave small AT-containing palindromes along the DNA helix.
Jun, Goo; Flickinger, Matthew; Hetrick, Kurt N.; Romm, Jane M.; Doheny, Kimberly F.; Abecasis, Gonçalo R.; Boehnke, Michael; Kang, Hyun Min
2012-01-01
DNA sample contamination is a serious problem in DNA sequencing studies and may result in systematic genotype misclassification and false positive associations. Although methods exist to detect and filter out cross-species contamination, few methods to detect within-species sample contamination are available. In this paper, we describe methods to identify within-species DNA sample contamination based on (1) a combination of sequencing reads and array-based genotype data, (2) sequence reads alone, and (3) array-based genotype data alone. Analysis of sequencing reads allows contamination detection after sequence data is generated but prior to variant calling; analysis of array-based genotype data allows contamination detection prior to generation of costly sequence data. Through a combination of analysis of in silico and experimentally contaminated samples, we show that our methods can reliably detect and estimate levels of contamination as low as 1%. We evaluate the impact of DNA contamination on genotype accuracy and propose effective strategies to screen for and prevent DNA contamination in sequencing studies. PMID:23103226
Lee, Hwan Young; Song, Injee; Ha, Eunho; Cho, Sung-Bae; Yang, Woo Ick; Shin, Kyoung-Jin
2008-01-01
Background For the past few years, scientific controversy has surrounded the large number of errors in forensic and literature mitochondrial DNA (mtDNA) data. However, recent research has shown that using mtDNA phylogeny and referring to known mtDNA haplotypes can be useful for checking the quality of sequence data. Results We developed a Web-based bioinformatics resource "mtDNAmanager" that offers a convenient interface supporting the management and quality analysis of mtDNA sequence data. The mtDNAmanager performs computations on mtDNA control-region sequences to estimate the most-probable mtDNA haplogroups and retrieves similar sequences from a selected database. By the phased designation of the most-probable haplogroups (both expected and estimated haplogroups), mtDNAmanager enables users to systematically detect errors whilst allowing for confirmation of the presence of clear key diagnostic mutations and accompanying mutations. The query tools of mtDNAmanager also facilitate database screening with two options of "match" and "include the queried nucleotide polymorphism". In addition, mtDNAmanager provides Web interfaces for users to manage and analyse their own data in batch mode. Conclusion The mtDNAmanager will provide systematic routines for mtDNA sequence data management and analysis via easily accessible Web interfaces, and thus should be very useful for population, medical and forensic studies that employ mtDNA analysis. mtDNAmanager can be accessed at . PMID:19014619
Mitochondrial sequence analysis for forensic identification using pyrosequencing technology.
Andréasson, H; Asp, A; Alderborn, A; Gyllensten, U; Allen, M
2002-01-01
Over recent years, requests for mtDNA analysis in the field of forensic medicine have notably increased, and the results of such analyses have proved to be very useful in forensic cases where nuclear DNA analysis cannot be performed. Traditionally, mtDNA has been analyzed by DNA sequencing of the two hypervariable regions, HVI and HVII, in the D-loop. DNA sequence analysis using the conventional Sanger sequencing is very robust but time consuming and labor intensive. By contrast, mtDNA analysis based on the pyrosequencing technology provides fast and accurate results from the human mtDNA present in many types of evidence materials in forensic casework. The assay has been developed to determine polymorphic sites in the mitochondrial D-loop as well as the coding region to further increase the discrimination power of mtDNA analysis. The pyrosequencing technology for analysis of mtDNA polymorphisms has been tested with regard to sensitivity, reproducibility, and success rate when applied to control samples and actual casework materials. The results show that the method is very accurate and sensitive; the results are easily interpreted and provide a high success rate on casework samples. The panel of pyrosequencing reactions for the mtDNA polymorphisms were chosen to result in an optimal discrimination power in relation to the number of bases determined.
Vlahovicek, K; Munteanu, M G; Pongor, S
1999-01-01
Bending is a local conformational micropolymorphism of DNA in which the original B-DNA structure is only distorted but not extensively modified. Bending can be predicted by simple static geometry models as well as by a recently developed elastic model that incorporate sequence dependent anisotropic bendability (SDAB). The SDAB model qualitatively explains phenomena including affinity of protein binding, kinking, as well as sequence-dependent vibrational properties of DNA. The vibrational properties of DNA segments can be studied by finite element analysis of a model subjected to an initial bending moment. The frequency spectrum is obtained by applying Fourier analysis to the displacement values in the time domain. This analysis shows that the spectrum of the bending vibrations quite sensitively depends on the sequence, for example the spectrum of a curved sequence is characteristically different from the spectrum of straight sequence motifs of identical basepair composition. Curvature distributions are genome-specific, and pronounced differences are found between protein-coding and regulatory regions, respectively, that is, sites of extreme curvature and/or bendability are less frequent in protein-coding regions. A WWW server is set up for the prediction of curvature and generation of 3D models from DNA sequences (http:@www.icgeb.trieste.it/dna).
Ludgate, Jackie L; Wright, James; Stockwell, Peter A; Morison, Ian M; Eccles, Michael R; Chatterjee, Aniruddha
2017-08-31
Formalin fixed paraffin embedded (FFPE) tumor samples are a major source of DNA from patients in cancer research. However, FFPE is a challenging material to work with due to macromolecular fragmentation and nucleic acid crosslinking. FFPE tissue particularly possesses challenges for methylation analysis and for preparing sequencing-based libraries relying on bisulfite conversion. Successful bisulfite conversion is a key requirement for sequencing-based methylation analysis. Here we describe a complete and streamlined workflow for preparing next generation sequencing libraries for methylation analysis from FFPE tissues. This includes, counting cells from FFPE blocks and extracting DNA from FFPE slides, testing bisulfite conversion efficiency with a polymerase chain reaction (PCR) based test, preparing reduced representation bisulfite sequencing libraries and massively parallel sequencing. The main features and advantages of this protocol are: An optimized method for extracting good quality DNA from FFPE tissues. An efficient bisulfite conversion and next generation sequencing library preparation protocol that uses 50 ng DNA from FFPE tissue. Incorporation of a PCR-based test to assess bisulfite conversion efficiency prior to sequencing. We provide a complete workflow and an integrated protocol for performing DNA methylation analysis at the genome-scale and we believe this will facilitate clinical epigenetic research that involves the use of FFPE tissue.
Food Fish Identification from DNA Extraction through Sequence Analysis
ERIC Educational Resources Information Center
Hallen-Adams, Heather E.
2015-01-01
This experiment exposed 3rd and 4th y undergraduates and graduate students taking a course in advanced food analysis to DNA extraction, polymerase chain reaction (PCR), and DNA sequence analysis. Students provided their own fish sample, purchased from local grocery stores, and the class as a whole extracted DNA, which was then subjected to PCR,…
Lactobacillus heilongjiangensis sp. nov., isolated from Chinese pickle.
Gu, Chun Tao; Li, Chun Yan; Yang, Li Jie; Huo, Gui Cheng
2013-11-01
A Gram-stain-positive bacterial strain, S4-3(T), was isolated from traditional pickle in Heilongjiang Province, China. The bacterium was characterized by a polyphasic approach, including 16S rRNA gene sequence analysis, pheS gene sequence analysis, rpoA gene sequence analysis, dnaK gene sequence analysis, fatty acid methyl ester (FAME) analysis, determination of DNA G+C content, DNA-DNA hybridization and an analysis of phenotypic features. Strain S4-3(T) showed 97.9-98.7 % 16S rRNA gene sequence similarities, 84.4-94.1 % pheS gene sequence similarities and 94.4-96.9 % rpoA gene sequence similarities to the type strains of Lactobacillus nantensis, Lactobacillus mindensis, Lactobacillus crustorum, Lactobacillus futsaii, Lactobacillus farciminis and Lactobacillus kimchiensis. dnaK gene sequence similarities between S4-3(T) and Lactobacillus nantensis LMG 23510(T), Lactobacillus mindensis LMG 21932(T), Lactobacillus crustorum LMG 23699(T), Lactobacillus futsaii JCM 17355(T) and Lactobacillus farciminis LMG 9200(T) were 95.4, 91.5, 90.4, 91.7 and 93.1 %, respectively. Based upon the data obtained in the present study, a novel species, Lactobacillus heilongjiangensis sp. nov., is proposed and the type strain is S4-3(T) ( = LMG 26166(T) = NCIMB 14701(T)).
Yanagi, Tomohiro; Shirasawa, Kenta; Terachi, Mayuko; Isobe, Sachiko
2017-01-01
Cultivated strawberry ( Fragaria × ananassa Duch.) has homoeologous chromosomes because of allo-octoploidy. For example, two homoeologous chromosomes that belong to different sub-genome of allopolyploids have similar base sequences. Thus, when conducting de novo assembly of DNA sequences, it is difficult to determine whether these sequences are derived from the same chromosome. To avoid the difficulties associated with homoeologous chromosomes and demonstrate the possibility of sequencing allopolyploids using single chromosomes, we conducted sequence analysis using microdissected single somatic chromosomes of cultivated strawberry. Three hundred and ten somatic chromosomes of the Japanese octoploid strawberry 'Reiko' were individually selected under a light microscope using a microdissection system. DNA from 288 of the dissected chromosomes was successfully amplified using a DNA amplification kit. Using next-generation sequencing, we decoded the base sequences of the amplified DNA segments, and on the basis of mapping, we identified DNA sequences from 144 samples that were best matched to the reference genomes of the octoploid strawberry, F. × ananassa , and the diploid strawberry, F. vesca . The 144 samples were classified into seven pseudo-molecules of F. vesca . The coverage rates of the DNA sequences from the single chromosome onto all pseudo-molecular sequences varied from 3 to 29.9%. We demonstrated an efficient method for sequence analysis of allopolyploid plants using microdissected single chromosomes. On the basis of our results, we believe that whole-genome analysis of allopolyploid plants can be enhanced using methodology that employs microdissected single chromosomes.
Oligo Design: a computer program for development of probes for oligonucleotide microarrays.
Herold, Keith E; Rasooly, Avraham
2003-12-01
Oligonucleotide microarrays have demonstrated potential for the analysis of gene expression, genotyping, and mutational analysis. Our work focuses primarily on the detection and identification of bacteria based on known short sequences of DNA. Oligo Design, the software described here, automates several design aspects that enable the improved selection of oligonucleotides for use with microarrays for these applications. Two major features of the program are: (i) a tiling algorithm for the design of short overlapping temperature-matched oligonucleotides of variable length, which are useful for the analysis of single nucleotide polymorphisms and (ii) a set of tools for the analysis of multiple alignments of gene families and related short DNA sequences, which allow for the identification of conserved DNA sequences for PCR primer selection and variable DNA sequences for the selection of unique probes for identification. Note that the program does not address the full genome perspective but, instead, is focused on the genetic analysis of short segments of DNA. The program is Internet-enabled and includes a built-in browser and the automated ability to download sequences from GenBank by specifying the GI number. The program also includes several utilities, including audio recital of a DNA sequence (useful for verifying sequences against a written document), a random sequence generator that provides insight into the relationship between melting temperature and GC content, and a PCR calculator.
[Genome-scale sequence data processing and epigenetic analysis of DNA methylation].
Wang, Ting-Zhang; Shan, Gao; Xu, Jian-Hong; Xue, Qing-Zhong
2013-06-01
A new approach recently developed for detecting cytosine DNA methylation (mC) and analyzing the genome-scale DNA methylation profiling, is called BS-Seq which is based on bisulfite conversion of genomic DNA combined with next-generation sequencing. The method can not only provide an insight into the difference of genome-scale DNA methylation among different organisms, but also reveal the conservation of DNA methylation in all contexts and nucleotide preference for different genomic regions, including genes, exons, and repetitive DNA sequences. It will be helpful to under-stand the epigenetic impacts of cytosine DNA methylation on the regulation of gene expression and maintaining silence of repetitive sequences, such as transposable elements. In this paper, we introduce the preprocessing steps of DNA methylation data, by which cytosine (C) and guanine (G) in the reference sequence are transferred to thymine (T) and adenine (A), and cytosine in reads is transferred to thymine, respectively. We also comprehensively review the main content of the DNA methylation analysis on the genomic scale: (1) the cytosine methylation under the context of different sequences; (2) the distribution of genomic methylcytosine; (3) DNA methylation context and the preference for the nucleotides; (4) DNA- protein interaction sites of DNA methylation; (5) degree of methylation of cytosine in the different structural elements of genes. DNA methylation analysis technique provides a powerful tool for the epigenome study in human and other species, and genes and environment interaction, and founds the theoretical basis for further development of disease diagnostics and therapeutics in human.
BiQ Analyzer HT: locus-specific analysis of DNA methylation by high-throughput bisulfite sequencing
Lutsik, Pavlo; Feuerbach, Lars; Arand, Julia; Lengauer, Thomas; Walter, Jörn; Bock, Christoph
2011-01-01
Bisulfite sequencing is a widely used method for measuring DNA methylation in eukaryotic genomes. The assay provides single-base pair resolution and, given sufficient sequencing depth, its quantitative accuracy is excellent. High-throughput sequencing of bisulfite-converted DNA can be applied either genome wide or targeted to a defined set of genomic loci (e.g. using locus-specific PCR primers or DNA capture probes). Here, we describe BiQ Analyzer HT (http://biq-analyzer-ht.bioinf.mpi-inf.mpg.de/), a user-friendly software tool that supports locus-specific analysis and visualization of high-throughput bisulfite sequencing data. The software facilitates the shift from time-consuming clonal bisulfite sequencing to the more quantitative and cost-efficient use of high-throughput sequencing for studying locus-specific DNA methylation patterns. In addition, it is useful for locus-specific visualization of genome-wide bisulfite sequencing data. PMID:21565797
Genomic DNA sequence and cytosine methylation changes of adult rice leaves after seeds space flight
NASA Astrophysics Data System (ADS)
Shi, Jinming
In this study, cytosine methylation on CCGG site and genomic DNA sequence changes of adult leaves of rice after seeds space flight were detected by methylation-sensitive amplification polymorphism (MSAP) and Amplified fragment length polymorphism (AFLP) technique respectively. Rice seeds were planted in the trial field after 4 days space flight on the shenzhou-6 Spaceship of China. Adult leaves of space-treated rice including 8 plants chosen randomly and 2 plants with phenotypic mutation were used for AFLP and MSAP analysis. Polymorphism of both DNA sequence and cytosine methylation were detected. For MSAP analysis, the average polymorphic frequency of the on-ground controls, space-treated plants and mutants are 1.3%, 3.1% and 11% respectively. For AFLP analysis, the average polymorphic frequencies are 1.4%, 2.9%and 8%respectively. Total 27 and 22 polymorphic fragments were cloned sequenced from MSAP and AFLP analysis respectively. Nine of the 27 fragments from MSAP analysis show homology to coding sequence. For the 22 polymorphic fragments from AFLP analysis, no one shows homology to mRNA sequence and eight fragments show homology to repeat region or retrotransposon sequence. These results suggest that although both genomic DNA sequence and cytosine methylation status can be effected by space flight, the genomic region homology to the fragments from genome DNA and cytosine methylation analysis were different.
SNP discovery through de novo deep sequencing using the next generation of DNA sequencers
USDA-ARS?s Scientific Manuscript database
The production of high volumes of DNA sequence data using new technologies has permitted more efficient identification of single nucleotide polymorphisms in vertebrate genomes. This chapter presented practical methodology for production and analysis of DNA sequence data for SNP discovery....
Thomas, W. Kelley; Vida, J. T.; Frisse, Linda M.; Mundo, Manuel; Baldwin, James G.
1997-01-01
To effectively integrate DNA sequence analysis and classical nematode taxonomy, we must be able to obtain DNA sequences from formalin-fixed specimens. Microdissected sections of nematodes were removed from specimens fixed in formalin, using standard protocols and without destroying morphological features. The fixed sections provided sufficient template for multiple polymerase chain reaction-based DNA sequence analyses. PMID:19274156
An improved model for whole genome phylogenetic analysis by Fourier transform.
Yin, Changchuan; Yau, Stephen S-T
2015-10-07
DNA sequence similarity comparison is one of the major steps in computational phylogenetic studies. The sequence comparison of closely related DNA sequences and genomes is usually performed by multiple sequence alignments (MSA). While the MSA method is accurate for some types of sequences, it may produce incorrect results when DNA sequences undergone rearrangements as in many bacterial and viral genomes. It is also limited by its computational complexity for comparing large volumes of data. Previously, we proposed an alignment-free method that exploits the full information contents of DNA sequences by Discrete Fourier Transform (DFT), but still with some limitations. Here, we present a significantly improved method for the similarity comparison of DNA sequences by DFT. In this method, we map DNA sequences into 2-dimensional (2D) numerical sequences and then apply DFT to transform the 2D numerical sequences into frequency domain. In the 2D mapping, the nucleotide composition of a DNA sequence is a determinant factor and the 2D mapping reduces the nucleotide composition bias in distance measure, and thus improving the similarity measure of DNA sequences. To compare the DFT power spectra of DNA sequences with different lengths, we propose an improved even scaling algorithm to extend shorter DFT power spectra to the longest length of the underlying sequences. After the DFT power spectra are evenly scaled, the spectra are in the same dimensionality of the Fourier frequency space, then the Euclidean distances of full Fourier power spectra of the DNA sequences are used as the dissimilarity metrics. The improved DFT method, with increased computational performance by 2D numerical representation, can be applicable to any DNA sequences of different length ranges. We assess the accuracy of the improved DFT similarity measure in hierarchical clustering of different DNA sequences including simulated and real datasets. The method yields accurate and reliable phylogenetic trees and demonstrates that the improved DFT dissimilarity measure is an efficient and effective similarity measure of DNA sequences. Due to its high efficiency and accuracy, the proposed DFT similarity measure is successfully applied on phylogenetic analysis for individual genes and large whole bacterial genomes. Copyright © 2015 Elsevier Ltd. All rights reserved.
Adachi, Noboru; Umetsu, Kazuo; Shojo, Hideki
2014-01-01
Mitochondrial DNA (mtDNA) is widely used for DNA analysis of highly degraded samples because of its polymorphic nature and high number of copies in a cell. However, as endogenous mtDNA in deteriorated samples is scarce and highly fragmented, it is not easy to obtain reliable data. In the current study, we report the risks of direct sequencing mtDNA in highly degraded material, and suggest a strategy to ensure the quality of sequencing data. It was observed that direct sequencing data of the hypervariable segment (HVS) 1 by using primer sets that generate an amplicon of 407 bp (long-primer sets) was different from results obtained by using newly designed primer sets that produce an amplicon of 120-139 bp (mini-primer sets). The data aligned with the results of mini-primer sets analysis in an amplicon length-dependent manner; the shorter the amplicon, the more evident the endogenous sequence became. Coding region analysis using multiplex amplified product-length polymorphisms revealed the incongruence of single nucleotide polymorphisms between the coding region and HVS 1 caused by contamination with exogenous mtDNA. Although the sequencing data obtained using long-primer sets turned out to be erroneous, it was unambiguous and reproducible. These findings suggest that PCR primers that produce amplicons shorter than those currently recognized should be used for mtDNA analysis in highly degraded samples. Haplogroup motif analysis of the coding region and HVS should also be performed to improve the reliability of forensic mtDNA data. Copyright © 2013 Elsevier Ireland Ltd. All rights reserved.
Kröber, Magdalena; Bekel, Thomas; Diaz, Naryttza N; Goesmann, Alexander; Jaenicke, Sebastian; Krause, Lutz; Miller, Dimitri; Runte, Kai J; Viehöver, Prisca; Pühler, Alfred; Schlüter, Andreas
2009-06-01
The phylogenetic structure of the microbial community residing in a fermentation sample from a production-scale biogas plant fed with maize silage, green rye and liquid manure was analysed by an integrated approach using clone library sequences and metagenome sequence data obtained by 454-pyrosequencing. Sequencing of 109 clones from a bacterial and an archaeal 16S-rDNA amplicon library revealed that the obtained nucleotide sequences are similar but not identical to 16S-rDNA database sequences derived from different anaerobic environments including digestors and bioreactors. Most of the bacterial 16S-rDNA sequences could be assigned to the phylum Firmicutes with the most abundant class Clostridia and to the class Bacteroidetes, whereas most archaeal 16S-rDNA sequences cluster close to the methanogen Methanoculleus bourgensis. Further sequences of the archaeal library most probably represent so far non-characterised species within the genus Methanoculleus. A similar result derived from phylogenetic analysis of mcrA clone sequences. The mcrA gene product encodes the alpha-subunit of methyl-coenzyme-M reductase involved in the final step of methanogenesis. BLASTn analysis applying stringent settings resulted in assignment of 16S-rDNA metagenome sequence reads to 62 16S-rDNA amplicon sequences thus enabling frequency of abundance estimations for 16S-rDNA clone library sequences. Ribosomal Database Project (RDP) Classifier processing of metagenome 16S-rDNA reads revealed abundance of the phyla Firmicutes, Bacteroidetes and Euryarchaeota and the orders Clostridiales, Bacteroidales and Methanomicrobiales. Moreover, a large fraction of 16S-rDNA metagenome reads could not be assigned to lower taxonomic ranks, demonstrating that numerous microorganisms in the analysed fermentation sample of the biogas plant are still unclassified or unknown.
Cloning and sequence analysis of Hemonchus contortus HC58cDNA.
Muleke, Charles I; Ruofeng, Yan; Lixin, Xu; Xinwen, Bo; Xiangrui, Li
2007-06-01
The complete coding sequence of Hemonchus contortus HC58cDNA was generated by rapid amplification of cDNA ends and polymerase chain reaction using primers based on the 5' and 3' ends of the parasite mRNA, accession no. AF305964. The HC58cDNA gene was 851 bp long, with open reading frame of 717 bp, precursors to 239 amino acids coding for approximately 27 kDa protein. Analysis of amino acid sequence revealed conserved residues of cysteine, histidine, asparagine, occluding loop pattern, hemoglobinase motif and glutamine of the oxyanion hole characteristic of cathepsin B like proteases (CBL). Comparison of the predicted amino acid sequences showed the protein shared 33.5-58.7% identity to cathepsin B homologues in the papain clan CA family (family C1). Phylogenetic analysis revealed close evolutionary proximity of the protein sequence to counterpart sequences in the CBL, suggesting that HC58cDNA was a member of the papain family.
"First generation" automated DNA sequencing technology.
Slatko, Barton E; Kieleczawa, Jan; Ju, Jingyue; Gardner, Andrew F; Hendrickson, Cynthia L; Ausubel, Frederick M
2011-10-01
Beginning in the 1980s, automation of DNA sequencing has greatly increased throughput, reduced costs, and enabled large projects to be completed more easily. The development of automation technology paralleled the development of other aspects of DNA sequencing: better enzymes and chemistry, separation and imaging technology, sequencing protocols, robotics, and computational advancements (including base-calling algorithms with quality scores, database developments, and sequence analysis programs). Despite the emergence of high-throughput sequencing platforms, automated Sanger sequencing technology remains useful for many applications. This unit provides background and a description of the "First-Generation" automated DNA sequencing technology. It also includes protocols for using the current Applied Biosystems (ABI) automated DNA sequencing machines. © 2011 by John Wiley & Sons, Inc.
ERIC Educational Resources Information Center
Shah, Kushani; Thomas, Shelby; Stein, Arnold
2013-01-01
In this report, we describe a 5-week laboratory exercise for undergraduate biology and biochemistry students in which students learn to sequence DNA and to genotype their DNA for selected single nucleotide polymorphisms (SNPs). Students use miniaturized DNA sequencing gels that require approximately 8 min to run. The students perform G, A, T, C…
TaxI: a software tool for DNA barcoding using distance methods
Steinke, Dirk; Vences, Miguel; Salzburger, Walter; Meyer, Axel
2005-01-01
DNA barcoding is a promising approach to the diagnosis of biological diversity in which DNA sequences serve as the primary key for information retrieval. Most existing software for evolutionary analysis of DNA sequences was designed for phylogenetic analyses and, hence, those algorithms do not offer appropriate solutions for the rapid, but precise analyses needed for DNA barcoding, and are also unable to process the often large comparative datasets. We developed a flexible software tool for DNA taxonomy, named TaxI. This program calculates sequence divergences between a query sequence (taxon to be barcoded) and each sequence of a dataset of reference sequences defined by the user. Because the analysis is based on separate pairwise alignments this software is also able to work with sequences characterized by multiple insertions and deletions that are difficult to align in large sequence sets (i.e. thousands of sequences) by multiple alignment algorithms because of computational restrictions. Here, we demonstrate the utility of this approach with two datasets of fish larvae and juveniles from Lake Constance and juvenile land snails under different models of sequence evolution. Sets of ribosomal 16S rRNA sequences, characterized by multiple indels, performed as good as or better than cox1 sequence sets in assigning sequences to species, demonstrating the suitability of rRNA genes for DNA barcoding. PMID:16214755
FUNGAL-SPECIFIC PCR PRIMERS DEVELOPED FOR ANALYSIS OF THE ITS REGION OF ENVIRONMENTAL DNA EXTRACTS
Background The Internal Transcribed Spacer (ITS) regions of fungal ribosomal DNA (rDNA) are highly variable sequences of great importance in distinguishing fungal species by PCR analysis. Previously published PCR primers available for amplifying these sequences from environmenta...
Schnitzler, P; Delius, H; Scholz, J; Touray, M; Orth, E; Darai, G
1987-12-01
The genome of the fish lymphocystis disease virus (FLDV) was screened for the existence of repetitive DNA sequences using a defined and complete gene library of the viral genome (98 kbp) by DNA-DNA hybridization, heteroduplex analysis, and restriction fine mapping. A repetitive DNA sequence was detected at the coordinates 0.034 to 0.057 and 0.718 to 0.736 map units (m.u.) of the FLDV genome. The first region (0.034 to 0.057 m.u.) corresponds to the 5' terminus of the EcoRI FLDV DNA fragment B (0.034 to 0.165 m.u.) and the second region (0.718 to 0.736 m.u.) is identical to the EcoRI DNA fragment M of the viral genome. The DNA nucleotide sequence of the EcoRI FLDV DNA fragment M was determined. This analysis revealed the presence of many short direct and inverted repetitions, e.g., a 18-mer direct repetition (TTTAAAATTTAATTAA) that started at nucleotide positions 812 and 942 and a 14-mer inverted repeat (TTAAATTTAAATTT) at nucleotide positions 820 and 959. Only short open reading frames were detected within this region. The DNA repetitions are discussed as sequences that play a possible regulatory role for virus replication. Furthermore, hybridization experiments revealed that the repetitive DNA sequences are conserved in the genome of different strains of fish lymphocystis disease virus isolated from two species of Pleuronectidae (flounder and dab).
DNA Clutch Probes for Circulating Tumor DNA Analysis.
Das, Jagotamoy; Ivanov, Ivaylo; Sargent, Edward H; Kelley, Shana O
2016-08-31
Progress toward the development of minimally invasive liquid biopsies of disease is being bolstered by breakthroughs in the analysis of circulating tumor DNA (ctDNA): DNA released from cancer cells into the bloodstream. However, robust, sensitive, and specific methods of detecting this emerging analyte are lacking. ctDNA analysis has unique challenges, since it is imperative to distinguish circulating DNA from normal cells vs mutation-bearing sequences originating from tumors. Here we report the electrochemical detection of mutated ctDNA in samples collected from cancer patients. By developing a strategy relying on the use of DNA clutch probes (DCPs) that render specific sequences of ctDNA accessible, we were able to readout the presence of mutated ctDNA. DCPs prevent reassociation of denatured DNA strands: they make one of the two strands of a dsDNA accessible for hybridization to a probe, and they also deactivate other closely related sequences in solution. DCPs ensure thereby that only mutated sequences associate with chip-based sensors detecting hybridization events. The assay exhibits excellent sensitivity and specificity in the detection of mutated ctDNA: it detects 1 fg/μL of a target mutation in the presence of 100 pg/μL of wild-type DNA, corresponding to detecting mutations at a level of 0.01% relative to wild type. This approach allows accurate analysis of samples collected from lung cancer and melanoma patients. This work represents the first detection of ctDNA without enzymatic amplification.
Analysis of DNA Sequences by an Optical Time-Integrating Correlator: Proposal
1991-11-01
OF THE PROBLEM AND CURRENT TECHNOLOGY 2 3.0 TIME-INTEGRATING CORRELATOR 2 4.0 REPRESENTATIONS OF THE DNA BASES 8 5.0 DNA ANALYSIS STRATEGY 8 6.0... DNA bases where each base is represented by a 7-bits long pseudorandom sequence. 9 Figure 5: The flow of data in a DNA analysis system based on an...logarithmic scale and a linear scale. 15 x LIST OF TABLES PAGE Table 1: Short representations of the DNA bases where each base is represented by 7-bits
Reed, K M; Dorschner, M O; Todd, T N; Phillips, R B
1998-09-01
Sequence variation in the control region (D-loop) of the mitochondrial DNA (mtDNA) was examined to assess the genetic distinctiveness of the shortjaw cisco (Coregonus zenithicus). Individuals from within the Great Lakes Basin as well as inland lakes outside the basin were sampled. DNA fragments containing the entire D-loop were amplified by PCR from specimens of C. zenithicus and the related species C. artedi, C. hoyi, C. kiyi, and C. clupeaformis. DNA sequence analysis revealed high similarity within and among species and shared polymorphism for length variants. Based on this analysis, the shortjaw cisco is not genetically distinct from other cisco species.
Fractal landscape analysis of DNA walks
NASA Technical Reports Server (NTRS)
Peng, C. K.; Buldyrev, S. V.; Goldberger, A. L.; Havlin, S.; Sciortino, F.; Simons, M.; Stanley, H. E.
1992-01-01
By mapping nucleotide sequences onto a "DNA walk", we uncovered remarkably long-range power law correlations [Nature 356 (1992) 168] that imply a new scale invariant property of DNA. We found such long-range correlations in intron-containing genes and in non-transcribed regulatory DNA sequences, but not in cDNA sequences or intron-less genes. In this paper, we present more explicit evidences to support our findings.
Attomole-level Genomics with Single-molecule Direct DNA, cDNA and RNA Sequencing Technologies.
Ozsolak, Fatih
2016-01-01
With the introduction of next-generation sequencing (NGS) technologies in 2005, the domination of microarrays in genomics quickly came to an end due to NGS's superior technical performance and cost advantages. By enabling genetic analysis capabilities that were not possible previously, NGS technologies have started to play an integral role in all areas of biomedical research. This chapter outlines the low-quantity DNA and cDNA sequencing capabilities and applications developed with the Helicos single molecule DNA sequencing technology.
Secondary structure prediction and structure-specific sequence analysis of single-stranded DNA.
Dong, F; Allawi, H T; Anderson, T; Neri, B P; Lyamichev, V I
2001-08-01
DNA sequence analysis by oligonucleotide binding is often affected by interference with the secondary structure of the target DNA. Here we describe an approach that improves DNA secondary structure prediction by combining enzymatic probing of DNA by structure-specific 5'-nucleases with an energy minimization algorithm that utilizes the 5'-nuclease cleavage sites as constraints. The method can identify structural differences between two DNA molecules caused by minor sequence variations such as a single nucleotide mutation. It also demonstrates the existence of long-range interactions between DNA regions separated by >300 nt and the formation of multiple alternative structures by a 244 nt DNA molecule. The differences in the secondary structure of DNA molecules revealed by 5'-nuclease probing were used to design structure-specific probes for mutation discrimination that target the regions of structural, rather than sequence, differences. We also demonstrate the performance of structure-specific 'bridge' probes complementary to non-contiguous regions of the target molecule. The structure-specific probes do not require the high stringency binding conditions necessary for methods based on mismatch formation and permit mutation detection at temperatures from 4 to 37 degrees C. Structure-specific sequence analysis is applied for mutation detection in the Mycobacterium tuberculosis katG gene and for genotyping of the hepatitis C virus.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Wu, Liyou; Yi, T. Y.; Van Nostrand, Joy
Phylogenetic analyses were done for the Shewanella strains isolated from Baltic Sea (38 strains), US DOE Hanford Uranium bioremediation site [Hanford Reach of the Columbia River (HRCR), 11 strains], Pacific Ocean and Hawaiian sediments (8 strains), and strains from other resources (16 strains) with three out group strains, Rhodopseudomonas palustris, Clostridium cellulolyticum, and Thermoanaerobacter ethanolicus X514, using DNA relatedness derived from WCGA-based DNA-DNA hybridizations, sequence similarities of 16S rRNA gene and gyrB gene, and sequence similarities of 6 loci of Shewanella genome selected from a shared gene list of the Shewanella strains with whole genome sequenced based on the averagemore » nucleotide identity of them (ANI). The phylogenetic trees based on 16S rRNA and gyrB gene sequences, and DNA relatedness derived from WCGA hybridizations of the tested Shewanella strains share exactly the same sub-clusters with very few exceptions, in which the strains were basically grouped by species. However, the phylogenetic analysis based on DNA relatedness derived from WCGA hybridizations dramatically increased the differentiation resolution at species and strains level within Shewanella genus. When the tree based on DNA relatedness derived from WCGA hybridizations was compared to the tree based on the combined sequences of the selected functional genes (6 loci), we found that the resolutions of both methods are similar, but the clustering of the tree based on DNA relatedness derived from WMGA hybridizations was clearer. These results indicate that WCGA-based DNA-DNA hybridization is an idea alternative of conventional DNA-DNA hybridization methods and it is superior to the phylogenetics methods based on sequence similarities of single genes. Detailed analysis is being performed for the re-classification of the strains examined.« less
Mosaic organization of DNA nucleotides
NASA Technical Reports Server (NTRS)
Peng, C. K.; Buldyrev, S. V.; Havlin, S.; Simons, M.; Stanley, H. E.; Goldberger, A. L.
1994-01-01
Long-range power-law correlations have been reported recently for DNA sequences containing noncoding regions. We address the question of whether such correlations may be a trivial consequence of the known mosaic structure ("patchiness") of DNA. We analyze two classes of controls consisting of patchy nucleotide sequences generated by different algorithms--one without and one with long-range power-law correlations. Although both types of sequences are highly heterogenous, they are quantitatively distinguishable by an alternative fluctuation analysis method that differentiates local patchiness from long-range correlations. Application of this analysis to selected DNA sequences demonstrates that patchiness is not sufficient to account for long-range correlation properties.
Wickramaarachchi, W A R T; Shankarappa, K S; Rangaswamy, K T; Maruthi, M N; Rajapakse, R G A S; Ghosh, Saptarshi
2016-06-01
Bunchy top disease of banana caused by Banana bunchy top virus (BBTV, genus Babuvirus family Nanoviridae) is one of the most important constraints in production of banana in the different parts of the world. Six genomic DNA components of BBTV isolate from Kandy, Sri Lanka (BBTV-K) were amplified by polymerase chain reaction (PCR) with specific primers using total DNA extracted from banana tissues showing typical symptoms of bunchy top disease. The amplicons were of expected size of 1.0-1.1 kb, which were cloned and sequenced. Analysis of sequence data revealed the presence of six DNA components; DNA-R, DNA-U3, DNA-S, DNA-N, DNA-M and DNA-C for Sri Lanka isolate. Comparisons of sequence data of DNA components followed by the phylogenetic analysis, grouped Sri Lanka-(Kandy) isolate in the Pacific Indian Oceans (PIO) group. Sri Lanka-(Kandy) isolate of BBTV is classified a new member of PIO group based on analysis of six components of the virus.
2013-01-01
Background Mitochondrial DNA (mtDNA) typing can be a useful aid for identifying people from compromised samples when nuclear DNA is too damaged, degraded or below detection thresholds for routine short tandem repeat (STR)-based analysis. Standard mtDNA typing, focused on PCR amplicon sequencing of the control region (HVS I and HVS II), is limited by the resolving power of this short sequence, which misses up to 70% of the variation present in the mtDNA genome. Methods We used in-solution hybridisation-based DNA capture (using DNA capture probes prepared from modern human mtDNA) to recover mtDNA from post-mortem human remains in which the majority of DNA is both highly fragmented (<100 base pairs in length) and chemically damaged. The method ‘immortalises’ the finite quantities of DNA in valuable extracts as DNA libraries, which is followed by the targeted enrichment of endogenous mtDNA sequences and characterisation by next-generation sequencing (NGS). Results We sequenced whole mitochondrial genomes for human identification from samples where standard nuclear STR typing produced only partial profiles or demonstrably failed and/or where standard mtDNA hypervariable region sequences lacked resolving power. Multiple rounds of enrichment can substantially improve coverage and sequencing depth of mtDNA genomes from highly degraded samples. The application of this method has led to the reliable mitochondrial sequencing of human skeletal remains from unidentified World War Two (WWII) casualties approximately 70 years old and from archaeological remains (up to 2,500 years old). Conclusions This approach has potential applications in forensic science, historical human identification cases, archived medical samples, kinship analysis and population studies. In particular the methodology can be applied to any case, involving human or non-human species, where whole mitochondrial genome sequences are required to provide the highest level of maternal lineage discrimination. Multiple rounds of in-solution hybridisation-based DNA capture can retrieve whole mitochondrial genome sequences from even the most challenging samples. PMID:24289217
Yang, Xiaojun; Wang, Xiaohong; Liang, Zhijuan; Zhang, Xiaoya; Wang, Yanbo; Wang, Zhenhai
2014-05-01
To study the species and amount of bacteria in sputum of patients with ventilator-associated pneumonia (VAP) by using 16S rDNA sequencing analysis, and to explore the new method for etiologic diagnosis of VAP. Bronchoalveolar lavage sputum samples were collected from 31 patients with VAP. Bacterial DNA of the samples were extracted and identified by polymerase chain reaction (PCR). At the same time, sputum specimens were processed for routine bacterial culture. The high flux sequencing experiment was conducted on PCR positive samples with 16S rDNA macro genome sequencing technology, and sequencing results were analyzed using bioinformatics, then the results between the sequencing and bacteria culture were compared. (1) 550 bp of specific DNA sequences were amplified in sputum specimens from 27 cases of the 31 patients with VAP, and they were used for sequencing analysis. 103 856 sequences were obtained from those sputum specimens using 16S rDNA sequencing, yielding approximately 39 Mb of raw data. Tag sequencing was able to inform genus level in all 27 samples. (2) Alpha-diversity analysis showed that sputum samples of patients with VAP had significantly higher variability and richness in bacterial species (Shannon index values 1.20, Simpson index values 0.48). Rarefaction curve analysis showed that there were more species that were not detected by sequencing from some VAP sputum samples. (3) Analysis of 27 sputum samples with VAP by using 16S rDNA sequences yielded four phyla: namely Acitinobacteria, Bacteroidetes, Firmicutes, Proteobacteria. With genus as a classification, it was found that the dominant species included Streptococcus 88.9% (24/27), Limnohabitans 77.8% (21/27), Acinetobacter 70.4% (19/27), Sphingomonas 63.0% (17/27), Prevotella 63.0% (17/27), Klebsiella 55.6% (15/27), Pseudomonas 55.6% (15/27), Aquabacterium 55.6% (15/27), and Corynebacterium 55.6% (15/27). (4) Pyrophosphate sequencing discovered that Prevotella, Limnohabitans, Aquabacterium, Sphingomonas might not be detected by routine bacteria culture. Among seven species which were identified by both methods, pyrophosphate sequencing yielded higher positive rate than that of ordinary bacteria culture [Streptococcus: 88.9% (24/27) vs. 18.5% (5/27), Klebsiella: 55.6% (15/27) vs. 18.5% (5/27), Acinetobacter: 70.4% (19/27) vs. 37.0% (10/27), Corynebacterium: 55.6% (15/27) vs. 7.4% (2/27), P<0.05 or P<0.01]. Sequencing positive rate was found to increase positive rate for culture of Pseudomonas [55.6% (15/27) vs. 25.9% (7/27), P=0.050]. No significant differences were observed between sequencing and ordinary bacteria culture for detection Staphylococcus [7.4% (2/27) vs. 11.1% (3/27)] and Neisseria bacteria genera [18.5% (5/27) vs. 3.7% (1/27), both P>0.05]. 16S rDNA sequencing analysis confirmed that pathogenic bacteria in sputum of VAP were complicated with multiple drug resistant strains. Compared with routine bacterial culture, pyrophosphate sequencing had higher positive rate in detecting pathogens. 16S rDNA gene sequencing technology may become a new method for etiological diagnosis of VAP.
Reed, Kent M.; Dorschner, Michael O.; Todd, Thomas N.; Phillips, Ruth B.
1998-01-01
Sequence variation in the control region (D-loop) of the mitochondrial DNA (mtDNA) was examined to assess the genetic distinctiveness of the shortjaw cisco (Coregonus zenithicus). Individuals from within the Great Lakes Basin as well as inland lakes outside the basin were sampled. DNA fragments containing the entire D-loop were amplified by PCR from specimens ofC. zenithicus and the related species C. artedi, C. hoyi, C. kiyi, and C. clupeaformis. DNA sequence analysis revealed high similarity within and among species and shared polymorphism for length variants. Based on this analysis, the shortjaw cisco is not genetically distinct from other cisco species.
Role of indirect readout mechanism in TATA box binding protein-DNA interaction.
Mondal, Manas; Choudhury, Devapriya; Chakrabarti, Jaydeb; Bhattacharyya, Dhananjay
2015-03-01
Gene expression generally initiates from recognition of TATA-box binding protein (TBP) to the minor groove of DNA of TATA box sequence where the DNA structure is significantly different from B-DNA. We have carried out molecular dynamics simulation studies of TBP-DNA system to understand how the DNA structure alters for efficient binding. We observed rigid nature of the protein while the DNA of TATA box sequence has an inherent flexibility in terms of bending and minor groove widening. The bending analysis of the free DNA and the TBP bound DNA systems indicate presence of some similar structures. Principal coordinate ordination analysis also indicates some structural features of the protein bound and free DNA are similar. Thus we suggest that the DNA of TATA box sequence regularly oscillates between several alternate structures and the one suitable for TBP binding is induced further by the protein for proper complex formation.
Improved multiple displacement amplification (iMDA) and ultraclean reagents.
Motley, S Timothy; Picuri, John M; Crowder, Chris D; Minich, Jeremiah J; Hofstadler, Steven A; Eshoo, Mark W
2014-06-06
Next-generation sequencing sample preparation requires nanogram to microgram quantities of DNA; however, many relevant samples are comprised of only a few cells. Genomic analysis of these samples requires a whole genome amplification method that is unbiased and free of exogenous DNA contamination. To address these challenges we have developed protocols for the production of DNA-free consumables including reagents and have improved upon multiple displacement amplification (iMDA). A specialized ethylene oxide treatment was developed that renders free DNA and DNA present within Gram positive bacterial cells undetectable by qPCR. To reduce DNA contamination in amplification reagents, a combination of ion exchange chromatography, filtration, and lot testing protocols were developed. Our multiple displacement amplification protocol employs a second strand-displacing DNA polymerase, improved buffers, improved reaction conditions and DNA free reagents. The iMDA protocol, when used in combination with DNA-free laboratory consumables and reagents, significantly improved efficiency and accuracy of amplification and sequencing of specimens with moderate to low levels of DNA. The sensitivity and specificity of sequencing of amplified DNA prepared using iMDA was compared to that of DNA obtained with two commercial whole genome amplification kits using 10 fg (~1-2 bacterial cells worth) of bacterial genomic DNA as a template. Analysis showed >99% of the iMDA reads mapped to the template organism whereas only 0.02% of the reads from the commercial kits mapped to the template. To assess the ability of iMDA to achieve balanced genomic coverage, a non-stochastic amount of bacterial genomic DNA (1 pg) was amplified and sequenced, and data obtained were compared to sequencing data obtained directly from genomic DNA. The iMDA DNA and genomic DNA sequencing had comparable coverage 99.98% of the reference genome at ≥1X coverage and 99.9% at ≥5X coverage while maintaining both balance and representation of the genome. The iMDA protocol in combination with DNA-free laboratory consumables, significantly improved the ability to sequence specimens with low levels of DNA. iMDA has broad utility in metagenomics, diagnostics, ancient DNA analysis, pre-implantation embryo screening, single-cell genomics, whole genome sequencing of unculturable organisms, and forensic applications for both human and microbial targets.
[Current applications of high-throughput DNA sequencing technology in antibody drug research].
Yu, Xin; Liu, Qi-Gang; Wang, Ming-Rong
2012-03-01
Since the publication of a high-throughput DNA sequencing technology based on PCR reaction was carried out in oil emulsions in 2005, high-throughput DNA sequencing platforms have been evolved to a robust technology in sequencing genomes and diverse DNA libraries. Antibody libraries with vast numbers of members currently serve as a foundation of discovering novel antibody drugs, and high-throughput DNA sequencing technology makes it possible to rapidly identify functional antibody variants with desired properties. Herein we present a review of current applications of high-throughput DNA sequencing technology in the analysis of antibody library diversity, sequencing of CDR3 regions, identification of potent antibodies based on sequence frequency, discovery of functional genes, and combination with various display technologies, so as to provide an alternative approach of discovery and development of antibody drugs.
The genome-wide DNA sequence specificity of the anti-tumour drug bleomycin in human cells.
Murray, Vincent; Chen, Jon K; Tanaka, Mark M
2016-07-01
The cancer chemotherapeutic agent, bleomycin, cleaves DNA at specific sites. For the first time, the genome-wide DNA sequence specificity of bleomycin breakage was determined in human cells. Utilising Illumina next-generation DNA sequencing techniques, over 200 million bleomycin cleavage sites were examined to elucidate the bleomycin genome-wide DNA selectivity. The genome-wide bleomycin cleavage data were analysed by four different methods to determine the cellular DNA sequence specificity of bleomycin strand breakage. For the most highly cleaved DNA sequences, the preferred site of bleomycin breakage was at 5'-GT* dinucleotide sequences (where the asterisk indicates the bleomycin cleavage site), with lesser cleavage at 5'-GC* dinucleotides. This investigation also determined longer bleomycin cleavage sequences, with preferred cleavage at 5'-GT*A and 5'- TGT* trinucleotide sequences, and 5'-TGT*A tetranucleotides. For cellular DNA, the hexanucleotide DNA sequence 5'-RTGT*AY (where R is a purine and Y is a pyrimidine) was the most highly cleaved DNA sequence. It was striking that alternating purine-pyrimidine sequences were highly cleaved by bleomycin. The highest intensity cleavage sites in cellular and purified DNA were very similar although there were some minor differences. Statistical nucleotide frequency analysis indicated a G nucleotide was present at the -3 position (relative to the cleavage site) in cellular DNA but was absent in purified DNA.
Direct Detection and Sequencing of Damaged DNA Bases
2011-01-01
Products of various forms of DNA damage have been implicated in a variety of important biological processes, such as aging, neurodegenerative diseases, and cancer. Therefore, there exists great interest to develop methods for interrogating damaged DNA in the context of sequencing. Here, we demonstrate that single-molecule, real-time (SMRT®) DNA sequencing can directly detect damaged DNA bases in the DNA template - as a by-product of the sequencing method - through an analysis of the DNA polymerase kinetics that are altered by the presence of a modified base. We demonstrate the sequencing of several DNA templates containing products of DNA damage, including 8-oxoguanine, 8-oxoadenine, O6-methylguanine, 1-methyladenine, O4-methylthymine, 5-hydroxycytosine, 5-hydroxyuracil, 5-hydroxymethyluracil, or thymine dimers, and show that these base modifications can be readily detected with single-modification resolution and DNA strand specificity. We characterize the distinct kinetic signatures generated by these DNA base modifications. PMID:22185597
Direct detection and sequencing of damaged DNA bases.
Clark, Tyson A; Spittle, Kristi E; Turner, Stephen W; Korlach, Jonas
2011-12-20
Products of various forms of DNA damage have been implicated in a variety of important biological processes, such as aging, neurodegenerative diseases, and cancer. Therefore, there exists great interest to develop methods for interrogating damaged DNA in the context of sequencing. Here, we demonstrate that single-molecule, real-time (SMRT®) DNA sequencing can directly detect damaged DNA bases in the DNA template - as a by-product of the sequencing method - through an analysis of the DNA polymerase kinetics that are altered by the presence of a modified base. We demonstrate the sequencing of several DNA templates containing products of DNA damage, including 8-oxoguanine, 8-oxoadenine, O6-methylguanine, 1-methyladenine, O4-methylthymine, 5-hydroxycytosine, 5-hydroxyuracil, 5-hydroxymethyluracil, or thymine dimers, and show that these base modifications can be readily detected with single-modification resolution and DNA strand specificity. We characterize the distinct kinetic signatures generated by these DNA base modifications.
Genomics dataset on unclassified published organism (patent US 7547531).
Khan Shawan, Mohammad Mahfuz Ali; Hasan, Md Ashraful; Hossain, Md Mozammel; Hasan, Md Mahmudul; Parvin, Afroza; Akter, Salina; Uddin, Kazi Rasel; Banik, Subrata; Morshed, Mahbubul; Rahman, Md Nazibur; Rahman, S M Badier
2016-12-01
Nucleotide (DNA) sequence analysis provides important clues regarding the characteristics and taxonomic position of an organism. With the intention that, DNA sequence analysis is very crucial to learn about hierarchical classification of that particular organism. This dataset (patent US 7547531) is chosen to simplify all the complex raw data buried in undisclosed DNA sequences which help to open doors for new collaborations. In this data, a total of 48 unidentified DNA sequences from patent US 7547531 were selected and their complete sequences were retrieved from NCBI BioSample database. Quick response (QR) code of those DNA sequences was constructed by DNA BarID tool. QR code is useful for the identification and comparison of isolates with other organisms. AT/GC content of the DNA sequences was determined using ENDMEMO GC Content Calculator, which indicates their stability at different temperature. The highest GC content was observed in GP445188 (62.5%) which was followed by GP445198 (61.8%) and GP445189 (59.44%), while lowest was in GP445178 (24.39%). In addition, New England BioLabs (NEB) database was used to identify cleavage code indicating the 5, 3 and blunt end and enzyme code indicating the methylation site of the DNA sequences was also shown. These data will be helpful for the construction of the organisms' hierarchical classification, determination of their phylogenetic and taxonomic position and revelation of their molecular characteristics.
Xu, Li; Ding, Zhi-Shan; Zhou, Yun-Kai; Tao, Xue-Fen
2009-06-01
To obtain the full-length cDNA sequence of Secoisolariciresinol Dehydrogenase gene from Dysosma versipellis by RACE PCR,then investigate the character of Secoisolariciresinol Dehydrogenase gene. The full-length cDNA sequence of Secoisolariciresinol Dehydrogenase gene was obtained by 3'-RACE and 5'-RACE from Dysosma versipellis. We first reported the full cDNA sequences of Secoisolariciresinol Dehydrogenase in Dysosma versipellis. The acquired gene was 991bp in full length, including 5' untranslated region of 42bp, 3' untranslated region of 112bp with Poly (A). The open reading frame (ORF) encoding 278 amino acid with molecular weight 29253.3 Daltons and isolectric point 6.328. The gene accession nucleotide sequence number in GeneBank was EU573789. Semi-quantitative RT-PCR analysis revealed that the Secoisolariciresinol Dehydrogenase gene was highly expressed in stem. Alignment of the amino acid sequence of Secoisolariciresinol Dehydrogenase indicated there may be some significant amino acid sequence difference among different species. Obtain the full-length cDNA sequence of Secoisolariciresinol Dehydrogenase gene from Dysosma versipellis.
Recurrence time statistics: versatile tools for genomic DNA sequence analysis.
Cao, Yinhe; Tung, Wen-Wen; Gao, J B
2004-01-01
With the completion of the human and a few model organisms' genomes, and the genomes of many other organisms waiting to be sequenced, it has become increasingly important to develop faster computational tools which are capable of easily identifying the structures and extracting features from DNA sequences. One of the more important structures in a DNA sequence is repeat-related. Often they have to be masked before protein coding regions along a DNA sequence are to be identified or redundant expressed sequence tags (ESTs) are to be sequenced. Here we report a novel recurrence time based method for sequence analysis. The method can conveniently study all kinds of periodicity and exhaustively find all repeat-related features from a genomic DNA sequence. An efficient codon index is also derived from the recurrence time statistics, which has the salient features of being largely species-independent and working well on very short sequences. Efficient codon indices are key elements of successful gene finding algorithms, and are particularly useful for determining whether a suspected EST belongs to a coding or non-coding region. We illustrate the power of the method by studying the genomes of E. coli, the yeast S. cervisivae, the nematode worm C. elegans, and the human, Homo sapiens. Computationally, our method is very efficient. It allows us to carry out analysis of genomes on the whole genomic scale by a PC.
Hendrickson, Edwin R.; Payne, Jo Ann; Young, Roslyn M.; Starr, Mark G.; Perry, Michael P.; Fahnestock, Stephen; Ellis, David E.; Ebersole, Richard C.
2002-01-01
The environmental distribution of Dehalococcoides group organisms and their association with chloroethene-contaminated sites were examined. Samples from 24 chloroethene-dechlorinating sites scattered throughout North America and Europe were tested for the presence of members of the Dehalococcoides group by using a PCR assay developed to detect Dehalococcoides 16S rRNA gene (rDNA) sequences. Sequences identified by sequence analysis as sequences of members of the Dehalococcoides group were detected at 21 sites. Full dechlorination of chloroethenes to ethene occurred at these sites. Dehalococcoides sequences were not detected in samples from three sites at which partial dechlorination of chloroethenes occurred, where dechlorination appeared to stop at 1,2-cis-dichloroethene. Phylogenetic analysis of the 16S rDNA amplicons confirmed that Dehalococcoides sequences formed a unique 16S rDNA group. These 16S rDNA sequences were divided into three subgroups based on specific base substitution patterns in variable regions 2 and 6 of the Dehalococcoides 16S rDNA sequence. Analyses also demonstrated that specific base substitution patterns were signature patterns. The specific base substitutions distinguished the three sequence subgroups phylogenetically. These results demonstrated that members of the Dehalococcoides group are widely distributed in nature and can be found in a variety of geological formations and in different climatic zones. Furthermore, the association of these organisms with full dechlorination of chloroethenes suggests that they are promising candidates for engineered bioremediation and may be important contributors to natural attenuation of chloroethenes. PMID:11823182
Leakey, Tatiana I; Zielinski, Jerzy; Siegfried, Rachel N; Siegel, Eric R; Fan, Chun-Yang; Cooney, Craig A
2008-06-01
DNA methylation at cytosines is a widely studied epigenetic modification. Methylation is commonly detected using bisulfite modification of DNA followed by PCR and additional techniques such as restriction digestion or sequencing. These additional techniques are either laborious, require specialized equipment, or are not quantitative. Here we describe a simple algorithm that yields quantitative results from analysis of conventional four-dye-trace sequencing. We call this method Mquant and we compare it with the established laboratory method of combined bisulfite restriction assay (COBRA). This analysis of sequencing electropherograms provides a simple, easily applied method to quantify DNA methylation at specific CpG sites.
NASA Technical Reports Server (NTRS)
Venkateswaran, Kasthuri; Kempf, Michael; Chen, Fei; Satomi, Masataka; Nicholson, Wayne; Kern, Roger
2003-01-01
One of the spore-formers isolated from a spacecraft-assembly facility, belonging to the genus Bacillus, is described on the basis of phenotypic characterization, 16S rDNA sequence analysis and DNA-DNA hybridization studies. It is a Gram-positive, facultatively anaerobic, rod-shaped eubacterium that produces endospores. The spores of this novel bacterial species exhibited resistance to UV, gamma-radiation, H2O2 and desiccation. The 18S rDNA sequence analysis revealed a clear affiliation between this strain and members of the low G+C Firmicutes. High 16S rDNA sequence similarity values were found with members of the genus Bacillus and this was supported by fatty acid profiles. The 16S rDNA sequence similarity between strain FO-92T and Bacillus benzoevorans DSM 5391T was very high. However, molecular characterizations employing small-subunit 16S rDNA sequences were at the limits of resolution for the differentiation of species in this genus, but DNA-DNA hybridization data support the proposal of FO-92T as Bacillus nealsonii sp. nov. (type strain is FO-92T =ATCC BAAM-519T =DSM 15077T).
Papasotiropoulos, Vasilis; Klossa-Kilia, Elena; Alahiotis, Stamatis N; Kilias, George
2007-08-01
Mitochondrial DNA sequence analysis has been used to explore genetic differentiation and phylogenetic relationships among five species of the Mugilidae family, Mugil cephalus, Chelon labrosus, Liza aurata, Liza ramada, and Liza saliens. DNA was isolated from samples originating from the Messolongi Lagoon in Greece. Three mtDNA segments (12s rRNA, 16s rRNA, and CO I) were PCR amplified and sequenced. Sequencing analysis revealed that the greatest genetic differentiation was observed between M. cephalus and all the other species studied, while C. labrosus and L. aurata were the closest taxa. Dendrograms obtained by the neighbor-joining method and Bayesian inference analysis exhibited the same topology. According to this topology, M. cephalus is the most distinct species and the remaining taxa are clustered together, with C. labrosus and L. aurata forming a single group. The latter result brings into question the monophyletic origin of the genus Liza.
Deciphering the genomic targets of alkylating polyamide conjugates using high-throughput sequencing
Chandran, Anandhakumar; Syed, Junetha; Taylor, Rhys D.; Kashiwazaki, Gengo; Sato, Shinsuke; Hashiya, Kaori; Bando, Toshikazu; Sugiyama, Hiroshi
2016-01-01
Chemically engineered small molecules targeting specific genomic sequences play an important role in drug development research. Pyrrole-imidazole polyamides (PIPs) are a group of molecules that can bind to the DNA minor-groove and can be engineered to target specific sequences. Their biological effects rely primarily on their selective DNA binding. However, the binding mechanism of PIPs at the chromatinized genome level is poorly understood. Herein, we report a method using high-throughput sequencing to identify the DNA-alkylating sites of PIP-indole-seco-CBI conjugates. High-throughput sequencing analysis of conjugate 2 showed highly similar DNA-alkylating sites on synthetic oligos (histone-free DNA) and on human genomes (chromatinized DNA context). To our knowledge, this is the first report identifying alkylation sites across genomic DNA by alkylating PIP conjugates using high-throughput sequencing. PMID:27098039
Long-range correlations and charge transport properties of DNA sequences
NASA Astrophysics Data System (ADS)
Liu, Xiao-liang; Ren, Yi; Xie, Qiong-tao; Deng, Chao-sheng; Xu, Hui
2010-04-01
By using Hurst's analysis and transfer approach, the rescaled range functions and Hurst exponents of human chromosome 22 and enterobacteria phage lambda DNA sequences are investigated and the transmission coefficients, Landauer resistances and Lyapunov coefficients of finite segments based on above genomic DNA sequences are calculated. In a comparison with quasiperiodic and random artificial DNA sequences, we find that λ-DNA exhibits anticorrelation behavior characterized by a Hurst exponent 0.5
Statistical properties of DNA sequences
NASA Technical Reports Server (NTRS)
Peng, C. K.; Buldyrev, S. V.; Goldberger, A. L.; Havlin, S.; Mantegna, R. N.; Simons, M.; Stanley, H. E.
1995-01-01
We review evidence supporting the idea that the DNA sequence in genes containing non-coding regions is correlated, and that the correlation is remarkably long range--indeed, nucleotides thousands of base pairs distant are correlated. We do not find such a long-range correlation in the coding regions of the gene. We resolve the problem of the "non-stationarity" feature of the sequence of base pairs by applying a new algorithm called detrended fluctuation analysis (DFA). We address the claim of Voss that there is no difference in the statistical properties of coding and non-coding regions of DNA by systematically applying the DFA algorithm, as well as standard FFT analysis, to every DNA sequence (33301 coding and 29453 non-coding) in the entire GenBank database. Finally, we describe briefly some recent work showing that the non-coding sequences have certain statistical features in common with natural and artificial languages. Specifically, we adapt to DNA the Zipf approach to analyzing linguistic texts. These statistical properties of non-coding sequences support the possibility that non-coding regions of DNA may carry biological information.
Parallel gene analysis with allele-specific padlock probes and tag microarrays
Banér, Johan; Isaksson, Anders; Waldenström, Erik; Jarvius, Jonas; Landegren, Ulf; Nilsson, Mats
2003-01-01
Parallel, highly specific analysis methods are required to take advantage of the extensive information about DNA sequence variation and of expressed sequences. We present a scalable laboratory technique suitable to analyze numerous target sequences in multiplexed assays. Sets of padlock probes were applied to analyze single nucleotide variation directly in total genomic DNA or cDNA for parallel genotyping or gene expression analysis. All reacted probes were then co-amplified and identified by hybridization to a standard tag oligonucleotide array. The technique was illustrated by analyzing normal and pathogenic variation within the Wilson disease-related ATP7B gene, both at the level of DNA and RNA, using allele-specific padlock probes. PMID:12930977
van der Kuyl, A C; Kuiken, C L; Dekker, J T; Perizonius, W R; Goudsmit, J
1995-06-01
Monkey mummy bones and teeth originating from the North Saqqara Baboon Galleries (Egypt), soft tissue from a mummified baboon in a museum collection, and nineteenth/twentieth-century skin fragments from mangabeys were used for DNA extraction and PCR amplification of part of the mitochondrial 12S rRNA gene. Sequences aligning with the 12S rRNA gene were recovered but were only distantly related to contemporary monkey mitochondrial 12S rRNA sequences. However, many of these sequences were identical or closely related to human nuclear DNA sequences resembling mitochondrial 12S rRNA (isolated from a cell line depleted in mitochondria) and therefore have to be considered contamination. Subsequently in a separate study we were able to recover genuine mitochondrial 12S rRNA sequences from many extant species of nonhuman Old World primates and sequences closely resembling the human nuclear integrations. Analysis of all sequences by the neighbor-joining (NJ) method indicated that mitochondrial DNA sequences and their nuclear counterparts can be divided into two distinct clusters. One cluster contained all temporary cytoplasmic mitochondrial DNA sequences and approximately half of the monkey nuclear mitochondriallike sequences. A second cluster contained most human nuclear sequences and the other half of monkey nuclear sequences with a separate branch leading to human and gorilla mitochondrial and nuclear sequences. Sequences recovered from ancient materials were equally divided between the two clusters. These results constitute a warning for when working with ancient DNA or performing phylogenetic analysis using mitochondrial DNA as a target sequence: Nuclear counterparts of mitochondrial genes may lead to faulty interpretation of results.
Bacterial identification and subtyping using DNA microarray and DNA sequencing.
Al-Khaldi, Sufian F; Mossoba, Magdi M; Allard, Marc M; Lienau, E Kurt; Brown, Eric D
2012-01-01
The era of fast and accurate discovery of biological sequence motifs in prokaryotic and eukaryotic cells is here. The co-evolution of direct genome sequencing and DNA microarray strategies not only will identify, isotype, and serotype pathogenic bacteria, but also it will aid in the discovery of new gene functions by detecting gene expressions in different diseases and environmental conditions. Microarray bacterial identification has made great advances in working with pure and mixed bacterial samples. The technological advances have moved beyond bacterial gene expression to include bacterial identification and isotyping. Application of new tools such as mid-infrared chemical imaging improves detection of hybridization in DNA microarrays. The research in this field is promising and future work will reveal the potential of infrared technology in bacterial identification. On the other hand, DNA sequencing by using 454 pyrosequencing is so cost effective that the promise of $1,000 per bacterial genome sequence is becoming a reality. Pyrosequencing technology is a simple to use technique that can produce accurate and quantitative analysis of DNA sequences with a great speed. The deposition of massive amounts of bacterial genomic information in databanks is creating fingerprint phylogenetic analysis that will ultimately replace several technologies such as Pulsed Field Gel Electrophoresis. In this chapter, we will review (1) the use of DNA microarray using fluorescence and infrared imaging detection for identification of pathogenic bacteria, and (2) use of pyrosequencing in DNA cluster analysis to fingerprint bacterial phylogenetic trees.
Yasuno, Rie; Wada, Hajime
1998-01-01
Lipoic acid is a coenzyme that is essential for the activity of enzyme complexes such as those of pyruvate dehydrogenase and glycine decarboxylase. We report here the isolation and characterization of LIP1 cDNA for lipoic acid synthase of Arabidopsis. The Arabidopsis LIP1 cDNA was isolated using an expressed sequence tag homologous to the lipoic acid synthase of Escherichia coli. This cDNA was shown to code for Arabidopsis lipoic acid synthase by its ability to complement a lipA mutant of E. coli defective in lipoic acid synthase. DNA-sequence analysis of the LIP1 cDNA revealed an open reading frame predicting a protein of 374 amino acids. Comparisons of the deduced amino acid sequence with those of E. coli and yeast lipoic acid synthase homologs showed a high degree of sequence similarity and the presence of a leader sequence presumably required for import into the mitochondria. Southern-hybridization analysis suggested that LIP1 is a single-copy gene in Arabidopsis. Western analysis with an antibody against lipoic acid synthase demonstrated that this enzyme is located in the mitochondrial compartment in Arabidopsis cells as a 43-kD polypeptide. PMID:9808738
Abdel-Shafi, Iman R; Shoieb, Eman Y; Attia, Samar S; Rubio, José M; Ta-Tang, Thuy-Huong; El-Badry, Ayman A
2017-03-01
Lymphatic filariasis (LF) is a serious vector-borne health problem, and Wuchereria bancrofti (W.b) is the major cause of LF worldwide and is focally endemic in Egypt. Identification of filarial infection using traditional morphologic and immunological criteria can be difficult and lead to misdiagnosis. The aim of the present study was molecular detection of W.b in residents in endemic areas in Egypt, sequence variance analysis, and phylogenetic analysis of W.b DNA. Collected blood samples from residents in filariasis endemic areas in five governorates were subjected to semi-nested PCR targeting repeated DNA sequence, for detection of W.b DNA. PCR products were sequenced; subsequently, a phylogenetic analysis of the obtained sequences was performed. Out of 300 blood samples, W.b DNA was identified in 48 (16%). Sequencing analysis confirmed PCR results identifying only W.b species. Sequence alignment and phylogenetic analysis indicated genetically distinct clusters of W.b among the study population. Study results demonstrated that the semi-nested PCR proved to be an effective diagnostic tool for accurate and rapid detection of W.b infections in nano-epidemics and is applicable for samples collected in the daytime as well as the night time. PCR products sequencing and phylogenitic analysis revealed three different nucleotide sequences variants. Further genetic studies of W.b in Egypt and other endemic areas are needed to distinguish related strains and the various ecological as well as drug effects exerted on them to support W.b elimination.
Anwar, R; Booth, A; Churchill, A J; Markham, A F
1996-01-01
The determination of nucleotide sequence is fundamental to the identification and molecular analysis of genes. Direct sequencing of PCR products is now becoming a commonplace procedure for haplotype analysis, and for defining mutations and polymorphism within genes, particularly for diagnostic purposes. A previously unrecognised phenomenon, primer related variability, observed in sequence data generated using Taq cycle sequencing and T7 Sequenase sequencing, is reported. This suggests that caution is necessary when interpreting DNA sequence data. This is particularly important in situations where treatment may be dependent on the accuracy of the molecular diagnosis. Images PMID:16696096
Winnowing DNA for rare sequences: highly specific sequence and methylation based enrichment.
Thompson, Jason D; Shibahara, Gosuke; Rajan, Sweta; Pel, Joel; Marziali, Andre
2012-01-01
Rare mutations in cell populations are known to be hallmarks of many diseases and cancers. Similarly, differential DNA methylation patterns arise in rare cell populations with diagnostic potential such as fetal cells circulating in maternal blood. Unfortunately, the frequency of alleles with diagnostic potential, relative to wild-type background sequence, is often well below the frequency of errors in currently available methods for sequence analysis, including very high throughput DNA sequencing. We demonstrate a DNA preparation and purification method that through non-linear electrophoretic separation in media containing oligonucleotide probes, achieves 10,000 fold enrichment of target DNA with single nucleotide specificity, and 100 fold enrichment of unmodified methylated DNA differing from the background by the methylation of a single cytosine residue.
Gene Identification Algorithms Using Exploratory Statistical Analysis of Periodicity
NASA Astrophysics Data System (ADS)
Mukherjee, Shashi Bajaj; Sen, Pradip Kumar
2010-10-01
Studying periodic pattern is expected as a standard line of attack for recognizing DNA sequence in identification of gene and similar problems. But peculiarly very little significant work is done in this direction. This paper studies statistical properties of DNA sequences of complete genome using a new technique. A DNA sequence is converted to a numeric sequence using various types of mappings and standard Fourier technique is applied to study the periodicity. Distinct statistical behaviour of periodicity parameters is found in coding and non-coding sequences, which can be used to distinguish between these parts. Here DNA sequences of Drosophila melanogaster were analyzed with significant accuracy.
Chwialkowska, Karolina; Korotko, Urszula; Kosinska, Joanna; Szarejko, Iwona; Kwasniewski, Miroslaw
2017-01-01
Epigenetic mechanisms, including histone modifications and DNA methylation, mutually regulate chromatin structure, maintain genome integrity, and affect gene expression and transposon mobility. Variations in DNA methylation within plant populations, as well as methylation in response to internal and external factors, are of increasing interest, especially in the crop research field. Methylation Sensitive Amplification Polymorphism (MSAP) is one of the most commonly used methods for assessing DNA methylation changes in plants. This method involves gel-based visualization of PCR fragments from selectively amplified DNA that are cleaved using methylation-sensitive restriction enzymes. In this study, we developed and validated a new method based on the conventional MSAP approach called Methylation Sensitive Amplification Polymorphism Sequencing (MSAP-Seq). We improved the MSAP-based approach by replacing the conventional separation of amplicons on polyacrylamide gels with direct, high-throughput sequencing using Next Generation Sequencing (NGS) and automated data analysis. MSAP-Seq allows for global sequence-based identification of changes in DNA methylation. This technique was validated in Hordeum vulgare . However, MSAP-Seq can be straightforwardly implemented in different plant species, including crops with large, complex and highly repetitive genomes. The incorporation of high-throughput sequencing into MSAP-Seq enables parallel and direct analysis of DNA methylation in hundreds of thousands of sites across the genome. MSAP-Seq provides direct genomic localization of changes and enables quantitative evaluation. We have shown that the MSAP-Seq method specifically targets gene-containing regions and that a single analysis can cover three-quarters of all genes in large genomes. Moreover, MSAP-Seq's simplicity, cost effectiveness, and high-multiplexing capability make this method highly affordable. Therefore, MSAP-Seq can be used for DNA methylation analysis in crop plants with large and complex genomes.
Chwialkowska, Karolina; Korotko, Urszula; Kosinska, Joanna; Szarejko, Iwona; Kwasniewski, Miroslaw
2017-01-01
Epigenetic mechanisms, including histone modifications and DNA methylation, mutually regulate chromatin structure, maintain genome integrity, and affect gene expression and transposon mobility. Variations in DNA methylation within plant populations, as well as methylation in response to internal and external factors, are of increasing interest, especially in the crop research field. Methylation Sensitive Amplification Polymorphism (MSAP) is one of the most commonly used methods for assessing DNA methylation changes in plants. This method involves gel-based visualization of PCR fragments from selectively amplified DNA that are cleaved using methylation-sensitive restriction enzymes. In this study, we developed and validated a new method based on the conventional MSAP approach called Methylation Sensitive Amplification Polymorphism Sequencing (MSAP-Seq). We improved the MSAP-based approach by replacing the conventional separation of amplicons on polyacrylamide gels with direct, high-throughput sequencing using Next Generation Sequencing (NGS) and automated data analysis. MSAP-Seq allows for global sequence-based identification of changes in DNA methylation. This technique was validated in Hordeum vulgare. However, MSAP-Seq can be straightforwardly implemented in different plant species, including crops with large, complex and highly repetitive genomes. The incorporation of high-throughput sequencing into MSAP-Seq enables parallel and direct analysis of DNA methylation in hundreds of thousands of sites across the genome. MSAP-Seq provides direct genomic localization of changes and enables quantitative evaluation. We have shown that the MSAP-Seq method specifically targets gene-containing regions and that a single analysis can cover three-quarters of all genes in large genomes. Moreover, MSAP-Seq's simplicity, cost effectiveness, and high-multiplexing capability make this method highly affordable. Therefore, MSAP-Seq can be used for DNA methylation analysis in crop plants with large and complex genomes. PMID:29250096
USDA-ARS?s Scientific Manuscript database
Analysis of DNA methylation patterns relies increasingly on sequencing-based profiling methods. The four most frequently used sequencing-based technologies are the bisulfite-based methods MethylC-seq and reduced representation bisulfite sequencing (RRBS), and the enrichment-based techniques methylat...
Method for phosphorothioate antisense DNA sequencing by capillary electrophoresis with UV detection.
Froim, D; Hopkins, C E; Belenky, A; Cohen, A S
1997-11-01
The progress of antisense DNA therapy demands development of reliable and convenient methods for sequencing short single-stranded oligonucleotides. A method of phosphorothioate antisense DNA sequencing analysis using UV detection coupled to capillary electrophoresis (CE) has been developed based on a modified chain termination sequencing method. The proposed method reduces the sequencing cost since it uses affordable CE-UV instrumentation and requires no labeling with minimal sample processing before analysis. Cycle sequencing with ThermoSequenase generates quantities of sequencing products that are readily detectable by UV. Discrimination of undesired components from sequencing products in the reaction mixture, previously accomplished by fluorescent or radioactive labeling, is now achieved by bringing concentrations of undesired components below the UV detection range which yields a 'clean', well defined sequence. UV detection coupled with CE offers additional conveniences for sequencing since it can be accomplished with commercially available CE-UV equipment and is readily amenable to automation.
Method for phosphorothioate antisense DNA sequencing by capillary electrophoresis with UV detection.
Froim, D; Hopkins, C E; Belenky, A; Cohen, A S
1997-01-01
The progress of antisense DNA therapy demands development of reliable and convenient methods for sequencing short single-stranded oligonucleotides. A method of phosphorothioate antisense DNA sequencing analysis using UV detection coupled to capillary electrophoresis (CE) has been developed based on a modified chain termination sequencing method. The proposed method reduces the sequencing cost since it uses affordable CE-UV instrumentation and requires no labeling with minimal sample processing before analysis. Cycle sequencing with ThermoSequenase generates quantities of sequencing products that are readily detectable by UV. Discrimination of undesired components from sequencing products in the reaction mixture, previously accomplished by fluorescent or radioactive labeling, is now achieved by bringing concentrations of undesired components below the UV detection range which yields a 'clean', well defined sequence. UV detection coupled with CE offers additional conveniences for sequencing since it can be accomplished with commercially available CE-UV equipment and is readily amenable to automation. PMID:9336449
Processing and population genetic analysis of multigenic datasets with ProSeq3 software.
Filatov, Dmitry A
2009-12-01
The current tendency in molecular population genetics is to use increasing numbers of genes in the analysis. Here I describe a program for handling and population genetic analysis of DNA polymorphism data collected from multiple genes. The program includes a sequence/alignment editor and an internal relational database that simplify the preparation and manipulation of multigenic DNA polymorphism datasets. The most commonly used DNA polymorphism analyses are implemented in ProSeq3, facilitating population genetic analysis of large multigenic datasets. Extensive input/output options make ProSeq3 a convenient hub for sequence data processing and analysis. The program is available free of charge from http://dps.plants.ox.ac.uk/sequencing/proseq.htm.
Recent patents of nanopore DNA sequencing technology: progress and challenges.
Zhou, Jianfeng; Xu, Bingqian
2010-11-01
DNA sequencing techniques witnessed fast development in the last decades, primarily driven by the Human Genome Project. Among the proposed new techniques, Nanopore was considered as a suitable candidate for the single DNA sequencing with ultrahigh speed and very low cost. Several fabrication and modification techniques have been developed to produce robust and well-defined nanopore devices. Many efforts have also been done to apply nanopore to analyze the properties of DNA molecules. By comparing with traditional sequencing techniques, nanopore has demonstrated its distinctive superiorities in main practical issues, such as sample preparation, sequencing speed, cost-effective and read-length. Although challenges still remain, recent researches in improving the capabilities of nanopore have shed a light to achieve its ultimate goal: Sequence individual DNA strand at single nucleotide level. This patent review briefly highlights recent developments and technological achievements for DNA analysis and sequencing at single molecule level, focusing on nanopore based methods.
Analysis of intraspecific patterns in genetic diversity of stream fishes provides a potentially powerful method for assessing the status and trends in the condition of aquatic ecosystems. We analyzed mitochondrial DNA (mtDNA) sequences (590 bases of cytochrome B) and nuclear DNA...
Parson, Walther; Strobl, Christina; Huber, Gabriela; Zimmermann, Bettina; Gomes, Sibylle M.; Souto, Luis; Fendt, Liane; Delport, Rhena; Langit, Reina; Wootton, Sharon; Lagacé, Robert; Irwin, Jodi
2013-01-01
Insights into the human mitochondrial phylogeny have been primarily achieved by sequencing full mitochondrial genomes (mtGenomes). In forensic genetics (partial) mtGenome information can be used to assign haplotypes to their phylogenetic backgrounds, which may, in turn, have characteristic geographic distributions that would offer useful information in a forensic case. In addition and perhaps even more relevant in the forensic context, haplogroup-specific patterns of mutations form the basis for quality control of mtDNA sequences. The current method for establishing (partial) mtDNA haplotypes is Sanger-type sequencing (STS), which is laborious, time-consuming, and expensive. With the emergence of Next Generation Sequencing (NGS) technologies, the body of available mtDNA data can potentially be extended much more quickly and cost-efficiently. Customized chemistries, laboratory workflows and data analysis packages could support the community and increase the utility of mtDNA analysis in forensics. We have evaluated the performance of mtGenome sequencing using the Personal Genome Machine (PGM) and compared the resulting haplotypes directly with conventional Sanger-type sequencing. A total of 64 mtGenomes (>1 million bases) were established that yielded high concordance with the corresponding STS haplotypes (<0.02% differences). About two-thirds of the differences were observed in or around homopolymeric sequence stretches. In addition, the sequence alignment algorithm employed to align NGS reads played a significant role in the analysis of the data and the resulting mtDNA haplotypes. Further development of alignment software would be desirable to facilitate the application of NGS in mtDNA forensic genetics. PMID:23948325
An evolution based biosensor receptor DNA sequence generation algorithm.
Kim, Eungyeong; Lee, Malrey; Gatton, Thomas M; Lee, Jaewan; Zang, Yupeng
2010-01-01
A biosensor is composed of a bioreceptor, an associated recognition molecule, and a signal transducer that can selectively detect target substances for analysis. DNA based biosensors utilize receptor molecules that allow hybridization with the target analyte. However, most DNA biosensor research uses oligonucleotides as the target analytes and does not address the potential problems of real samples. The identification of recognition molecules suitable for real target analyte samples is an important step towards further development of DNA biosensors. This study examines the characteristics of DNA used as bioreceptors and proposes a hybrid evolution-based DNA sequence generating algorithm, based on DNA computing, to identify suitable DNA bioreceptor recognition molecules for stable hybridization with real target substances. The Traveling Salesman Problem (TSP) approach is applied in the proposed algorithm to evaluate the safety and fitness of the generated DNA sequences. This approach improves efficiency and stability for enhanced and variable-length DNA sequence generation and allows extension to generation of variable-length DNA sequences with diverse receptor recognition requirements.
High-Throughput Analysis of T-DNA Location and Structure Using Sequence Capture.
Inagaki, Soichi; Henry, Isabelle M; Lieberman, Meric C; Comai, Luca
2015-01-01
Agrobacterium-mediated transformation of plants with T-DNA is used both to introduce transgenes and for mutagenesis. Conventional approaches used to identify the genomic location and the structure of the inserted T-DNA are laborious and high-throughput methods using next-generation sequencing are being developed to address these problems. Here, we present a cost-effective approach that uses sequence capture targeted to the T-DNA borders to select genomic DNA fragments containing T-DNA-genome junctions, followed by Illumina sequencing to determine the location and junction structure of T-DNA insertions. Multiple probes can be mixed so that transgenic lines transformed with different T-DNA types can be processed simultaneously, using a simple, index-based pooling approach. We also developed a simple bioinformatic tool to find sequence read pairs that span the junction between the genome and T-DNA or any foreign DNA. We analyzed 29 transgenic lines of Arabidopsis thaliana, each containing inserts from 4 different T-DNA vectors. We determined the location of T-DNA insertions in 22 lines, 4 of which carried multiple insertion sites. Additionally, our analysis uncovered a high frequency of unconventional and complex T-DNA insertions, highlighting the needs for high-throughput methods for T-DNA localization and structural characterization. Transgene insertion events have to be fully characterized prior to use as commercial products. Our method greatly facilitates the first step of this characterization of transgenic plants by providing an efficient screen for the selection of promising lines.
Company profile: Complete Genomics Inc.
Reid, Clifford
2011-02-01
Complete Genomics Inc. is a life sciences company that focuses on complete human genome sequencing. It is taking a completely different approach to DNA sequencing than other companies in the industry. Rather than building a general-purpose platform for sequencing all organisms and all applications, it has focused on a single application - complete human genome sequencing. The company's Complete Genomics Analysis Platform (CGA™ Platform) comprises an integrated package of biochemistry, instrumentation and software that sequences human genomes at the highest quality, lowest cost and largest scale available. Complete Genomics offers a turnkey service that enables customers to outsource their human genome sequencing to the company's genome sequencing center in Mountain View, CA, USA. Customers send in their DNA samples, the company does all the library preparation, DNA sequencing, assembly and variant analysis, and customers receive research-ready data that they can use for biological discovery.
Bhore, Subhash J; Kassim, Amelia; Loh, Chye Ying; Shah, Farida H
2010-01-01
It is well known that the nutritional quality of the American oil-palm (Elaeis oleifera) mesocarp oil is superior to that of African oil-palm (Elaeis guineensis Jacq. Tenera) mesocarp oil. Therefore, it is of important to identify the genetic features for its superior value. This could be achieved through the genome sequencing of the oil-palm. However, the genome sequence is not available in the public domain due to commercial secrecy. Hence, we constructed a cDNA library and generated expressed sequence tags (3,205) from the mesocarp tissue of the American oil-palm. We continued to annotate each of these cDNAs after submitting to GenBank/DDBJ/EMBL. A rough analysis turned our attention to the beta-carotene hydroxylase (Chyb) enzyme encoding cDNA. Then, we completed the full sequencing of cDNA clone for its both strands using M13 forward and reverse primers. The full nucleotide and protein sequence was further analyzed and annotated using various Bioinformatics tools. The analysis results showed the presence of fatty acid hydroxylase superfamily domain in the protein sequence. The multiple sequence alignment of selected Chyb amino acid sequences from other plant species and algal members with E. oleifera Chyb using ClustalW and its phylogenetic analysis suggest that Chyb from monocotyledonous plant species, Lilium hubrid, Crocus sativus and Zea mays are the most evolutionary related with E. oleifera Chyb. This study reports the annotation of E. oleifera Chyb. Abbreviations ESTs - expressed sequence tags, EoChyb - Elaeis oleifera beta-carotene hydroxylase, MC - main cluster PMID:21364789
Shah, Kushani; Thomas, Shelby; Stein, Arnold
2013-01-01
In this report, we describe a 5-week laboratory exercise for undergraduate biology and biochemistry students in which students learn to sequence DNA and to genotype their DNA for selected single nucleotide polymorphisms (SNPs). Students use miniaturized DNA sequencing gels that require approximately 8 min to run. The students perform G, A, T, C Sanger sequencing reactions. They prepare and run the gels, perform Southern blots (which require only 10 min), and detect sequencing ladders using a colorimetric detection system. Students enlarge their sequencing ladders from digital images of their small nylon membranes, and read the sequence manually. They compare their reads with the actual DNA sequence using BLAST2. After mastering the DNA sequencing system, students prepare their own DNA from a cheek swab, polymerase chain reaction-amplify a region of their DNA that encompasses a SNP of interest, and perform sequencing to determine their genotype at the SNP position. A family pedigree can also be constructed. The SNP chosen by the instructor was rs17822931, which is in the ABCC11 gene and is the determinant of human earwax type. Genotypes at the rs178229931 site vary in different ethnic populations. © 2013 by The International Union of Biochemistry and Molecular Biology.
Ray Wu as Fifth Business: Deconstructing collective memory in the history of DNA sequencing.
Onaga, Lisa A
2014-06-01
The concept of 'Fifth Business' is used to analyze a minority standpoint and bring serious attention to the role of scientists who play a galvanizing role in a science but for multiple reasons appear less prominently in more common recounts of any particular development. Biochemist Ray Wu (1928-2008) published a DNA sequencing experiment in March 1970 using DNA polymerase catalysis and specific nucleotide labeling, both of which are foundational to general sequencing methods today. The scant mention of Wu's work from textbooks, research articles, and other accounts of DNA sequencing calls into question how scientific collective memory forms. This alternative history seeks to understand why a key figure in nucleic acid sequence analysis has remained less visibly connected or peripheral to solidifying narratives about the history of DNA sequencing. The study resists predictable dismissals of Wu's work in order to seriously examine the formation of his nucleic acid sequence analysis research program and how he shared his knowledge of sequencing during a period of rapid advancement in the field. An analysis of Wu's work on sequencing the cohesive ends of lambda bacteriophage in the 1960s and 1970s exemplifies how a variety of individuals and groups attempted to develop protocol for sequencing the order of nucleotide base pairs comprising DNA. This historical examination of the sociality of scientific research suggests a way to understand how Wu and others contributed to the very collective memory of DNA sequencing that Wu eventually tried to repair. The study of Wu, who was a Chinese immigrant to the United States, provides a foundation for further critical scholarship on the heterogeneous histories of Asian American bioscientists, the sociality of their scientific works, and how the resulting knowledge produced is preserved, if not evenly, in a scientific field's collective memory. Copyright © 2014 Elsevier Ltd. All rights reserved.
Comprehensive Analysis of DNA Methylation Data with RnBeads
Walter, Jörn; Lengauer, Thomas; Bock, Christoph
2014-01-01
RnBeads is a software tool for large-scale analysis and interpretation of DNA methylation data, providing a user-friendly analysis workflow that yields detailed hypertext reports (http://rnbeads.mpi-inf.mpg.de). Supported assays include whole genome bisulfite sequencing, reduced representation bisulfite sequencing, Infinium microarrays, and any other protocol that produces high-resolution DNA methylation data. Important applications of RnBeads include the analysis of epigenome-wide association studies and epigenetic biomarker discovery in cancer cohorts. PMID:25262207
Alignment of high-throughput sequencing data inside in-memory databases.
Firnkorn, Daniel; Knaup-Gregori, Petra; Lorenzo Bermejo, Justo; Ganzinger, Matthias
2014-01-01
In times of high-throughput DNA sequencing techniques, performance-capable analysis of DNA sequences is of high importance. Computer supported DNA analysis is still an intensive time-consuming task. In this paper we explore the potential of a new In-Memory database technology by using SAP's High Performance Analytic Appliance (HANA). We focus on read alignment as one of the first steps in DNA sequence analysis. In particular, we examined the widely used Burrows-Wheeler Aligner (BWA) and implemented stored procedures in both, HANA and the free database system MySQL, to compare execution time and memory management. To ensure that the results are comparable, MySQL has been running in memory as well, utilizing its integrated memory engine for database table creation. We implemented stored procedures, containing exact and inexact searching of DNA reads within the reference genome GRCh37. Due to technical restrictions in SAP HANA concerning recursion, the inexact matching problem could not be implemented on this platform. Hence, performance analysis between HANA and MySQL was made by comparing the execution time of the exact search procedures. Here, HANA was approximately 27 times faster than MySQL which means, that there is a high potential within the new In-Memory concepts, leading to further developments of DNA analysis procedures in the future.
Chaotic Image Encryption Algorithm Based on Bit Permutation and Dynamic DNA Encoding.
Zhang, Xuncai; Han, Feng; Niu, Ying
2017-01-01
With the help of the fact that chaos is sensitive to initial conditions and pseudorandomness, combined with the spatial configurations in the DNA molecule's inherent and unique information processing ability, a novel image encryption algorithm based on bit permutation and dynamic DNA encoding is proposed here. The algorithm first uses Keccak to calculate the hash value for a given DNA sequence as the initial value of a chaotic map; second, it uses a chaotic sequence to scramble the image pixel locations, and the butterfly network is used to implement the bit permutation. Then, the image is coded into a DNA matrix dynamic, and an algebraic operation is performed with the DNA sequence to realize the substitution of the pixels, which further improves the security of the encryption. Finally, the confusion and diffusion properties of the algorithm are further enhanced by the operation of the DNA sequence and the ciphertext feedback. The results of the experiment and security analysis show that the algorithm not only has a large key space and strong sensitivity to the key but can also effectively resist attack operations such as statistical analysis and exhaustive analysis.
Chaotic Image Encryption Algorithm Based on Bit Permutation and Dynamic DNA Encoding
2017-01-01
With the help of the fact that chaos is sensitive to initial conditions and pseudorandomness, combined with the spatial configurations in the DNA molecule's inherent and unique information processing ability, a novel image encryption algorithm based on bit permutation and dynamic DNA encoding is proposed here. The algorithm first uses Keccak to calculate the hash value for a given DNA sequence as the initial value of a chaotic map; second, it uses a chaotic sequence to scramble the image pixel locations, and the butterfly network is used to implement the bit permutation. Then, the image is coded into a DNA matrix dynamic, and an algebraic operation is performed with the DNA sequence to realize the substitution of the pixels, which further improves the security of the encryption. Finally, the confusion and diffusion properties of the algorithm are further enhanced by the operation of the DNA sequence and the ciphertext feedback. The results of the experiment and security analysis show that the algorithm not only has a large key space and strong sensitivity to the key but can also effectively resist attack operations such as statistical analysis and exhaustive analysis. PMID:28912802
Van Kreijl, C F; Bos, J L
1977-01-01
The repeating nucleotide sequence of 68 base pairs in the mtDNA from an ethidium-induced cytoplasmic petite mutant of yeast has been determined. For sequence analysis specifically primed and terminated RNA copies, obtained by in vitro transcription of the separated strands, were use. The sequence consists of 66 consecutive AT base pairs flanked by two GC pairs and comprises nearly all of the mutant mitochondrial genome. The sequence, moreover, also represents the first part of wild-type mtDNA sequence so far. Images PMID:198740
High-resolution biophysical analysis of the dynamics of nucleosome formation
Hatakeyama, Akiko; Hartmann, Brigitte; Travers, Andrew; Nogues, Claude; Buckle, Malcolm
2016-01-01
We describe a biophysical approach that enables changes in the structure of DNA to be followed during nucleosome formation in in vitro reconstitution with either the canonical “Widom” sequence or a judiciously mutated sequence. The rapid non-perturbing photochemical analysis presented here provides ‘snapshots’ of the DNA configuration at any given moment in time during nucleosome formation under a very broad range of reaction conditions. Changes in DNA photochemical reactivity upon protein binding are interpreted as being mainly induced by alterations in individual base pair roll angles. The results strengthen the importance of the role of an initial (H3/H4)2 histone tetramer-DNA interaction and highlight the modulation of this early event by the DNA sequence. (H3/H4)2 binding precedes and dictates subsequent H2A/H2B-DNA interactions, which are less affected by the DNA sequence, leading to the final octameric nucleosome. Overall, our results provide a novel, exciting way to investigate those biophysical properties of DNA that constitute a crucial component in nucleosome formation and stabilization. PMID:27263658
Hykin, Sarah M.; Bi, Ke; McGuire, Jimmy A.
2015-01-01
For 150 years or more, specimens were routinely collected and deposited in natural history collections without preserving fresh tissue samples for genetic analysis. In the case of most herpetological specimens (i.e. amphibians and reptiles), attempts to extract and sequence DNA from formalin-fixed, ethanol-preserved specimens—particularly for use in phylogenetic analyses—has been laborious and largely ineffective due to the highly fragmented nature of the DNA. As a result, tens of thousands of specimens in herpetological collections have not been available for sequence-based phylogenetic studies. Massively parallel High-Throughput Sequencing methods and the associated bioinformatics, however, are particularly suited to recovering meaningful genetic markers from severely degraded/fragmented DNA sequences such as DNA damaged by formalin-fixation. In this study, we compared previously published DNA extraction methods on three tissue types subsampled from formalin-fixed specimens of Anolis carolinensis, followed by sequencing. Sufficient quality DNA was recovered from liver tissue, making this technique minimally destructive to museum specimens. Sequencing was only successful for the more recently collected specimen (collected ~30 ybp). We suspect this could be due either to the conditions of preservation and/or the amount of tissue used for extraction purposes. For the successfully sequenced sample, we found a high rate of base misincorporation. After rigorous trimming, we successfully mapped 27.93% of the cleaned reads to the reference genome, were able to reconstruct the complete mitochondrial genome, and recovered an accurate phylogenetic placement for our specimen. We conclude that the amount of DNA available, which can vary depending on specimen age and preservation conditions, will determine if sequencing will be successful. The technique described here will greatly improve the value of museum collections by making many formalin-fixed specimens available for genetic analysis. PMID:26505622
Hykin, Sarah M; Bi, Ke; McGuire, Jimmy A
2015-01-01
For 150 years or more, specimens were routinely collected and deposited in natural history collections without preserving fresh tissue samples for genetic analysis. In the case of most herpetological specimens (i.e. amphibians and reptiles), attempts to extract and sequence DNA from formalin-fixed, ethanol-preserved specimens-particularly for use in phylogenetic analyses-has been laborious and largely ineffective due to the highly fragmented nature of the DNA. As a result, tens of thousands of specimens in herpetological collections have not been available for sequence-based phylogenetic studies. Massively parallel High-Throughput Sequencing methods and the associated bioinformatics, however, are particularly suited to recovering meaningful genetic markers from severely degraded/fragmented DNA sequences such as DNA damaged by formalin-fixation. In this study, we compared previously published DNA extraction methods on three tissue types subsampled from formalin-fixed specimens of Anolis carolinensis, followed by sequencing. Sufficient quality DNA was recovered from liver tissue, making this technique minimally destructive to museum specimens. Sequencing was only successful for the more recently collected specimen (collected ~30 ybp). We suspect this could be due either to the conditions of preservation and/or the amount of tissue used for extraction purposes. For the successfully sequenced sample, we found a high rate of base misincorporation. After rigorous trimming, we successfully mapped 27.93% of the cleaned reads to the reference genome, were able to reconstruct the complete mitochondrial genome, and recovered an accurate phylogenetic placement for our specimen. We conclude that the amount of DNA available, which can vary depending on specimen age and preservation conditions, will determine if sequencing will be successful. The technique described here will greatly improve the value of museum collections by making many formalin-fixed specimens available for genetic analysis.
Flow cytometric detection method for DNA samples
Nasarabadi, Shanavaz [Livermore, CA; Langlois, Richard G [Livermore, CA; Venkateswaran, Kodumudi S [Round Rock, TX
2011-07-05
Disclosed herein are two methods for rapid multiplex analysis to determine the presence and identity of target DNA sequences within a DNA sample. Both methods use reporting DNA sequences, e.g., modified conventional Taqman.RTM. probes, to combine multiplex PCR amplification with microsphere-based hybridization using flow cytometry means of detection. Real-time PCR detection can also be incorporated. The first method uses a cyanine dye, such as, Cy3.TM., as the reporter linked to the 5' end of a reporting DNA sequence. The second method positions a reporter dye, e.g., FAM.TM. on the 3' end of the reporting DNA sequence and a quencher dye, e.g., TAMRA.TM., on the 5' end.
Flow cytometric detection method for DNA samples
Nasarabadi, Shanavaz [Livermore, CA; Langlois, Richard G [Livermore, CA; Venkateswaran, Kodumudi S [Livermore, CA
2006-08-01
Disclosed herein are two methods for rapid multiplex analysis to determine the presence and identity of target DNA sequences within a DNA sample. Both methods use reporting DNA sequences, e.g., modified conventional Taqman.RTM. probes, to combine multiplex PCR amplification with microsphere-based hybridization using flow cytometry means of detection. Real-time PCR detection can also be incorporated. The first method uses a cyanine dye, such as, Cy3.TM., as the reporter linked to the 5' end of a reporting DNA sequence. The second method positions a reporter dye, e.g., FAM, on the 3' end of the reporting DNA sequence and a quencher dye, e.g., TAMRA, on the 5' end.
Zackay, Arie; Steinhoff, Christine
2010-12-15
Exploration of DNA methylation and its impact on various regulatory mechanisms has become a very active field of research. Simultaneously there is an arising need for tools to process and analyse the data together with statistical investigation and visualisation. MethVisual is a new application that enables exploratory analysis and intuitive visualization of DNA methylation data as is typically generated by bisulfite sequencing. The package allows the import of DNA methylation sequences, aligns them and performs quality control comparison. It comprises basic analysis steps as lollipop visualization, co-occurrence display of methylation of neighbouring and distant CpG sites, summary statistics on methylation status, clustering and correspondence analysis. The package has been developed for methylation data but can be also used for other data types for which binary coding can be inferred. The application of the package, as well as a comparison to existing DNA methylation analysis tools and its workflow based on two datasets is presented in this paper. The R package MethVisual offers various analysis procedures for data that can be binarized, in particular for bisulfite sequenced methylation data. R/Bioconductor has become one of the most important environments for statistical analysis of various types of biological and medical data. Therefore, any data analysis within R that allows the integration of various data types as provided from different technological platforms is convenient. It is the first and so far the only specific package for DNA methylation analysis, in particular for bisulfite sequenced data available in R/Bioconductor enviroment. The package is available for free at http://methvisual.molgen.mpg.de/ and from the Bioconductor Consortium http://www.bioconductor.org.
2010-01-01
Background Exploration of DNA methylation and its impact on various regulatory mechanisms has become a very active field of research. Simultaneously there is an arising need for tools to process and analyse the data together with statistical investigation and visualisation. Findings MethVisual is a new application that enables exploratory analysis and intuitive visualization of DNA methylation data as is typically generated by bisulfite sequencing. The package allows the import of DNA methylation sequences, aligns them and performs quality control comparison. It comprises basic analysis steps as lollipop visualization, co-occurrence display of methylation of neighbouring and distant CpG sites, summary statistics on methylation status, clustering and correspondence analysis. The package has been developed for methylation data but can be also used for other data types for which binary coding can be inferred. The application of the package, as well as a comparison to existing DNA methylation analysis tools and its workflow based on two datasets is presented in this paper. Conclusions The R package MethVisual offers various analysis procedures for data that can be binarized, in particular for bisulfite sequenced methylation data. R/Bioconductor has become one of the most important environments for statistical analysis of various types of biological and medical data. Therefore, any data analysis within R that allows the integration of various data types as provided from different technological platforms is convenient. It is the first and so far the only specific package for DNA methylation analysis, in particular for bisulfite sequenced data available in R/Bioconductor enviroment. The package is available for free at http://methvisual.molgen.mpg.de/ and from the Bioconductor Consortium http://www.bioconductor.org. PMID:21159174
Winnowing DNA for Rare Sequences: Highly Specific Sequence and Methylation Based Enrichment
Thompson, Jason D.; Shibahara, Gosuke; Rajan, Sweta; Pel, Joel; Marziali, Andre
2012-01-01
Rare mutations in cell populations are known to be hallmarks of many diseases and cancers. Similarly, differential DNA methylation patterns arise in rare cell populations with diagnostic potential such as fetal cells circulating in maternal blood. Unfortunately, the frequency of alleles with diagnostic potential, relative to wild-type background sequence, is often well below the frequency of errors in currently available methods for sequence analysis, including very high throughput DNA sequencing. We demonstrate a DNA preparation and purification method that through non-linear electrophoretic separation in media containing oligonucleotide probes, achieves 10,000 fold enrichment of target DNA with single nucleotide specificity, and 100 fold enrichment of unmodified methylated DNA differing from the background by the methylation of a single cytosine residue. PMID:22355378
Pollier, Jacob; González-Guzmán, Miguel; Ardiles-Diaz, Wilson; Geelen, Danny; Goossens, Alain
2011-01-01
cDNA-Amplified Fragment Length Polymorphism (cDNA-AFLP) is a commonly used technique for genome-wide expression analysis that does not require prior sequence knowledge. Typically, quantitative expression data and sequence information are obtained for a large number of differentially expressed gene tags. However, most of the gene tags do not correspond to full-length (FL) coding sequences, which is a prerequisite for subsequent functional analysis. A medium-throughput screening strategy, based on integration of polymerase chain reaction (PCR) and colony hybridization, was developed that allows in parallel screening of a cDNA library for FL clones corresponding to incomplete cDNAs. The method was applied to screen for the FL open reading frames of a selection of 163 cDNA-AFLP tags from three different medicinal plants, leading to the identification of 109 (67%) FL clones. Furthermore, the protocol allows for the use of multiple probes in a single hybridization event, thus significantly increasing the throughput when screening for rare transcripts. The presented strategy offers an efficient method for the conversion of incomplete expressed sequence tags (ESTs), such as cDNA-AFLP tags, to FL-coding sequences.
Schilmiller, Anthony L; Miner, Dennis P; Larson, Matthew; McDowell, Eric; Gang, David R; Wilkerson, Curtis; Last, Robert L
2010-07-01
Shotgun proteomics analysis allows hundreds of proteins to be identified and quantified from a single sample at relatively low cost. Extensive DNA sequence information is a prerequisite for shotgun proteomics, and it is ideal to have sequence for the organism being studied rather than from related species or accessions. While this requirement has limited the set of organisms that are candidates for this approach, next generation sequencing technologies make it feasible to obtain deep DNA sequence coverage from any organism. As part of our studies of specialized (secondary) metabolism in tomato (Solanum lycopersicum) trichomes, 454 sequencing of cDNA was combined with shotgun proteomics analyses to obtain in-depth profiles of genes and proteins expressed in leaf and stem glandular trichomes of 3-week-old plants. The expressed sequence tag and proteomics data sets combined with metabolite analysis led to the discovery and characterization of a sesquiterpene synthase that produces beta-caryophyllene and alpha-humulene from E,E-farnesyl diphosphate in trichomes of leaf but not of stem. This analysis demonstrates the utility of combining high-throughput cDNA sequencing with proteomics experiments in a target tissue. These data can be used for dissection of other biochemical processes in these specialized epidermal cells.
Schilmiller, Anthony L.; Miner, Dennis P.; Larson, Matthew; McDowell, Eric; Gang, David R.; Wilkerson, Curtis; Last, Robert L.
2010-01-01
Shotgun proteomics analysis allows hundreds of proteins to be identified and quantified from a single sample at relatively low cost. Extensive DNA sequence information is a prerequisite for shotgun proteomics, and it is ideal to have sequence for the organism being studied rather than from related species or accessions. While this requirement has limited the set of organisms that are candidates for this approach, next generation sequencing technologies make it feasible to obtain deep DNA sequence coverage from any organism. As part of our studies of specialized (secondary) metabolism in tomato (Solanum lycopersicum) trichomes, 454 sequencing of cDNA was combined with shotgun proteomics analyses to obtain in-depth profiles of genes and proteins expressed in leaf and stem glandular trichomes of 3-week-old plants. The expressed sequence tag and proteomics data sets combined with metabolite analysis led to the discovery and characterization of a sesquiterpene synthase that produces β-caryophyllene and α-humulene from E,E-farnesyl diphosphate in trichomes of leaf but not of stem. This analysis demonstrates the utility of combining high-throughput cDNA sequencing with proteomics experiments in a target tissue. These data can be used for dissection of other biochemical processes in these specialized epidermal cells. PMID:20431087
Church, George M.; Kieffer-Higgins, Stephen
1992-01-01
This invention features vectors and a method for sequencing DNA. The method includes the steps of: a) ligating the DNA into a vector comprising a tag sequence, the tag sequence includes at least 15 bases, wherein the tag sequence will not hybridize to the DNA under stringent hybridization conditions and is unique in the vector, to form a hybrid vector, b) treating the hybrid vector in a plurality of vessels to produce fragments comprising the tag sequence, wherein the fragments differ in length and terminate at a fixed known base or bases, wherein the fixed known base or bases differs in each vessel, c) separating the fragments from each vessel according to their size, d) hybridizing the fragments with an oligonucleotide able to hybridize specifically with the tag sequence, and e) detecting the pattern of hybridization of the tag sequence, wherein the pattern reflects the nucleotide sequence of the DNA.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Lapidus, Alla L.
From the date its role in heredity was discovered, DNA has been generating interest among scientists from different fields of knowledge: physicists have studied the three dimensional structure of the DNA molecule, biologists tried to decode the secrets of life hidden within these long molecules, and technologists invent and improve methods of DNA analysis. The analysis of the nucleotide sequence of DNA occupies a special place among the methods developed. Thanks to the variety of sequencing technologies available, the process of decoding the sequence of genomic DNA (or whole genome sequencing) has become robust and inexpensive. Meanwhile the assembly ofmore » whole genome sequences remains a challenging task. In addition to the need to assemble millions of DNA fragments of different length (from 35 bp (Solexa) to 800 bp (Sanger)), great interest in analysis of microbial communities (metagenomes) of different complexities raises new problems and pushes some new requirements for sequence assembly tools to the forefront. The genome assembly process can be divided into two steps: draft assembly and assembly improvement (finishing). Despite the fact that automatically performed assembly (or draft assembly) is capable of covering up to 98% of the genome, in most cases, it still contains incorrectly assembled reads. The error rate of the consensus sequence produced at this stage is about 1/2000 bp. A finished genome represents the genome assembly of much higher accuracy (with no gaps or incorrectly assembled areas) and quality ({approx}1 error/10,000 bp), validated through a number of computer and laboratory experiments.« less
Havert, Michael B.; Ji, Lin; Loeb, Daniel D.
2002-01-01
The synthesis of the hepadnavirus relaxed circular DNA genome requires two template switches, primer translocation and circularization, during plus-strand DNA synthesis. Repeated sequences serve as donor and acceptor templates for these template switches, with direct repeat 1 (DR1) and DR2 for primer translocation and 5′r and 3′r for circularization. These donor and acceptor sequences are at, or near, the ends of the minus-strand DNA. Analysis of plus-strand DNA synthesis of duck hepatitis B virus (DHBV) has indicated that there are at least three other cis-acting sequences that make contributions during the synthesis of relaxed circular DNA. These sequences, 5E, M, and 3E, are located near the 5′ end, the middle, and the 3′ end of minus-strand DNA, respectively. The mechanism by which these sequences contribute to the synthesis of plus-strand DNA was unclear. Our aim was to better understand the mechanism by which 5E and M act. We localized the DHBV 5E element to a short sequence of approximately 30 nucleotides that is 100 nucleotides 3′ of DR2 on minus-strand DNA. We found that the new 5E mutants were partially defective for primer translocation/utilization at DR2. They were also invariably defective for circularization. In addition, examination of several new DHBV M variants indicated that they too were defective for primer translocation/utilization and circularization. Thus, this analysis indicated that 5E and M play roles in both primer translocation/utilization and circularization. In conjunction with earlier findings that 3E functions in both template switches, our findings indicate that the processes of primer translocation and circularization share a common underlying mechanism. PMID:11861843
Escaping introns in COI through cDNA barcoding of mushrooms: Pleurotus as a test case.
Avin, Farhat A; Subha, Bhassu; Tan, Yee-Shin; Braukmann, Thomas W A; Vikineswary, Sabaratnam; Hebert, Paul D N
2017-09-01
DNA barcoding involves the use of one or more short, standardized DNA fragments for the rapid identification of species. A 648-bp segment near the 5' terminus of the mitochondrial cytochrome c oxidase subunit I (COI) gene has been adopted as the universal DNA barcode for members of the animal kingdom, but its utility in mushrooms is complicated by the frequent occurrence of large introns. As a consequence, ITS has been adopted as the standard DNA barcode marker for mushrooms despite several shortcomings. This study employed newly designed primers coupled with cDNA analysis to examine COI sequence diversity in six species of Pleurotus and compared these results with those for ITS. The ability of the COI gene to discriminate six species of Pleurotus , the commonly cultivated oyster mushroom, was examined by analysis of cDNA. The amplification success, sequence variation within and among species, and the ability to design effective primers was tested. We compared ITS sequences to their COI cDNA counterparts for all isolates. ITS discriminated between all six species, but some sequence results were uninterpretable, because of length variation among ITS copies. By comparison, a complete COI sequences were recovered from all but three individuals of Pleurotus giganteus where only the 5' region was obtained. The COI sequences permitted the resolution of all species when partial data was excluded for P. giganteus . Our results suggest that COI can be a useful barcode marker for mushrooms when cDNA analysis is adopted, permitting identifications in cases where ITS cannot be recovered or where it offers higher resolution when fresh tissue is. The suitability of this approach remains to be confirmed for other mushrooms.
2017-01-01
Amplicon (targeted) sequencing by massively parallel sequencing (PCR-MPS) is a potential method for use in forensic DNA analyses. In this application, PCR-MPS may supplement or replace other instrumental analysis methods such as capillary electrophoresis and Sanger sequencing for STR and mitochondrial DNA typing, respectively. PCR-MPS also may enable the expansion of forensic DNA analysis methods to include new marker systems such as single nucleotide polymorphisms (SNPs) and insertion/deletions (indels) that currently are assayable using various instrumental analysis methods including microarray and quantitative PCR. Acceptance of PCR-MPS as a forensic method will depend in part upon developing protocols and criteria that define the limitations of a method, including a defensible analytical threshold or method detection limit. This paper describes an approach to establish objective analytical thresholds suitable for multiplexed PCR-MPS methods. A definition is proposed for PCR-MPS method background noise, and an analytical threshold based on background noise is described. PMID:28542338
Young, Brian; King, Jonathan L; Budowle, Bruce; Armogida, Luigi
2017-01-01
Amplicon (targeted) sequencing by massively parallel sequencing (PCR-MPS) is a potential method for use in forensic DNA analyses. In this application, PCR-MPS may supplement or replace other instrumental analysis methods such as capillary electrophoresis and Sanger sequencing for STR and mitochondrial DNA typing, respectively. PCR-MPS also may enable the expansion of forensic DNA analysis methods to include new marker systems such as single nucleotide polymorphisms (SNPs) and insertion/deletions (indels) that currently are assayable using various instrumental analysis methods including microarray and quantitative PCR. Acceptance of PCR-MPS as a forensic method will depend in part upon developing protocols and criteria that define the limitations of a method, including a defensible analytical threshold or method detection limit. This paper describes an approach to establish objective analytical thresholds suitable for multiplexed PCR-MPS methods. A definition is proposed for PCR-MPS method background noise, and an analytical threshold based on background noise is described.
Zhao, Ya-E; Xu, Ji-Ru; Hu, Li; Wu, Li-Ping; Wang, Zheng-Hang
2012-05-01
The study for the first time attempted to accomplish 18S ribosomal DNA (rDNA) complete sequence amplification and analysis for three Demodex species (Demodex folliculorum, Demodex brevis and Demodex canis) based on gDNA extraction from individual mites. The mites were treated by DNA Release Additive and Hot Start II DNA Polymerase so as to promote mite disruption and increase PCR specificity. Determination of D. folliculorum gDNA showed that the gDNA yield reached the highest at 1 mite, tending to descend with the increase of mite number. The individual mite gDNA was successfully used for 18S rDNA fragment (about 900 bp) amplification examination. The alignments of 18S rDNA complete sequences of individual mite samples and those of pooled mite samples ( ≥ 1000mites/sample) showed over 97% identities for each species, indicating that the gDNA extracted from a single individual mite was as satisfactory as that from pooled mites for PCR amplification. Further pairwise sequence analyses showed that average divergence, genetic distance, transition/transversion or phylogenetic tree could not effectively identify the three Demodex species, largely due to the differentiation in the D. canis isolates. It can be concluded that the individual Demodex mite gDNA can satisfy the molecular study of Demodex. 18S rDNA complete sequence is suitable for interfamily identification in Cheyletoidea, but whether it is suitable for intrafamily identification cannot be confirmed until the ascertainment of the types of Demodex mites parasitizing in dogs. Copyright © 2012 Elsevier Inc. All rights reserved.
Tamori, Akihiro; Yamanishi, Yoshihiro; Kawashima, Shuichi; Kanehisa, Minoru; Enomoto, Masaru; Tanaka, Hiromu; Kubo, Shoji; Shiomi, Susumu; Nishiguchi, Shuhei
2005-08-15
Integration of hepatitis B virus (HBV) DNA into the human genome is one of the most important steps in HBV-related carcinogenesis. This study attempted to find the link between HBV DNA, the adjoining cellular sequence, and altered gene expression in hepatocellular carcinoma (HCC) with integrated HBV DNA. We examined 15 cases of HCC infected with HBV by cassette ligation-mediated PCR. The human DNA adjacent to the integrated HBV DNA was sequenced. Protein coding sequences were searched for in the human sequence. In five cases with HBV DNA integration, from which good quality RNA was extracted, gene expression was examined by cDNA microarray analysis. The human DNA sequence successive to integrated HBV DNA was determined in the 15 HCCs. Eight protein-coding regions were involved: ras-responsive element binding protein 1, calmodulin 1, mixed lineage leukemia 2 (MLL2), FLJ333655, LOC220272, LOC255345, LOC220220, and LOC168991. The MLL2 gene was expressed in three cases with HBV DNA integrated into exon 3 of MLL2 and in one case with HBV DNA integrated into intron 3 of MLL2. Gene expression analysis suggested that two HCCs with HBV integrated into MLL2 had similar patterns of gene expression compared with three HCCs with HBV integrated into other loci of human chromosomes. HBV DNA was integrated at random sites of human DNA, and the MLL2 gene was one of the targets for integration. Our results suggest that HBV DNA might modulate human genes near integration sites, followed by integration site-specific expression of such genes during hepatocarcinogenesis.
HLA genotyping by next-generation sequencing of complementary DNA.
Segawa, Hidenobu; Kukita, Yoji; Kato, Kikuya
2017-11-28
Genotyping of the human leucocyte antigen (HLA) is indispensable for various medical treatments. However, unambiguous genotyping is technically challenging due to high polymorphism of the corresponding genomic region. Next-generation sequencing is changing the landscape of genotyping. In addition to high throughput of data, its additional advantage is that DNA templates are derived from single molecules, which is a strong merit for the phasing problem. Although most currently developed technologies use genomic DNA, use of cDNA could enable genotyping with reduced costs in data production and analysis. We thus developed an HLA genotyping system based on next-generation sequencing of cDNA. Each HLA gene was divided into 3 or 4 target regions subjected to PCR amplification and subsequent sequencing with Ion Torrent PGM. The sequence data were then subjected to an automated analysis. The principle of the analysis was to construct candidate sequences generated from all possible combinations of variable bases and arrange them in decreasing order of the number of reads. Upon collecting candidate sequences from all target regions, 2 haplotypes were usually assigned. Cases not assigned 2 haplotypes were forwarded to 4 additional processes: selection of candidate sequences applying more stringent criteria, removal of artificial haplotypes, selection of candidate sequences with a relaxed threshold for sequence matching, and countermeasure for incomplete sequences in the HLA database. The genotyping system was evaluated using 30 samples; the overall accuracy was 97.0% at the field 3 level and 98.3% at the G group level. With one sample, genotyping of DPB1 was not completed due to short read size. We then developed a method for complete sequencing of individual molecules of the DPB1 gene, using the molecular barcode technology. The performance of the automatic genotyping system was comparable to that of systems developed in previous studies. Thus, next-generation sequencing of cDNA is a viable option for HLA genotyping.
NASA Astrophysics Data System (ADS)
Yang, Hong
Until recently, recovery and analysis of genetic information encoded in ancient DNA sequences from Pleistocene fossils were impossible. Recent advances in molecular biology offered technical tools to obtain ancient DNA sequences from well-preserved Quaternary fossils and opened the possibilities to directly study genetic changes in fossil species to address various biological and paleontological questions. Ancient DNA studies involving Pleistocene fossil material and ancient DNA degradation and preservation in Quaternary deposits are reviewed. The molecular technology applied to isolate, amplify, and sequence ancient DNA is also presented. Authentication of ancient DNA sequences and technical problems associated with modern and ancient DNA contamination are discussed. As illustrated in recent studies on ancient DNA from proboscideans, it is apparent that fossil DNA sequence data can shed light on many aspects of Quaternary research such as systematics and phylogeny. conservation biology, evolutionary theory, molecular taphonomy, and forensic sciences. Improvement of molecular techniques and a better understanding of DNA degradation during fossilization are likely to build on current strengths and to overcome existing problems, making fossil DNA data a unique source of information for Quaternary scientists.
[Replication of Streptomyces plasmids: the DNA nucleotide sequence of plasmid pSB 24.2].
Bolotin, A P; Sorokin, A V; Aleksandrov, N N; Danilenko, V N; Kozlov, Iu I
1985-11-01
The nucleotide sequence of DNA in plasmid pSB 24.2, a natural deletion derivative of plasmid pSB 24.1 isolated from S. cyanogenus was studied. The plasmid amounted by its size to 3706 nucleotide pairs. The G-C composition was equal to 73 per cent. The analysis of the DNA structure in plasmid pSB 24.2 revealed the protein-encoding sequence of DNA, the continuity of which was significant for replication of the plasmid containing more than 1300 nucleotide pairs. The analysis also revealed two A-T-rich areas of DNA, the G-C composition of which was less than 55 per cent and a DNA area with a branched pin structure. The results may be of value in investigation of plasmid replication in actinomycetes and experimental cloning of DNA with this plasmid as a vector.
Wang, Yongming; Lin, Xiuyun; Dong, Bo; Wang, Yingdian; Liu, Bao
2004-01-01
RAPD (randomly amplified polymorphic DNA) and ISSR (inter-simple sequence repeat) fingerprinting on HpaII/MspI-digested genomic DNA of nine elite japonica rice cultivars implies inter-cultivar DNA methylation polymorphism. Using both DNA fragments isolated from RAPD or ISSR gels and selected low-copy sequences as probes, methylation-sensitive Southern blot analysis confirms the existence of extensive DNA methylation polymorphism in both genes and DNA repeats among the rice cultivars. The cultivar-specific methylation patterns are stably maintained, and can be used as reliable molecular markers. Transcriptional analysis of four selected sequences (RdRP, AC9, HSP90 and MMR) on leaves and roots from normal and 5-azacytidine-treated seedlings of three representative cultivars shows an association between the transcriptional activity of one of the genes, the mismatch repair (MMR) gene, and its CG methylation patterns.
Detection of herpes simplex virus-specific DNA sequences in latently infected mice and in humans.
Efstathiou, S; Minson, A C; Field, H J; Anderson, J R; Wildy, P
1986-02-01
Herpes simplex virus-specific DNA sequences have been detected by Southern hybridization analysis in both central and peripheral nervous system tissues of latently infected mice. We have detected virus-specific sequences corresponding to the junction fragment but not the genomic termini, an observation first made by Rock and Fraser (Nature [London] 302:523-525, 1983). This "endless" herpes simplex virus DNA is both qualitatively and quantitatively stable in mouse neural tissue analyzed over a 4-month period. In addition, examination of DNA extracted from human trigeminal ganglia has shown herpes simplex virus DNA to be present in an "endless" form similar to that found in the mouse model system. Further restriction enzyme analysis of latently infected mouse brainstem and human trigeminal DNA has shown that this "endless" herpes simplex virus DNA is present in all four isomeric configurations.
Detection of herpes simplex virus-specific DNA sequences in latently infected mice and in humans.
Efstathiou, S; Minson, A C; Field, H J; Anderson, J R; Wildy, P
1986-01-01
Herpes simplex virus-specific DNA sequences have been detected by Southern hybridization analysis in both central and peripheral nervous system tissues of latently infected mice. We have detected virus-specific sequences corresponding to the junction fragment but not the genomic termini, an observation first made by Rock and Fraser (Nature [London] 302:523-525, 1983). This "endless" herpes simplex virus DNA is both qualitatively and quantitatively stable in mouse neural tissue analyzed over a 4-month period. In addition, examination of DNA extracted from human trigeminal ganglia has shown herpes simplex virus DNA to be present in an "endless" form similar to that found in the mouse model system. Further restriction enzyme analysis of latently infected mouse brainstem and human trigeminal DNA has shown that this "endless" herpes simplex virus DNA is present in all four isomeric configurations. Images PMID:3003377
Harper, B; McClain, S; Ganko, E W
2012-08-01
Global regulatory agencies require bioinformatic sequence analysis as part of their safety evaluation for transgenic crops. Analysis typically focuses on encoded proteins and adjacent endogenous flanking sequences. Recently, regulatory expectations have expanded to include all reading frames of the inserted DNA. The intent is to provide biologically relevant results that can be used in the overall assessment of safety. This paper evaluates the relevance of assessing the allergenic potential of all DNA reading frames found in common food genes using methods considered for the analysis of T-DNA sequences used in transgenic crops. FASTA and BLASTX algorithms were used to compare genes from maize, rice, soybean, cucumber, melon, watermelon, and tomato using international regulatory guidance. Results show that BLASTX for maize yielded 7254 alignments that exceeded allergen similarity thresholds and 210,772 alignments that matched eight or more consecutive amino acids with an allergen; other crops produced similar results. This analysis suggests that each nontransgenic crop has a much greater potential for allergenic risk than what has been observed clinically. We demonstrate that a meaningful safety assessment is unlikely to be provided by using methods with inherently high frequencies of false positive alignments when broadly applied to all reading frames of DNA sequence. Copyright © 2012 Elsevier Inc. All rights reserved.
Murray, V
1999-01-01
This article reviews the literature concerning the sequence specificity of DNA-damaging agents. DNA-damaging agents are widely used in cancer chemotherapy. It is important to understand fully the determinants of DNA sequence specificity so that more effective DNA-damaging agents can be developed as antitumor drugs. There are five main methods of DNA sequence specificity analysis: cleavage of end-labeled fragments, linear amplification with Taq DNA polymerase, ligation-mediated polymerase chain reaction (PCR), single-strand ligation PCR, and footprinting. The DNA sequence specificity in purified DNA and in intact mammalian cells is reviewed for several classes of DNA-damaging agent. These include agents that form covalent adducts with DNA, free radical generators, topoisomerase inhibitors, intercalators and minor groove binders, enzymes, and electromagnetic radiation. The main sites of adduct formation are at the N-7 of guanine in the major groove of DNA and the N-3 of adenine in the minor groove, whereas free radical generators abstract hydrogen from the deoxyribose sugar and topoisomerase inhibitors cause enzyme-DNA cross-links to form. Several issues involved in the determination of the DNA sequence specificity are discussed. The future directions of the field, with respect to cancer chemotherapy, are also examined.
Noncoding sequence classification based on wavelet transform analysis: part I
NASA Astrophysics Data System (ADS)
Paredes, O.; Strojnik, M.; Romo-Vázquez, R.; Vélez Pérez, H.; Ranta, R.; Garcia-Torales, G.; Scholl, M. K.; Morales, J. A.
2017-09-01
DNA sequences in human genome can be divided into the coding and noncoding ones. Coding sequences are those that are read during the transcription. The identification of coding sequences has been widely reported in literature due to its much-studied periodicity. Noncoding sequences represent the majority of the human genome. They play an important role in gene regulation and differentiation among the cells. However, noncoding sequences do not exhibit periodicities that correlate to their functions. The ENCODE (Encyclopedia of DNA elements) and Epigenomic Roadmap Project projects have cataloged the human noncoding sequences into specific functions. We study characteristics of noncoding sequences with wavelet analysis of genomic signals.
DNA Translator and Aligner: HyperCard utilities to aid phylogenetic analysis of molecules.
Eernisse, D J
1992-04-01
DNA Translator and Aligner are molecular phylogenetics HyperCard stacks for Macintosh computers. They manipulate sequence data to provide graphical gene mapping, conversions, translations and manual multiple-sequence alignment editing. DNA Translator is able to convert documented GenBank or EMBL documented sequences into linearized, rescalable gene maps whose gene sequences are extractable by clicking on the corresponding map button or by selection from a scrolling list. Provided gene maps, complete with extractable sequences, consist of nine metazoan, one yeast, and one ciliate mitochondrial DNAs and three green plant chloroplast DNAs. Single or multiple sequences can be manipulated to aid in phylogenetic analysis. Sequences can be translated between nucleic acids and proteins in either direction with flexible support of alternate genetic codes and ambiguous nucleotide symbols. Multiple aligned sequence output from diverse sources can be converted to Nexus, Hennig86 or PHYLIP format for subsequent phylogenetic analysis. Input or output alignments can be examined with Aligner, a convenient accessory stack included in the DNA Translator package. Aligner is an editor for the manual alignment of up to 100 sequences that toggles between display of matched characters and normal unmatched sequences. DNA Translator also generates graphic displays of amino acid coding and codon usage frequency relative to all other, or only synonymous, codons for approximately 70 select organism-organelle combinations. Codon usage data is compatible with spreadsheet or UWGCG formats for incorporation of additional molecules of interest. The complete package is available via anonymous ftp and is free for non-commercial uses.
Nacheva, Elizabeth; Mokretar, Katya; Soenmez, Aynur; Pittman, Alan M; Grace, Colin; Valli, Roberto; Ejaz, Ayesha; Vattathil, Selina; Maserati, Emanuela; Houlden, Henry; Taanman, Jan-Willem; Schapira, Anthony H; Proukakis, Christos
2017-01-01
Potential bias introduced during DNA isolation is inadequately explored, although it could have significant impact on downstream analysis. To investigate this in human brain, we isolated DNA from cerebellum and frontal cortex using spin columns under different conditions, and salting-out. We first analysed DNA using array CGH, which revealed a striking wave pattern suggesting primarily GC-rich cerebellar losses, even against matched frontal cortex DNA, with a similar pattern on a SNP array. The aCGH changes varied with the isolation protocol. Droplet digital PCR of two genes also showed protocol-dependent losses. Whole genome sequencing showed GC-dependent variation in coverage with spin column isolation from cerebellum. We also extracted and sequenced DNA from substantia nigra using salting-out and phenol / chloroform. The mtDNA copy number, assessed by reads mapping to the mitochondrial genome, was higher in substantia nigra when using phenol / chloroform. We thus provide evidence for significant method-dependent bias in DNA isolation from human brain, as reported in rat tissues. This may contribute to array "waves", and could affect copy number determination, particularly if mosaicism is being sought, and sequencing coverage. Variations in isolation protocol may also affect apparent mtDNA abundance.
Nacheva, Elizabeth; Mokretar, Katya; Soenmez, Aynur; Pittman, Alan M.; Grace, Colin; Valli, Roberto; Ejaz, Ayesha; Vattathil, Selina; Maserati, Emanuela; Houlden, Henry; Taanman, Jan-Willem; Schapira, Anthony H.
2017-01-01
Potential bias introduced during DNA isolation is inadequately explored, although it could have significant impact on downstream analysis. To investigate this in human brain, we isolated DNA from cerebellum and frontal cortex using spin columns under different conditions, and salting-out. We first analysed DNA using array CGH, which revealed a striking wave pattern suggesting primarily GC-rich cerebellar losses, even against matched frontal cortex DNA, with a similar pattern on a SNP array. The aCGH changes varied with the isolation protocol. Droplet digital PCR of two genes also showed protocol-dependent losses. Whole genome sequencing showed GC-dependent variation in coverage with spin column isolation from cerebellum. We also extracted and sequenced DNA from substantia nigra using salting-out and phenol / chloroform. The mtDNA copy number, assessed by reads mapping to the mitochondrial genome, was higher in substantia nigra when using phenol / chloroform. We thus provide evidence for significant method-dependent bias in DNA isolation from human brain, as reported in rat tissues. This may contribute to array “waves”, and could affect copy number determination, particularly if mosaicism is being sought, and sequencing coverage. Variations in isolation protocol may also affect apparent mtDNA abundance. PMID:28683077
Bandelt, Hans-Jürgen; Yao, Yong-Gang; Bravi, Claudio M; Salas, Antonio; Kivisild, Toomas
2009-03-01
Sequence analysis of the mitochondrial genome has become a routine method in the study of mitochondrial diseases. Quite often, the sequencing efforts in the search of pathogenic or disease-associated mutations are affected by technical and interpretive problems, caused by sample mix-up, contamination, biochemical problems, incomplete sequencing, misdocumentation and insufficient reference to previously published data. To assess data quality in case studies of mitochondrial diseases, it is recommended to compare any mtDNA sequence under consideration to their phylogenetically closest lineages available in the Web. The median network method has proven useful for visualizing potential problems with the data. We contrast some early reports of complete mtDNA sequences to more recent total mtDNA sequencing efforts in studies of various mitochondrial diseases. We conclude that the quality of complete mtDNA sequences generated in the medical field in the past few years is somewhat unsatisfactory and may even fall behind that of pioneer manual sequencing in the early nineties. Our study provides a paradigm for an a posteriori evaluation of sequence quality and for detection of potential problems with inferring a pathogenic status of a particular mutation.
USDA-ARS?s Scientific Manuscript database
To confirm a hybrid swarm population of Pinus densiflora × P. sylvestris in Jilin, China and to study whether shoot apex morphology of 4-year old seedlings can be correlated with the sequence of a chloroplast DNA simple sequence repeat marker (cpDNA SSR), needles and seeds from P. densiflora, P. syl...
Gu, Chun Tao; Li, Chun Yan; Yang, Li Jie; Huo, Gui Cheng
2014-08-01
A Gram-stain-negative bacterial strain, 10-17(T), was isolated from traditional sourdough in Heilongjiang Province, China. The bacterium was characterized by a polyphasic approach, including 16S rRNA gene sequence analysis, RNA polymerase β subunit (rpoB) gene sequence analysis, DNA gyrase (gyrB) gene sequence analysis, initiation translation factor 2 (infB) gene sequence analysis, ATP synthase β subunit (atpD) gene sequence analysis, fatty acid methyl ester analysis, determination of DNA G+C content, DNA-DNA hybridization and an analysis of phenotypic features. Strain 10-17(T) was phylogenetically related to Enterobacter hormaechei CIP 103441(T), Enterobacter cancerogenus LMG 2693(T), Enterobacter asburiae JCM 6051(T), Enterobacter mori LMG 25706(T), Enterobacter ludwigii EN-119(T) and Leclercia adecarboxylata LMG 2803(T), having 99.5%, 99.3%, 98.7%, 98.5%, 98.4% and 98.4% 16S rRNA gene sequence similarity, respectively. On the basis of polyphasic characterization data obtained in the present study, a novel species, Enterobacter xiangfangensis sp. nov., is proposed and the type strain is 10-17(T) ( = LMG 27195(T) = NCIMB 14836(T) = CCUG 62994(T)). Enterobacter sacchari Zhu et al. 2013 was reclassified as Kosakonia sacchari comb. nov. on the basis of 16S rRNA, rpoB, gyrB, infB and atpD gene sequence analysis and the type strain is strain SP1(T)( = CGMCC 1.12102(T) = LMG 26783(T)). © 2014 IUMS.
DeBoy, Robert T; Mongodin, Emmanuel F; Emerson, Joanne B; Nelson, Karen E
2006-04-01
In the present study, the chromosomes of two members of the Thermotogales were compared. A whole-genome alignment of Thermotoga maritima MSB8 and Thermotoga neapolitana NS-E has revealed numerous large-scale DNA rearrangements, most of which are associated with CRISPR DNA repeats and/or tRNA genes. These DNA rearrangements do not include the putative origin of DNA replication but move within the same replichore, i.e., the same replicating half of the chromosome (delimited by the replication origin and terminus). Based on cumulative GC skew analysis, both the T. maritima and T. neapolitana lineages contain one or two major inverted DNA segments. Also, based on PCR amplification and sequence analysis of the DNA joints that are associated with the major rearrangements, the overall chromosome architecture was found to be conserved at most DNA joints for other strains of T. neapolitana. Taken together, the results from this analysis suggest that the observed chromosomal rearrangements in the Thermotogales likely occurred by successive inversions after their divergence from a common ancestor and before strain diversification. Finally, sequence analysis shows that size polymorphisms in the DNA joints associated with CRISPRs can be explained by expansion and possibly contraction of the DNA repeat and spacer unit, providing a tool for discerning the relatedness of strains from different geographic locations.
First isolation of Actinobacillus genomospecies 2 in Japan.
Murakami, Miyuki; Shimonishi, Yoshimasa; Hobo, Seiji; Niwa, Hidekazu; Ito, Hiroya
2016-05-03
We describe here the first isolation of Actinobacillus genomospecies 2 in Japan. The isolate was found in a septicemic foal and characterized by phenotypic and genetic analyses, with the latter consisting of 16S rDNA nucleotide sequence analysis plus multilocus sequence analysis using three housekeeping genes, recN, rpoA and thdF, that have been proposed for use as a genomic tool in place of DNA-DNA hybridization.
Dialynas, D P; Murre, C; Quertermous, T; Boss, J M; Leiden, J M; Seidman, J G; Strominger, J L
1986-01-01
Complementary DNA (cDNA) encoding a human T-cell gamma chain has been cloned and sequenced. At the junction of the variable and joining regions, there is an apparent deletion of two nucleotides in the human cDNA sequence relative to the murine gamma-chain cDNA sequence, resulting simultaneously in the generation of an in-frame stop codon and in a translational frameshift. For this reason, the sequence presented here encodes an aberrantly rearranged human T-cell gamma chain. There are several surprising differences between the deduced human and murine gamma-chain amino acid sequences. These include poor homology in the variable region, poor homology in a discrete segment of the constant region precisely bounded by the expected junctions of exon CII, and the presence in the human sequence of five potential sites for N-linked glycosylation. Images PMID:3458221
2013-01-01
Background The revolution in DNA sequencing technology continues unabated, and is affecting all aspects of the biological and medical sciences. The training and recruitment of the next generation of researchers who are able to use and exploit the new technology is severely lacking and potentially negatively influencing research and development efforts to advance genome biology. Here we present a cross-disciplinary course that provides undergraduate students with practical experience in running a next generation sequencing instrument through to the analysis and annotation of the generated DNA sequences. Results Many labs across world are installing next generation sequencing technology and we show that the undergraduate students produce quality sequence data and were excited to participate in cutting edge research. The students conducted the work flow from DNA extraction, library preparation, running the sequencing instrument, to the extraction and analysis of the data. They sequenced microbes, metagenomes, and a marine mammal, the Californian sea lion, Zalophus californianus. The students met sequencing quality controls, had no detectable contamination in the targeted DNA sequences, provided publication quality data, and became part of an international collaboration to investigate carcinomas in carnivores. Conclusions Students learned important skills for their future education and career opportunities, and a perceived increase in students’ ability to conduct independent scientific research was measured. DNA sequencing is rapidly expanding in the life sciences. Teaching undergraduates to use the latest technology to sequence genomic DNA ensures they are ready to meet the challenges of the genomic era and allows them to participate in annotating the tree of life. PMID:24007365
Oh, Chang Seok; Lee, Soong Deok; Kim, Yi-Suk; Shin, Dong Hoon
2015-01-01
Previous study showed that East Asian mtDNA haplogroups, especially those of Koreans, could be successfully assigned by the coupled use of analyses on coding region SNP markers and control region mutation motifs. In this study, we tried to see if the same triple multiplex analysis for coding regions SNPs could be also applicable to ancient samples from East Asia as the complementation for sequence analysis of mtDNA control region. By the study on Joseon skeleton samples, we know that mtDNA haplogroup determined by coding region SNP markers successfully falls within the same haplogroup that sequence analysis on control region can assign. Considering that ancient samples in previous studies make no small number of errors in control region mtDNA sequencing, coding region SNP analysis can be used as good complimentary to the conventional haplogroup determination, especially of archaeological human bone samples buried underground over long periods. PMID:26345190
A DNA sequence obtained by replacement of the dopamine RNA aptamer bases is not an aptamer.
Álvarez-Martos, Isabel; Ferapontova, Elena E
2017-08-05
A unique specificity of the aptamer-ligand biorecognition and binding facilitates bioanalysis and biosensor development, contributing to discrimination of structurally related molecules, such as dopamine and other catecholamine neurotransmitters. The aptamer sequence capable of specific binding of dopamine is a 57 nucleotides long RNA sequence reported in 1997 (Biochemistry, 1997, 36, 9726). Later, it was suggested that the DNA homologue of the RNA aptamer retains the specificity of dopamine binding (Biochem. Biophys. Res. Commun., 2009, 388, 732). Here, we show that the DNA sequence obtained by the replacement of the RNA aptamer bases for their DNA analogues is not able of specific biorecognition of dopamine, in contrast to the original RNA aptamer sequence. This DNA sequence binds dopamine and structurally related catecholamine neurotransmitters non-specifically, as any DNA sequence, and, thus, is not an aptamer and cannot be used neither for in vivo nor in situ analysis of dopamine in the presence of structurally related neurotransmitters. Copyright © 2017 Elsevier Inc. All rights reserved.
BATTLE: Biomarker-Based Approaches of Targeted Therapy for Lung Cancer Elimination
2008-04-01
although a grade 3 neutropenia was dose-limiting in one importance. Th th ubstrate of the CYP3A4 isoenzyme and P-gp. Its metabolism is sensitive to...tratification in clinis Molecular Pathway Biomarkers Type of Analysis EGFR EGFR Mutation ( exons 18 to 21) DNA sequencing EGFR Increased Copy Number...polysomy/am 1plification) DNA FISH K-Ras/B-Raf K-RAS Mutation (codons 12,13, 61) DNA sequencing B-RAF Mutations ( exons 11 and 15) DNA sequencing
Ancient DNA sequence revealed by error-correcting codes.
Brandão, Marcelo M; Spoladore, Larissa; Faria, Luzinete C B; Rocha, Andréa S L; Silva-Filho, Marcio C; Palazzo, Reginaldo
2015-07-10
A previously described DNA sequence generator algorithm (DNA-SGA) using error-correcting codes has been employed as a computational tool to address the evolutionary pathway of the genetic code. The code-generated sequence alignment demonstrated that a residue mutation revealed by the code can be found in the same position in sequences of distantly related taxa. Furthermore, the code-generated sequences do not promote amino acid changes in the deviant genomes through codon reassignment. A Bayesian evolutionary analysis of both code-generated and homologous sequences of the Arabidopsis thaliana malate dehydrogenase gene indicates an approximately 1 MYA divergence time from the MDH code-generated sequence node to its paralogous sequences. The DNA-SGA helps to determine the plesiomorphic state of DNA sequences because a single nucleotide alteration often occurs in distantly related taxa and can be found in the alternative codon patterns of noncanonical genetic codes. As a consequence, the algorithm may reveal an earlier stage of the evolution of the standard code.
Ancient DNA sequence revealed by error-correcting codes
Brandão, Marcelo M.; Spoladore, Larissa; Faria, Luzinete C. B.; Rocha, Andréa S. L.; Silva-Filho, Marcio C.; Palazzo, Reginaldo
2015-01-01
A previously described DNA sequence generator algorithm (DNA-SGA) using error-correcting codes has been employed as a computational tool to address the evolutionary pathway of the genetic code. The code-generated sequence alignment demonstrated that a residue mutation revealed by the code can be found in the same position in sequences of distantly related taxa. Furthermore, the code-generated sequences do not promote amino acid changes in the deviant genomes through codon reassignment. A Bayesian evolutionary analysis of both code-generated and homologous sequences of the Arabidopsis thaliana malate dehydrogenase gene indicates an approximately 1 MYA divergence time from the MDH code-generated sequence node to its paralogous sequences. The DNA-SGA helps to determine the plesiomorphic state of DNA sequences because a single nucleotide alteration often occurs in distantly related taxa and can be found in the alternative codon patterns of noncanonical genetic codes. As a consequence, the algorithm may reveal an earlier stage of the evolution of the standard code. PMID:26159228
Wolffe, E J; Gause, W C; Pelfrey, C M; Holland, S M; Steinberg, A D; August, J T
1990-01-05
We describe the isolation and sequencing of a cDNA encoding mouse Pgp-1. An oligonucleotide probe corresponding to the NH2-terminal sequence of the purified protein was synthesized by the polymerase chain reaction and used to screen a mouse macrophage lambda gt11 library. A cDNA clone with an insert of 1.2 kilobases was selected and sequenced. In Northern blot analysis, only cells expressing Pgp-1 contained mRNA species that hybridized with this Pgp-1 cDNA. The nucleotide sequence of the cDNA has a single open reading frame that yields a protein-coding sequence of 1076 base pairs followed by a 132-base pair 3'-untranslated sequence that includes a putative polyadenylation signal but no poly(A) tail. The translated sequence comprises a 13-amino acid signal peptide followed by a polypeptide core of 345 residues corresponding to an Mr of 37,800. Portions of the deduced amino acid sequence were identical to those obtained by amino acid sequence analysis from the purified glycoprotein, confirming that the cDNA encodes Pgp-1. The predicted structure of Pgp-1 includes an NH2-terminal extracellular domain (residues 14-265), a transmembrane domain (residues 266-286), and a cytoplasmic tail (residues 287-358). Portions of the mouse Pgp-1 sequence are highly similar to that of the human CD44 cell surface glycoprotein implicated in cell adhesion. The protein also shows sequence similarity to the proteoglycan tandem repeat sequences found in cartilage link protein and cartilage proteoglycan core protein which are thought to be involved in binding to hyaluronic acid.
High-throughput analysis of T-DNA location and structure using sequence capture
DOE Office of Scientific and Technical Information (OSTI.GOV)
Inagaki, Soichi; Henry, Isabelle M.; Lieberman, Meric C.
Agrobacterium-mediated transformation of plants with T-DNA is used both to introduce transgenes and for mutagenesis. Conventional approaches used to identify the genomic location and the structure of the inserted T-DNA are laborious and high-throughput methods using next-generation sequencing are being developed to address these problems. Here, we present a cost-effective approach that uses sequence capture targeted to the T-DNA borders to select genomic DNA fragments containing T-DNA—genome junctions, followed by Illumina sequencing to determine the location and junction structure of T-DNA insertions. Multiple probes can be mixed so that transgenic lines transformed with different T-DNA types can be processed simultaneously,more » using a simple, index-based pooling approach. We also developed a simple bioinformatic tool to find sequence read pairs that span the junction between the genome and T-DNA or any foreign DNA. We analyzed 29 transgenic lines of Arabidopsis thaliana, each containing inserts from 4 different T-DNA vectors. We determined the location of T-DNA insertions in 22 lines, 4 of which carried multiple insertion sites. Additionally, our analysis uncovered a high frequency of unconventional and complex T-DNA insertions, highlighting the needs for high-throughput methods for T-DNA localization and structural characterization. Transgene insertion events have to be fully characterized prior to use as commercial products. As a result, our method greatly facilitates the first step of this characterization of transgenic plants by providing an efficient screen for the selection of promising lines.« less
High-throughput analysis of T-DNA location and structure using sequence capture
Inagaki, Soichi; Henry, Isabelle M.; Lieberman, Meric C.; ...
2015-10-07
Agrobacterium-mediated transformation of plants with T-DNA is used both to introduce transgenes and for mutagenesis. Conventional approaches used to identify the genomic location and the structure of the inserted T-DNA are laborious and high-throughput methods using next-generation sequencing are being developed to address these problems. Here, we present a cost-effective approach that uses sequence capture targeted to the T-DNA borders to select genomic DNA fragments containing T-DNA—genome junctions, followed by Illumina sequencing to determine the location and junction structure of T-DNA insertions. Multiple probes can be mixed so that transgenic lines transformed with different T-DNA types can be processed simultaneously,more » using a simple, index-based pooling approach. We also developed a simple bioinformatic tool to find sequence read pairs that span the junction between the genome and T-DNA or any foreign DNA. We analyzed 29 transgenic lines of Arabidopsis thaliana, each containing inserts from 4 different T-DNA vectors. We determined the location of T-DNA insertions in 22 lines, 4 of which carried multiple insertion sites. Additionally, our analysis uncovered a high frequency of unconventional and complex T-DNA insertions, highlighting the needs for high-throughput methods for T-DNA localization and structural characterization. Transgene insertion events have to be fully characterized prior to use as commercial products. As a result, our method greatly facilitates the first step of this characterization of transgenic plants by providing an efficient screen for the selection of promising lines.« less
NASA Astrophysics Data System (ADS)
Cannon, M. V.; Hester, J.; Shalkhauser, A.; Chan, E. R.; Logue, K.; Small, S. T.; Serre, D.
2016-03-01
Analysis of environmental DNA (eDNA) enables the detection of species of interest from water and soil samples, typically using species-specific PCR. Here, we describe a method to characterize the biodiversity of a given environment by amplifying eDNA using primer pairs targeting a wide range of taxa and high-throughput sequencing for species identification. We tested this approach on 91 water samples of 40 mL collected along the Cuyahoga River (Ohio, USA). We amplified eDNA using 12 primer pairs targeting mammals, fish, amphibians, birds, bryophytes, arthropods, copepods, plants and several microorganism taxa and sequenced all PCR products simultaneously by high-throughput sequencing. Overall, we identified DNA sequences from 15 species of fish, 17 species of mammals, 8 species of birds, 15 species of arthropods, one turtle and one salamander. Interestingly, in addition to aquatic and semi-aquatic animals, we identified DNA from terrestrial species that live near the Cuyahoga River. We also identified DNA from one Asian carp species invasive to the Great Lakes but that had not been previously reported in the Cuyahoga River. Our study shows that analysis of eDNA extracted from small water samples using wide-range PCR amplification combined with high-throughput sequencing can provide a broad perspective on biological diversity.
Cannon, M. V.; Hester, J.; Shalkhauser, A.; Chan, E. R.; Logue, K.; Small, S. T.; Serre, D.
2016-01-01
Analysis of environmental DNA (eDNA) enables the detection of species of interest from water and soil samples, typically using species-specific PCR. Here, we describe a method to characterize the biodiversity of a given environment by amplifying eDNA using primer pairs targeting a wide range of taxa and high-throughput sequencing for species identification. We tested this approach on 91 water samples of 40 mL collected along the Cuyahoga River (Ohio, USA). We amplified eDNA using 12 primer pairs targeting mammals, fish, amphibians, birds, bryophytes, arthropods, copepods, plants and several microorganism taxa and sequenced all PCR products simultaneously by high-throughput sequencing. Overall, we identified DNA sequences from 15 species of fish, 17 species of mammals, 8 species of birds, 15 species of arthropods, one turtle and one salamander. Interestingly, in addition to aquatic and semi-aquatic animals, we identified DNA from terrestrial species that live near the Cuyahoga River. We also identified DNA from one Asian carp species invasive to the Great Lakes but that had not been previously reported in the Cuyahoga River. Our study shows that analysis of eDNA extracted from small water samples using wide-range PCR amplification combined with high-throughput sequencing can provide a broad perspective on biological diversity. PMID:26965911
PCR Conditions for 16S Primers for Analysis of Microbes in the Colon of Rats.
Guillen, I A; Camacho, H; Tuero, A D; Bacardí, D; Palenzuela, D O; Aguilera, A; Silva, J A; Estrada, R; Gell, O; Suárez, J; Ancizar, J; Brown, E; Colarte, A B; Castro, J; Novoa, L I
2016-09-01
The study of the composition of the intestinal flora is important to the health of the host, playing a key role in maintaining intestinal homeostasis and the evolution of the immune system. For these studies, various universal primers of the 16S rDNA gene are used in microbial taxonomy. Here, we report an evaluation of 5 universal primers to explore the presence of microbial DNA in colon biopsies preserved in RNAlater solution. The DNA extracted was used for the amplification of PCR products containing the variable (V) regions of the microbial 16S rDNA gene. The PCR products were studied by restriction fragment length polymorphism (RFLP) analysis and DNA sequence, whose percent of homology with microbial sequences reported in GenBank was verified using bioinformatics tools. The presence of microbes in the colon of rats was quantified by the quantitative PCR (qPCR) technique. We obtained microbial DNA from rat, useful for PCR analysis with the universal primers for the bacteria 16S rDNA. The sequences of PCR products obtained from a colon biopsy of the animal showed homology with the classes bacilli (Lactobacillus spp) and proteobacteria, normally represented in the colon of rats. The proposed methodology allowed the attainment of DNA of bacteria with the quality and integrity for use in qPCR, sequencing, and PCR-RFLP analysis. The selected universal primers provided knowledge of the abundance of microorganisms and the formation of a preliminary test of bacterial diversity in rat colon biopsies.
Massively parallel sequencing-enabled mixture analysis of mitochondrial DNA samples.
Churchill, Jennifer D; Stoljarova, Monika; King, Jonathan L; Budowle, Bruce
2018-02-22
The mitochondrial genome has a number of characteristics that provide useful information to forensic investigations. Massively parallel sequencing (MPS) technologies offer improvements to the quantitative analysis of the mitochondrial genome, specifically the interpretation of mixed mitochondrial samples. Two-person mixtures with nuclear DNA ratios of 1:1, 5:1, 10:1, and 20:1 of individuals from different and similar phylogenetic backgrounds and three-person mixtures with nuclear DNA ratios of 1:1:1 and 5:1:1 were prepared using the Precision ID mtDNA Whole Genome Panel and Ion Chef, and sequenced on the Ion PGM or Ion S5 sequencer (Thermo Fisher Scientific, Waltham, MA, USA). These data were used to evaluate whether and to what degree MPS mixtures could be deconvolved. Analysis was effective in identifying the major contributor in each instance, while SNPs from the minor contributor's haplotype only were identified in the 1:1, 5:1, and 10:1 two-person mixtures. While the major contributor was identified from the 5:1:1 mixture, analysis of the three-person mixtures was more complex, and the mixed haplotypes could not be completely parsed. These results indicate that mixed mitochondrial DNA samples may be interpreted with the use of MPS technologies.
Bharti, Sanjay Kumar; Sommers, Joshua A.; Zhou, Jun; Kaplan, Daniel L.; Spelbrink, Johannes N.; Mergny, Jean-Louis; Brosh, Robert M.
2014-01-01
Mitochondrial DNA deletions are prominent in human genetic disorders, cancer, and aging. It is thought that stalling of the mitochondrial replication machinery during DNA synthesis is a prominent source of mitochondrial genome instability; however, the precise molecular determinants of defective mitochondrial replication are not well understood. In this work, we performed a computational analysis of the human mitochondrial genome using the “Pattern Finder” G-quadruplex (G4) predictor algorithm to assess whether G4-forming sequences reside in close proximity (within 20 base pairs) to known mitochondrial DNA deletion breakpoints. We then used this information to map G4P sequences with deletions characteristic of representative mitochondrial genetic disorders and also those identified in various cancers and aging. Circular dichroism and UV spectral analysis demonstrated that mitochondrial G-rich sequences near deletion breakpoints prevalent in human disease form G-quadruplex DNA structures. A biochemical analysis of purified recombinant human Twinkle protein (gene product of c10orf2) showed that the mitochondrial replicative helicase inefficiently unwinds well characterized intermolecular and intramolecular G-quadruplex DNA substrates, as well as a unimolecular G4 substrate derived from a mitochondrial sequence that nests a deletion breakpoint described in human renal cell carcinoma. Although G4 has been implicated in the initiation of mitochondrial DNA replication, our current findings suggest that mitochondrial G-quadruplexes are also likely to be a source of instability for the mitochondrial genome by perturbing the normal progression of the mitochondrial replication machinery, including DNA unwinding by Twinkle helicase. PMID:25193669
Inaugural Genomics Automation Congress and the coming deluge of sequencing data.
Creighton, Chad J
2010-10-01
Presentations at Select Biosciences's first 'Genomics Automation Congress' (Boston, MA, USA) in 2010 focused on next-generation sequencing and the platforms and methodology around them. The meeting provided an overview of sequencing technologies, both new and emerging. Speakers shared their recent work on applying sequencing to profile cells for various levels of biomolecular complexity, including DNA sequences, DNA copy, DNA methylation, mRNA and microRNA. With sequencing time and costs continuing to drop dramatically, a virtual explosion of very large sequencing datasets is at hand, which will probably present challenges and opportunities for high-level data analysis and interpretation, as well as for information technology infrastructure.
Advances in high throughput DNA sequence data compression.
Sardaraz, Muhammad; Tahir, Muhammad; Ikram, Ataul Aziz
2016-06-01
Advances in high throughput sequencing technologies and reduction in cost of sequencing have led to exponential growth in high throughput DNA sequence data. This growth has posed challenges such as storage, retrieval, and transmission of sequencing data. Data compression is used to cope with these challenges. Various methods have been developed to compress genomic and sequencing data. In this article, we present a comprehensive review of compression methods for genome and reads compression. Algorithms are categorized as referential or reference free. Experimental results and comparative analysis of various methods for data compression are presented. Finally, key challenges and research directions in DNA sequence data compression are highlighted.
Chae, Heejoon; Lee, Sangseon; Seo, Seokjun; Jung, Daekyoung; Chang, Hyeonsook; Nephew, Kenneth P; Kim, Sun
2016-12-01
Measuring gene expression, DNA sequence variation, and DNA methylation status is routinely done using high throughput sequencing technologies. To analyze such multi-omics data and explore relationships, reliable bioinformatics systems are much needed. Existing systems are either for exploring curated data or for processing omics data in the form of a library such as R. Thus scientists have much difficulty in investigating relationships among gene expression, DNA sequence variation, and DNA methylation using multi-omics data. In this study, we report a system called BioVLAB-mCpG-SNP-EXPRESS for the integrated analysis of DNA methylation, sequence variation (SNPs), and gene expression for distinguishing cellular phenotypes at the pairwise and multiple phenotype levels. The system can be deployed on either the Amazon cloud or a publicly available high-performance computing node, and the data analysis and exploration of the analysis result can be conveniently done using a web-based interface. In order to alleviate analysis complexity, all the process are fully automated, and graphical workflow system is integrated to represent real-time analysis progression. The BioVLAB-mCpG-SNP-EXPRESS system works in three stages. First, it processes and analyzes multi-omics data as input in the form of the raw data, i.e., FastQ files. Second, various integrated analyses such as methylation vs. gene expression and mutation vs. methylation are performed. Finally, the analysis result can be explored in a number of ways through a web interface for the multi-level, multi-perspective exploration. Multi-level interpretation can be done by either gene, gene set, pathway or network level and multi-perspective exploration can be explored from either gene expression, DNA methylation, sequence variation, or their relationship perspective. The utility of the system is demonstrated by performing analysis of phenotypically distinct 30 breast cancer cell line data set. BioVLAB-mCpG-SNP-EXPRESS is available at http://biohealth.snu.ac.kr/software/biovlab_mcpg_snp_express/. Copyright © 2016 Elsevier Inc. All rights reserved.
M.N. lslam-Faridi; C.D. Nelson; S.P. DiFazio; L.E. Gunter; G.A. Tuskan
2009-01-01
The 185-285 rDNA and 55 rDNA loci in Populus trichocarpa were localized using fluorescent in situ hybridization (FISH). Two 185-285 rDNA sites and one 55 rDNA site were identified and located at the ends of 3 different chromosomes. FISH signals from the Arabidopsis-type telomere repeat sequence were observed at the distal ends of each chromosome. Six BAC clones...
Sequencing of adenine in DNA by scanning tunneling microscopy
NASA Astrophysics Data System (ADS)
Tanaka, Hiroyuki; Taniguchi, Masateru
2017-08-01
The development of DNA sequencing technology utilizing the detection of a tunnel current is important for next-generation sequencer technologies based on single-molecule analysis technology. Using a scanning tunneling microscope, we previously reported that dI/dV measurements and dI/dV mapping revealed that the guanine base (purine base) of DNA adsorbed onto the Cu(111) surface has a characteristic peak at V s = -1.6 V. If, in addition to guanine, the other purine base of DNA, namely, adenine, can be distinguished, then by reading all the purine bases of each single strand of a DNA double helix, the entire base sequence of the original double helix can be determined due to the complementarity of the DNA base pair. Therefore, the ability to read adenine is important from the viewpoint of sequencing. Here, we report on the identification of adenine by STM topographic and spectroscopic measurements using a synthetic DNA oligomer and viral DNA.
Pediatric Glioblastoma Therapies Based on Patient-Derived Stem Cell Resources
2014-11-01
genomic DNA and then subjected to Illumina high-throughput sequencing . In this analysis, shRNAs lost in the GSC population represent candidate gene...and genomic DNA and then subjected to Illumina high-throughput sequencing . In this analysis, shRNAs lost in the GSC population represent candidate...PRISM 7900 Sequence Detection System ( Genomics Resource, FHCRC). Relative transcript abundance was analyzed using the 2−ΔΔCt method. TRIzol (Invitrogen
NASA Technical Reports Server (NTRS)
La Duc, Myron T.; Satomi, Masataka; Agata, Norio; Venkateswaran, Kasthuri
2004-01-01
Bacillus anthracis, the causative agent of the human disease anthrax, Bacillus cereus, a food-borne pathogen capable of causing human illness, and Bacillus thuringiensis, a well-characterized insecticidal toxin producer, all cluster together within a very tight clade (B. cereus group) phylogenetically and are indistinguishable from one another via 16S rDNA sequence analysis. As new pathogens are continually emerging, it is imperative to devise a system capable of rapidly and accurately differentiating closely related, yet phenotypically distinct species. Although the gyrB gene has proven useful in discriminating closely related species, its sequence analysis has not yet been validated by DNA:DNA hybridization, the taxonomically accepted "gold standard". We phylogenetically characterized the gyrB sequences of various species and serotypes encompassed in the "B. cereus group," including lab strains and environmental isolates. Results were compared to those obtained from analyses of phenotypic characteristics, 16S rDNA sequence, DNA:DNA hybridization, and virulence factors. The gyrB gene proved more highly differential than 16S, while, at the same time, as analytical as costly and laborious DNA:DNA hybridization techniques in differentiating species within the B. cereus group.
Oligonucleotide fingerprinting of rRNA genes for analysis of fungal community composition.
Valinsky, Lea; Della Vedova, Gianluca; Jiang, Tao; Borneman, James
2002-12-01
Thorough assessments of fungal diversity are currently hindered by technological limitations. Here we describe a new method for identifying fungi, oligonucleotide fingerprinting of rRNA genes (OFRG). ORFG sorts arrayed rRNA gene (ribosomal DNA [rDNA]) clones into taxonomic clusters through a series of hybridization experiments, each using a single oligonucleotide probe. A simulated annealing algorithm was used to design an OFRG probe set for fungal rDNA. Analysis of 1,536 fungal rDNA clones derived from soil generated 455 clusters. A pairwise sequence analysis showed that clones with average sequence identities of 99.2% were grouped into the same cluster. To examine the accuracy of the taxonomic identities produced by this OFRG experiment, we determined the nucleotide sequences for 117 clones distributed throughout the tree. For all but two of these clones, the taxonomic identities generated by this OFRG experiment were consistent with those generated by a nucleotide sequence analysis. Eighty-eight percent of the clones were affiliated with Ascomycota, while 12% belonged to BASIDIOMYCOTA: A large fraction of the clones were affiliated with the genera Fusarium (404 clones) and Raciborskiomyces (176 clones). Smaller assemblages of clones had high sequence identities to the Alternaria, Ascobolus, Chaetomium, Cryptococcus, and Rhizoctonia clades.
Zhao, A; Guo, A; Liu, Z; Pape, L
1997-01-01
The coding sequences for a Schizosaccharomyces pombe sequence-specific DNA binding protein, Reb1p, have been cloned. The predicted S. pombe Reb1p is 24-29% identical to mouse TTF-1 (transcription termination factor-1) and Saccharomyces cerevisiae REB1 protein, both of which direct termination of RNA polymerase I catalyzed transcripts. The S.pombe Reb1 cDNA encodes a predicted polypeptide of 504 amino acids with a predicted molecular weight of 58.4 kDa. The S. pombe Reb1p is unusual in that the bipartite DNA binding motif identified originally in S.cerevisiae and Klyveromyces lactis REB1 proteins is uninterrupted and thus S.pombe Reb1p may contain the smallest natural REB1 homologous DNA binding domain. Its genomic coding sequences were shown to be interrupted by two introns. A recombinant histidine-tagged Reb1 protein bearing the rDNA binding domain has two homologous, sequence-specific binding sites in the S. pomber DNA intergenic spacer, located between 289 and 480 nt downstream of the end of the approximately 25S rRNA coding sequences. Each binding site is 13-14 bp downstream of two of the three proposed in vivo termination sites. The core of this 17 bp site, AGGTAAGGGTAATGCAC, is specifically protected by Reb1p in footprinting analysis. PMID:9016645
DNA Nucleotide Sequence Restricted by the RI Endonuclease
Hedgpeth, Joe; Goodman, Howard M.; Boyer, Herbert W.
1972-01-01
The sequence of DNA base pairs adjacent to the phosphodiester bonds cleaved by the RI restriction endonuclease in unmodified DNA from coliphage λ has been determined. The 5′-terminal nucleotide labeled with 32P and oligonucleotides up to the heptamer were analyzed from a pancreatic DNase digest. The following sequence of nucleotides adjacent to the RI break made in λ DNA was deduced from these data and from the 3′-dinucleotide sequence and nearest-neighbor analysis obtained from repair synthesis with the DNA polymerase of Rous sarcoma virus [Formula: see text] The RI endonuclease cleavage of the phosphodiester bonds (indicated by arrows) generates 5′-phosphoryls and short cohesive termini of four nucleotides, pApApTpT. The most striking feature of the sequence is its symmetry. PMID:4343974
DNA mimic proteins: functions, structures, and bioinformatic analysis.
Wang, Hao-Ching; Ho, Chun-Han; Hsu, Kai-Cheng; Yang, Jinn-Moon; Wang, Andrew H-J
2014-05-13
DNA mimic proteins have DNA-like negative surface charge distributions, and they function by occupying the DNA binding sites of DNA binding proteins to prevent these sites from being accessed by DNA. DNA mimic proteins control the activities of a variety of DNA binding proteins and are involved in a wide range of cellular mechanisms such as chromatin assembly, DNA repair, transcription regulation, and gene recombination. However, the sequences and structures of DNA mimic proteins are diverse, making them difficult to predict by bioinformatic search. To date, only a few DNA mimic proteins have been reported. These DNA mimics were not found by searching for functional motifs in their sequences but were revealed only by structural analysis of their charge distribution. This review highlights the biological roles and structures of 16 reported DNA mimic proteins. We also discuss approaches that might be used to discover new DNA mimic proteins.
Gomes, S L; Gober, J W; Shapiro, L
1990-01-01
Caulobacter crescentus has a single dnaK gene that is highly homologous to the hsp70 family of heat shock genes. Analysis of the cloned and sequenced dnaK gene has shown that the deduced amino acid sequence could encode a protein of 67.6 kilodaltons that is 68% identical to the DnaK protein of Escherichia coli and 49% identical to the Drosophila and human hsp70 protein family. A partial open reading frame 165 base pairs 3' to the end of dnaK encodes a peptide of 190 amino acids that is 59% identical to DnaJ of E. coli. Northern blot analysis revealed a single 4.0-kilobase mRNA homologous to the cloned fragment. Since the dnaK coding region is 1.89 kilobases, dnaK and dnaJ may be transcribed as a polycistronic message. S1 mapping and primer extension experiments showed that transcription initiated at two sites 5' to the dnaK coding sequence. A single start site of transcription was identified during heat shock at 42 degrees C, and the predicted promoter sequence conformed to the consensus heat shock promoters of E. coli. At normal growth temperature (30 degrees C), a different start site was identified 3' to the heat shock start site that conformed to the E. coli sigma 70 promoter consensus sequence. S1 protection assays and analysis of expression of the dnaK gene fused to the lux transcription reporter gene showed that expression of dnaK is temporally controlled under normal physiological conditions and that transcription occurs just before the initiation of DNA replication. Thus, in both human cells (I. K. L. Milarski and R. I. Morimoto, Proc. Natl. Acad. Sci. USA 83:9517-9521, 1986) and in a simple bacterium, the transcription of a hsp70 gene is temporally controlled as a function of the cell cycle under normal growth conditions. Images PMID:2345134
Kim, Suk Kyeong; Kim, Dong-Lim; Han, Hye Seung; Kim, Wan Seop; Kim, Seung Ja; Moon, Won Jin; Oh, Seo Young; Hwang, Tae Sook
2008-06-01
Fine-needle aspiration biopsy (FNAB) is the primary means of distinguishing benign from malignant and of guiding therapeutic intervention in thyroid nodules. However, 10% to 30% of cases with indeterminate cytology in FNAB need other diagnostic tools to refine diagnosis. We compared the pyrosequencing method with the conventional direct DNA sequencing analysis and investigated the usefulness of preoperative BRAF mutation analysis as an adjunct diagnostic tool with routine FNAB. A total of 103 surgically confirmed patients' FNA slides were recruited and DNA was extracted after atypical cells were scraped from the slides. BRAF mutation was analyzed by pyrosequencing and direct DNA sequencing. Sixty-three (77.8%) of 81 histopathologically diagnosed malignant nodules revealed positive BRAF mutation on pyrosequencing analysis. In detail, 63 (84.0%) of 75 papillary thyroid carcinoma (PTC) samples showed positive BRAF mutation, whereas 3 follicular thyroid carcinomas, 1 anaplastic carcinoma, 1 medullary thyroid carcinoma, and 1 metastatic lung carcinoma did not show BRAF mutation. None of 22 benign nodules had BRAF mutation in both pyrosequencing and direct DNA sequencing. Out of 27 thyroid nodules classified as 'indeterminate' on cytologic examination preoperatively, 21 (77.8%) cases turned out to be malignant: 18 PTCs (including 2 follicular variant types) and 3 follicular thyroid carcinomas. Among these, 13 (61.9%) classic PTCs had BRAF mutation. None of 6 benign nodules, including 3 follicular adenomas and 3 nodular hyperplasias, had BRAF mutation. Among 63 PTCs with positive BRAF mutation detected by pyrosequencing analysis, 3 cases did not show BRAF mutation by direct DNA sequencing. Although it was not statistically significant, pyrosequencing was superior to direct DNA sequencing in detecting the BRAF mutation of thyroid nodules (P=0.25). Detecting BRAF mutation by pyrosequencing is more sensitive, faster, and less expensive than direct DNA sequencing and is proposed as an adjunct diagnostic tool in evaluating thyroid nodules of indeterminate cytology.
Pardo, Carolina E; Carr, Ian M; Hoffman, Christopher J; Darst, Russell P; Markham, Alexander F; Bonthron, David T; Kladde, Michael P
2011-01-01
Bisulfite sequencing is a widely-used technique for examining cytosine DNA methylation at nucleotide resolution along single DNA strands. Probing with cytosine DNA methyltransferases followed by bisulfite sequencing (MAPit) is an effective technique for mapping protein-DNA interactions. Here, MAPit methylation footprinting with M.CviPI, a GC methyltransferase we previously cloned and characterized, was used to probe hMLH1 chromatin in HCT116 and RKO colorectal cancer cells. Because M.CviPI-probed samples contain both CG and GC methylation, we developed a versatile, visually-intuitive program, called MethylViewer, for evaluating the bisulfite sequencing results. Uniquely, MethylViewer can simultaneously query cytosine methylation status in bisulfite-converted sequences at as many as four different user-defined motifs, e.g. CG, GC, etc., including motifs with degenerate bases. Data can also be exported for statistical analysis and as publication-quality images. Analysis of hMLH1 MAPit data with MethylViewer showed that endogenous CG methylation and accessible GC sites were both mapped on single molecules at high resolution. Disruption of positioned nucleosomes on single molecules of the PHO5 promoter was detected in budding yeast using M.CviPII, increasing the number of enzymes available for probing protein-DNA interactions. MethylViewer provides an integrated solution for primer design and rapid, accurate and detailed analysis of bisulfite sequencing or MAPit datasets from virtually any biological or biochemical system.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Fields, C.A.
1996-06-01
The objective of this project is the development of practical software to automate the identification of genes in anonymous DNA sequences from the human, and other higher eukaryotic genomes. A software system for automated sequence analysis, gm (gene modeler) has been designed, implemented, tested, and distributed to several dozen laboratories worldwide. A significantly faster, more robust, and more flexible version of this software, gm 2.0 has now been completed, and is being tested by operational use to analyze human cosmid sequence data. A range of efforts to further understand the features of eukaryoyic gene sequences are also underway. This progressmore » report also contains papers coming out of the project including the following: gm: a Tool for Exploratory Analysis of DNA Sequence Data; The Human THE-LTR(O) and MstII Interspersed Repeats are subfamilies of a single widely distruted highly variable repeat family; Information contents and dinucleotide compostions of plant intron sequences vary with evolutionary origin; Splicing signals in Drosophila: intron size, information content, and consensus sequences; Integration of automated sequence analysis into mapping and sequencing projects; Software for the C. elegans genome project.« less
Computational and experimental analysis of DNA shuffling
Maheshri, Narendra; Schaffer, David V.
2003-01-01
We describe a computational model of DNA shuffling based on the thermodynamics and kinetics of this process. The model independently tracks a representative ensemble of DNA molecules and records their states at every stage of a shuffling reaction. These data can subsequently be analyzed to yield information on any relevant metric, including reassembly efficiency, crossover number, type and distribution, and DNA sequence length distributions. The predictive ability of the model was validated by comparison to three independent sets of experimental data, and analysis of the simulation results led to several unique insights into the DNA shuffling process. We examine a tradeoff between crossover frequency and reassembly efficiency and illustrate the effects of experimental parameters on this relationship. Furthermore, we discuss conditions that promote the formation of useless “junk” DNA sequences or multimeric sequences containing multiple copies of the reassembled product. This model will therefore aid in the design of optimal shuffling reaction conditions. PMID:12626764
El-Sherry, Shiem; Ogedengbe, Mosun E; Hafeez, Mian A; Barta, John R
2013-07-01
Multiple 18S rDNA sequences were obtained from two single-oocyst-derived lines of each of Eimeria meleagrimitis and Eimeria adenoeides. After analysing the 15 new 18S rDNA sequences from two lines of E. meleagrimitis and 17 new sequences from two lines of E. adenoeides, there were clear indications that divergent, paralogous 18S rDNA copies existed within the nuclear genome of E. meleagrimitis. In contrast, mitochondrial cytochrome c oxidase subunit I (COI) partial sequences from all lines of a particular Eimeria sp. were identical and, in phylogenetic analyses, COI sequences clustered unambiguously in monophyletic and highly-supported clades specific to individual Eimeria sp. Phylogenetic analysis of the new 18S rDNA sequences from E. meleagrimitis showed that they formed two distinct clades: Type A with four new sequences; and Type B with nine new sequences; both Types A and B sequences were obtained from each of the single-oocyst-derived lines of E. meleagrimitis. Together these rDNA types formed a well-supported E. meleagrimitis clade. Types A and B 18S rDNA sequences from E. meleagrimitis had a mean sequence identity of only 97.4% whereas mean sequence identity within types was 99.1-99.3%. The observed intraspecific sequence divergence among E. meleagrimitis 18S rDNA sequence types was even higher (approximately 2.6%) than the interspecific sequence divergence present between some well-recognized species such as Eimeria tenella and Eimeria necatrix (1.1%). Our observations suggest that, unlike COI sequences, 18S rDNA sequences are not reliable molecular markers to be used alone for species identification with coccidia, although 18S rDNA sequences have clear utility for phylogenetic reconstruction of apicomplexan parasites at the genus and higher taxonomic ranks. Copyright © 2013. Published by Elsevier Ltd.
First isolation of Actinobacillus genomospecies 2 in Japan
MURAKAMI, Miyuki; SHIMONISHI, Yoshimasa; HOBO, Seiji; NIWA, Hidekazu; ITO, Hiroya
2015-01-01
We describe here the first isolation of Actinobacillus genomospecies 2 in Japan. The isolate was found in a septicemic foal and characterized by phenotypic and genetic analyses, with the latter consisting of 16S rDNA nucleotide sequence analysis plus multilocus sequence analysis using three housekeeping genes, recN, rpoA and thdF, that have been proposed for use as a genomic tool in place of DNA-DNA hybridization. PMID:26668165
Genomic sequencing of Pleistocene cave bears
DOE Office of Scientific and Technical Information (OSTI.GOV)
Noonan, James P.; Hofreiter, Michael; Smith, Doug
2005-04-01
Despite the information content of genomic DNA, ancient DNA studies to date have largely been limited to amplification of mitochondrial DNA due to technical hurdles such as contamination and degradation of ancient DNAs. In this study, we describe two metagenomic libraries constructed using unamplified DNA extracted from the bones of two 40,000-year-old extinct cave bears. Analysis of {approx}1 Mb of sequence from each library showed that, despite significant microbial contamination, 5.8 percent and 1.1 percent of clones in the libraries contain cave bear inserts, yielding 26,861 bp of cave bear genome sequence. Alignment of this sequence to the dog genome,more » the closest sequenced genome to cave bear in terms of evolutionary distance, revealed roughly the expected ratio of cave bear exons, repeats and conserved noncoding sequences. Only 0.04 percent of all clones sequenced were derived from contamination with modern human DNA. Comparison of cave bear with orthologous sequences from several modern bear species revealed the evolutionary relationship of these lineages. Using the metagenomic approach described here, we have recovered substantial quantities of mammalian genomic sequence more than twice as old as any previously reported, establishing the feasibility of ancient DNA genomic sequencing programs.« less
Phylogenetic Analysis of Ruminant Theileria spp. from China Based on 28S Ribosomal RNA Gene
Gou, Huitian; Guan, Guiquan; Ma, Miling; Liu, Aihong; Liu, Zhijie; Xu, Zongke; Ren, Qiaoyun; Li, Youquan; Yang, Jifei; Chen, Ze
2013-01-01
Species identification using DNA sequences is the basis for DNA taxonomy. In this study, we sequenced the ribosomal large-subunit RNA gene sequences (3,037-3,061 bp) in length of 13 Chinese Theileria stocks that were infective to cattle and sheep. The complete 28S rRNA gene is relatively difficult to amplify and its conserved region is not important for phylogenetic study. Therefore, we selected the D2-D3 region from the complete 28S rRNA sequences for phylogenetic analysis. Our analyses of 28S rRNA gene sequences showed that the 28S rRNA was useful as a phylogenetic marker for analyzing the relationships among Theileria spp. in ruminants. In addition, the D2-D3 region was a short segment that could be used instead of the whole 28S rRNA sequence during the phylogenetic analysis of Theileria, and it may be an ideal DNA barcode. PMID:24327775
Phylogenetic analysis of ruminant Theileria spp. from China based on 28S ribosomal RNA gene.
Gou, Huitian; Guan, Guiquan; Ma, Miling; Liu, Aihong; Liu, Zhijie; Xu, Zongke; Ren, Qiaoyun; Li, Youquan; Yang, Jifei; Chen, Ze; Yin, Hong; Luo, Jianxun
2013-10-01
Species identification using DNA sequences is the basis for DNA taxonomy. In this study, we sequenced the ribosomal large-subunit RNA gene sequences (3,037-3,061 bp) in length of 13 Chinese Theileria stocks that were infective to cattle and sheep. The complete 28S rRNA gene is relatively difficult to amplify and its conserved region is not important for phylogenetic study. Therefore, we selected the D2-D3 region from the complete 28S rRNA sequences for phylogenetic analysis. Our analyses of 28S rRNA gene sequences showed that the 28S rRNA was useful as a phylogenetic marker for analyzing the relationships among Theileria spp. in ruminants. In addition, the D2-D3 region was a short segment that could be used instead of the whole 28S rRNA sequence during the phylogenetic analysis of Theileria, and it may be an ideal DNA barcode.
Wang, Yongjie; Kleespies, Regina G; Ramle, Moslim B; Jehle, Johannes A
2008-09-01
The genomic sequence analysis of many large dsDNA viruses is hampered by the lack of enough sample materials. Here, we report a whole genome amplification of the Oryctes rhinoceros nudivirus (OrNV) isolate Ma07 starting from as few as about 10 ng of purified viral DNA by application of phi29 DNA polymerase- and exonuclease-resistant random hexamer-based multiple displacement amplification (MDA) method. About 60 microg of high molecular weight DNA with fragment sizes of up to 25 kbp was amplified. A genomic DNA clone library was generated using the product DNA. After 8-fold sequencing coverage, the 127,615 bp of OrNV whole genome was sequenced successfully. The results demonstrate that the MDA-based whole genome amplification enables rapid access to genomic information from exiguous virus samples.
Variation of 45S rDNA intergenic spacers in Arabidopsis thaliana.
Havlová, Kateřina; Dvořáčková, Martina; Peiro, Ramon; Abia, David; Mozgová, Iva; Vansáčová, Lenka; Gutierrez, Crisanto; Fajkus, Jiří
2016-11-01
Approximately seven hundred 45S rRNA genes (rDNA) in the Arabidopsis thaliana genome are organised in two 4 Mbp-long arrays of tandem repeats arranged in head-to-tail fashion separated by an intergenic spacer (IGS). These arrays make up 5 % of the A. thaliana genome. IGS are rapidly evolving sequences and frequent rearrangements inside the rDNA loci have generated considerable interspecific and even intra-individual variability which allows to distinguish among otherwise highly conserved rRNA genes. The IGS has not been comprehensively described despite its potential importance in regulation of rDNA transcription and replication. Here we describe the detailed sequence variation in the complete IGS of A. thaliana WT plants and provide the reference/consensus IGS sequence, as well as genomic DNA analysis. We further investigate mutants dysfunctional in chromatin assembly factor-1 (CAF-1) (fas1 and fas2 mutants), which are known to have a reduced number of rDNA copies, and plant lines with restored CAF-1 function (segregated from a fas1xfas2 genetic background) showing major rDNA rearrangements. The systematic rDNA loss in CAF-1 mutants leads to the decreased variability of the IGS and to the occurrence of distinct IGS variants. We present for the first time a comprehensive and representative set of complete IGS sequences, obtained by conventional cloning and by Pacific Biosciences sequencing. Our data expands the knowledge of the A. thaliana IGS sequence arrangement and variability, which has not been available in full and in detail until now. This is also the first study combining IGS sequencing data with RFLP analysis of genomic DNA.
Kerschner, Joseph E; Erdos, Geza; Hu, Fen Ze; Burrows, Amy; Cioffi, Joseph; Khampang, Pawjai; Dahlgren, Margaret; Hayes, Jay; Keefe, Randy; Janto, Benjamin; Post, J Christopher; Ehrlich, Garth D
2010-04-01
We sought to construct and partially characterize complementary DNA (cDNA) libraries prepared from the middle ear mucosa (MEM) of chinchillas to better understand pathogenic aspects of infection and inflammation, particularly with respect to leukotriene biogenesis and response. Chinchilla MEM was harvested from controls and after middle ear inoculation with nontypeable Haemophilus influenzae. RNA was extracted to generate cDNA libraries. Randomly selected clones were subjected to sequence analysis to characterize the libraries and to provide DNA sequence for phylogenetic analyses. Reverse transcription-polymerase chain reaction of the RNA pools was used to generate cDNA sequences corresponding to genes associated with leukotriene biosynthesis and metabolism. Sequence analysis of 921 randomly selected clones from the uninfected MEM cDNA library produced approximately 250,000 nucleotides of almost entirely novel sequence data. Searches of the GenBank database with the Basic Local Alignment Search Tool provided for identification of 515 unique genes expressed in the MEM and not previously described in chinchillas. In almost all cases, the chinchilla cDNA sequences displayed much greater homology to human or other primate genes than with rodent species. Genes associated with leukotriene metabolism were present in both normal and infected MEM. Based on both phylogenetic comparisons and gene expression similarities with humans, chinchilla MEM appears to be an excellent model for the study of middle ear inflammation and infection. The higher degree of sequence similarity between chinchillas and humans compared to chinchillas and rodents was unexpected. The cDNA libraries from normal and infected chinchilla MEM will serve as useful molecular tools in the study of otitis media and should yield important information with respect to middle ear pathogenesis.
Kerschner, Joseph E.; Erdos, Geza; Hu, Fen Ze; Burrows, Amy; Cioffi, Joseph; Khampang, Pawjai; Dahlgren, Margaret; Hayes, Jay; Keefe, Randy; Janto, Benjamin; Post, J. Christopher; Ehrlich, Garth D.
2010-01-01
Objectives We sought to construct and partially characterize complementary DNA (cDNA) libraries prepared from the middle ear mucosa (MEM) of chinchillas to better understand pathogenic aspects of infection and inflammation, particularly with respect to leukotriene biogenesis and response. Methods Chinchilla MEM was harvested from controls and after middle ear inoculation with nontypeable Haemophilus influenzae. RNA was extracted to generate cDNA libraries. Randomly selected clones were subjected to sequence analysis to characterize the libraries and to provide DNA sequence for phylogenetic analyses. Reverse transcription–polymerase chain reaction of the RNA pools was used to generate cDNA sequences corresponding to genes associated with leukotriene biosynthesis and metabolism. Results Sequence analysis of 921 randomly selected clones from the uninfected MEM cDNA library produced approximately 250,000 nucleotides of almost entirely novel sequence data. Searches of the GenBank database with the Basic Local Alignment Search Tool provided for identification of 515 unique genes expressed in the MEM and not previously described in chinchillas. In almost all cases, the chinchilla cDNA sequences displayed much greater homology to human or other primate genes than with rodent species. Genes associated with leukotriene metabolism were present in both normal and infected MEM. Conclusions Based on both phylogenetic comparisons and gene expression similarities with humans, chinchilla MEM appears to be an excellent model for the study of middle ear inflammation and infection. The higher degree of sequence similarity between chinchillas and humans compared to chinchillas and rodents was unexpected. The cDNA libraries from normal and infected chinchilla MEM will serve as useful molecular tools in the study of otitis media and should yield important information with respect to middle ear pathogenesis. PMID:20433028
Palzkill, T G; Oliver, S G; Newlon, C S
1986-01-01
Four fragments of Saccharomyces cerevisiae chromosome III DNA which carry ARS elements have been sequenced. Each fragment contains multiple copies of sequences that have at least 10 out of 11 bases of homology to a previously reported 11 bp core consensus sequence. A survey of these new ARS sequences and previously reported sequences revealed the presence of an additional 11 bp conserved element located on the 3' side of the T-rich strand of the core consensus. Subcloning analysis as well as deletion and transposon insertion mutagenesis of ARS fragments support a role for 3' conserved sequence in promoting ARS activity. PMID:3529036
Long interspersed repeated DNA (LINE) causes polymorphism at the rat insulin 1 locus.
Lakshmikumaran, M S; D'Ambrosio, E; Laimins, L A; Lin, D T; Furano, A V
1985-09-01
The insulin 1, but not the insulin 2, locus is polymorphic (i.e., exhibits allelic variation) in rats. Restriction enzyme analysis and hybridization studies showed that the polymorphic region is 2.2 kilobases upstream of the insulin 1 coding region and is due to the presence or absence of an approximately 2.7-kilobase repeated DNA element. DNA sequence determination showed that this DNA element is a member of a long interspersed repeated DNA family (LINE) that is highly repeated (greater than 50,000 copies) and highly transcribed in the rat. Although the presence or absence of LINE sequences at the insulin 1 locus occurs in both the homozygous and heterozygous states, LINE-containing insulin 1 alleles are more prevalent in the rat population than are alleles without LINEs. Restriction enzyme analysis of the LINE-containing alleles indicated that at least two versions of the LINE sequence may be present at the insulin 1 locus in different rats. Either repeated transposition of LINE sequences or gene conversion between the resident insulin 1 LINE and other sequences in the genome are possible explanations for this.
Kukita, Yoji; Matoba, Ryo; Uchida, Junji; Hamakawa, Takuya; Doki, Yuichiro; Imamura, Fumio; Kato, Kikuya
2015-08-01
Circulating tumour DNA (ctDNA) is an emerging field of cancer research. However, current ctDNA analysis is usually restricted to one or a few mutation sites due to technical limitations. In the case of massively parallel DNA sequencers, the number of false positives caused by a high read error rate is a major problem. In addition, the final sequence reads do not represent the original DNA population due to the global amplification step during the template preparation. We established a high-fidelity target sequencing system of individual molecules identified in plasma cell-free DNA using barcode sequences; this system consists of the following two steps. (i) A novel target sequencing method that adds barcode sequences by adaptor ligation. This method uses linear amplification to eliminate the errors introduced during the early cycles of polymerase chain reaction. (ii) The monitoring and removal of erroneous barcode tags. This process involves the identification of individual molecules that have been sequenced and for which the number of mutations have been absolute quantitated. Using plasma cell-free DNA from patients with gastric or lung cancer, we demonstrated that the system achieved near complete elimination of false positives and enabled de novo detection and absolute quantitation of mutations in plasma cell-free DNA. © The Author 2015. Published by Oxford University Press on behalf of Kazusa DNA Research Institute.
A complete Neandertal mitochondrial genome sequence determined by high-throughput sequencing
Green, Richard E.; Malaspinas, Anna-Sapfo; Krause, Johannes; Briggs, Adrian W.; Johnson, Philip L. F.; Uhler, Caroline; Meyer, Matthias; Good, Jeffrey M.; Maricic, Tomislav; Stenzel, Udo; Prüfer, Kay; Siebauer, Michael; Burbano, Hernán A.; Ronan, Michael; Rothberg, Jonathan M.; Egholm, Michael; Rudan, Pavao; Brajković, Dejana; Kućan, Željko; Gušić, Ivan; Wikström, Mårten; Laakkonen, Liisa; Kelso, Janet; Slatkin, Montgomery; Pääbo, Svante
2008-01-01
Summary A complete mitochondrial (mt) genome sequence was reconstructed from a 38,000-year-old Neandertal individual using 8,341 mtDNA sequences identified among 4.8 Gb of DNA generated from ~0.3 grams of bone. Analysis of the assembled sequence unequivocally establishes that the Neandertal mtDNA falls outside the variation of extant human mtDNAs and allows an estimate of the divergence date between the two mtDNA lineages of 660,000±140,000 years. Of the 13 proteins encoded in the mtDNA, subunit 2 of cytochrome c oxidase of the mitochondrial electron transport chain has experienced the largest number of amino acid substitutions in human ancestors since the separation from Neandertals. There is evidence that purifying selection in the Neandertal mtDNA was reduced compared to other primate lineages suggesting that the effective population size of Neandertals was small. PMID:18692465
Phylogenetic analysis of mtDNA lineages in South American mummies.
Monsalve, M V; Cardenas, F; Guhl, F; Delaney, A D; Devine, D V
1996-07-01
Some studies of mtDNA propose that contemporary Amerindians have descended from four haplotype groups, each defined by specific sets of polymorphisms. One recent study also found evidence of other potential founder haplotypes. We wanted to determine whether the four haplotypes in modern populations were also present in ancient South American aboriginals. We subjected mtDNA from Colombian mummies (470 to 1849 AD) to PCR amplification and restriction endonuclease analysis. The mtDNA D-loop region was surveyed for sequence variation by restriction analysis and a segment of this region was sequenced for each mummy to characterize the haplotypes. Our mummies exhibited three of the four major characteristic haplotypes of Amerindian populations defined by four markers. With sequence data obtained in the ancient samples and published data on contemporary Amerindians it was possible to infer the origin of these six mummies.
Perina, Alejandra; Seoane, David; González-Tizón, Ana M; Rodríguez-Fariña, Fernanda; Martínez-Lage, Andrés
2011-10-17
The 5S ribosomal DNA (5S rDNA) is organized in tandem arrays with repeat units that consist of a transcribing region (5S) and a variable nontranscribed spacer (NTS), in higher eukaryotes. Until recently the 5S rDNA was thought to be subject to concerted evolution, however, in several taxa, sequence divergence levels between the 5S and the NTS were found higher than expected under this model. So, many studies have shown that birth-and-death processes and selection can drive the evolution of 5S rDNA. In analyses of 5S rDNA evolution is found several 5S rDNA types in the genome, with low levels of nucleotide variation in the 5S and a spacer region highly divergent. Molecular organization and nucleotide sequence of the 5S ribosomal DNA multigene family (5S rDNA) were investigated in three Pollicipes species in an evolutionary context. The nucleotide sequence variation revealed that several 5S rDNA variants occur in Pollicipes genomes. They are clustered in up to seven different types based on differences in their nontranscribed spacers (NTS). Five different units of 5S rDNA were characterized in P. pollicipes and two different units in P. elegans and P. polymerus. Analysis of these sequences showed that identical types were shared among species and that two pseudogenes were present. We predicted the secondary structure and characterized the upstream and downstream conserved elements. Phylogenetic analysis showed an among-species clustering pattern of 5S rDNA types. These results suggest that the evolution of Pollicipes 5S rDNA is driven by birth-and-death processes with strong purifying selection.
2011-01-01
Background The 5S ribosomal DNA (5S rDNA) is organized in tandem arrays with repeat units that consist of a transcribing region (5S) and a variable nontranscribed spacer (NTS), in higher eukaryotes. Until recently the 5S rDNA was thought to be subject to concerted evolution, however, in several taxa, sequence divergence levels between the 5S and the NTS were found higher than expected under this model. So, many studies have shown that birth-and-death processes and selection can drive the evolution of 5S rDNA. In analyses of 5S rDNA evolution is found several 5S rDNA types in the genome, with low levels of nucleotide variation in the 5S and a spacer region highly divergent. Molecular organization and nucleotide sequence of the 5S ribosomal DNA multigene family (5S rDNA) were investigated in three Pollicipes species in an evolutionary context. Results The nucleotide sequence variation revealed that several 5S rDNA variants occur in Pollicipes genomes. They are clustered in up to seven different types based on differences in their nontranscribed spacers (NTS). Five different units of 5S rDNA were characterized in P. pollicipes and two different units in P. elegans and P. polymerus. Analysis of these sequences showed that identical types were shared among species and that two pseudogenes were present. We predicted the secondary structure and characterized the upstream and downstream conserved elements. Phylogenetic analysis showed an among-species clustering pattern of 5S rDNA types. Conclusions These results suggest that the evolution of Pollicipes 5S rDNA is driven by birth-and-death processes with strong purifying selection. PMID:22004418
Walker, M D; Park, C W; Rosen, A; Aronheim, A
1990-01-01
Cell specific expression of the insulin gene is achieved through transcriptional mechanisms operating on multiple DNA sequence elements located in the 5' flanking region of the gene. Of particular importance in the rat insulin I gene are two closely similar 9 bp sequences (IEB1 and IEB2): mutation of either of these leads to 5-10 fold reduction in transcriptional activity. We have screened an expression cDNA library derived from mouse pancreatic endocrine beta cells with a radioactive DNA probe containing multiple copies of the IEB1 sequence. A cDNA clone (A1) isolated by this procedure encodes a protein which shows efficient binding to the IEB1 probe, but much weaker binding to either an unrelated DNA probe or to a probe bearing a single base pair insertion within the recognition sequence. DNA sequence analysis indicates a protein belonging to the helix-loop-helix family of DNA-binding proteins. The ability of the protein encoded by clone A1 to recognize a number of wild type and mutant DNA sequences correlates closely with the ability of each sequence element to support transcription in vivo in the context of the insulin 5' flanking DNA. We conclude that the isolated cDNA may encode a transcription factor that participates in control of insulin gene expression. Images PMID:2181401
Berger, C; Berger, B; Parson, W
2012-01-01
In recent years, evidence from domestic dogs has increasingly been analyzed by forensic DNA testing. Especially, canine hairs have proved most suitable and practical due to the high rate of hair transfer occurring between dogs and humans. Starting with the description of a contamination-free sample handling procedure, we give a detailed workflow for sequencing hypervariable segments (HVS) of the mtDNA control region from canine evidence. After the hair material is lysed and the DNA extracted by Phenol/Chloroform, the amplification and sequencing strategy comprises the HVS I and II of the canine control region and is optimized for DNA of medium-to-low quality and quantity. The sequencing procedure is based on the Sanger Big-dye deoxy-terminator method and the separation of the sequencing reaction products is performed on a conventional multicolor fluorescence detection capillary electrophoresis platform. Finally, software-aided base calling and sequence interpretation are addressed exemplarily.
Kim, W J; Ji, Y; Choi, G; Kang, Y M; Yang, S; Moon, B C
2016-08-05
This study was performed to identify and analyze the phylogenetic relationship among four herbaceous species of the genus Paeonia, P. lactiflora, P. japonica, P. veitchii, and P. suffruticosa, using DNA barcodes. These four species, which are commonly used in traditional medicine as Paeoniae Radix and Moutan Radicis Cortex, are pharmaceutically defined in different ways in the national pharmacopoeias in Korea, Japan, and China. To authenticate the different species used in these medicines, we evaluated rDNA-internal transcribed spacers (ITS), matK and rbcL regions, which provide information capable of effectively distinguishing each species from one another. Seventeen samples were collected from different geographic regions in Korea and China, and DNA barcode regions were amplified using universal primers. Comparative analyses of these DNA barcode sequences revealed species-specific nucleotide sequences capable of discriminating the four Paeonia species. Among the entire sequences of three barcodes, marker nucleotides were identified at three positions in P. lactiflora, eleven in P. japonica, five in P. veitchii, and 25 in P. suffruticosa. Phylogenetic analyses also revealed four distinct clusters showing homogeneous clades with high resolution at the species level. The results demonstrate that the analysis of these three DNA barcode sequences is a reliable method for identifying the four Paeonia species and can be used to authenticate Paeoniae Radix and Moutan Radicis Cortex at the species level. Furthermore, based on the assessment of amplicon sizes, inter/intra-specific distances, marker nucleotides, and phylogenetic analysis, rDNA-ITS was the most suitable DNA barcode for identification of these species.
Liao, Ai-Jun; Su, Qi; Wang, Xun; Zeng, Bin; Shi, Wei
2008-01-01
AIM: To isolate and analyze the DNA sequences which are methylated differentially between gastric cancer and normal gastric mucosa. METHODS: The differentially methylated DNA sequences between gastric cancer and normal gastric mucosa were isolated by methylation-sensitive representational difference analysis (MS-RDA). Similarities between the separated fragments and the human genomic DNA were analyzed with Basic Local Alignment Search Tool (BLAST). RESULTS: Three differentially methylated DNA sequences were obtained, two of which have been accepted by GenBank. The accession numbers are AY887106 and AY887107. AY887107 was highly similar to the 11th exon of LOC440683 (98%), 3’ end of LOC440887 (99%), and promoter and exon regions of DRD5 (94%). AY887106 was consistent (98%) with a CpG island in ribosomal RNA isolated from colorectal cancer by Minoru Toyota in 1999. CONCLUSION: The methylation degree is different between gastric cancer and normal gastric mucosa. The differentially methylated DNA sequences can be isolated effectively by MS-RDA. PMID:18322944
DNA sequence analysis with droplet-based microfluidics
Abate, Adam R.; Hung, Tony; Sperling, Ralph A.; Mary, Pascaline; Rotem, Assaf; Agresti, Jeremy J.; Weiner, Michael A.; Weitz, David A.
2014-01-01
Droplet-based microfluidic techniques can form and process micrometer scale droplets at thousands per second. Each droplet can house an individual biochemical reaction, allowing millions of reactions to be performed in minutes with small amounts of total reagent. This versatile approach has been used for engineering enzymes, quantifying concentrations of DNA in solution, and screening protein crystallization conditions. Here, we use it to read the sequences of DNA molecules with a FRET-based assay. Using probes of different sequences, we interrogate a target DNA molecule for polymorphisms. With a larger probe set, additional polymorphisms can be interrogated as well as targets of arbitrary sequence. PMID:24185402
A Glimpse into the Satellite DNA Library in Characidae Fish (Teleostei, Characiformes)
Utsunomia, Ricardo; Ruiz-Ruano, Francisco J.; Silva, Duílio M. Z. A.; Serrano, Érica A.; Rosa, Ivana F.; Scudeler, Patrícia E. S.; Hashimoto, Diogo T.; Oliveira, Claudio; Camacho, Juan Pedro M.; Foresti, Fausto
2017-01-01
Satellite DNA (satDNA) is an abundant fraction of repetitive DNA in eukaryotic genomes and plays an important role in genome organization and evolution. In general, satDNA sequences follow a concerted evolutionary pattern through the intragenomic homogenization of different repeat units. In addition, the satDNA library hypothesis predicts that related species share a series of satDNA variants descended from a common ancestor species, with differential amplification of different satDNA variants. The finding of a same satDNA family in species belonging to different genera within Characidae fish provided the opportunity to test both concerted evolution and library hypotheses. For this purpose, we analyzed here sequence variation and abundance of this satDNA family in ten species, by a combination of next generation sequencing (NGS), PCR and Sanger sequencing, and fluorescence in situ hybridization (FISH). We found extensive between-species variation for the number and size of pericentromeric FISH signals. At genomic level, the analysis of 1000s of DNA sequences obtained by Illumina sequencing and PCR amplification allowed defining 150 haplotypes which were linked in a common minimum spanning tree, where different patterns of concerted evolution were apparent. This also provided a glimpse into the satDNA library of this group of species. In consistency with the library hypothesis, different variants for this satDNA showed high differences in abundance between species, from highly abundant to simply relictual variants. PMID:28855916
NASA Astrophysics Data System (ADS)
Meyer, Sam; Everaers, Ralf
2015-02-01
The histone-DNA interaction in the nucleosome is a fundamental mechanism of genomic compaction and regulation, which remains largely unknown despite increasing structural knowledge of the complex. In this paper, we propose a framework for the extraction of a nanoscale histone-DNA force-field from a collection of high-resolution structures, which may be adapted to a larger class of protein-DNA complexes. We applied the procedure to a large crystallographic database extended by snapshots from molecular dynamics simulations. The comparison of the structural models first shows that, at histone-DNA contact sites, the DNA base-pairs are shifted outwards locally, consistent with locally repulsive forces exerted by the histones. The second step shows that the various force profiles of the structures under analysis derive locally from a unique, sequence-independent, quadratic repulsive force-field, while the sequence preferences are entirely due to internal DNA mechanics. We have thus obtained the first knowledge-derived nanoscale interaction potential for histone-DNA in the nucleosome. The conformations obtained by relaxation of nucleosomal DNA with high-affinity sequences in this potential accurately reproduce the experimental values of binding preferences. Finally we address the more generic binding mechanisms relevant to the 80% genomic sequences incorporated in nucleosomes, by computing the conformation of nucleosomal DNA with sequence-averaged properties. This conformation differs from those found in crystals, and the analysis suggests that repulsive histone forces are related to local stretch tension in nucleosomal DNA, mostly between adjacent contact points. This tension could play a role in the stability of the complex.
[The future of forensic DNA analysis for criminal justice].
Laurent, François-Xavier; Vibrac, Geoffrey; Rubio, Aurélien; Thévenot, Marie-Thérèse; Pène, Laurent
2017-11-01
In the criminal framework, the analysis of approximately 20 DNA microsatellites enables the establishment of a genetic profile with a high statistical power of discrimination. This technique gives us the possibility to establish or exclude a match between a biological trace detected at a crime scene and a suspect whose DNA was collected via an oral swab. However, conventional techniques do tend to complexify the interpretation of complex DNA samples, such as degraded DNA and mixture DNA. The aim of this review is to highlight the powerness of new forensic DNA methods (including high-throughput sequencing or single-cell sequencing) to facilitate the interpretation of the expert with full compliance with existing french legislation. © 2017 médecine/sciences – Inserm.
Nucleotide Sequence Analysis of RNA Synthesized from Rabbit Globin Complementary DNA
Poon, Raymond; Paddock, Gary V.; Heindell, Howard; Whitcome, Philip; Salser, Winston; Kacian, Dan; Bank, Arthur; Gambino, Roberto; Ramirez, Francesco
1974-01-01
Rabbit globin complementary DNA made with RNA-dependent DNA polymerase (reverse transcriptase) was used as template for in vitro synthesis of 32P-labeled RNA. The sequences of the nucleotides in most of the fragments resulting from combined ribonuclease T1 and alkaline phosphatase digestion have been determined. Several fragments were long enough to fit uniquely with the α or β globin amino-acid sequences. These data demonstrate that the cDNA was copied from globin mRNA and contained no detectable contaminants. Images PMID:4139714
Kim, Na Young; Lee, Hwan Young; Park, Sun Joo; Yang, Woo Ick; Shin, Kyoung-Jin
2013-05-01
Two multiplex polymerase chain reaction (PCR) systems (Midiplex and Miniplex) were developed for the amplification of the mitochondrial DNA (mtDNA) control region, and the efficiencies of the multiplexes for amplifying degraded DNA were validated using old skeletal remains. The Midiplex system consisted of two multiplex PCRs to amplify six overlapping amplicons ranging in length from 227 to 267 bp. The Miniplex system consisted of three multiplex PCRs to amplify 10 overlapping short amplicons ranging in length from 142 to 185 bp. Most mtDNA control region sequences of several 60-year-old and 400-500-year-old skeletal remains were successfully obtained using both PCR systems and consistent with those previously obtained by monoplex amplification. The multiplex system consisting of smaller amplicons is effective for mtDNA sequence analyses of ancient and forensic degraded samples, saving time, cost, and the amount of DNA sample consumed during analysis. © 2013 American Academy of Forensic Sciences.
Pastor, N; Pardo, L; Weinstein, H
1997-01-01
The binding of the TATA box-binding protein (TBP) to a TATA sequence in DNA is essential for eukaryotic basal transcription. TBP binds in the minor groove of DNA, causing a large distortion of the DNA helix. Given the apparent stereochemical equivalence of AT and TA basepairs in the minor groove, DNA deformability must play a significant role in binding site selection, because not all AT-rich sequences are bound effectively by TBP. To gain insight into the precise role that the properties of the TATA sequence have in determining the specificity of the DNA substrates of TBP, the solution structure and dynamics of seven DNA dodecamers have been studied by using molecular dynamics simulations. The analysis of the structural properties of basepair steps in these TATA sequences suggests a reason for the preference for alternating pyrimidine-purine (YR) sequences, but indicates that these properties cannot be the sole determinant of the sequence specificity of TBP. Rather, recognition depends on the interplay between the inherent deformability of the DNA and steric complementarity at the molecular interface. Images FIGURE 2 PMID:9251783
Ozga, Andrew T; Nieves-Colón, Maria A; Honap, Tanvi P; Sankaranarayanan, Krithivasan; Hofman, Courtney A; Milner, George R; Lewis, Cecil M; Stone, Anne C; Warinner, Christina
2016-06-01
Archaeological dental calculus is a rich source of host-associated biomolecules. Importantly, however, dental calculus is more accurately described as a calcified microbial biofilm than a host tissue. As such, concerns regarding destructive analysis of human remains may not apply as strongly to dental calculus, opening the possibility of obtaining human health and ancestry information from dental calculus in cases where destructive analysis of conventional skeletal remains is not permitted. Here we investigate the preservation of human mitochondrial DNA (mtDNA) in archaeological dental calculus and its potential for full mitochondrial genome (mitogenome) reconstruction in maternal lineage ancestry analysis. Extracted DNA from six individuals at the 700-year-old Norris Farms #36 cemetery in Illinois was enriched for mtDNA using in-solution capture techniques, followed by Illumina high-throughput sequencing. Full mitogenomes (7-34×) were successfully reconstructed from dental calculus for all six individuals, including three individuals who had previously tested negative for DNA preservation in bone using conventional PCR techniques. Mitochondrial haplogroup assignments were consistent with previously published findings, and additional comparative analysis of paired dental calculus and dentine from two individuals yielded equivalent haplotype results. All dental calculus samples exhibited damage patterns consistent with ancient DNA, and mitochondrial sequences were estimated to be 92-100% endogenous. DNA polymerase choice was found to impact error rates in downstream sequence analysis, but these effects can be mitigated by greater sequencing depth. Dental calculus is a viable alternative source of human DNA that can be used to reconstruct full mitogenomes from archaeological remains. Am J Phys Anthropol 160:220-228, 2016. © 2016 The Authors American Journal of Physical Anthropology Published by Wiley Periodicals, Inc. © 2016 Wiley Periodicals, Inc.
Successful enrichment and recovery of whole mitochondrial genomes from ancient human dental calculus
Ozga, Andrew T.; Nieves‐Colón, Maria A.; Honap, Tanvi P.; Sankaranarayanan, Krithivasan; Hofman, Courtney A.; Milner, George R.; Lewis, Cecil M.; Stone, Anne C.
2016-01-01
ABSTRACT Objectives Archaeological dental calculus is a rich source of host‐associated biomolecules. Importantly, however, dental calculus is more accurately described as a calcified microbial biofilm than a host tissue. As such, concerns regarding destructive analysis of human remains may not apply as strongly to dental calculus, opening the possibility of obtaining human health and ancestry information from dental calculus in cases where destructive analysis of conventional skeletal remains is not permitted. Here we investigate the preservation of human mitochondrial DNA (mtDNA) in archaeological dental calculus and its potential for full mitochondrial genome (mitogenome) reconstruction in maternal lineage ancestry analysis. Materials and Methods Extracted DNA from six individuals at the 700‐year‐old Norris Farms #36 cemetery in Illinois was enriched for mtDNA using in‐solution capture techniques, followed by Illumina high‐throughput sequencing. Results Full mitogenomes (7–34×) were successfully reconstructed from dental calculus for all six individuals, including three individuals who had previously tested negative for DNA preservation in bone using conventional PCR techniques. Mitochondrial haplogroup assignments were consistent with previously published findings, and additional comparative analysis of paired dental calculus and dentine from two individuals yielded equivalent haplotype results. All dental calculus samples exhibited damage patterns consistent with ancient DNA, and mitochondrial sequences were estimated to be 92–100% endogenous. DNA polymerase choice was found to impact error rates in downstream sequence analysis, but these effects can be mitigated by greater sequencing depth. Discussion Dental calculus is a viable alternative source of human DNA that can be used to reconstruct full mitogenomes from archaeological remains. Am J Phys Anthropol 160:220–228, 2016. © 2016 The Authors American Journal of Physical Anthropology Published by Wiley Periodicals, Inc. PMID:26989998
Buschmann, Tilo; Zhang, Rong; Brash, Douglas E; Bystrykh, Leonid V
2014-08-07
DNA barcodes are short unique sequences used to label DNA or RNA-derived samples in multiplexed deep sequencing experiments. During the demultiplexing step, barcodes must be detected and their position identified. In some cases (e.g., with PacBio SMRT), the position of the barcode and DNA context is not well defined. Many reads start inside the genomic insert so that adjacent primers might be missed. The matter is further complicated by coincidental similarities between barcode sequences and reference DNA. Therefore, a robust strategy is required in order to detect barcoded reads and avoid a large number of false positives or negatives.For mass inference problems such as this one, false discovery rate (FDR) methods are powerful and balanced solutions. Since existing FDR methods cannot be applied to this particular problem, we present an adapted FDR method that is suitable for the detection of barcoded reads as well as suggest possible improvements. In our analysis, barcode sequences showed high rates of coincidental similarities with the Mus musculus reference DNA. This problem became more acute when the length of the barcode sequence decreased and the number of barcodes in the set increased. The method presented in this paper controls the tail area-based false discovery rate to distinguish between barcoded and unbarcoded reads. This method helps to establish the highest acceptable minimal distance between reads and barcode sequences. In a proof of concept experiment we correctly detected barcodes in 83% of the reads with a precision of 89%. Sensitivity improved to 99% at 99% precision when the adjacent primer sequence was incorporated in the analysis. The analysis was further improved using a paired end strategy. Following an analysis of the data for sequence variants induced in the Atp1a1 gene of C57BL/6 murine melanocytes by ultraviolet light and conferring resistance to ouabain, we found no evidence of cross-contamination of DNA material between samples. Our method offers a proper quantitative treatment of the problem of detecting barcoded reads in a noisy sequencing environment. It is based on the false discovery rate statistics that allows a proper trade-off between sensitivity and precision to be chosen.
Hajibabaei, Mehrdad; Shokralla, Shadi; Zhou, Xin; Singer, Gregory A. C.; Baird, Donald J.
2011-01-01
Timely and accurate biodiversity analysis poses an ongoing challenge for the success of biomonitoring programs. Morphology-based identification of bioindicator taxa is time consuming, and rarely supports species-level resolution especially for immature life stages. Much work has been done in the past decade to develop alternative approaches for biodiversity analysis using DNA sequence-based approaches such as molecular phylogenetics and DNA barcoding. On-going assembly of DNA barcode reference libraries will provide the basis for a DNA-based identification system. The use of recently introduced next-generation sequencing (NGS) approaches in biodiversity science has the potential to further extend the application of DNA information for routine biomonitoring applications to an unprecedented scale. Here we demonstrate the feasibility of using 454 massively parallel pyrosequencing for species-level analysis of freshwater benthic macroinvertebrate taxa commonly used for biomonitoring. We designed our experiments in order to directly compare morphology-based, Sanger sequencing DNA barcoding, and next-generation environmental barcoding approaches. Our results show the ability of 454 pyrosequencing of mini-barcodes to accurately identify all species with more than 1% abundance in the pooled mixture. Although the approach failed to identify 6 rare species in the mixture, the presence of sequences from 9 species that were not represented by individuals in the mixture provides evidence that DNA based analysis may yet provide a valuable approach in finding rare species in bulk environmental samples. We further demonstrate the application of the environmental barcoding approach by comparing benthic macroinvertebrates from an urban region to those obtained from a conservation area. Although considerable effort will be required to robustly optimize NGS tools to identify species from bulk environmental samples, our results indicate the potential of an environmental barcoding approach for biomonitoring programs. PMID:21533287
Xian, Zhi-Hong; Cong, Wen-Ming; Zhang, Shu-Hui; Wu, Meng-Chao
2005-01-01
AIM: To study the genetic alterations and their association with clinicopathological characteristics of hepatocellular carcinoma (HCC), and to find the tumor related DNA fragments. METHODS: DNA isolated from tumors and corresponding noncancerous liver tissues of 56 HCC patients was amplified by random amplified polymorphic DNA (RAPD) with 10 random 10-mer arbitrary primers. The RAPD bands showing obvious differences in tumor tissue DNA corresponding to that of normal tissue were separated, purified, cloned and sequenced. DNA sequences were analyzed and compared with GenBank data. RESULTS: A total of 56 cases of HCC were demonstrated to have genetic alterations, which were detected by at least one primer. The detestability of genetic alterations ranged from 20% to 70% in each case, and 17.9% to 50% in each primer. Serum HBV infection, tumor size, histological grade, tumor capsule, as well as tumor intrahepatic metastasis, might be correlated with genetic alterations on certain primers. A band with a higher intensity of 480 bp or so amplified fragments in tumor DNA relative to normal DNA could be seen in 27 of 56 tumor samples using primer 4. Sequence analysis of these fragments showed 91% homology with Homo sapiens double homeobox protein DUX10 gene. CONCLUSION: Genetic alterations are a frequent event in HCC, and tumor related DNA fragments have been found in this study, which may be associated with hepatocarcin-ogenesis. RAPD is an effective method for the identification and analysis of genetic alterations in HCC, and may provide new information for further evaluating the molecular mechanism of hepatocarcinogenesis. PMID:15996039
Elrobh, Mohamed S.; Alanazi, Mohammad S.; Khan, Wajahatullah; Abduljaleel, Zainularifeen; Al-Amri, Abdullah; Bazzi, Mohammad D.
2011-01-01
Heat shock proteins are ubiquitous, induced under a number of environmental and metabolic stresses, with highly conserved DNA sequences among mammalian species. Camelus dromedaries (the Arabian camel) domesticated under semi-desert environments, is well adapted to tolerate and survive against severe drought and high temperatures for extended periods. This is the first report of molecular cloning and characterization of full length cDNA of encoding a putative stress-induced heat shock HSPA6 protein (also called HSP70B′) from Arabian camel. A full-length cDNA (2417 bp) was obtained by rapid amplification of cDNA ends (RACE) and cloned in pET-b expression vector. The sequence analysis of HSPA6 gene showed 1932 bp-long open reading frame encoding 643 amino acids. The complete cDNA sequence of the Arabian camel HSPA6 gene was submitted to NCBI GeneBank (accession number HQ214118.1). The BLAST analysis indicated that C. dromedaries HSPA6 gene nucleotides shared high similarity (77–91%) with heat shock gene nucleotide of other mammals. The deduced 643 amino acid sequences (accession number ADO12067.1) showed that the predicted protein has an estimated molecular weight of 70.5 kDa with a predicted isoelectric point (pI) of 6.0. The comparative analyses of camel HSPA6 protein sequences with other mammalian heat shock proteins (HSPs) showed high identity (80–94%). Predicted camel HSPA6 protein structure using Protein 3D structural analysis high similarities with human and mouse HSPs. Taken together, this study indicates that the cDNA sequences of HSPA6 gene and its amino acid and protein structure from the Arabian camel are highly conserved and have similarities with other mammalian species. PMID:21845074
Dragan, Anatoliy I; Golberg, Karina; Elbaz, Amit; Marks, Robert; Zhang, Yongxia; Geddes, Chris D
2011-03-07
For analyses of DNA fragment sequences in solution we introduce a 2-color DNA assay, utilizing a combination of the Metal-Enhanced Fluorescence (MEF) effect and microwave-accelerated DNA hybridization. The assay is based on a new "Catch and Signal" technology, i.e. the simultaneous specific recognition of two target DNA sequences in one well by complementary anchor-ssDNAs, attached to silver island films (SiFs). It is shown that fluorescent labels (Alexa 488 and Alexa 594), covalently attached to ssDNA fragments, play the role of biosensor recognition probes, demonstrating strong response upon DNA hybridization, locating fluorophores in close proximity to silver NPs, which is ideal for MEF. Subsequently the emission dramatically increases, while the excited state lifetime decreases. It is also shown that 30s microwave irradiation of wells, containing DNA molecules, considerably (~1000-fold) speeds up the highly selective hybridization of DNA fragments at ambient temperature. The 2-color "Catch and Signal" DNA assay platform can radically expedite quantitative analysis of genome DNA sequences, creating a simple and fast bio-medical platform for nucleic acid analysis. Copyright © 2010 Elsevier B.V. All rights reserved.
Flynn, Theodore M.; Koval, Jason C.; Greenwald, Stephanie M.; Owens, Sarah M.; Kemner, Kenneth M.; Antonopoulos, Dionysios A.
2017-01-01
We present DNA sequence data in FASTA-formatted files from aerobic environmental microcosms inoculated with a sole carbon source. DNA sequences are of 16S rRNA genes present in DNA extracted from each microcosm along with the environmental samples (soil, water) used to inoculate them. These samples were sequenced using the Illumina MiSeq platform at the Environmental Sample Preparation and Sequencing Facility at Argonne National Laboratory. This data is compatible with standard microbiome analysis pipelines (e.g., QIIME, mothur, etc.).
Adenine specific DNA chemical sequencing reaction.
Iverson, B L; Dervan, P B
1987-01-01
Reaction of DNA with K2PdCl4 at pH 2.0 followed by a piperidine workup produces specific cleavage at adenine (A) residues. Product analysis revealed the K2PdCl4 reaction involves selective depurination at adenine, affording an excision reaction analogous to the other chemical DNA sequencing reactions. Adenine residues methylated at the exocyclic amine (N6) react with lower efficiency than unmethylated adenine in an identical sequence. This simple protocol specific for A may be a useful addition to current chemical sequencing reactions. Images PMID:3671067
NASA Astrophysics Data System (ADS)
Holden, Todd; Marchese, P.; Tremberger, G., Jr.; Cheung, E.; Subramaniam, R.; Sullivan, R.; Schneider, P.; Flamholz, A.; Lieberman, D.; Cheung, T.
2008-08-01
We have characterized function related DNA sequences of various organisms using informatics techniques, including fractal dimension calculation, nucleotide and multi-nucleotide statistics, and sequence fluctuation analysis. Our analysis shows trends which differentiate extremophile from non-extremophile organisms, which could be reproduced in extraterrestrial life. Among the systems studied are radiation repair genes, genes involved in thermal shocks, and genes involved in drug resistance. We also evaluate sequence level changes that have occurred during short term evolution (several thousand generations) under extreme conditions.
King, Brian R; Aburdene, Maurice; Thompson, Alex; Warres, Zach
2014-01-01
Digital signal processing (DSP) techniques for biological sequence analysis continue to grow in popularity due to the inherent digital nature of these sequences. DSP methods have demonstrated early success for detection of coding regions in a gene. Recently, these methods are being used to establish DNA gene similarity. We present the inter-coefficient difference (ICD) transformation, a novel extension of the discrete Fourier transformation, which can be applied to any DNA sequence. The ICD method is a mathematical, alignment-free DNA comparison method that generates a genetic signature for any DNA sequence that is used to generate relative measures of similarity among DNA sequences. We demonstrate our method on a set of insulin genes obtained from an evolutionarily wide range of species, and on a set of avian influenza viral sequences, which represents a set of highly similar sequences. We compare phylogenetic trees generated using our technique against trees generated using traditional alignment techniques for similarity and demonstrate that the ICD method produces a highly accurate tree without requiring an alignment prior to establishing sequence similarity.
The phylogenetic relationship of Alexandrium monilatum to other Alexandrium spp. was explored using 18S rDNA sequences. Maximum likelilhood phylogenetic analysis of the combined rDNA sequences established that A. monilatum paired with Alexandrium taylori and that the pair was the...
The phylogenetic relationship of Alexandrium monilatum to other Alexandrium spp. was explored using 18S rDNA sequences. Maximum likelihood phylogenetic analysis of the combined rDNA sequences established that A. monilatum paired with Alexandrium taylori and that the pair was the ...
Quantitative analysis and prediction of G-quadruplex forming sequences in double-stranded DNA
Kim, Minji; Kreig, Alex; Lee, Chun-Ying; Rube, H. Tomas; Calvert, Jacob; Song, Jun S.; Myong, Sua
2016-01-01
Abstract G-quadruplex (GQ) is a four-stranded DNA structure that can be formed in guanine-rich sequences. GQ structures have been proposed to regulate diverse biological processes including transcription, replication, translation and telomere maintenance. Recent studies have demonstrated the existence of GQ DNA in live mammalian cells and a significant number of potential GQ forming sequences in the human genome. We present a systematic and quantitative analysis of GQ folding propensity on a large set of 438 GQ forming sequences in double-stranded DNA by integrating fluorescence measurement, single-molecule imaging and computational modeling. We find that short minimum loop length and the thymine base are two main factors that lead to high GQ folding propensity. Linear and Gaussian process regression models further validate that the GQ folding potential can be predicted with high accuracy based on the loop length distribution and the nucleotide content of the loop sequences. Our study provides important new parameters that can inform the evaluation and classification of putative GQ sequences in the human genome. PMID:27095201
Transcriptome analysis by strand-specific sequencing of complementary DNA
Parkhomchuk, Dmitri; Borodina, Tatiana; Amstislavskiy, Vyacheslav; Banaru, Maria; Hallen, Linda; Krobitsch, Sylvia; Lehrach, Hans; Soldatov, Alexey
2009-01-01
High-throughput complementary DNA sequencing (RNA-Seq) is a powerful tool for whole-transcriptome analysis, supplying information about a transcript's expression level and structure. However, it is difficult to determine the polarity of transcripts, and therefore identify which strand is transcribed. Here, we present a simple cDNA sequencing protocol that preserves information about a transcript's direction. Using Saccharomyces cerevisiae and mouse brain transcriptomes as models, we demonstrate that knowing the transcript's orientation allows more accurate determination of the structure and expression of genes. It also helps to identify new genes and enables studying promoter-associated and antisense transcription. The transcriptional landscapes we obtained are available online. PMID:19620212
Transcriptome analysis by strand-specific sequencing of complementary DNA.
Parkhomchuk, Dmitri; Borodina, Tatiana; Amstislavskiy, Vyacheslav; Banaru, Maria; Hallen, Linda; Krobitsch, Sylvia; Lehrach, Hans; Soldatov, Alexey
2009-10-01
High-throughput complementary DNA sequencing (RNA-Seq) is a powerful tool for whole-transcriptome analysis, supplying information about a transcript's expression level and structure. However, it is difficult to determine the polarity of transcripts, and therefore identify which strand is transcribed. Here, we present a simple cDNA sequencing protocol that preserves information about a transcript's direction. Using Saccharomyces cerevisiae and mouse brain transcriptomes as models, we demonstrate that knowing the transcript's orientation allows more accurate determination of the structure and expression of genes. It also helps to identify new genes and enables studying promoter-associated and antisense transcription. The transcriptional landscapes we obtained are available online.
Global DNA methylation analysis using methyl-sensitive amplification polymorphism (MSAP).
Yaish, Mahmoud W; Peng, Mingsheng; Rothstein, Steven J
2014-01-01
DNA methylation is a crucial epigenetic process which helps control gene transcription activity in eukaryotes. Information regarding the methylation status of a regulatory sequence of a particular gene provides important knowledge of this transcriptional control. DNA methylation can be detected using several methods, including sodium bisulfite sequencing and restriction digestion using methylation-sensitive endonucleases. Methyl-Sensitive Amplification Polymorphism (MSAP) is a technique used to study the global DNA methylation status of an organism and hence to distinguish between two individuals based on the DNA methylation status determined by the differential digestion pattern. Therefore, this technique is a useful method for DNA methylation mapping and positional cloning of differentially methylated genes. In this technique, genomic DNA is first digested with a methylation-sensitive restriction enzyme such as HpaII, and then the DNA fragments are ligated to adaptors in order to facilitate their amplification. Digestion using a methylation-insensitive isoschizomer of HpaII, MspI is used in a parallel digestion reaction as a loading control in the experiment. Subsequently, these fragments are selectively amplified by fluorescently labeled primers. PCR products from different individuals are compared, and once an interesting polymorphic locus is recognized, the desired DNA fragment can be isolated from a denaturing polyacrylamide gel, sequenced and identified based on DNA sequence similarity to other sequences available in the database. We will use analysis of met1, ddm1, and atmbd9 mutants and wild-type plants treated with a cytidine analogue, 5-azaC, or zebularine to demonstrate how to assess the genetic modulation of DNA methylation in Arabidopsis. It should be noted that despite the fact that MSAP is a reliable technique used to fish for polymorphic methylated loci, its power is limited to the restriction recognition sites of the enzymes used in the genomic DNA digestion.
NASA Astrophysics Data System (ADS)
Serra, Reviewed By Martin J.
2000-01-01
Genomics is one of the most rapidly expanding areas of science. This book is an outgrowth of a series of lectures given by one of the former heads (CRC) of the Human Genome Initiative. The book is designed to reach a wide audience, from biologists with little chemical or physical science background through engineers, computer scientists, and physicists with little current exposure to the chemical or biological principles of genetics. The text starts with a basic review of the chemical and biological properties of DNA. However, without either a biochemistry background or a supplemental biochemistry text, this chapter and much of the rest of the text would be difficult to digest. The second chapter is designed to put DNA into the context of the larger chromosomal unit. Specialized chromosomal structures and sequences (centromeres, telomeres) are introduced, leading to a section on chromosome organization and purification. The next 4 chapters cover the physical (hybridization, electrophoresis), chemical (polymerase chain reaction), and biological (genetic) techniques that provide the backbone of genomic analysis. These chapters cover in significant detail the fundamental principles underlying each technique and provide a firm background for the remainder of the text. Chapters 79 consider the need and methods for the development of physical maps. Chapter 7 primarily discusses chromosomal localization techniques, including in situ hybridization, FISH, and chromosome paintings. The next two chapters focus on the development of libraries and clones. In particular, Chapter 9 considers the limitations of current mapping and clone production. The current state and future of DNA sequencing is covered in the next three chapters. The first considers the current methods of DNA sequencing - especially gel-based methods of analysis, although other possible approaches (mass spectrometry) are introduced. Much of the chapter addresses the limitations of current methods, including analysis of error in sequencing and current bottlenecks in the sequencing effort. The next chapter describes the steps necessary to scale current technologies for the sequencing of entire genomes. Chapter 12 examines alternate methods for DNA sequencing. Initially, methods of single-molecule sequencing and sequencing by microscopy are introduced; the majority of the chapter is devoted to the development of DNA sequencing methods using chip microarrays and hybridization. The remaining chapters (13-15) consider the uses and analysis of DNA sequence information. The initial focus is on the identification of genes. Several examples are given of the use of DNA sequence information for diagnosis of inherited or infectious diseases. The sequence-specific manipulation of DNA is discussed in Chapter 14. The final chapter deals with the implications of large-scale sequencing, including methods for identifying genes and finding errors in DNA sequences, to the development of computer algorithms for the interpretation of DNA sequence information. The text figures are black and white line drawings that, although clearly done, seem a bit primitive for 1999. While I appreciated the simplicity of the drawings, many students accustomed to more colorful presentations will find them wanting. The four color figures in the center of the text seem an afterthought and add little to the text's clarity. Each chapter has a set of additional reading sources, mostly primary sources. Often, specialized topics are offset into boxes that provide clarification and amplification without cluttering the text. An appendix includes a list of the Web-based database resources. As an undergraduate instructor who has previously taught biochemistry, molecular biology, and a course on the human genome, I found many interesting tidbits and amplifications throughout the text. I would recommend this book as a text for an advanced undergraduate or beginning graduate course in genomics. Although the text works though several examples of genetic and genome analysis, additional problem/homework sets would need to be developed to ensure student comprehension. The text steers clear of the ethical implications of the Human Genome Initiative and remains true to its subtitle The Science and Technology .
Massively Parallel DNA Sequencing Facilitates Diagnosis of Patients with Usher Syndrome Type 1
Yoshimura, Hidekane; Iwasaki, Satoshi; Nishio, Shin-ya; Kumakawa, Kozo; Tono, Tetsuya; Kobayashi, Yumiko; Sato, Hiroaki; Nagai, Kyoko; Ishikawa, Kotaro; Ikezono, Tetsuo; Naito, Yasushi; Fukushima, Kunihiro; Oshikawa, Chie; Kimitsuki, Takashi; Nakanishi, Hiroshi; Usami, Shin-ichi
2014-01-01
Usher syndrome is an autosomal recessive disorder manifesting hearing loss, retinitis pigmentosa and vestibular dysfunction, and having three clinical subtypes. Usher syndrome type 1 is the most severe subtype due to its profound hearing loss, lack of vestibular responses, and retinitis pigmentosa that appears in prepuberty. Six of the corresponding genes have been identified, making early diagnosis through DNA testing possible, with many immediate and several long-term advantages for patients and their families. However, the conventional genetic techniques, such as direct sequence analysis, are both time-consuming and expensive. Targeted exon sequencing of selected genes using the massively parallel DNA sequencing technology will potentially enable us to systematically tackle previously intractable monogenic disorders and improve molecular diagnosis. Using this technique combined with direct sequence analysis, we screened 17 unrelated Usher syndrome type 1 patients and detected probable pathogenic variants in the 16 of them (94.1%) who carried at least one mutation. Seven patients had the MYO7A mutation (41.2%), which is the most common type in Japanese. Most of the mutations were detected by only the massively parallel DNA sequencing. We report here four patients, who had probable pathogenic mutations in two different Usher syndrome type 1 genes, and one case of MYO7A/PCDH15 digenic inheritance. This is the first report of Usher syndrome mutation analysis using massively parallel DNA sequencing and the frequency of Usher syndrome type 1 genes in Japanese. Mutation screening using this technique has the power to quickly identify mutations of many causative genes while maintaining cost-benefit performance. In addition, the simultaneous mutation analysis of large numbers of genes is useful for detecting mutations in different genes that are possibly disease modifiers or of digenic inheritance. PMID:24618850
Massively parallel DNA sequencing facilitates diagnosis of patients with Usher syndrome type 1.
Yoshimura, Hidekane; Iwasaki, Satoshi; Nishio, Shin-Ya; Kumakawa, Kozo; Tono, Tetsuya; Kobayashi, Yumiko; Sato, Hiroaki; Nagai, Kyoko; Ishikawa, Kotaro; Ikezono, Tetsuo; Naito, Yasushi; Fukushima, Kunihiro; Oshikawa, Chie; Kimitsuki, Takashi; Nakanishi, Hiroshi; Usami, Shin-Ichi
2014-01-01
Usher syndrome is an autosomal recessive disorder manifesting hearing loss, retinitis pigmentosa and vestibular dysfunction, and having three clinical subtypes. Usher syndrome type 1 is the most severe subtype due to its profound hearing loss, lack of vestibular responses, and retinitis pigmentosa that appears in prepuberty. Six of the corresponding genes have been identified, making early diagnosis through DNA testing possible, with many immediate and several long-term advantages for patients and their families. However, the conventional genetic techniques, such as direct sequence analysis, are both time-consuming and expensive. Targeted exon sequencing of selected genes using the massively parallel DNA sequencing technology will potentially enable us to systematically tackle previously intractable monogenic disorders and improve molecular diagnosis. Using this technique combined with direct sequence analysis, we screened 17 unrelated Usher syndrome type 1 patients and detected probable pathogenic variants in the 16 of them (94.1%) who carried at least one mutation. Seven patients had the MYO7A mutation (41.2%), which is the most common type in Japanese. Most of the mutations were detected by only the massively parallel DNA sequencing. We report here four patients, who had probable pathogenic mutations in two different Usher syndrome type 1 genes, and one case of MYO7A/PCDH15 digenic inheritance. This is the first report of Usher syndrome mutation analysis using massively parallel DNA sequencing and the frequency of Usher syndrome type 1 genes in Japanese. Mutation screening using this technique has the power to quickly identify mutations of many causative genes while maintaining cost-benefit performance. In addition, the simultaneous mutation analysis of large numbers of genes is useful for detecting mutations in different genes that are possibly disease modifiers or of digenic inheritance.
Zhu, X Q; Gasser, R B
1998-06-01
In this study, we assessed single-strand conformation polymorphism (SSCP)-based approaches for their capacity to fingerprint sequence variation in ribosomal DNA (rDNA) of ascaridoid nematodes of veterinary and/or human health significance. The second internal transcribed spacer region (ITS-2) of rDNA was utilised as the target region because it is known to provide species-specific markers for this group of parasites. ITS-2 was amplified by PCR from genomic DNA derived from individual parasites and subjected to analysis. Direct SSCP analysis of amplicons from seven taxa (Toxocara vitulorum, Toxocara cati, Toxocara canis, Toxascaris leonina, Baylisascaris procyonis, Ascaris suum and Parascaris equorum) showed that the single-strand (ss) ITS-2 patterns produced allowed their unequivocal identification to species. While no variation in SSCP patterns was detected in the ITS-2 within four species for which multiple samples were available, the method allowed the direct display of four distinct sequence types of ITS-2 among individual worms of T. cati. Comparison of SSCP/sequencing with the methods of dideoxy fingerprinting (ddF) and restriction endonuclease fingerprinting (REF) revealed that also ddF allowed the definition of the four sequence types, whereas REF displayed three of four. The findings indicate the usefulness of the SSCP-based approaches for the identification of ascaridoid nematodes to species, the direct display of sequence variation in rDNA and the detection of population variation. The ability to fingerprint microheterogeneity in ITS-2 rDNA using such approaches also has implications for studying fundamental aspects relating to mutational change in rDNA.
Sequence-dependent DNA deformability studied using molecular dynamics simulations.
Fujii, Satoshi; Kono, Hidetoshi; Takenaka, Shigeori; Go, Nobuhiro; Sarai, Akinori
2007-01-01
Proteins recognize specific DNA sequences not only through direct contact between amino acids and bases, but also indirectly based on the sequence-dependent conformation and deformability of the DNA (indirect readout). We used molecular dynamics simulations to analyze the sequence-dependent DNA conformations of all 136 possible tetrameric sequences sandwiched between CGCG sequences. The deformability of dimeric steps obtained by the simulations is consistent with that by the crystal structures. The simulation results further showed that the conformation and deformability of the tetramers can highly depend on the flanking base pairs. The conformations of xATx tetramers show the most rigidity and are not affected by the flanking base pairs and the xYRx show by contrast the greatest flexibility and change their conformations depending on the base pairs at both ends, suggesting tetramers with the same central dimer can show different deformabilities. These results suggest that analysis of dimeric steps alone may overlook some conformational features of DNA and provide insight into the mechanism of indirect readout during protein-DNA recognition. Moreover, the sequence dependence of DNA conformation and deformability may be used to estimate the contribution of indirect readout to the specificity of protein-DNA recognition as well as nucleosome positioning and large-scale behavior of nucleic acids.
Classification of European Mtdnas from an Analysis of Three European Populations
Torroni, A.; Huoponen, K.; Francalacci, P.; Petrozzi, M.; Morelli, L.; Scozzari, R.; Obinu, D.; Savontaus, M. L.; Wallace, D. C.
1996-01-01
Mitochondrial DNA (mtDNA) sequence variation was examined in Finns, Swedes and Tuscans by PCR amplification and restriction analysis. About 99% of the mtDNAs were subsumed within 10 mtDNA haplogroups (H, I, J, K, M, T, U, V, W, and X) suggesting that the identified haplogroups could encompass virtually all European mtDNAs. Because both hypervariable segments of the mtDNA control region were previously sequenced in the Tuscan samples, the mtDNA haplogroups and control region sequences could be compared. Using a combination of haplogroup-specific restriction site changes and control region nucleotide substitutions, the distribution of the haplogroups was surveyed through the published restriction site polymorphism and control region sequence data of Caucasoids. This supported the conclusion that most haplogroups observed in Europe are Caucasoid-specific, and that at least some of them occur at varying frequencies in different Caucasoid populations. The classification of almost all European mtDNA variation in a number of well defined haplogroups could provide additional insights about the origin and relationships of Caucasoid populations and the process of human colonization of Europe, and is valuable for the definition of the role played by mtDNA backgrounds in the expression of pathological mtDNA mutations PMID:8978068
Systematic analysis of coding and noncoding DNA sequences using methods of statistical linguistics
NASA Technical Reports Server (NTRS)
Mantegna, R. N.; Buldyrev, S. V.; Goldberger, A. L.; Havlin, S.; Peng, C. K.; Simons, M.; Stanley, H. E.
1995-01-01
We compare the statistical properties of coding and noncoding regions in eukaryotic and viral DNA sequences by adapting two tests developed for the analysis of natural languages and symbolic sequences. The data set comprises all 30 sequences of length above 50 000 base pairs in GenBank Release No. 81.0, as well as the recently published sequences of C. elegans chromosome III (2.2 Mbp) and yeast chromosome XI (661 Kbp). We find that for the three chromosomes we studied the statistical properties of noncoding regions appear to be closer to those observed in natural languages than those of coding regions. In particular, (i) a n-tuple Zipf analysis of noncoding regions reveals a regime close to power-law behavior while the coding regions show logarithmic behavior over a wide interval, while (ii) an n-gram entropy measurement shows that the noncoding regions have a lower n-gram entropy (and hence a larger "n-gram redundancy") than the coding regions. In contrast to the three chromosomes, we find that for vertebrates such as primates and rodents and for viral DNA, the difference between the statistical properties of coding and noncoding regions is not pronounced and therefore the results of the analyses of the investigated sequences are less conclusive. After noting the intrinsic limitations of the n-gram redundancy analysis, we also briefly discuss the failure of the zeroth- and first-order Markovian models or simple nucleotide repeats to account fully for these "linguistic" features of DNA. Finally, we emphasize that our results by no means prove the existence of a "language" in noncoding DNA.
Caramelli, David; Milani, Lucio; Vai, Stefania; Modi, Alessandra; Pecchioli, Elena; Girardi, Matteo; Pilli, Elena; Lari, Martina; Lippi, Barbara; Ronchitelli, Annamaria; Mallegni, Francesco; Casoli, Antonella; Bertorelle, Giorgio; Barbujani, Guido
2008-01-01
Background DNA sequences from ancient speciments may in fact result from undetected contamination of the ancient specimens by modern DNA, and the problem is particularly challenging in studies of human fossils. Doubts on the authenticity of the available sequences have so far hampered genetic comparisons between anatomically archaic (Neandertal) and early modern (Cro-Magnoid) Europeans. Methodology/Principal Findings We typed the mitochondrial DNA (mtDNA) hypervariable region I in a 28,000 years old Cro-Magnoid individual from the Paglicci cave, in Italy (Paglicci 23) and in all the people who had contact with the sample since its discovery in 2003. The Paglicci 23 sequence, determined through the analysis of 152 clones, is the Cambridge reference sequence, and cannot possibly reflect contamination because it differs from all potentially contaminating modern sequences. Conclusions/Significance: The Paglicci 23 individual carried a mtDNA sequence that is still common in Europe, and which radically differs from those of the almost contemporary Neandertals, demonstrating a genealogical continuity across 28,000 years, from Cro-Magnoid to modern Europeans. Because all potential sources of modern DNA contamination are known, the Paglicci 23 sample will offer a unique opportunity to get insight for the first time into the nuclear genes of early modern Europeans. PMID:18628960
Amemiya, Kenji; Hirotsu, Yosuke; Goto, Taichiro; Nakagomi, Hiroshi; Mochizuki, Hitoshi; Oyama, Toshio; Omata, Masao
2016-12-01
Identifying genetic alterations in tumors is critical for molecular targeting of therapy. In the clinical setting, formalin-fixed paraffin-embedded (FFPE) tissue is usually employed for genetic analysis. However, DNA extracted from FFPE tissue is often not suitable for analysis because of its low levels and poor quality. Additionally, FFPE sample preparation is time-consuming. To provide early treatment for cancer patients, a more rapid and robust method is required for precision medicine. We present a simple method for genetic analysis, called touch imprint cytology combined with massively paralleled sequencing (touch imprint cytology [TIC]-seq), to detect somatic mutations in tumors. We prepared FFPE tissues and TIC specimens from tumors in nine lung cancer patients and one patient with breast cancer. We found that the quality and quantity of TIC DNA was higher than that of FFPE DNA, which requires microdissection to enrich DNA from target tissues. Targeted sequencing using a next-generation sequencer obtained sufficient sequence data using TIC DNA. Most (92%) somatic mutations in lung primary tumors were found to be consistent between TIC and FFPE DNA. We also applied TIC DNA to primary and metastatic tumor tissues to analyze tumor heterogeneity in a breast cancer patient, and showed that common and distinct mutations among primary and metastatic sites could be classified into two distinct histological subtypes. TIC-seq is an alternative and feasible method to analyze genomic alterations in tumors by simply touching the cut surface of specimens to slides. © 2016 The Authors. Cancer Medicine published by John Wiley & Sons Ltd.
Carpenter, Meredith L.; Buenrostro, Jason D.; Valdiosera, Cristina; Schroeder, Hannes; Allentoft, Morten E.; Sikora, Martin; Rasmussen, Morten; Gravel, Simon; Guillén, Sonia; Nekhrizov, Georgi; Leshtakov, Krasimir; Dimitrova, Diana; Theodossiev, Nikola; Pettener, Davide; Luiselli, Donata; Sandoval, Karla; Moreno-Estrada, Andrés; Li, Yingrui; Wang, Jun; Gilbert, M. Thomas P.; Willerslev, Eske; Greenleaf, William J.; Bustamante, Carlos D.
2013-01-01
Most ancient specimens contain very low levels of endogenous DNA, precluding the shotgun sequencing of many interesting samples because of cost. Ancient DNA (aDNA) libraries often contain <1% endogenous DNA, with the majority of sequencing capacity taken up by environmental DNA. Here we present a capture-based method for enriching the endogenous component of aDNA sequencing libraries. By using biotinylated RNA baits transcribed from genomic DNA libraries, we are able to capture DNA fragments from across the human genome. We demonstrate this method on libraries created from four Iron Age and Bronze Age human teeth from Bulgaria, as well as bone samples from seven Peruvian mummies and a Bronze Age hair sample from Denmark. Prior to capture, shotgun sequencing of these libraries yielded an average of 1.2% of reads mapping to the human genome (including duplicates). After capture, this fraction increased substantially, with up to 59% of reads mapped to human and enrichment ranging from 6- to 159-fold. Furthermore, we maintained coverage of the majority of regions sequenced in the precapture library. Intersection with the 1000 Genomes Project reference panel yielded an average of 50,723 SNPs (range 3,062–147,243) for the postcapture libraries sequenced with 1 million reads, compared with 13,280 SNPs (range 217–73,266) for the precapture libraries, increasing resolution in population genetic analyses. Our whole-genome capture approach makes it less costly to sequence aDNA from specimens containing very low levels of endogenous DNA, enabling the analysis of larger numbers of samples. PMID:24568772
Saito, T; Ochiai, H
1999-10-01
cDNA fragments putatively encoding amino acid sequences characteristic of the fatty acid desaturase were obtained using expressed sequence tag (EST) information of the Dictyostelium cDNA project. Using this sequence, we have determined the cDNA sequence and genomic sequence of a desaturase. The cloned cDNA is 1489 nucleotides long and the deduced amino acid sequence comprised 464 amino acid residues containing an N-terminal cytochrome b5 domain. The whole sequence was 38.6% identical to the initially identified Delta5-desaturase of Mortierella alpina. We have confirmed its function as Delta5-desaturase by over expression mutation in D. discoideum and also the gain of function mutation in the yeast Saccharomyces cerevisiae. Analysis of the lipids from transformed D. discoideum and yeast demonstrated the accumulation of Delta5-desaturated products. This is the first report concering fatty acid desaturase in cellular slime molds.
Applications of statistical physics and information theory to the analysis of DNA sequences
NASA Astrophysics Data System (ADS)
Grosse, Ivo
2000-10-01
DNA carries the genetic information of most living organisms, and the of genome projects is to uncover that genetic information. One basic task in the analysis of DNA sequences is the recognition of protein coding genes. Powerful computer programs for gene recognition have been developed, but most of them are based on statistical patterns that vary from species to species. In this thesis I address the question if there exist universal statistical patterns that are different in coding and noncoding DNA of all living species, regardless of their phylogenetic origin. In search for such species-independent patterns I study the mutual information function of genomic DNA sequences, and find that it shows persistent period-three oscillations. To understand the biological origin of the observed period-three oscillations, I compare the mutual information function of genomic DNA sequences to the mutual information function of stochastic model sequences. I find that the pseudo-exon model is able to reproduce the mutual information function of genomic DNA sequences. Moreover, I find that a generalization of the pseudo-exon model can connect the existence and the functional form of long-range correlations to the presence and the length distributions of coding and noncoding regions. Based on these theoretical studies I am able to find an information-theoretical quantity, the average mutual information (AMI), whose probability distributions are significantly different in coding and noncoding DNA, while they are almost identical in all studied species. These findings show that there exist universal statistical patterns that are different in coding and noncoding DNA of all studied species, and they suggest that the AMI may be used to identify genes in different living species, irrespective of their taxonomic origin.
Marinospirillum insulare sp. nov., a novel halophilic helical bacterium isolated from kusaya gravy.
Satomi, M; Kimura, B; Hayashi, M; Okuzumi, M; Fujii, T
2004-01-01
A novel species that belongs to the genus Marinospirillum is described on the basis of phenotypic characteristics, phylogenetic analysis of 16S rRNA and gyrB gene sequences and DNA-DNA hybridization. Four strains of helical, halophilic, Gram-negative, heterotrophic bacteria were isolated from kusaya gravy, which is fermented brine that is used for the production of traditional dried fish in the Izu Islands of Japan. All of the new isolates were motile by means of bipolar tuft flagella, of small cell size, coccoid-body-forming and aerophilic; it was concluded that they belong to the same bacterial species, based on DNA-DNA hybridization values (>70% DNA relatedness). DNA G+C contents of the new strains were 42-43 mol% and they had isoprenoid quinone Q-8 as the major component. Phylogenetic analysis of 16S rRNA gene sequences indicated that the new isolates were members of the genus Marinospirillum; sequence similarity of the new isolates to Marinospirillum minutulum, Marinospirillum megaterium and Marinospirillum alkaliphilum was 98.5, 98.2 and 95.2%, respectively. Phylogenetic analysis based on the gyrB gene indicated that the new isolates had enough phylogenetic distance from M. minutulum and M. megaterium to be regarded as different species, with 84.7 and 78.7% sequence similarity, respectively. DNA-DNA hybridization showed that the new isolates had <36% DNA relatedness to M. minutulum and M. megaterium, supporting the phylogenetic conclusion. Thus, a novel species is proposed: Marinospirillum insulare sp. nov. (type strain, KT=LMG 21802T=NBRC 100033T).
Chen, Hui; Luthra, Rajyalakshmi; Goswami, Rashmi S; Singh, Rajesh R; Roy-Chowdhuri, Sinchita
2015-08-28
Application of next-generation sequencing (NGS) technology to routine clinical practice has enabled characterization of personalized cancer genomes to identify patients likely to have a response to targeted therapy. The proper selection of tumor sample for downstream NGS based mutational analysis is critical to generate accurate results and to guide therapeutic intervention. However, multiple pre-analytic factors come into play in determining the success of NGS testing. In this review, we discuss pre-analytic requirements for AmpliSeq PCR-based sequencing using Ion Torrent Personal Genome Machine (PGM) (Life Technologies), a NGS sequencing platform that is often used by clinical laboratories for sequencing solid tumors because of its low input DNA requirement from formalin fixed and paraffin embedded tissue. The success of NGS mutational analysis is affected not only by the input DNA quantity but also by several other factors, including the specimen type, the DNA quality, and the tumor cellularity. Here, we review tissue requirements for solid tumor NGS based mutational analysis, including procedure types, tissue types, tumor volume and fraction, decalcification, and treatment effects.
Kim, Kyunghee; Lee, Sang-Choon; Lee, Junki; Lee, Hyun Oh; Joh, Ho Jun; Kim, Nam-Hoon; Park, Hyun-Seung; Yang, Tae-Jin
2015-01-01
We report complete sequences of chloroplast (cp) genome and 45S nuclear ribosomal DNA (45S nrDNA) for 11 Panax ginseng cultivars. We have obtained complete sequences of cp and 45S nrDNA, the representative barcoding target sequences for cytoplasm and nuclear genome, respectively, based on low coverage NGS sequence of each cultivar. The cp genomes sizes ranged from 156,241 to 156,425 bp and the major size variation was derived from differences in copy number of tandem repeats in the ycf1 gene and in the intergenic regions of rps16-trnUUG and rpl32-trnUAG. The complete 45S nrDNA unit sequences were 11,091 bp, representing a consensus single transcriptional unit with an intergenic spacer region. Comparative analysis of these sequences as well as those previously reported for three Chinese accessions identified very rare but unique polymorphism in the cp genome within P. ginseng cultivars. There were 12 intra-species polymorphisms (six SNPs and six InDels) among 14 cultivars. We also identified five SNPs from 45S nrDNA of 11 Korean ginseng cultivars. From the 17 unique informative polymorphic sites, we developed six reliable markers for analysis of ginseng diversity and cultivar authentication. PMID:26061692
Crainey, James Lee; Marín, Michel Abanto; Silva, Túllio Romão Ribeiro da; de Medeiros, Jansen Fernandes; Pessoa, Felipe Arley Costa; Santos, Yago Vinícius; Vicente, Ana Carolina Paulo; Luz, Sérgio Luiz Bessa
2018-04-18
Despite the broad distribution of M. ozzardi in Latin America and the Caribbean, there is still very little DNA sequence data available to study this neglected parasite's epidemiology. Mitochondrial DNA (mtDNA) sequences, especially the cytochrome oxidase (CO1) gene's barcoding region, have been targeted successfully for filarial diagnostics and for epidemiological, ecological and evolutionary studies. MtDNA-based studies can, however, be compromised by unrecognised mitochondrial pseudogenes, such as Numts. Here, we have used shot-gun Illumina-HiSeq sequencing to recover the first complete Mansonella genus mitogenome and to identify several mitochondrial-origin pseudogenes. Mitogenome phylogenetic analysis placed M. ozzardi in the Onchocercidae "ONC5" clade and suggested that Mansonella parasites are more closely related to Wuchereria and Brugia genera parasites than they are to Loa genus parasites. DNA sequence alignments, BLAST searches and conceptual translations have been used to compliment phylogenetic analysis showing that M. ozzardi from the Amazon and Caribbean regions are near-identical and that previously reported Peruvian M. ozzardi CO1 reference sequences are probably of pseudogene origin. In addition to adding a much-needed resource to the Mansonella genus's molecular tool-kit and providing evidence that some M. ozzardi CO1 sequence deposits are pseudogenes, our results suggest that all Neotropical M. ozzardi parasites are closely related.
2012-01-01
Background Hawthorn is the common name of all plant species in the genus Crataegus, which belongs to the Rosaceae family. Crataegus are considered useful medicinal plants because of their high content of proanthocyanidins (PAs) and other related compounds. To improve PAs production in Crataegus tissues, the sequences of genes encoding PAs biosynthetic enzymes are required. Findings Different bioinformatics tools, including BLAST, multiple sequence alignment and alignment PCR analysis were used to design primers suitable for the amplification of DNA fragments from 10 candidate genes encoding enzymes involved in PAs biosynthesis in C. aronia. DNA sequencing results proved the utility of the designed primers. The primers were used successfully to amplify DNA fragments of different PAs biosynthesis genes in different Rosaceae plants. Conclusion To the best of our knowledge, this is the first use of the alignment PCR approach to isolate DNA sequences encoding PAs biosynthetic enzymes in Rosaceae plants. PMID:22883984
Zuiter, Afnan Saeid; Sawwan, Jammal; Al Abdallat, Ayed
2012-08-10
Hawthorn is the common name of all plant species in the genus Crataegus, which belongs to the Rosaceae family. Crataegus are considered useful medicinal plants because of their high content of proanthocyanidins (PAs) and other related compounds. To improve PAs production in Crataegus tissues, the sequences of genes encoding PAs biosynthetic enzymes are required. Different bioinformatics tools, including BLAST, multiple sequence alignment and alignment PCR analysis were used to design primers suitable for the amplification of DNA fragments from 10 candidate genes encoding enzymes involved in PAs biosynthesis in C. aronia. DNA sequencing results proved the utility of the designed primers. The primers were used successfully to amplify DNA fragments of different PAs biosynthesis genes in different Rosaceae plants. To the best of our knowledge, this is the first use of the alignment PCR approach to isolate DNA sequences encoding PAs biosynthetic enzymes in Rosaceae plants.
Cadmium sulfide nanocluster-based electrochemical stripping detection of DNA hybridization.
Zhu, Ningning; Zhang, Aiping; He, Pingang; Fang, Yuzhi
2003-03-01
A novel, sensitive electrochemical DNA hybridization detection assay, using cadmium sulfide (CdS) nanoclusters as the oligonucleotide labeling tag, is described. The assay relies on the hybridization of the target DNA with the CdS nanocluster oligonucleotide DNA probe, followed by the dissolution of the CdS nanoclusters anchored on the hybrids and the indirect determination of the dissolved cadmium ions by sensitive anodic stripping voltammetry (ASV) at a mercury-coated glassy carbon electrode (GCE). The results showed that only a complementary sequence could form a double-stranded dsDNA-CdS with the DNA probe and give an obvious electrochemical response. A three-base mismatch sequence and non-complementary sequence had negligible response. The combination of the large number of cadmium ions released from each dsDNA hybrid with the remarkable sensitivity of the electrochemical stripping analysis for cadmium at mercury-film GCE allows detection at levels as low as 0.2 pmol L(-1) of the complementary sequence of DNA.
Separating endogenous ancient DNA from modern day contamination in a Siberian Neandertal
Skoglund, Pontus; Northoff, Bernd H.; Shunkov, Michael V.; Derevianko, Anatoli P.; Pääbo, Svante; Krause, Johannes; Jakobsson, Mattias
2014-01-01
One of the main impediments for obtaining DNA sequences from ancient human skeletons is the presence of contaminating modern human DNA molecules in many fossil samples and laboratory reagents. However, DNA fragments isolated from ancient specimens show a characteristic DNA damage pattern caused by miscoding lesions that differs from present day DNA sequences. Here, we develop a framework for evaluating the likelihood of a sequence originating from a model with postmortem degradation—summarized in a postmortem degradation score—which allows the identification of DNA fragments that are unlikely to originate from present day sources. We apply this approach to a contaminated Neandertal specimen from Okladnikov Cave in Siberia to isolate its endogenous DNA from modern human contaminants and show that the reconstructed mitochondrial genome sequence is more closely related to the variation of Western Neandertals than what was discernible from previous analyses. Our method opens up the potential for genomic analysis of contaminated fossil material. PMID:24469802
Long interspersed repeated DNA (LINE) causes polymorphism at the rat insulin 1 locus.
Lakshmikumaran, M S; D'Ambrosio, E; Laimins, L A; Lin, D T; Furano, A V
1985-01-01
The insulin 1, but not the insulin 2, locus is polymorphic (i.e., exhibits allelic variation) in rats. Restriction enzyme analysis and hybridization studies showed that the polymorphic region is 2.2 kilobases upstream of the insulin 1 coding region and is due to the presence or absence of an approximately 2.7-kilobase repeated DNA element. DNA sequence determination showed that this DNA element is a member of a long interspersed repeated DNA family (LINE) that is highly repeated (greater than 50,000 copies) and highly transcribed in the rat. Although the presence or absence of LINE sequences at the insulin 1 locus occurs in both the homozygous and heterozygous states, LINE-containing insulin 1 alleles are more prevalent in the rat population than are alleles without LINEs. Restriction enzyme analysis of the LINE-containing alleles indicated that at least two versions of the LINE sequence may be present at the insulin 1 locus in different rats. Either repeated transposition of LINE sequences or gene conversion between the resident insulin 1 LINE and other sequences in the genome are possible explanations for this. Images PMID:3016521
Detection of Bacterial Pathogens from Broncho-Alveolar Lavage by Next-Generation Sequencing.
Leo, Stefano; Gaïa, Nadia; Ruppé, Etienne; Emonet, Stephane; Girard, Myriam; Lazarevic, Vladimir; Schrenzel, Jacques
2017-09-20
The applications of whole-metagenome shotgun sequencing (WMGS) in routine clinical analysis are still limited. A combination of a DNA extraction procedure, sequencing, and bioinformatics tools is essential for the removal of human DNA and for improving bacterial species identification in a timely manner. We tackled these issues with a broncho-alveolar lavage (BAL) sample from an immunocompromised patient who had developed severe chronic pneumonia. We extracted DNA from the BAL sample with protocols based either on sequential lysis of human and bacterial cells or on the mechanical disruption of all cells. Metagenomic libraries were sequenced on Illumina HiSeq platforms. Microbial community composition was determined by k-mer analysis or by mapping to taxonomic markers. Results were compared to those obtained by conventional clinical culture and molecular methods. Compared to mechanical cell disruption, a sequential lysis protocol resulted in a significantly increased proportion of bacterial DNA over human DNA and higher sequence coverage of Mycobacterium abscessus , Corynebacterium jeikeium and Rothia dentocariosa , the bacteria reported by clinical microbiology tests. In addition, we identified anaerobic bacteria not searched for by the clinical laboratory. Our results further support the implementation of WMGS in clinical routine diagnosis for bacterial identification.
Vartanian, Jean-Pierre; Wain-Hobson, Simon
2002-05-28
Nuclear mtDNA sequences (numts) are a widespread family of paralogs evolving as pseudogenes in chromosomal DNA [Zhang, D. E. & Hewitt, G. M. (1996) TREE 11, 247-251 and Bensasson, D., Zhang, D., Hartl, D. L. & Hewitt, G. M. (2001) TREE 16, 314-321]. When trying to identify the species origin of an unknown DNA sample by way of an mtDNA locus, PCR may amplify both mtDNA and numts. Indeed, occasionally numts dominate confounding attempts at species identification [Bensasson, D., Zhang, D. X. & Hewitt, G. M. (2000) Mol. Biol. Evol. 17, 406-415; Wallace, D. C., et al. (1997) Proc. Natl. Acad. Sci. USA 94, 14900-14905]. Rhesus and cynomolgus macaque mtDNA haplotypes were identified in a study of oral polio vaccine samples dating from the late 1950s [Blancou, P., et al. (2001) Nature (London) 410, 1045-1046]. They were accompanied by a number of putative numts. To confirm that these putative numts were of macaque origin, a library of numts corresponding to a small segment of 12S rDNA locus has been made by using DNA from a Chinese rhesus macaque. A broad distribution was found with up to 30% sequence variation. Phylogenetic analysis showed that the evolutionary trajectories of numts and bona fide mtDNA haplotypes do not overlap with the signal exception of the host species; mtDNA fragments are continually crossing over into the germ line. In the case of divergent mtDNA sequences from old oral polio vaccine samples [Blancou, P., et al. (2001) Nature (London) 410, 1045-1046], all were closely related to numts in the Chinese macaque library.
Schmidt-Chanasit, Jonas; Bialonski, Alexandra; Heinemann, Patrick; Ulrich, Rainer G; Günther, Stephan; Rabenau, Holger F; Doerr, Hans Wilhelm
2010-07-01
Recently two different herpes simplex virus type 2 (HSV-2) clades (A and B) were described on DNA sequence data of the glycoprotein E (gE), G (gG) and I (gI) genes. To type the circulating HSV-2 wild-type strains in Germany by a novel approach and to monitor potential changes in the molecular epidemiology between 1997 and 2008. A total of 64 clinical HSV-2 isolates were analyzed by a novel approach using the DNA sequences of the complete open reading frames of glycoprotein B (gB) and gG. Recombination analysis of the gB and gG gene sequences was performed to reveal intragenic recombinants. Based on the phylogenetic analysis of the gB coding DNA sequence 8 of 64 (12%) isolates were classified as clade A strains and 56 of 64 (88%) isolates were classified as clade B strains. Analysis of the gG coding DNA sequence classified 4 (6%) isolates as clade A strains and 60 (94%) isolates as clade B strains. In comparison, the 8 isolates classified as clade A strains using the gB sequence data were classified as clade B strains when using the gG coding DNA sequence, suggesting intergenic recombination events. Intragenic recombination events were not detected. The first molecular survey of clinical HSV-2 isolates from Germany demonstrated the circulation of clade A and B strains and of intergenic recombinants over a period of 12 years. Copyright (c) 2010 Elsevier B.V. All rights reserved.
Maggi, Elaine C; Gravina, Silvia; Cheng, Haiying; Piperdi, Bilal; Yuan, Ziqiang; Dong, Xiao; Libutti, Steven K; Vijg, Jan; Montagna, Cristina
2018-01-01
The goal of this study was to develop a method for whole genome cell-free DNA (cfDNA) methylation analysis in humans and mice with the ultimate goal to facilitate the identification of tumor derived DNA methylation changes in the blood. Plasma or serum from patients with pancreatic neuroendocrine tumors or lung cancer, and plasma from a murine model of pancreatic adenocarcinoma was used to develop a protocol for cfDNA isolation, library preparation and whole-genome bisulfite sequencing of ultra low quantities of cfDNA, including tumor-specific DNA. The protocol developed produced high quality libraries consistently generating a conversion rate >98% that will be applicable for the analysis of human and mouse plasma or serum to detect tumor-derived changes in DNA methylation.
In and out of the rRNA genes: characterization of Pokey elements in the sequenced Daphnia genome
2013-01-01
Background Only a few transposable elements are known to exhibit site-specific insertion patterns, including the well-studied R-element retrotransposons that insert into specific sites within the multigene rDNA. The only known rDNA-specific DNA transposon, Pokey (superfamily: piggyBac) is found in the freshwater microcrustacean, Daphnia pulex. Here, we present a genome-wide analysis of Pokey based on the recently completed whole genome sequencing project for D. pulex. Results Phylogenetic analysis of Pokey elements recovered from the genome sequence revealed the presence of four lineages corresponding to two divergent autonomous families and two related lineages of non-autonomous miniature inverted repeat transposable elements (MITEs). The MITEs are also found at the same 28S rRNA gene insertion site as the Pokey elements, and appear to have arisen as deletion derivatives of autonomous elements. Several copies of the full-length Pokey elements may be capable of producing an active transposase. Surprisingly, both families of Pokey possess a series of 200 bp repeats upstream of the transposase that is derived from the rDNA intergenic spacer (IGS). The IGS sequences within the Pokey elements appear to be evolving in concert with the rDNA units. Finally, analysis of the insertion sites of Pokey elements outside of rDNA showed a target preference for sites similar to the specific sequence that is targeted within rDNA. Conclusions Based on the target site preference of Pokey elements and the concerted evolution of a segment of the element with the rDNA unit, we propose an evolutionary path by which the ancestors of Pokey elements have invaded the rDNA niche. We discuss how specificity for the rDNA unit may have evolved and how this specificity has played a role in the long-term survival of these elements in the subgenus Daphnia. PMID:24059783
Ancient DNA Reveals Late Pleistocene Existence of Ostriches in Indian Sub-Continent.
Jain, Sonal; Rai, Niraj; Kumar, Giriraj; Pruthi, Parul Aggarwal; Thangaraj, Kumarasamy; Bajpai, Sunil; Pruthi, Vikas
2017-01-01
Ancient DNA (aDNA) analysis of extinct ratite species is of considerable interest as it provides important insights into their origin, evolution, paleogeographical distribution and vicariant speciation in congruence with continental drift theory. In this study, DNA hotspots were detected in fossilized eggshell fragments of ratites (dated ≥25000 years B.P. by radiocarbon dating) using confocal laser scanning microscopy (CLSM). DNA was isolated from five eggshell fragments and a 43 base pair (bp) sequence of a 16S rRNA mitochondrial-conserved region was successfully amplified and sequenced from one of the samples. Phylogenetic analysis of the DNA sequence revealed a 92% identity of the fossil eggshells to Struthio camelus and their position basal to other palaeognaths, consistent with the vicariant speciation model. Our study provides the first molecular evidence for the presence of ostriches in India, complementing the continental drift theory of biogeographical movement of ostriches in India, and opening up a new window into the evolutionary history of ratites.
NASA Astrophysics Data System (ADS)
Walker, David Lee
1999-12-01
This study uses dynamical analysis to examine in a quantitative fashion the information coding mechanism in DNA sequences. This exceeds the simple dichotomy of either modeling the mechanism by comparing DNA sequence walks as Fractal Brownian Motion (fbm) processes. The 2-D mappings of the DNA sequences for this research are from Iterated Function System (IFS) (Also known as the ``Chaos Game Representation'' (CGR)) mappings of the DNA sequences. This technique converts a 1-D sequence into a 2-D representation that preserves subsequence structure and provides a visual representation. The second step of this analysis involves the application of Wavelet Packet Transforms, a recently developed technique from the field of signal processing. A multi-fractal model is built by using wavelet transforms to estimate the Hurst exponent, H. The Hurst exponent is a non-parametric measurement of the dynamism of a system. This procedure is used to evaluate gene- coding events in the DNA sequence of cystic fibrosis mutations. The H exponent is calculated for various mutation sites in this gene. The results of this study indicate the presence of anti-persistent, random walks and persistent ``sub-periods'' in the sequence. This indicates the hypothesis of a multi-fractal model of DNA information encoding warrants further consideration. This work examines the model's behavior in both pathological (mutations) and non-pathological (healthy) base pair sequences of the cystic fibrosis gene. These mutations both natural and synthetic were introduced by computer manipulation of the original base pair text files. The results show that disease severity and system ``information dynamics'' correlate. These results have implications for genetic engineering as well as in mathematical biology. They suggest that there is scope for more multi-fractal models to be developed.
Impact of cultivation on characterisation of species composition of soil bacterial communities.
McCaig, A E.; Grayston, S J.; Prosser, J I.; Glover, L A.
2001-03-01
The species composition of culturable bacteria in Scottish grassland soils was investigated using a combination of Biolog and 16S rDNA analysis for characterisation of isolates. The inclusion of a molecular approach allowed direct comparison of sequences from culturable bacteria with sequences obtained during analysis of DNA extracted directly from the same soil samples. Bacterial strains were isolated on Pseudomonas isolation agar (PIA), a selective medium, and on tryptone soya agar (TSA), a general laboratory medium. In total, 12 and 21 morphologically different bacterial cultures were isolated on PIA and TSA, respectively. Biolog and sequencing placed PIA isolates in the same taxonomic groups, the majority of cultures belonging to the Pseudomonas (sensu stricto) group. However, analysis of 16S rDNA sequences proved more efficient than Biolog for characterising TSA isolates due to limitations of the Microlog database for identifying environmental bacteria. In general, 16S rDNA sequences from TSA isolates showed high similarities to cultured species represented in sequence databases, although TSA-8 showed only 92.5% similarity to the nearest relative, Bacillus insolitus. In general, there was very little overlap between the culturable and uncultured bacterial communities, although two sequences, PIA-2 and TSA-13, showed >99% similarity to soil clones. A cloning step was included prior to sequence analysis of two isolates, TSA-5 and TSA-14, and analysis of several clones confirmed that these cultures comprised at least four and three sequence types, respectively. All isolate clones were most closely related to uncultured bacteria, with clone TSA-5.1 showing 99.8% similarity to a sequence amplified directly from the same soil sample. Interestingly, one clone, TSA-5.4, clustered within a novel group comprising only uncultured sequences. This group, which is associated with the novel, deep-branching Acidobacterium capsulatum lineage, also included clones isolated during direct analysis of the same soil and from a wide range of other sample types studied elsewhere. The study demonstrates the value of fine-scale molecular analysis for identification of laboratory isolates and indicates the culturability of approximately 1% of the total population but under a restricted range of media and cultivation conditions.
ChIP-chip versus ChIP-seq: Lessons for experimental design and data analysis
2011-01-01
Background Chromatin immunoprecipitation (ChIP) followed by microarray hybridization (ChIP-chip) or high-throughput sequencing (ChIP-seq) allows genome-wide discovery of protein-DNA interactions such as transcription factor bindings and histone modifications. Previous reports only compared a small number of profiles, and little has been done to compare histone modification profiles generated by the two technologies or to assess the impact of input DNA libraries in ChIP-seq analysis. Here, we performed a systematic analysis of a modENCODE dataset consisting of 31 pairs of ChIP-chip/ChIP-seq profiles of the coactivator CBP, RNA polymerase II (RNA PolII), and six histone modifications across four developmental stages of Drosophila melanogaster. Results Both technologies produce highly reproducible profiles within each platform, ChIP-seq generally produces profiles with a better signal-to-noise ratio, and allows detection of more peaks and narrower peaks. The set of peaks identified by the two technologies can be significantly different, but the extent to which they differ varies depending on the factor and the analysis algorithm. Importantly, we found that there is a significant variation among multiple sequencing profiles of input DNA libraries and that this variation most likely arises from both differences in experimental condition and sequencing depth. We further show that using an inappropriate input DNA profile can impact the average signal profiles around genomic features and peak calling results, highlighting the importance of having high quality input DNA data for normalization in ChIP-seq analysis. Conclusions Our findings highlight the biases present in each of the platforms, show the variability that can arise from both technology and analysis methods, and emphasize the importance of obtaining high quality and deeply sequenced input DNA libraries for ChIP-seq analysis. PMID:21356108
Hoy, Marshal S.; Rodriguez, Rusty J.
2013-01-01
Molecular genetic analysis was conducted on two populations of the invasive non-native New Zealand mud snail (Potamopyrgus antipodarum), one from a freshwater ecosystem in Devil's Lake (Oregon, USA) and the other from an ecosystem of higher salinity in the Columbia River estuary (Hammond Harbor, Oregon, USA). To elucidate potential genetic differences between the two populations, three segments of nuclear ribosomal DNA (rDNA), the ITS1-ITS2 regions and the 18S and 28S rDNA genes were cloned and sequenced. Variant sequences within each individual were found in all three rDNA segments. Folding models were utilized for secondary structure analysis and results indicated that there were many sequences which contained structure-altering polymorphisms, which suggests they could be nonfunctional pseudogenes. In addition, analysis of molecular variance (AMOVA) was used for hierarchical analysis of genetic variance to estimate variation within and among populations and within individuals. AMOVA revealed significant variation in the ITS region between the populations and among clones within individuals, while in the 5.8S rDNA significant variation was revealed among individuals within the two populations. High levels of intragenomic variation were found in the ITS regions, which are known to be highly variable in many organisms. More interestingly, intragenomic variation was also found in the 18S and 28S rDNA, which has rarely been observed in animals and is so far unreported in Mollusca. We postulate that in these P. antipodarum populations the effects of concerted evolution are diminished due to the fact that not all of the rDNA genes in their polyploid genome should be essential for sustaining cellular function. This could lead to a lessening of selection pressures, allowing mutations to accumulate in some copies, changing them into variant sequences.
Eduardoff, Mayra; Xavier, Catarina; Strobl, Christina; Casas-Vargas, Andrea; Parson, Walther
2017-01-01
The analysis of mitochondrial DNA (mtDNA) has proven useful in forensic genetics and ancient DNA (aDNA) studies, where specimens are often highly compromised and DNA quality and quantity are low. In forensic genetics, the mtDNA control region (CR) is commonly sequenced using established Sanger-type Sequencing (STS) protocols involving fragment sizes down to approximately 150 base pairs (bp). Recent developments include Massively Parallel Sequencing (MPS) of (multiplex) PCR-generated libraries using the same amplicon sizes. Molecular genetic studies on archaeological remains that harbor more degraded aDNA have pioneered alternative approaches to target mtDNA, such as capture hybridization and primer extension capture (PEC) methods followed by MPS. These assays target smaller mtDNA fragment sizes (down to 50 bp or less), and have proven to be substantially more successful in obtaining useful mtDNA sequences from these samples compared to electrophoretic methods. Here, we present the modification and optimization of a PEC method, earlier developed for sequencing the Neanderthal mitochondrial genome, with forensic applications in mind. Our approach was designed for a more sensitive enrichment of the mtDNA CR in a single tube assay and short laboratory turnaround times, thus complying with forensic practices. We characterized the method using sheared, high quantity mtDNA (six samples), and tested challenging forensic samples (n = 2) as well as compromised solid tissue samples (n = 15) up to 8 kyrs of age. The PEC MPS method produced reliable and plausible mtDNA haplotypes that were useful in the forensic context. It yielded plausible data in samples that did not provide results with STS and other MPS techniques. We addressed the issue of contamination by including four generations of negative controls, and discuss the results in the forensic context. We finally offer perspectives for future research to enable the validation and accreditation of the PEC MPS method for final implementation in forensic genetic laboratories. PMID:28934125
Albayrak, Levent; Khanipov, Kamil; Pimenova, Maria; Golovko, George; Rojas, Mark; Pavlidis, Ioannis; Chumakov, Sergei; Aguilar, Gerardo; Chávez, Arturo; Widger, William R; Fofanov, Yuriy
2016-12-12
Low-abundance mutations in mitochondrial populations (mutations with minor allele frequency ≤ 1%), are associated with cancer, aging, and neurodegenerative disorders. While recent progress in high-throughput sequencing technology has significantly improved the heteroplasmy identification process, the ability of this technology to detect low-abundance mutations can be affected by the presence of similar sequences originating from nuclear DNA (nDNA). To determine to what extent nDNA can cause false positive low-abundance heteroplasmy calls, we have identified mitochondrial locations of all subsequences that are common or similar (one mismatch allowed) between nDNA and mitochondrial DNA (mtDNA). Performed analysis revealed up to a 25-fold variation in the lengths of longest common and longest similar (one mismatch allowed) subsequences across the mitochondrial genome. The size of the longest subsequences shared between nDNA and mtDNA in several regions of the mitochondrial genome were found to be as low as 11 bases, which not only allows using these regions to design new, very specific PCR primers, but also supports the hypothesis of the non-random introduction of mtDNA into the human nuclear DNA. Analysis of the mitochondrial locations of the subsequences shared between nDNA and mtDNA suggested that even very short (36 bases) single-end sequencing reads can be used to identify low-abundance variation in 20.4% of the mitochondrial genome. For longer (76 and 150 bases) reads, the proportion of the mitochondrial genome where nDNA presence will not interfere found to be 44.5 and 67.9%, when low-abundance mutations at 100% of locations can be identified using 417 bases long single reads. This observation suggests that the analysis of low-abundance variations in mitochondria population can be extended to a variety of large data collections such as NCBI Sequence Read Archive, European Nucleotide Archive, The Cancer Genome Atlas, and International Cancer Genome Consortium.
Nanopore sequencing in microgravity
McIntyre, Alexa B R; Rizzardi, Lindsay; Yu, Angela M; Alexander, Noah; Rosen, Gail L; Botkin, Douglas J; Stahl, Sarah E; John, Kristen K; Castro-Wallace, Sarah L; McGrath, Ken; Burton, Aaron S; Feinberg, Andrew P; Mason, Christopher E
2016-01-01
Rapid DNA sequencing and analysis has been a long-sought goal in remote research and point-of-care medicine. In microgravity, DNA sequencing can facilitate novel astrobiological research and close monitoring of crew health, but spaceflight places stringent restrictions on the mass and volume of instruments, crew operation time, and instrument functionality. The recent emergence of portable, nanopore-based tools with streamlined sample preparation protocols finally enables DNA sequencing on missions in microgravity. As a first step toward sequencing in space and aboard the International Space Station (ISS), we tested the Oxford Nanopore Technologies MinION during a parabolic flight to understand the effects of variable gravity on the instrument and data. In a successful proof-of-principle experiment, we found that the instrument generated DNA reads over the course of the flight, including the first ever sequenced in microgravity, and additional reads measured after the flight concluded its parabolas. Here we detail modifications to the sample-loading procedures to facilitate nanopore sequencing aboard the ISS and in other microgravity environments. We also evaluate existing analysis methods and outline two new approaches, the first based on a wave-fingerprint method and the second on entropy signal mapping. Computationally light analysis methods offer the potential for in situ species identification, but are limited by the error profiles (stays, skips, and mismatches) of older nanopore data. Higher accuracies attainable with modified sample processing methods and the latest version of flow cells will further enable the use of nanopore sequencers for diagnostics and research in space. PMID:28725742
DOE Office of Scientific and Technical Information (OSTI.GOV)
Benasutti, M.; Ejadi, S.; Whitlow, M.D.
The mutagenic and carcinogenic chemical aflatoxin B/sub 1/ (AFB/sub 1/) reacts almost exclusively at the N(7)-position of guanine following activation to its reactive form, the 8,9-epoxide (AFB/sub 1/ oxide). In general N(7)-guanine adducts yield DNA strand breaks when heated in base, a property that serves as the basis for the Maxam-Gilbert DNA sequencing reaction specific for guanine. Using DNA sequencing methods, other workers have shown that AFB/sub 1/ oxide gives strand breaks at positions of guanines; however, the guanine bands varied in intensity. This phenomenon has been used to infer that AFB/sub 1/ oxide prefers to react with guanines inmore » some sequence contexts more than in others and has been referred to as sequence specificity of binding. Herein, data on the reaction of AFB/sub 1/ oxide with several synthetic DNA polymers with different sequences are presented, and (following hydrolysis) adduct levels are determine by high-pressure liquid chromatography. These results reveal that for AFB/sub 1/ oxide (1) the N(7)-guanine adduct is the major adduct found in all of the DNA polymers, (2) adduct levels vary in different sequences, and, thus, sequence specificity is also observed by this more direct method, and (3) the intensity of bands in DNA sequencing gels is likely to reflect adduct levels formed at the N(7)-position of guanine. Knowing this, a reinvestigation of the reactivity of guanines in different DNA sequences using DNA sequencing methods was undertaken. Methods are developed to determine the X (5'-side) base and the Y (3'-side) base are most influential in determining guanine reactivity. These rules in conjunction with molecular modeling studies were used to assess the binding sites that might be utilized by AFB/sub 1/ oxide in its reaction with DNA.« less
Chan, K. C. Allen; Jiang, Peiyong; Sun, Kun; Cheng, Yvonne K. Y.; Tong, Yu K.; Cheng, Suk Hang; Wong, Ada I. C.; Hudecova, Irena; Leung, Tak Y.; Chiu, Rossa W. K.; Lo, Yuk Ming Dennis
2016-01-01
Plasma DNA obtained from a pregnant woman was sequenced to a depth of 270× haploid genome coverage. Comparing the maternal plasma DNA sequencing data with the parental genomic DNA data and using a series of bioinformatics filters, fetal de novo mutations were detected at a sensitivity of 85% and a positive predictive value of 74%. These results represent a 169-fold improvement in the positive predictive value over previous attempts. Improvements in the interpretation of the sequence information of every base position in the genome allowed us to interrogate the maternal inheritance of the fetus for 618,271 of 656,676 (94.2%) heterozygous SNPs within the maternal genome. The fetal genotype at each of these sites was deduced individually, unlike previously, where the inheritance was determined for a collection of sites within a haplotype. These results represent a 90-fold enhancement in the resolution in determining the fetus’s maternal inheritance. Selected genomic locations were more likely to be found at the ends of plasma DNA molecules. We found that a subset of such preferred ends exhibited selectivity for fetal- or maternal-derived DNA in maternal plasma. The ratio of the number of maternal plasma DNA molecules with fetal preferred ends to those with maternal preferred ends showed a correlation with the fetal DNA fraction. Finally, this second generation approach for noninvasive fetal whole-genome analysis was validated in a pregnancy diagnosed with cardiofaciocutaneous syndrome with maternal plasma DNA sequenced to 195× coverage. The causative de novo BRAF mutation was successfully detected through the maternal plasma DNA analysis. PMID:27799561
Uchoi, Ajit; Malik, Surendra Kumar; Choudhary, Ravish; Kumar, Susheel; Rohini, M R; Pal, Digvender; Ercisli, Sezai; Chaudhury, Rekha
2016-06-01
Phylogenetic relationships of Indian Citron (Citrus medica L.) with other important Citrus species have been inferred through sequence analyses of rbcL and matK gene region of chloroplast DNA. The study was based on 23 accessions of Citrus genotypes representing 15 taxa of Indian Citrus, collected from wild, semi-wild, and domesticated stocks. The phylogeny was inferred using the maximum parsimony (MP) and neighbor-joining (NJ) methods. Both MP and NJ trees separated all the 23 accessions of Citrus into five distinct clusters. The chloroplast DNA (cpDNA) analysis based on rbcL and matK sequence data carried out in Indian taxa of Citrus was useful in differentiating all the true species and species/varieties of probable hybrid origin in distinct clusters or groups. Sequence analysis based on rbcL and matK gene provided unambiguous identification and disposition of true species like C. maxima, C. medica, C. reticulata, and related hybrids/cultivars. The separation of C. maxima, C. medica, and C. reticulata in distinct clusters or sub-clusters supports their distinctiveness as the basic species of edible Citrus. However, the cpDNA sequence analysis of rbcL and matK gene could not find any clear cut differentiation between subgenera Citrus and Papeda as proposed in Swingle's system of classification.
Frye, Mark A; Ryu, Euijung; Nassan, Malik; Jenkins, Gregory D; Andreazza, Ana C; Evans, Jared M; McElroy, Susan L; Oglesbee, Devin; Highsmith, W Edward; Biernacka, Joanna M
2017-01-01
Converging genetic, postmortem gene-expression, cellular, and neuroimaging data implicate mitochondrial dysfunction in bipolar disorder. This study was conducted to investigate whether mitochondrial DNA (mtDNA) haplogroups and single nucleotide variants (SNVs) are associated with sub-phenotypes of bipolar disorder. MtDNA from 224 patients with Bipolar I disorder (BPI) was sequenced, and association of sequence variations with 3 sub-phenotypes (psychosis, rapid cycling, and adolescent illness onset) was evaluated. Gene-level tests were performed to evaluate overall burden of minor alleles for each phenotype. The haplogroup U was associated with a higher risk of psychosis. Secondary analyses of SNVs provided nominal evidence for association of psychosis with variants in the tRNA, ND4 and ND5 genes. The association of psychosis with ND4 (gene that encodes NADH dehydrogenase 4) was further supported by gene-level analysis. Preliminary analysis of mtDNA sequence data suggests a higher risk of psychosis with the U haplogroup and variation in the ND4 gene implicated in electron transport chain energy regulation. Further investigation of the functional consequences of this mtDNA variation is encouraged. Copyright © 2016. Published by Elsevier Ltd.
Vettore, André L.; da Silva, Felipe R.; Kemper, Edson L.; Souza, Glaucia M.; da Silva, Aline M.; Ferro, Maria Inês T.; Henrique-Silva, Flavio; Giglioti, Éder A.; Lemos, Manoel V.F.; Coutinho, Luiz L.; Nobrega, Marina P.; Carrer, Helaine; França, Suzelei C.; Bacci, Maurício; Goldman, Maria Helena S.; Gomes, Suely L.; Nunes, Luiz R.; Camargo, Luis E.A.; Siqueira, Walter J.; Van Sluys, Marie-Anne; Thiemann, Otavio H.; Kuramae, Eiko E.; Santelli, Roberto V.; Marino, Celso L.; Targon, Maria L.P.N.; Ferro, Jesus A.; Silveira, Henrique C.S.; Marini, Danyelle C.; Lemos, Eliana G.M.; Monteiro-Vitorello, Claudia B.; Tambor, José H.M.; Carraro, Dirce M.; Roberto, Patrícia G.; Martins, Vanderlei G.; Goldman, Gustavo H.; de Oliveira, Regina C.; Truffi, Daniela; Colombo, Carlos A.; Rossi, Magdalena; de Araujo, Paula G.; Sculaccio, Susana A.; Angella, Aline; Lima, Marleide M.A.; de Rosa, Vicente E.; Siviero, Fábio; Coscrato, Virginia E.; Machado, Marcos A.; Grivet, Laurent; Di Mauro, Sonia M.Z.; Nobrega, Francisco G.; Menck, Carlos F.M.; Braga, Marilia D.V.; Telles, Guilherme P.; Cara, Frank A.A.; Pedrosa, Guilherme; Meidanis, João; Arruda, Paulo
2003-01-01
To contribute to our understanding of the genome complexity of sugarcane, we undertook a large-scale expressed sequence tag (EST) program. More than 260,000 cDNA clones were partially sequenced from 26 standard cDNA libraries generated from different sugarcane tissues. After the processing of the sequences, 237,954 high-quality ESTs were identified. These ESTs were assembled into 43,141 putative transcripts. Of the assembled sequences, 35.6% presented no matches with existing sequences in public databases. A global analysis of the whole SUCEST data set indicated that 14,409 assembled sequences (33% of the total) contained at least one cDNA clone with a full-length insert. Annotation of the 43,141 assembled sequences associated almost 50% of the putative identified sugarcane genes with protein metabolism, cellular communication/signal transduction, bioenergetics, and stress responses. Inspection of the translated assembled sequences for conserved protein domains revealed 40,821 amino acid sequences with 1415 Pfam domains. Reassembling the consensus sequences of the 43,141 transcripts revealed a 22% redundancy in the first assembling. This indicated that possibly 33,620 unique genes had been identified and indicated that >90% of the sugarcane expressed genes were tagged. PMID:14613979
[Development of laboratory sequence analysis software based on WWW and UNIX].
Huang, Y; Gu, J R
2001-01-01
Sequence analysis tools based on WWW and UNIX were developed in our laboratory to meet the needs of molecular genetics research in our laboratory. General principles of computer analysis of DNA and protein sequences were also briefly discussed in this paper.
Bergman, C M; Kreitman, M
2001-08-01
Comparative genomic approaches to gene and cis-regulatory prediction are based on the principle that differential DNA sequence conservation reflects variation in functional constraint. Using this principle, we analyze noncoding sequence conservation in Drosophila for 40 loci with known or suspected cis-regulatory function encompassing >100 kb of DNA. We estimate the fraction of noncoding DNA conserved in both intergenic and intronic regions and describe the length distribution of ungapped conserved noncoding blocks. On average, 22%-26% of noncoding sequences surveyed are conserved in Drosophila, with median block length approximately 19 bp. We show that point substitution in conserved noncoding blocks exhibits transition bias as well as lineage effects in base composition, and occurs more than an order of magnitude more frequently than insertion/deletion (indel) substitution. Overall, patterns of noncoding DNA structure and evolution differ remarkably little between intergenic and intronic conserved blocks, suggesting that the effects of transcription per se contribute minimally to the constraints operating on these sequences. The results of this study have implications for the development of alignment and prediction algorithms specific to noncoding DNA, as well as for models of cis-regulatory DNA sequence evolution.
Barcode extension for analysis and reconstruction of structures
NASA Astrophysics Data System (ADS)
Myhrvold, Cameron; Baym, Michael; Hanikel, Nikita; Ong, Luvena L.; Gootenberg, Jonathan S.; Yin, Peng
2017-03-01
Collections of DNA sequences can be rationally designed to self-assemble into predictable three-dimensional structures. The geometric and functional diversity of DNA nanostructures created to date has been enhanced by improvements in DNA synthesis and computational design. However, existing methods for structure characterization typically image the final product or laboriously determine the presence of individual, labelled strands using gel electrophoresis. Here we introduce a new method of structure characterization that uses barcode extension and next-generation DNA sequencing to quantitatively measure the incorporation of every strand into a DNA nanostructure. By quantifying the relative abundances of distinct DNA species in product and monomer bands, we can study the influence of geometry and sequence on assembly. We have tested our method using 2D and 3D DNA brick and DNA origami structures. Our method is general and should be extensible to a wide variety of DNA nanostructures.
Barcode extension for analysis and reconstruction of structures.
Myhrvold, Cameron; Baym, Michael; Hanikel, Nikita; Ong, Luvena L; Gootenberg, Jonathan S; Yin, Peng
2017-03-13
Collections of DNA sequences can be rationally designed to self-assemble into predictable three-dimensional structures. The geometric and functional diversity of DNA nanostructures created to date has been enhanced by improvements in DNA synthesis and computational design. However, existing methods for structure characterization typically image the final product or laboriously determine the presence of individual, labelled strands using gel electrophoresis. Here we introduce a new method of structure characterization that uses barcode extension and next-generation DNA sequencing to quantitatively measure the incorporation of every strand into a DNA nanostructure. By quantifying the relative abundances of distinct DNA species in product and monomer bands, we can study the influence of geometry and sequence on assembly. We have tested our method using 2D and 3D DNA brick and DNA origami structures. Our method is general and should be extensible to a wide variety of DNA nanostructures.
Barcode extension for analysis and reconstruction of structures
Myhrvold, Cameron; Baym, Michael; Hanikel, Nikita; Ong, Luvena L; Gootenberg, Jonathan S; Yin, Peng
2017-01-01
Collections of DNA sequences can be rationally designed to self-assemble into predictable three-dimensional structures. The geometric and functional diversity of DNA nanostructures created to date has been enhanced by improvements in DNA synthesis and computational design. However, existing methods for structure characterization typically image the final product or laboriously determine the presence of individual, labelled strands using gel electrophoresis. Here we introduce a new method of structure characterization that uses barcode extension and next-generation DNA sequencing to quantitatively measure the incorporation of every strand into a DNA nanostructure. By quantifying the relative abundances of distinct DNA species in product and monomer bands, we can study the influence of geometry and sequence on assembly. We have tested our method using 2D and 3D DNA brick and DNA origami structures. Our method is general and should be extensible to a wide variety of DNA nanostructures. PMID:28287117
Forlano, M D; Teixeira, K R S; Scofield, A; Elisei, C; Yotoko, K S C; Fernandes, K R; Linhares, G F C; Ewing, S A; Massard, C L
2007-04-10
To characterize phylogenetically the species which causes canine hepatozoonosis at two rural areas of Rio de Janeiro State, Brazil, we used universal or Hepatozoon spp. primer sets for the 18S SSU rRNA coding region. DNA extracts were obtained from blood samples of thirteen dogs naturally infected, from four experimentally infected, and from five puppies infected by vertical transmission from a dam, that was experimentally infected. DNA of sporozoites of Hepatozoon americanum was used as positive control. The amplification of DNA extracts from blood of dogs infected with sporozoites of Hepatozoon spp. was observed in the presence of primers to 18S SSU rRNA gene of Hepatozoon spp., whereas DNA of H. americanum sporozoites was amplified in the presence of either universal or Hepatozoon spp.-specific primer sets; the amplified products were approximately 600bp in size. Cloned PCR products obtained from DNA extracts of blood from two dogs experimentally infected with Hepatozoon sp. were sequenced. The consensus sequence, derived from six sequence data sets, were blasted against sequences of 18S SSU rRNA of Hepatozoon spp. available at GenBank and aligned to homologous sequences to perform the phylogenetic analysis. This analysis clearly showed that our sequence clustered, independently of H. americanum sequences, within a group comprising other Hepatozoon canis sequences. Our results confirmed the hypothesis that the agent causing hepatozoonosis in the areas studied in Brazil is H. canis, supporting previous reports that were based on morphological and morphometric analyses.
DNA methylation assessment from human slow- and fast-twitch skeletal muscle fibers
Begue, Gwénaëlle; Raue, Ulrika; Jemiolo, Bozena
2017-01-01
A new application of the reduced representation bisulfite sequencing method was developed using low-DNA input to investigate the epigenetic profile of human slow- and fast-twitch skeletal muscle fibers. Successful library construction was completed with as little as 15 ng of DNA, and high-quality sequencing data were obtained with 32 ng of DNA. Analysis identified 143,160 differentially methylated CpG sites across 14,046 genes. In both fiber types, selected genes predominantly expressed in slow or fast fibers were hypomethylated, which was supported by the RNA-sequencing analysis. These are the first fiber type-specific methylation data from human skeletal muscle and provide a unique platform for future research. NEW & NOTEWORTHY This study validates a low-DNA input reduced representation bisulfite sequencing method for human muscle biopsy samples to investigate the methylation patterns at a fiber type-specific level. These are the first fiber type-specific methylation data reported from human skeletal muscle and thus provide initial insight into basal state differences in myosin heavy chain I and IIa muscle fibers among young, healthy men. PMID:28057818
Complexity and Entropy Analysis of DNMT1 Gene
USDA-ARS?s Scientific Manuscript database
Background: The application of complexity information on DNA sequence and protein in biological processes are well established in this study. Available sequences for DNMT1 gene, which is a maintenance methyltransferase is responsible for copying DNA methylation patterns to the daughter strands durin...
Saladino, R; Crestini, C; Mincione, E; Costanzo, G; Di Mauro, E; Negri, R
1997-11-01
We describe the reaction of formamide with 2'-deoxycytidine to give pyrimidine ring opening by nucleophilic addition on the electrophilic C(6) and C(4) positions. This information is confirmed by the analysis of the products of formamide attack on 2'-deoxycytidine, 5-methyl-2'-deoxycytidine, and 5-bromo-2'-deoxycytidine, residues when the latter are incorporated into oligonucleotides by DNA polymerase-driven polymerization and solid-phase phosphoramidite procedure. The increased sensitivity of 5-bromo-2'-deoxycytidine relative to that of 2'-deoxycytidine is pivotal for the improvement of the one-lane chemical DNA sequencing procedure based on the base-selective reaction of formamide with DNA. In many DNA sequencing cases it will in fact be possible to incorporate this base analogue into the DNA to be sequenced, thus providing a complete discrimination between its UV absorption signal and that of the thymidine residues. The wide spectrum of different sensitivities to formamide displayed by the 2'-deoxycytidine analogues solves, in the DNA single-lane chemical sequencing procedure, the possible source of errors due to low discrimination between C and T residues.
FastID: Extremely Fast Forensic DNA Comparisons
2017-05-19
FastID: Extremely Fast Forensic DNA Comparisons Darrell O. Ricke, PhD Bioengineering Systems & Technologies Massachusetts Institute of...Technology Lincoln Laboratory Lexington, MA USA Darrell.Ricke@ll.mit.edu Abstract—Rapid analysis of DNA forensic samples can have a critical impact on...time sensitive investigations. Analysis of forensic DNA samples by massively parallel sequencing is creating the next gold standard for DNA
PMS2 gene mutational analysis: direct cDNA sequencing to circumvent pseudogene interference.
Wimmer, Katharina; Wernstedt, Annekatrin
2014-01-01
The presence of highly homologous pseudocopies can compromise the mutation analysis of a gene of interest. In particular, when using PCR-based strategies, pseudogene co-amplification has to be effectively prevented. This is often achieved by using primers designed to be parental gene specific according to the reference sequence and by applying stringent PCR conditions. However, there are cases in which this approach is of limited utility. For example, it has been shown that the PMS2 gene exchanges sequences with one of its pseudogenes, named PMS2CL. This results in functional PMS2 alleles containing pseudogene-derived sequences at their 3'-end and in nonfunctional PMS2CL pseudogene alleles that contain gene-derived sequences. Hence, the paralogues cannot be distinguished according to the reference sequence. This shortcoming can be effectively circumvented by using direct cDNA sequencing. This approach is based on the selective amplification of PMS2 transcripts in two overlapping 1.6-kb RT-PCR products. In addition to avoiding pseudogene co-amplification and allele dropout, this method has also the advantage that it allows to effectively identify deletions, splice mutations, and de novo retrotransposon insertions that escape the detection of most DNA-based mutation analysis protocols.
Tooley, Paul W; Bandyopadhyay, Ranajit; Carras, Marie M; Pazoutová, Sylvie
2006-04-01
Isolates of Claviceps causing ergot on sorghum in India were analysed by AFLP analysis, and by analysis of DNA sequences of the EF-1alpha gene intron 4 and beta-tubulin gene intron 3 region. Of 89 isolates assayed from six states in India, four were determined to be C. sorghi, and the rest C. africana. A relatively low level of genetic diversity was observed within the Indian C. africana population. No evidence of genetic exchange between C. africana and C. sorghi was observed in either AFLP or DNA sequence analysis. Phylogenetic analysis was conducted using DNA sequences from 14 different Claviceps species. A multigene phylogeny based on the EF-1alpha gene intron 4, the beta-tubulin gene intron 3 region, and rDNA showed that C. sorghi grouped most closely with C. gigantea and C. africana. Although the Claviceps species we analysed were closely related, they colonize hosts that are taxonomically very distinct suggesting that there is no direct coevolution of Claviceps with its hosts.
Vera-Rodriguez, M; Diez-Juan, A; Jimenez-Almazan, J; Martinez, S; Navarro, R; Peinado, V; Mercader, A; Meseguer, M; Blesa, D; Moreno, I; Valbuena, D; Rubio, C; Simon, C
2018-04-01
What is the origin and composition of cell-free DNA in human embryo spent culture media? Cell-free DNA from human embryo spent culture media represents a mix of maternal and embryonic DNA, and the mixture can be more complex for mosaic embryos. In 2016, ~300 000 human embryos were chromosomally and/or genetically analyzed using preimplantation genetic testing for aneuploidies (PGT-A) or monogenic disorders (PGT-M) before transfer into the uterus. While progress in genetic techniques has enabled analysis of the full karyotype in a single cell with high sensitivity and specificity, these approaches still require an embryo biopsy. Thus, non-invasive techniques are sought as an alternative. This study was based on a total of 113 human embryos undergoing trophectoderm biopsy as part of PGT-A analysis. For each embryo, the spent culture media used between Day 3 and Day 5 of development were collected for cell-free DNA analysis. In addition to the 113 spent culture media samples, 28 media drops without embryo contact were cultured in parallel under the same conditions to use as controls. In total, 141 media samples were collected and divided into two groups: one for direct DNA quantification (53 spent culture media and 17 controls), the other for whole-genome amplification (60 spent culture media and 11 controls) and subsequent quantification. Some samples with amplified DNA (N = 56) were used for aneuploidy testing by next-generation sequencing; of those, 35 samples underwent single-nucleotide polymorphism (SNP) sequencing to detect maternal contamination. Finally, from the 35 spent culture media analyzed by SNP sequencing, 12 whole blastocysts were analyzed by fluorescence in situ hybridization (FISH) to determine the level of mosaicism in each embryo, as a possible origin for discordance between sample types. Trophectoderm biopsies and culture media samples (20 μl) underwent whole-genome amplification, then libraries were generated and sequenced for an aneuploidy study. For SNP sequencing, triads including trophectoderm DNA, cell-free DNA, and follicular fluid DNA were analyzed. In total, 124 SNPs were included with 90 SNPs distributed among all autosomes and 34 SNPs located on chromosome Y. Finally, 12 whole blastocysts were fixed and individual cells were analyzed by FISH using telomeric/centromeric probes for the affected chromosomes. We found a higher quantity of cell-free DNA in spent culture media co-cultured with embryos versus control media samples (P ≤ 0.001). The presence of cell-free DNA in the spent culture media enabled a chromosomal diagnosis, although results differed from those of trophectoderm biopsy analysis in most cases (67%). Discordant results were mainly attributable to a high percentage of maternal DNA in the spent culture media, with a median percentage of embryonic DNA estimated at 8%. Finally, from the discordant cases, 91.7% of whole blastocysts analyzed by FISH were mosaic and 75% of the analyzed chromosomes were concordant with the trophectoderm DNA diagnosis instead of the cell-free DNA result. This study was limited by the sample size and the number of cells analyzed by FISH. This is the first study to combine chromosomal analysis of cell-free DNA, SNP sequencing to identify maternal contamination, and whole-blastocyst analysis for detecting mosaicism. Our results provide a better understanding of the origin of cell-free DNA in spent culture media, offering an important step toward developing future non-invasive karyotyping that must rely on the specific identification of DNA released from human embryos. This work was funded by Igenomix S.L. There are no competing interests.
A DNA 'barcode blitz': rapid digitization and sequencing of a natural history collection.
Hebert, Paul D N; Dewaard, Jeremy R; Zakharov, Evgeny V; Prosser, Sean W J; Sones, Jayme E; McKeown, Jaclyn T A; Mantle, Beth; La Salle, John
2013-01-01
DNA barcoding protocols require the linkage of each sequence record to a voucher specimen that has, whenever possible, been authoritatively identified. Natural history collections would seem an ideal resource for barcode library construction, but they have never seen large-scale analysis because of concerns linked to DNA degradation. The present study examines the strength of this barrier, carrying out a comprehensive analysis of moth and butterfly (Lepidoptera) species in the Australian National Insect Collection. Protocols were developed that enabled tissue samples, specimen data, and images to be assembled rapidly. Using these methods, a five-person team processed 41,650 specimens representing 12,699 species in 14 weeks. Subsequent molecular analysis took about six months, reflecting the need for multiple rounds of PCR as sequence recovery was impacted by age, body size, and collection protocols. Despite these variables and the fact that specimens averaged 30.4 years old, barcode records were obtained from 86% of the species. In fact, one or more barcode compliant sequences (>487 bp) were recovered from virtually all species represented by five or more individuals, even when the youngest was 50 years old. By assembling specimen images, distributional data, and DNA barcode sequences on a web-accessible informatics platform, this study has greatly advanced accessibility to information on thousands of species. Moreover, much of the specimen data became publically accessible within days of its acquisition, while most sequence results saw release within three months. As such, this study reveals the speed with which DNA barcode workflows can mobilize biodiversity data, often providing the first web-accessible information for a species. These results further suggest that existing collections can enable the rapid development of a comprehensive DNA barcode library for the most diverse compartment of terrestrial biodiversity - insects.
DNA-Encoded Solid-Phase Synthesis: Encoding Language Design and Complex Oligomer Library Synthesis.
MacConnell, Andrew B; McEnaney, Patrick J; Cavett, Valerie J; Paegel, Brian M
2015-09-14
The promise of exploiting combinatorial synthesis for small molecule discovery remains unfulfilled due primarily to the "structure elucidation problem": the back-end mass spectrometric analysis that significantly restricts one-bead-one-compound (OBOC) library complexity. The very molecular features that confer binding potency and specificity, such as stereochemistry, regiochemistry, and scaffold rigidity, are conspicuously absent from most libraries because isomerism introduces mass redundancy and diverse scaffolds yield uninterpretable MS fragmentation. Here we present DNA-encoded solid-phase synthesis (DESPS), comprising parallel compound synthesis in organic solvent and aqueous enzymatic ligation of unprotected encoding dsDNA oligonucleotides. Computational encoding language design yielded 148 thermodynamically optimized sequences with Hamming string distance ≥ 3 and total read length <100 bases for facile sequencing. Ligation is efficient (70% yield), specific, and directional over 6 encoding positions. A series of isomers served as a testbed for DESPS's utility in split-and-pool diversification. Single-bead quantitative PCR detected 9 × 10(4) molecules/bead and sequencing allowed for elucidation of each compound's synthetic history. We applied DESPS to the combinatorial synthesis of a 75,645-member OBOC library containing scaffold, stereochemical and regiochemical diversity using mixed-scale resin (160-μm quality control beads and 10-μm screening beads). Tandem DNA sequencing/MALDI-TOF MS analysis of 19 quality control beads showed excellent agreement (<1 ppt) between DNA sequence-predicted mass and the observed mass. DESPS synergistically unites the advantages of solid-phase synthesis and DNA encoding, enabling single-bead structural elucidation of complex compounds and synthesis using reactions normally considered incompatible with unprotected DNA. The widespread availability of inexpensive oligonucleotide synthesis, enzymes, DNA sequencing, and PCR make implementation of DESPS straightforward, and may prompt the chemistry community to revisit the synthesis of more complex and diverse libraries.
Chelomina, Galina N; Rozhkovan, Konstantin V; Voronova, Anastasia N; Burundukova, Olga L; Muzarok, Tamara I; Zhuravlev, Yuri N
2016-04-01
Wild ginseng, Panax ginseng Meyer, is an endangered species of medicinal plants. In the present study, we analyzed variations within the ribosomal DNA (rDNA) cluster to gain insight into the genetic diversity of the Oriental ginseng, P. ginseng, at artificial plant cultivation. The roots of wild P. ginseng plants were sampled from a nonprotected natural population of the Russian Far East. The slides were prepared from leaf tissues using the squash technique for cytogenetic analysis. The 18S rDNA sequences were cloned and sequenced. The distribution of nucleotide diversity, recombination events, and interspecific phylogenies for the total 18S rDNA sequence data set was also examined. In mesophyll cells, mononucleolar nuclei were estimated to be dominant (75.7%), while the remaining nuclei contained two to four nucleoli. Among the analyzed 18S rDNA clones, 20% were identical to the 18S rDNA sequence of P. ginseng from Japan, and other clones differed in one to six substitutions. The nucleotide polymorphism was more expressed at the positions 440-640 bp, and distributed in variable regions, expansion segments, and conservative elements of core structure. The phylogenetic analysis confirmed conspecificity of ginseng plants cultivated in different regions, with two fixed mutations between P. ginseng and other species. This study identified the evidences of the intragenomic nucleotide polymorphism in the 18S rDNA sequences of P. ginseng. These data suggest that, in cultivated plants, the observed genome instability may influence the synthesis of biologically active compounds, which are widely used in traditional medicine.
Chelomina, Galina N.; Rozhkovan, Konstantin V.; Voronova, Anastasia N.; Burundukova, Olga L.; Muzarok, Tamara I.; Zhuravlev, Yuri N.
2015-01-01
Background Wild ginseng, Panax ginseng Meyer, is an endangered species of medicinal plants. In the present study, we analyzed variations within the ribosomal DNA (rDNA) cluster to gain insight into the genetic diversity of the Oriental ginseng, P. ginseng, at artificial plant cultivation. Methods The roots of wild P. ginseng plants were sampled from a nonprotected natural population of the Russian Far East. The slides were prepared from leaf tissues using the squash technique for cytogenetic analysis. The 18S rDNA sequences were cloned and sequenced. The distribution of nucleotide diversity, recombination events, and interspecific phylogenies for the total 18S rDNA sequence data set was also examined. Results In mesophyll cells, mononucleolar nuclei were estimated to be dominant (75.7%), while the remaining nuclei contained two to four nucleoli. Among the analyzed 18S rDNA clones, 20% were identical to the 18S rDNA sequence of P. ginseng from Japan, and other clones differed in one to six substitutions. The nucleotide polymorphism was more expressed at the positions 440–640 bp, and distributed in variable regions, expansion segments, and conservative elements of core structure. The phylogenetic analysis confirmed conspecificity of ginseng plants cultivated in different regions, with two fixed mutations between P. ginseng and other species. Conclusion This study identified the evidences of the intragenomic nucleotide polymorphism in the 18S rDNA sequences of P. ginseng. These data suggest that, in cultivated plants, the observed genome instability may influence the synthesis of biologically active compounds, which are widely used in traditional medicine. PMID:27158239
Presence of a consensus DNA motif at nearby DNA sequence of the mutation susceptible CG nucleotides.
Chowdhury, Kaushik; Kumar, Suresh; Sharma, Tanu; Sharma, Ankit; Bhagat, Meenakshi; Kamai, Asangla; Ford, Bridget M; Asthana, Shailendra; Mandal, Chandi C
2018-01-10
Complexity in tissues affected by cancer arises from somatic mutations and epigenetic modifications in the genome. The mutation susceptible hotspots present within the genome indicate a non-random nature and/or a position specific selection of mutation. An association exists between the occurrence of mutations and epigenetic DNA methylation. This study is primarily aimed at determining mutation status, and identifying a signature for predicting mutation prone zones of tumor suppressor (TS) genes. Nearby sequences from the top five positions having a higher mutation frequency in each gene of 42 TS genes were selected from a cosmic database and were considered as mutation prone zones. The conserved motifs present in the mutation prone DNA fragments were identified. Molecular docking studies were done to determine putative interactions between the identified conserved motifs and enzyme methyltransferase DNMT1. Collective analysis of 42 TS genes found GC as the most commonly replaced and AT as the most commonly formed residues after mutation. Analysis of the top 5 mutated positions of each gene (210 DNA segments for 42 TS genes) identified that CG nucleotides of the amino acid codons (e.g., Arginine) are most susceptible to mutation, and found a consensus DNA "T/AGC/GAGGA/TG" sequence present in these mutation prone DNA segments. Similar to TS genes, analysis of 54 oncogenes not only found CG nucleotides of the amino acid Arg as the most susceptible to mutation, but also identified the presence of similar consensus DNA motifs in the mutation prone DNA fragments (270 DNA segments for 54 oncogenes) of oncogenes. Docking studies depicted that, upon binding of DNMT1 methylates to this consensus DNA motif (C residues of CpG islands), mutation was likely to occur. Thus, this study proposes that DNMT1 mediated methylation in chromosomal DNA may decrease if a foreign DNA segment containing this consensus sequence along with CG nucleotides is exogenously introduced to dividing cancer cells. Copyright © 2017 Elsevier B.V. All rights reserved.
Advantages of genome sequencing by long-read sequencer using SMRT technology in medical area.
Nakano, Kazuma; Shiroma, Akino; Shimoji, Makiko; Tamotsu, Hinako; Ashimine, Noriko; Ohki, Shun; Shinzato, Misuzu; Minami, Maiko; Nakanishi, Tetsuhiro; Teruya, Kuniko; Satou, Kazuhito; Hirano, Takashi
2017-07-01
PacBio RS II is the first commercialized third-generation DNA sequencer able to sequence a single molecule DNA in real-time without amplification. PacBio RS II's sequencing technology is novel and unique, enabling the direct observation of DNA synthesis by DNA polymerase. PacBio RS II confers four major advantages compared to other sequencing technologies: long read lengths, high consensus accuracy, a low degree of bias, and simultaneous capability of epigenetic characterization. These advantages surmount the obstacle of sequencing genomic regions such as high/low G+C, tandem repeat, and interspersed repeat regions. Moreover, PacBio RS II is ideal for whole genome sequencing, targeted sequencing, complex population analysis, RNA sequencing, and epigenetics characterization. With PacBio RS II, we have sequenced and analyzed the genomes of many species, from viruses to humans. Herein, we summarize and review some of our key genome sequencing projects, including full-length viral sequencing, complete bacterial genome and almost-complete plant genome assemblies, and long amplicon sequencing of a disease-associated gene region. We believe that PacBio RS II is not only an effective tool for use in the basic biological sciences but also in the medical/clinical setting.
Sequence dependence of electron-induced DNA strand breakage revealed by DNA nanoarrays
Keller, Adrian; Rackwitz, Jenny; Cauët, Emilie; Liévin, Jacques; Körzdörfer, Thomas; Rotaru, Alexandru; Gothelf, Kurt V.; Besenbacher, Flemming; Bald, Ilko
2014-01-01
The electronic structure of DNA is determined by its nucleotide sequence, which is for instance exploited in molecular electronics. Here we demonstrate that also the DNA strand breakage induced by low-energy electrons (18 eV) depends on the nucleotide sequence. To determine the absolute cross sections for electron induced single strand breaks in specific 13 mer oligonucleotides we used atomic force microscopy analysis of DNA origami based DNA nanoarrays. We investigated the DNA sequences 5′-TT(XYX)3TT with X = A, G, C and Y = T, BrU 5-bromouracil and found absolute strand break cross sections between 2.66 · 10−14 cm2 and 7.06 · 10−14 cm2. The highest cross section was found for 5′-TT(ATA)3TT and 5′-TT(ABrUA)3TT, respectively. BrU is a radiosensitizer, which was discussed to be used in cancer radiation therapy. The replacement of T by BrU into the investigated DNA sequences leads to a slight increase of the absolute strand break cross sections resulting in sequence-dependent enhancement factors between 1.14 and 1.66. Nevertheless, the variation of strand break cross sections due to the specific nucleotide sequence is considerably higher. Thus, the present results suggest the development of targeted radiosensitizers for cancer radiation therapy. PMID:25487346
Reddy, M K; Nair, S; Singh, B N; Mudgil, Y; Tewari, K K; Sopory, S K
2001-01-24
We report the cloning and sequencing of both cDNA and genomic DNA of a 33 kDa chloroplast ribonucleoprotein (33RNP) from pea. The analysis of the predicted amino acid sequence of the cDNA clone revealed that the encoded protein contains two RNA binding domains, including the conserved consensus ribonucleoprotein sequences CS-RNP1 and CS-RNP2, on the C-terminus half and the presence of a putative transit peptide sequence in the N-terminus region. The phylogenetic and multiple sequence alignment analysis of pea chloroplast RNP along with RNPs reported from the other plant sources revealed that the pea 33RNP is very closely related to Nicotiana sylvestris 31RNP and 28RNP and also to 31RNP and 28RNP of Arabidopsis and spinach, respectively. The pea 33RNP was expressed in Escherichia coli and purified to homogeneity. The in vitro import of precursor protein into chloroplasts confirmed that the N-terminus putative transit peptide is a bona fide transit peptide and 33RNP is localized in the chloroplast. The nucleic acid-binding properties of the recombinant protein, as revealed by South-Western analysis, showed that 33RNP has higher binding affinity for poly (U) and oligo dT than for ssDNA and dsDNA. The steady state transcript level was higher in leaves than in roots and the expression of this gene is light stimulated. Sequence analysis of the genomic clone revealed that the gene contains four exons and three introns. We have also isolated and analyzed the 5' flanking region of the pea 33RNP gene.
Polymenakou, Paraskevi N; Bertilsson, Stefan; Tselepides, Anastasios; Stephanou, Euripides G
2005-10-01
The regional variability of sediment bacterial community composition and diversity was studied by comparative analysis of four large 16S ribosomal DNA (rDNA) clone libraries from sediments in different regions of the Eastern Mediterranean Sea (Thermaikos Gulf, Cretan Sea, and South lonian Sea). Amplified rDNA restriction analysis of 664 clones from the libraries indicate that the rDNA richness and evenness was high: for example, a near-1:1 relationship among screened clones and number of unique restriction patterns when up to 190 clones were screened for each library. Phylogenetic analysis of 207 bacterial 16S rDNA sequences from the sediment libraries demonstrated that Gamma-, Delta-, and Alphaproteobacteria, Holophaga/Acidobacteria, Planctomycetales, Actinobacteria, Bacteroidetes, and Verrucomicrobia were represented in all four libraries. A few clones also grouped with the Betaproteobacteria, Nitrospirae, Spirochaetales, Chlamydiae, Firmicutes, and candidate division OPl 1. The abundance of sequences affiliated with Gammaproteobacteria was higher in libraries from shallow sediments in the Thermaikos Gulf (30 m) and the Cretan Sea (100 m) compared to the deeper South Ionian station (2790 m). Most sequences in the four sediment libraries clustered with uncultured 16S rDNA phylotypes from marine habitats, and many of the closest matches were clones from hydrocarbon seeps, benzene-mineralizing consortia, sulfate reducers, sulk oxidizers, and ammonia oxidizers. LIBSHUFF statistics of 16S rDNA gene sequences from the four libraries revealed major differences, indicating either a very high richness in the sediment bacterial communities or considerable variability in bacterial community composition among regions, or both.
Marshall, Charla; Sturk-Andreaggi, Kimberly; Daniels-Higginbotham, Jennifer; Oliver, Robert Sean; Barritt-Ross, Suzanne; McMahon, Timothy P
2017-11-01
Next-generation ancient DNA technologies have the potential to assist in the analysis of degraded DNA extracted from forensic specimens. Mitochondrial genome (mitogenome) sequencing, specifically, may be of benefit to samples that fail to yield forensically relevant genetic information using conventional PCR-based techniques. This report summarizes the Armed Forces Medical Examiner System's Armed Forces DNA Identification Laboratory's (AFMES-AFDIL) performance evaluation of a Next-Generation Sequencing protocol for degraded and chemically treated past accounting samples. The procedure involves hybridization capture for targeted enrichment of mitochondrial DNA, massively parallel sequencing using Illumina chemistry, and an automated bioinformatic pipeline for forensic mtDNA profile generation. A total of 22 non-probative samples and associated controls were processed in the present study, spanning a range of DNA quantity and quality. Data were generated from over 100 DNA libraries by ten DNA analysts over the course of five months. The results show that the mitogenome sequencing procedure is reliable and robust, sensitive to low template (one ng control DNA) as well as degraded DNA, and specific to the analysis of the human mitogenome. Haplotypes were overall concordant between NGS replicates and with previously generated Sanger control region data. Due to the inherent risk for contamination when working with low-template, degraded DNA, a contamination assessment was performed. The consumables were shown to be void of human DNA contaminants and suitable for forensic use. Reagent blanks and negative controls were analyzed to determine the background signal of the procedure. This background signal was then used to set analytical and reporting thresholds, which were designated at 4.0X (limit of detection) and 10.0X (limit of quantiation) average coverage across the mitogenome, respectively. Nearly all human samples exceeded the reporting threshold, although coverage was reduced in chemically treated samples resulting in a ∼58% passing rate for these poor-quality samples. A concordance assessment demonstrated the reliability of the NGS data when compared to known Sanger profiles. One case sample was shown to be mixed with a co-processed sample and two reagent blanks indicated the presence of DNA above the analytical threshold. This contamination was attributed to sequencing crosstalk from simultaneously sequenced high-quality samples to include the positive control. Overall this study demonstrated that hybridization capture and Illumina sequencing provide a viable method for mitogenome sequencing of degraded and chemically treated skeletal DNA samples, yet may require alternative measures of quality control. Copyright © 2017 The Authors. Published by Elsevier B.V. All rights reserved.
Cloning and sequence analysis of a cDNA clone coding for the mouse GM2 activator protein.
Bellachioma, G; Stirling, J L; Orlacchio, A; Beccari, T
1993-01-01
A cDNA (1.1 kb) containing the complete coding sequence for the mouse GM2 activator protein was isolated from a mouse macrophage library using a cDNA for the human protein as a probe. There was a single ATG located 12 bp from the 5' end of the cDNA clone followed by an open reading frame of 579 bp. Northern blot analysis of mouse macrophage RNA showed that there was a single band with a mobility corresponding to a size of 2.3 kb. We deduce from this that the mouse mRNA, in common with the mRNA for the human GM2 activator protein, has a long 3' untranslated sequence of approx. 1.7 kb. Alignment of the mouse and human deduced amino acid sequences showed 68% identity overall and 75% identity for the sequence on the C-terminal side of the first 31 residues, which in the human GM2 activator protein contains the signal peptide. Hydropathicity plots showed great similarity between the mouse and human sequences even in regions of low sequence similarity. There is a single N-glycosylation site in the mouse GM2 activator protein sequence (Asn151-Phe-Thr) which differs in its location from the single site reported in the human GM2 activator protein sequence (Asn63-Val-Thr). Images Figure 1 PMID:7689829
DNA-encoded chemistry: enabling the deeper sampling of chemical space.
Goodnow, Robert A; Dumelin, Christoph E; Keefe, Anthony D
2017-02-01
DNA-encoded chemical library technologies are increasingly being adopted in drug discovery for hit and lead generation. DNA-encoded chemistry enables the exploration of chemical spaces four to five orders of magnitude more deeply than is achievable by traditional high-throughput screening methods. Operation of this technology requires developing a range of capabilities including aqueous synthetic chemistry, building block acquisition, oligonucleotide conjugation, large-scale molecular biological transformations, selection methodologies, PCR, sequencing, sequence data analysis and the analysis of large chemistry spaces. This Review provides an overview of the development and applications of DNA-encoded chemistry, highlighting the challenges and future directions for the use of this technology.
Begum, Rabeya; Zakrzewski, Falk; Menzel, Gerhard; Weber, Beatrice; Alam, Sheikh Shamimul; Schmidt, Thomas
2013-07-01
The cultivated jute species Corchorus olitorius and Corchorus capsularis are important fibre crops. The analysis of repetitive DNA sequences, comprising a major part of plant genomes, has not been carried out in jute but is useful to investigate the long-range organization of chromosomes. The aim of this study was the identification of repetitive DNA sequences to facilitate comparative molecular and cytogenetic studies of two jute cultivars and to develop a fluorescent in situ hybridization (FISH) karyotype for chromosome identification. A plasmid library was generated from C. olitorius and C. capsularis with genomic restriction fragments of 100-500 bp, which was complemented by targeted cloning of satellite DNA by PCR. The diversity of the repetitive DNA families was analysed comparatively. The genomic abundance and chromosomal localization of different repeat classes were investigated by Southern analysis and FISH, respectively. The cytosine methylation of satellite arrays was studied by immunolabelling. Major satellite repeats and retrotransposons have been identified from C. olitorius and C. capsularis. The satellite family CoSat I forms two undermethylated species-specific subfamilies, while the long terminal repeat (LTR) retrotransposons CoRetro I and CoRetro II show similarity to the Metaviridea of plant retroelements. FISH karyotypes were developed by multicolour FISH using these repetitive DNA sequences in combination with 5S and 18S-5·8S-25S rRNA genes which enable the unequivocal chromosome discrimination in both jute species. The analysis of the structure and diversity of the repeated DNA is crucial for genome sequence annotation. The reference karyotypes will be useful for breeding of jute and provide the basis for karyotyping homeologous chromosomes of wild jute species to reveal the genetic and evolutionary relationship between cultivated and wild Corchorus species.
Peck, Michelle A; Sturk-Andreaggi, Kimberly; Thomas, Jacqueline T; Oliver, Robert S; Barritt-Ross, Suzanne; Marshall, Charla
2018-05-01
Generating mitochondrial genome (mitogenome) data from reference samples in a rapid and efficient manner is critical to harnessing the greater power of discrimination of the entire mitochondrial DNA (mtDNA) marker. The method of long-range target enrichment, Nextera XT library preparation, and Illumina sequencing on the MiSeq is a well-established technique for generating mitogenome data from high-quality samples. To this end, a validation was conducted for this mitogenome method processing up to 24 samples simultaneously along with analysis in the CLC Genomics Workbench and utilizing the AQME (AFDIL-QIAGEN mtDNA Expert) tool to generate forensic profiles. This validation followed the Federal Bureau of Investigation's Quality Assurance Standards (QAS) for forensic DNA testing laboratories and the Scientific Working Group on DNA Analysis Methods (SWGDAM) validation guidelines. The evaluation of control DNA, non-probative samples, blank controls, mixtures, and nonhuman samples demonstrated the validity of this method. Specifically, the sensitivity was established at ≥25 pg of nuclear DNA input for accurate mitogenome profile generation. Unreproducible low-level variants were observed in samples with low amplicon yields. Further, variant quality was shown to be a useful metric for identifying sequencing error and crosstalk. Success of this method was demonstrated with a variety of reference sample substrates and extract types. These studies further demonstrate the advantages of using NGS techniques by highlighting the quantitative nature of heteroplasmy detection. The results presented herein from more than 175 samples processed in ten sequencing runs, show this mitogenome sequencing method and analysis strategy to be valid for the generation of reference data. Copyright © 2018 Elsevier B.V. All rights reserved.
Ancient DNA studies: new perspectives on old samples
2012-01-01
In spite of past controversies, the field of ancient DNA is now a reliable research area due to recent methodological improvements. A series of recent large-scale studies have revealed the true potential of ancient DNA samples to study the processes of evolution and to test models and assumptions commonly used to reconstruct patterns of evolution and to analyze population genetics and palaeoecological changes. Recent advances in DNA technologies, such as next-generation sequencing make it possible to recover DNA information from archaeological and paleontological remains allowing us to go back in time and study the genetic relationships between extinct organisms and their contemporary relatives. With the next-generation sequencing methodologies, DNA sequences can be retrieved even from samples (for example human remains) for which the technical pitfalls of classical methodologies required stringent criteria to guaranty the reliability of the results. In this paper, we review the methodologies applied to ancient DNA analysis and the perspectives that next-generation sequencing applications provide in this field. PMID:22697611
[Structural organization of 5S ribosomal DNA of Rosa rugosa].
Tynkevych, Iu O; Volkov, R A
2014-01-01
In order to clarify molecular organization of the genomic region encoding 5S rRNA in diploid species Rosa rugosa several 5S rDNA repeated units were cloned and sequenced. Analysis of the obtained sequences revealed that only one length variant of 5S rDNA repeated units, which contains intact promoter elements in the intergenic spacer region (IGS) and appears to be transcriptionally active is present in the genome. Additionally, a limited number of 5S rDNA pseudogenes lacking a portion of coding sequence and the complete IGS was detected. A high level of sequence similarity (from 93.7 to 97.5%) between the IGS of major 5S rDNA variants of East Asian R. rugosa and North American R. nitida was found indicating comparatively recent divergence of these species.
A DNA sequence element that advances replication origin activation time in Saccharomyces cerevisiae.
Pohl, Thomas J; Kolor, Katherine; Fangman, Walton L; Brewer, Bonita J; Raghuraman, M K
2013-11-06
Eukaryotic origins of DNA replication undergo activation at various times in S-phase, allowing the genome to be duplicated in a temporally staggered fashion. In the budding yeast Saccharomyces cerevisiae, the activation times of individual origins are not intrinsic to those origins but are instead governed by surrounding sequences. Currently, there are two examples of DNA sequences that are known to advance origin activation time, centromeres and forkhead transcription factor binding sites. By combining deletion and linker scanning mutational analysis with two-dimensional gel electrophoresis to measure fork direction in the context of a two-origin plasmid, we have identified and characterized a 19- to 23-bp and a larger 584-bp DNA sequence that are capable of advancing origin activation time.
Distinctive archaebacterial species associated with anaerobic rumen protozoan Entodinium caudatum.
Tóthová, T; Piknová, M; Kisidayová, S; Javorský, P; Pristas, P
2008-01-01
The diversity of archaebacteria associated with anaerobic rumen protozoan Entodinium caudatum in long term in vitro culture was investigated by denaturing gradient gel electrophoresis (DGGE) analysis of hypervariable V3 region of archaebacterial 16S rRNA gene. PCR was accomplished directly from DNA extracted from a single protozoal cell and from total community genomic DNA and the obtained fingerprints were compared. The analysis indicated the presence of a solitary intensive band present in Entodinium caudatum single cell DNA, which had no counterparts in the profile from total DNA. The identity of archaebacterium represented by this band was determined by sequence analysis which showed that the sequence fell to the cluster of ciliate symbiotic methanogens identified recently by 16S gene library approach.
ERIC Educational Resources Information Center
Galewsky, Samuel
2000-01-01
Introduces a series of molecular genetics laboratories where students pick a single colony from a Drosophila melanogester embryo cDNA library and purify the plasmid, then analyze the insert through restriction digests and gel electrophoresis. (Author/YDS)
Shi, Liang; Khandurina, Julia; Ronai, Zsolt; Li, Bi-Yu; Kwan, Wai King; Wang, Xun; Guttman, András
2003-01-01
A capillary gel electrophoresis based automated DNA fraction collection technique was developed to support a novel DNA fragment-pooling strategy for expressed sequence tag (EST) library construction. The cDNA population is first cleaved by BsaJ I and EcoR I restriction enzymes, and then subpooled by selective ligation with specific adapters followed by polymerase chain reaction (PCR) amplification and labeling. Combination of this cDNA fingerprinting method with high-resolution capillary gel electrophoresis separation and precise fractionation of individual cDNA transcript representatives avoids redundant fragment selection and concomitant repetitive sequencing of abundant transcripts. Using a computer-controlled capillary electrophoresis device the transcript representatives were separated by their size and fractions were automatically collected in every 30 s into 96-well plates. The high resolving power of the sieving matrix ensured sequencing grade separation of the DNA fragments (i.e., single-base resolution) and successful fraction collection. Performance and precision of the fraction collection procedure was validated by PCR amplification of the collected DNA fragments followed by capillary electrophoresis analysis for size and purity verification. The collected and PCR-amplified transcript representatives, ranging up to several hundred base pairs, were then sequenced to create an EST library.
High-resolution characterization of sequence signatures due to non-random cleavage of cell-free DNA.
Chandrananda, Dineika; Thorne, Natalie P; Bahlo, Melanie
2015-06-17
High-throughput sequencing of cell-free DNA fragments found in human plasma has been used to non-invasively detect fetal aneuploidy, monitor organ transplants and investigate tumor DNA. However, many biological properties of this extracellular genetic material remain unknown. Research that further characterizes circulating DNA could substantially increase its diagnostic value by allowing the application of more sophisticated bioinformatics tools that lead to an improved signal to noise ratio in the sequencing data. In this study, we investigate various features of cell-free DNA in plasma using deep-sequencing data from two pregnant women (>70X, >50X) and compare them with matched cellular DNA. We utilize a descriptive approach to examine how the biological cleavage of cell-free DNA affects different sequence signatures such as fragment lengths, sequence motifs at fragment ends and the distribution of cleavage sites along the genome. We show that the size distributions of these cell-free DNA molecules are dependent on their autosomal and mitochondrial origin as well as the genomic location within chromosomes. DNA mapping to particular microsatellites and alpha repeat elements display unique size signatures. We show how cell-free fragments occur in clusters along the genome, localizing to nucleosomal arrays and are preferentially cleaved at linker regions by correlating the mapping locations of these fragments with ENCODE annotation of chromatin organization. Our work further demonstrates that cell-free autosomal DNA cleavage is sequence dependent. The region spanning up to 10 positions on either side of the DNA cleavage site show a consistent pattern of preference for specific nucleotides. This sequence motif is present in cleavage sites localized to nucleosomal cores and linker regions but is absent in nucleosome-free mitochondrial DNA. These background signals in cell-free DNA sequencing data stem from the non-random biological cleavage of these fragments. This sequence structure can be harnessed to improve bioinformatics algorithms, in particular for CNV and structural variant detection. Descriptive measures for cell-free DNA features developed here could also be used in biomarker analysis to monitor the changes that occur during different pathological conditions.
Discrete Ramanujan transform for distinguishing the protein coding regions from other regions.
Hua, Wei; Wang, Jiasong; Zhao, Jian
2014-01-01
Based on the study of Ramanujan sum and Ramanujan coefficient, this paper suggests the concepts of discrete Ramanujan transform and spectrum. Using Voss numerical representation, one maps a symbolic DNA strand as a numerical DNA sequence, and deduces the discrete Ramanujan spectrum of the numerical DNA sequence. It is well known that of discrete Fourier power spectrum of protein coding sequence has an important feature of 3-base periodicity, which is widely used for DNA sequence analysis by the technique of discrete Fourier transform. It is performed by testing the signal-to-noise ratio at frequency N/3 as a criterion for the analysis, where N is the length of the sequence. The results presented in this paper show that the property of 3-base periodicity can be only identified as a prominent spike of the discrete Ramanujan spectrum at period 3 for the protein coding regions. The signal-to-noise ratio for discrete Ramanujan spectrum is defined for numerical measurement. Therefore, the discrete Ramanujan spectrum and the signal-to-noise ratio of a DNA sequence can be used for distinguishing the protein coding regions from the noncoding regions. All the exon and intron sequences in whole chromosomes 1, 2, 3 and 4 of Caenorhabditis elegans have been tested and the histograms and tables from the computational results illustrate the reliability of our method. In addition, we have analyzed theoretically and gotten the conclusion that the algorithm for calculating discrete Ramanujan spectrum owns the lower computational complexity and higher computational accuracy. The computational experiments show that the technique by using discrete Ramanujan spectrum for classifying different DNA sequences is a fast and effective method. Copyright © 2014 Elsevier Ltd. All rights reserved.
Googling DNA sequences on the World Wide Web.
Hajibabaei, Mehrdad; Singer, Gregory A C
2009-11-10
New web-based technologies provide an excellent opportunity for sharing and accessing information and using web as a platform for interaction and collaboration. Although several specialized tools are available for analyzing DNA sequence information, conventional web-based tools have not been utilized for bioinformatics applications. We have developed a novel algorithm and implemented it for searching species-specific genomic sequences, DNA barcodes, by using popular web-based methods such as Google. We developed an alignment independent character based algorithm based on dividing a sequence library (DNA barcodes) and query sequence to words. The actual search is conducted by conventional search tools such as freely available Google Desktop Search. We implemented our algorithm in two exemplar packages. We developed pre and post-processing software to provide customized input and output services, respectively. Our analysis of all publicly available DNA barcode sequences shows a high accuracy as well as rapid results. Our method makes use of conventional web-based technologies for specialized genetic data. It provides a robust and efficient solution for sequence search on the web. The integration of our search method for large-scale sequence libraries such as DNA barcodes provides an excellent web-based tool for accessing this information and linking it to other available categories of information on the web.
On site DNA barcoding by nanopore sequencing
Menegon, Michele; Cantaloni, Chiara; Rodriguez-Prieto, Ana; Centomo, Cesare; Abdelfattah, Ahmed; Rossato, Marzia; Bernardi, Massimo; Xumerle, Luciano; Loader, Simon; Delledonne, Massimo
2017-01-01
Biodiversity research is becoming increasingly dependent on genomics, which allows the unprecedented digitization and understanding of the planet’s biological heritage. The use of genetic markers i.e. DNA barcoding, has proved to be a powerful tool in species identification. However, full exploitation of this approach is hampered by the high sequencing costs and the absence of equipped facilities in biodiversity-rich countries. In the present work, we developed a portable sequencing laboratory based on the portable DNA sequencer from Oxford Nanopore Technologies, the MinION. Complementary laboratory equipment and reagents were selected to be used in remote and tough environmental conditions. The performance of the MinION sequencer and the portable laboratory was tested for DNA barcoding in a mimicking tropical environment, as well as in a remote rainforest of Tanzania lacking electricity. Despite the relatively high sequencing error-rate of the MinION, the development of a suitable pipeline for data analysis allowed the accurate identification of different species of vertebrates including amphibians, reptiles and mammals. In situ sequencing of a wild frog allowed us to rapidly identify the species captured, thus confirming that effective DNA barcoding in the field is possible. These results open new perspectives for real-time-on-site DNA sequencing thus potentially increasing opportunities for the understanding of biodiversity in areas lacking conventional laboratory facilities. PMID:28977016
Linear and Nonlinear Statistical Characterization of DNA
NASA Astrophysics Data System (ADS)
Norio Oiwa, Nestor; Goldman, Carla; Glazier, James
2002-03-01
We find spatial order in the distribution of protein-coding (including RNAs) and control segments of GenBank genomic sequences, irrespective of ATCG content. This is achieved by correlations, histograms, fractal dimensions and singularity spectra. Estimates of these quantities in complete nuclear genome indicate that coding sequences are long-range correlated and their disposition are self-similar (multifractal) for eukaryotes. These characteristics are absent in prokaryotes, where there are few noncoding sequences, suggesting the `junk' DNA play a relevant role to the genome structure and function. Concerning the genetic message of ATCG sequences, we build a random walk (Levy flight), using DNA symmetry arguments, where we associate A, T, C and G as left, right, down and up steps, respectively. Nonlinear analysis of mitochondrial DNA walks reveal multifractal pattern based on palindromic sequences, which fold in hairpins and loops.
Breathing dynamics based parameter sensitivity analysis of hetero-polymeric DNA
DOE Office of Scientific and Technical Information (OSTI.GOV)
Talukder, Srijeeta; Sen, Shrabani; Chaudhury, Pinaki, E-mail: pinakc@rediffmail.com
We study the parameter sensitivity of hetero-polymeric DNA within the purview of DNA breathing dynamics. The degree of correlation between the mean bubble size and the model parameters is estimated for this purpose for three different DNA sequences. The analysis leads us to a better understanding of the sequence dependent nature of the breathing dynamics of hetero-polymeric DNA. Out of the 14 model parameters for DNA stability in the statistical Poland-Scheraga approach, the hydrogen bond interaction ε{sub hb}(AT) for an AT base pair and the ring factor ξ turn out to be the most sensitive parameters. In addition, the stackingmore » interaction ε{sub st}(TA-TA) for an TA-TA nearest neighbor pair of base-pairs is found to be the most sensitive one among all stacking interactions. Moreover, we also establish that the nature of stacking interaction has a deciding effect on the DNA breathing dynamics, not the number of times a particular stacking interaction appears in a sequence. We show that the sensitivity analysis can be used as an effective measure to guide a stochastic optimization technique to find the kinetic rate constants related to the dynamics as opposed to the case where the rate constants are measured using the conventional unbiased way of optimization.« less
Chen, Zhen-Yong; Guo, Xiao-Jiang; Chen, Zhong-Xu; Chen, Wei-Ying; Wang, Ji-Rui
2017-06-01
The binding sites of transcription factors (TFs) in upstream DNA regions are called transcription factor binding sites (TFBSs). TFBSs are important elements for regulating gene expression. To date, there have been few studies on the profiles of TFBSs in plants. In total, 4,873 sequences with 5' upstream regions from 8530 wheat fl-cDNA sequences were used to predict TFBSs. We found 4572 TFBSs for the MADS TF family, which was twice as many as for bHLH (1951), B3 (1951), HB superfamily (1914), ERF (1820), and AP2/ERF (1725) TFs, and was approximately four times higher than the remaining TFBS types. The percentage of TFBSs and TF members showed a distinct distribution in different tissues. Overall, the distribution of TFBSs in the upstream regions of wheat fl-cDNA sequences had significant difference. Meanwhile, high frequencies of some types of TFBSs were found in specific regions in the upstream sequences. Both TFs and fl-cDNA with TFBSs predicted in the same tissues exhibited specific distribution preferences for regulating gene expression. The tissue-specific analysis of TFs and fl-cDNA with TFBSs provides useful information for functional research, and can be used to identify relationships between tissue-specific TFs and fl-cDNA with TFBSs. Moreover, the positional distribution of TFBSs indicates that some types of wheat TFBS have different positional distribution preferences in the upstream regions of genes.
Owa, Chie; Poulin, Matthew; Yan, Liying; Shioda, Toshi
2018-01-01
The existence of cytosine methylation in mammalian mitochondrial DNA (mtDNA) is a controversial subject. Because detection of DNA methylation depends on resistance of 5'-modified cytosines to bisulfite-catalyzed conversion to uracil, examined parameters that affect technical adequacy of mtDNA methylation analysis. Negative control amplicons (NCAs) devoid of cytosine methylation were amplified to cover the entire human or mouse mtDNA by long-range PCR. When the pyrosequencing template amplicons were gel-purified after bisulfite conversion, bisulfite pyrosequencing of NCAs did not detect significant levels of bisulfite-resistant cytosines (brCs) at ND1 (7 CpG sites) or CYTB (8 CpG sites) genes (CI95 = 0%-0.94%); without gel-purification, significant false-positive brCs were detected from NCAs (CI95 = 4.2%-6.8%). Bisulfite pyrosequencing of highly purified, linearized mtDNA isolated from human iPS cells or mouse liver detected significant brCs (~30%) in human ND1 gene when the sequencing primer was not selective in bisulfite-converted and unconverted templates. However, repeated experiments using a sequencing primer selective in bisulfite-converted templates almost completely (< 0.8%) suppressed brC detection, supporting the false-positive nature of brCs detected using the non-selective primer. Bisulfite-seq deep sequencing of linearized, gel-purified human mtDNA detected 9.4%-14.8% brCs for 9 CpG sites in ND1 gene. However, because all these brCs were associated with adjacent non-CpG brCs showing the same degrees of bisulfite resistance, DNA methylation in this mtDNA-encoded gene was not confirmed. Without linearization, data generated by bisulfite pyrosequencing or deep sequencing of purified mtDNA templates did not pass the quality control criteria. Shotgun bisulfite sequencing of human mtDNA detected extremely low levels of CpG methylation (<0.65%) over non-CpG methylation (<0.55%). Taken together, our study demonstrates that adequacy of mtDNA methylation analysis using methods dependent on bisulfite conversion needs to be established for each experiment, taking effects of incomplete bisulfite conversion and template impurity or topology into consideration.
Sunflower centromeres consist of a centromere-specific LINE and a chromosome-specific tandem repeat.
Nagaki, Kiyotaka; Tanaka, Keisuke; Yamaji, Naoki; Kobayashi, Hisato; Murata, Minoru
2015-01-01
The kinetochore is a protein complex including kinetochore-specific proteins that plays a role in chromatid segregation during mitosis and meiosis. The complex associates with centromeric DNA sequences that are usually species-specific. In plant species, tandem repeats including satellite DNA sequences and retrotransposons have been reported as centromeric DNA sequences. In this study on sunflowers, a cDNA-encoding centromere-specific histone H3 (CENH3) was isolated from a cDNA pool from a seedling, and an antibody was raised against a peptide synthesized from the deduced cDNA. The antibody specifically recognized the sunflower CENH3 (HaCENH3) and showed centromeric signals by immunostaining and immunohistochemical staining analysis. The antibody was also applied in chromatin immunoprecipitation (ChIP)-Seq to isolate centromeric DNA sequences and two different types of repetitive DNA sequences were identified. One was a long interspersed nuclear element (LINE)-like sequence, which showed centromere-specific signals on almost all chromosomes in sunflowers. This is the first report of a centromeric LINE sequence, suggesting possible centromere targeting ability. Another type of identified repetitive DNA was a tandem repeat sequence with a 187-bp unit that was found only on a pair of chromosomes. The HaCENH3 content of the tandem repeats was estimated to be much higher than that of the LINE, which implies centromere evolution from LINE-based centromeres to more stable tandem-repeat-based centromeres. In addition, the epigenetic status of the sunflower centromeres was investigated by immunohistochemical staining and ChIP, and it was found that centromeres were heterochromatic.
Regional differences in mitochondrial DNA methylation in human post-mortem brain tissue.
Devall, Matthew; Smith, Rebecca G; Jeffries, Aaron; Hannon, Eilis; Davies, Matthew N; Schalkwyk, Leonard; Mill, Jonathan; Weedon, Michael; Lunnon, Katie
2017-01-01
DNA methylation is an important epigenetic mechanism involved in gene regulation, with alterations in DNA methylation in the nuclear genome being linked to numerous complex diseases. Mitochondrial DNA methylation is a phenomenon that is receiving ever-increasing interest, particularly in diseases characterized by mitochondrial dysfunction; however, most studies have been limited to the investigation of specific target regions. Analyses spanning the entire mitochondrial genome have been limited, potentially due to the amount of input DNA required. Further, mitochondrial genetic studies have been previously confounded by nuclear-mitochondrial pseudogenes. Methylated DNA Immunoprecipitation Sequencing is a technique widely used to profile DNA methylation across the nuclear genome; however, reads mapped to mitochondrial DNA are often discarded. Here, we have developed an approach to control for nuclear-mitochondrial pseudogenes within Methylated DNA Immunoprecipitation Sequencing data. We highlight the utility of this approach in identifying differences in mitochondrial DNA methylation across regions of the human brain and pre-mortem blood. We were able to correlate mitochondrial DNA methylation patterns between the cortex, cerebellum and blood. We identified 74 nominally significant differentially methylated regions ( p < 0.05) in the mitochondrial genome, between anatomically separate cortical regions and the cerebellum in matched samples ( N = 3 matched donors). Further analysis identified eight significant differentially methylated regions between the total cortex and cerebellum after correcting for multiple testing. Using unsupervised hierarchical clustering analysis of the mitochondrial DNA methylome, we were able to identify tissue-specific patterns of mitochondrial DNA methylation between blood, cerebellum and cortex. Our study represents a comprehensive analysis of the mitochondrial methylome using pre-existing Methylated DNA Immunoprecipitation Sequencing data to identify brain region-specific patterns of mitochondrial DNA methylation.
Statistical and linguistic features of DNA sequences
NASA Technical Reports Server (NTRS)
Havlin, S.; Buldyrev, S. V.; Goldberger, A. L.; Mantegna, R. N.; Peng, C. K.; Simons, M.; Stanley, H. E.
1995-01-01
We present evidence supporting the idea that the DNA sequence in genes containing noncoding regions is correlated, and that the correlation is remarkably long range--indeed, base pairs thousands of base pairs distant are correlated. We do not find such a long-range correlation in the coding regions of the gene. We resolve the problem of the "non-stationary" feature of the sequence of base pairs by applying a new algorithm called Detrended Fluctuation Analysis (DFA). We address the claim of Voss that there is no difference in the statistical properties of coding and noncoding regions of DNA by systematically applying the DFA algorithm, as well as standard FFT analysis, to all eukaryotic DNA sequences (33 301 coding and 29 453 noncoding) in the entire GenBank database. We describe a simple model to account for the presence of long-range power-law correlations which is based upon a generalization of the classic Levy walk. Finally, we describe briefly some recent work showing that the noncoding sequences have certain statistical features in common with natural languages. Specifically, we adapt to DNA the Zipf approach to analyzing linguistic texts, and the Shannon approach to quantifying the "redundancy" of a linguistic text in terms of a measurable entropy function. We suggest that noncoding regions in plants and invertebrates may display a smaller entropy and larger redundancy than coding regions, further supporting the possibility that noncoding regions of DNA may carry biological information.
Packialakshmi, R M; Srivastava, N; Girish, K R; Usha, R
2010-08-01
Vernonia cinerea plants with yellow vein symptoms were collected around crop fields in Madurai. A portion (550 bp) of the AV1 gene amplified using degenerate primers from the total DNA purified from diseased leaf sample was cloned and sequenced. Specific primers derived from the above sequence were used to amplify 2,745 nucleotides with the typical genome organization of begomoviral DNA A (EMBL Accession No. AM182232). Sequence comparison with other begomoviruses revealed the greatest identity (82.4%) with Emilia yellow vein virus (EmYVV-[Fz1]) from China and less than 80% with all other known begomoviruses. The International Committee on Taxonomy of Viruses (ICTV) has therefore recognized Vernonia yellow vein virus (VeYVV) as a distinct begomovirus species. Conventional PCR could not amplify the DNA B or DNA beta from the diseased tissue. However, the beta DNA (1364 bp) associated with the disease was obtained (Accession No. FN435836) by the rolling circle amplification-restriction fragment length polymorphism method (RCA-RFLP) using Phi 29 DNA polymerase. Sequence analysis shows that DNA beta of VeYVV has the highest identity (56.8%) with DNA beta of Sigesbeckia yellow vein Guangxi betasatellite (SibYVGxB-[CN: Gx111:05]) and 56-53% with DNA beta associated with other begomoviruses. This is the first report of the molecular characterization of VeYVV from V. cinerea in India. The complete molecular characterization, phylogenetic analysis, and putative recombination events in VeYVV are reported.
In silico Analysis of 2085 Clones from a Normalized Rat Vestibular Periphery 3′ cDNA Library
Roche, Joseph P.; Cioffi, Joseph A.; Kwitek, Anne E.; Erbe, Christy B.; Popper, Paul
2005-01-01
The inserts from 2400 cDNA clones isolated from a normalized Rattus norvegicus vestibular periphery cDNA library were sequenced and characterized. The Wackym-Soares vestibular 3′ cDNA library was constructed from the saccular and utricular maculae, the ampullae of all three semicircular canals and Scarpa's ganglia containing the somata of the primary afferent neurons, microdissected from 104 male and female rats. The inserts from 2400 randomly selected clones were sequenced from the 5′ end. Each sequence was analyzed using the BLAST algorithm compared to the Genbank nonredundant, rat genome, mouse genome and human genome databases to search for high homology alignments. Of the initial 2400 clones, 315 (13%) were found to be of poor quality and did not yield useful information, and therefore were eliminated from the analysis. Of the remaining 2085 sequences, 918 (44%) were found to represent 758 unique genes having useful annotations that were identified in databases within the public domain or in the published literature; these sequences were designated as known characterized sequences. 1141 sequences (55%) aligned with 1011 unique sequences had no useful annotations and were designated as known but uncharacterized sequences. Of the remaining 26 sequences (1%), 24 aligned with rat genomic sequences, but none matched previously described rat expressed sequence tags or mRNAs. No significant alignment to the rat or human genomic sequences could be found for the remaining 2 sequences. Of the 2085 sequences analyzed, 86% were singletons. The known, characterized sequences were analyzed with the FatiGO online data-mining tool (http://fatigo.bioinfo.cnio.es/) to identify level 5 biological process gene ontology (GO) terms for each alignment and to group alignments with similar or identical GO terms. Numerous genes were identified that have not been previously shown to be expressed in the vestibular system. Further characterization of the novel cDNA sequences may lead to the identification of genes with vestibular-specific functions. Continued analysis of the rat vestibular periphery transcriptome should provide new insights into vestibular function and generate new hypotheses. Physiological studies are necessary to further elucidate the roles of the identified genes and novel sequences in vestibular function. PMID:16103642
Kachhap, Sangita; Singh, Balvinder
2015-01-01
In most of homeodomain-DNA complexes, glutamine or lysine is present at 50th position and interacts with 5th and 6th nucleotide of core recognition region. Molecular dynamics simulations of Msx-1-DNA complex (Q50-TG) and its variant complexes, that is specific (Q50K-CC), nonspecific (Q50-CC) having mutation in DNA and (Q50K-TG) in protein, have been carried out. Analysis of protein-DNA interactions and structure of DNA in specific and nonspecific complexes show that amino acid residues use sequence-dependent shape of DNA to interact. The binding free energies of all four complexes were analysed to define role of amino acid residue at 50th position in terms of binding strength considering the variation in DNA on stability of protein-DNA complexes. The order of stability of protein-DNA complexes shows that specific complexes are more stable than nonspecific ones. Decomposition analysis shows that N-terminal amino acid residues have been found to contribute maximally in binding free energy of protein-DNA complexes. Among specific protein-DNA complexes, K50 contributes more as compared to Q50 towards binding free energy in respective complexes. The sequence dependence of local conformation of DNA enables Q50/Q50K to make hydrogen bond with nucleotide(s) of DNA. The changes in amino acid sequence of protein are accommodated and stabilized around TAAT core region of DNA having variation in nucleotides.
Characterization of proviruses cloned from mink cell focus-forming virus-infected cellular DNA.
Khan, A S; Repaske, R; Garon, C F; Chan, H W; Rowe, W P; Martin, M A
1982-01-01
Two proviruses were cloned from EcoRI-digested DNA extracted from mink cells chronically infected with AKR mink cell focus-forming (MCF) 247 murine leukemia virus (MuLV), using a lambda phage host vector system. One cloned MuLV DNA fragment (designated MCF 1) contained sequences extending 6.8 kilobases from an EcoRI restriction site in the 5' long terminal repeat (LTR) to an EcoRI site located in the envelope (env) region and was indistinguishable by restriction endonuclease mapping for 5.1 kilobases (except for the EcoRI site in the LTR) from the 5' end of AKR ecotropic proviral DNA. The DNA segment extending from 5.1 to 6.8 kilobases contained several restriction sites that were not present in the AKR ecotropic provirus. A 0.5-kilobase DNA segment located at the 3' end of MCF 1 DNA contained sequences which hybridized to a xenotropic env-specific DNA probe but not to labeled ecotropic env-specific DNA. This dual character of MCF 1 proviral DNA was also confirmed by analyzing heteroduplex molecules by electron microscopy. The second cloned proviral DNA (designated MCF 2) was a 6.9-kilobase EcoRI DNA fragment which contained LTR sequences at each end and a 2.0-kilobase deletion encompassing most of the env region. The MCF 2 proviral DNA proved to be a useful reagent for detecting LTRs electron microscopically due to the presence of nonoverlapping, terminally located LTR sequences which effected its circularization with DNAs containing homologous LTR sequences. Nucleotide sequence analysis demonstrated the presence of a 104-base-pair direct repeat in the LTR of MCF 2 DNA. In contrast, only a single copy of the reiterated component of the direct repeat was present in MCF 1 DNA. Images PMID:6281459
Binladen, Jonas; Gilbert, M Thomas P; Bollback, Jonathan P; Panitz, Frank; Bendixen, Christian; Nielsen, Rasmus; Willerslev, Eske
2007-02-14
The invention of the Genome Sequence 20 DNA Sequencing System (454 parallel sequencing platform) has enabled the rapid and high-volume production of sequence data. Until now, however, individual emulsion PCR (emPCR) reactions and subsequent sequencing runs have been unable to combine template DNA from multiple individuals, as homologous sequences cannot be subsequently assigned to their original sources. We use conventional PCR with 5'-nucleotide tagged primers to generate homologous DNA amplification products from multiple specimens, followed by sequencing through the high-throughput Genome Sequence 20 DNA Sequencing System (GS20, Roche/454 Life Sciences). Each DNA sequence is subsequently traced back to its individual source through 5'tag-analysis. We demonstrate that this new approach enables the assignment of virtually all the generated DNA sequences to the correct source once sequencing anomalies are accounted for (miss-assignment rate<0.4%). Therefore, the method enables accurate sequencing and assignment of homologous DNA sequences from multiple sources in single high-throughput GS20 run. We observe a bias in the distribution of the differently tagged primers that is dependent on the 5' nucleotide of the tag. In particular, primers 5' labelled with a cytosine are heavily overrepresented among the final sequences, while those 5' labelled with a thymine are strongly underrepresented. A weaker bias also exists with regards to the distribution of the sequences as sorted by the second nucleotide of the dinucleotide tags. As the results are based on a single GS20 run, the general applicability of the approach requires confirmation. However, our experiments demonstrate that 5'primer tagging is a useful method in which the sequencing power of the GS20 can be applied to PCR-based assays of multiple homologous PCR products. The new approach will be of value to a broad range of research areas, such as those of comparative genomics, complete mitochondrial analyses, population genetics, and phylogenetics.
Chen, Y. C.; Eisner, J. D.; Kattar, M. M.; Rassoulian-Barrett, S. L.; LaFe, K.; Yarfitz, S. L.; Limaye, A. P.; Cookson, B. T.
2000-01-01
Identification of medically relevant yeasts can be time-consuming and inaccurate with current methods. We evaluated PCR-based detection of sequence polymorphisms in the internal transcribed spacer 2 (ITS2) region of the rRNA genes as a means of fungal identification. Clinical isolates (401), reference strains (6), and type strains (27), representing 34 species of yeasts were examined. The length of PCR-amplified ITS2 region DNA was determined with single-base precision in less than 30 min by using automated capillary electrophoresis. Unique, species-specific PCR products ranging from 237 to 429 bp were obtained from 92% of the clinical isolates. The remaining 8%, divided into groups with ITS2 regions which differed by ≤2 bp in mean length, all contained species-specific DNA sequences easily distinguishable by restriction enzyme analysis. These data, and the specificity of length polymorphisms for identifying yeasts, were confirmed by DNA sequence analysis of the ITS2 region from 93 isolates. Phenotypic and ITS2-based identification was concordant for 427 of 434 yeast isolates examined using sequence identity of ≥99%. Seven clinical isolates contained ITS2 sequences that did not agree with their phenotypic identification, and ITS2-based phylogenetic analyses indicate the possibility of new or clinically unusual species in the Rhodotorula and Candida genera. This work establishes an initial database, validated with over 400 clinical isolates, of ITS2 length and sequence polymorphisms for 34 species of yeasts. We conclude that size and restriction analysis of PCR-amplified ITS2 region DNA is a rapid and reliable method to identify clinically significant yeasts, including potentially new or emerging pathogenic species. PMID:10834993
GATA: A graphic alignment tool for comparative sequenceanalysis
DOE Office of Scientific and Technical Information (OSTI.GOV)
Nix, David A.; Eisen, Michael B.
2005-01-01
Several problems exist with current methods used to align DNA sequences for comparative sequence analysis. Most dynamic programming algorithms assume that conserved sequence elements are collinear. This assumption appears valid when comparing orthologous protein coding sequences. Functional constraints on proteins provide strong selective pressure against sequence inversions, and minimize sequence duplications and feature shuffling. For non-coding sequences this collinearity assumption is often invalid. For example, enhancers contain clusters of transcription factor binding sites that change in number, orientation, and spacing during evolution yet the enhancer retains its activity. Dotplot analysis is often used to estimate non-coding sequence relatedness. Yet dotmore » plots do not actually align sequences and thus cannot account well for base insertions or deletions. Moreover, they lack an adequate statistical framework for comparing sequence relatedness and are limited to pairwise comparisons. Lastly, dot plots and dynamic programming text outputs fail to provide an intuitive means for visualizing DNA alignments.« less
Methods of DNA methylation analysis.
USDA-ARS?s Scientific Manuscript database
The purpose of this review was to provide guidance for investigators who are new to the field of DNA methylation analysis. Epigenetics is the study of mitotically heritable alterations in gene expression potential that are not mediated by changes in DNA sequence. Recently, it has become clear that n...
Evaluation of microbial community in hydrothermal field by direct DNA sequencing
NASA Astrophysics Data System (ADS)
Kawarabayasi, Y.; Maruyama, A.
2002-12-01
Many extremophiles have been discovered from terrestrial and marine hydrothermal fields. Some thermophiles can grow beyond 90°C in culture, while direct microscopic analysis occasionally indicates that microbes may survive in much hotter hydrothermal fluids. However, it is very difficult to isolate and cultivate such microbes from the environments, i.e., over 99% of total microbes remains undiscovered. Based on experiences of entire microbial genome analysis (Y.K.) and microbial community analysis (A.M.), we started to find out unique microbes/genes in hydrothermal fields through direct sequencing of environmental DNA fragments. At first, shotgun plasmid libraries were directly constructed with the DNA molecules prepared from mixed microbes collected by an in situ filtration system from low-temperature fluids at RM24 in the Southern East Pacific Rise (S-EPR). A gene amplification (PCR) technique was not used for preventing mutation in the process. The nucleotide sequences of 285 clones indicated that no sequence had identical data in public databases. Among 27 clones determined entire sequences, no ORF was identified on 14 clones like intron in Eukaryote. On four clones, tetra-nucleotide-long multiple tandem repetitive sequences were identified. This type of sequence was identified in some familiar disease in human. The result indicates that living/dead materials with eukaryotic features may exist in this low temperature field. Secondly, shotgun plasmid libraries were constructed from the environmental DNA prepared from Beppu hot springs. In randomly-selected 143 clones used for sequencing, no known sequence was identified. Unlike the clones in S-EPR library, clear ORFs were identified on all nine clones determined the entire sequence. It was found that one clone, H4052, contained the complete Aspartyl-tRNA synthetase. Phylogenetic analysis using amino acid sequences of this gene indicated that this gene was separated from other Euryarchaea before the differentiation of species. Thus, some novel archaeal species are expected to be in this field. The present direct cloning and sequencing technique is now opening a window to the new world in hydrothermal microbial community analysis.
Mapping the Space of Genomic Signatures
Kari, Lila; Hill, Kathleen A.; Sayem, Abu S.; Karamichalis, Rallis; Bryans, Nathaniel; Davis, Katelyn; Dattani, Nikesh S.
2015-01-01
We propose a computational method to measure and visualize interrelationships among any number of DNA sequences allowing, for example, the examination of hundreds or thousands of complete mitochondrial genomes. An "image distance" is computed for each pair of graphical representations of DNA sequences, and the distances are visualized as a Molecular Distance Map: Each point on the map represents a DNA sequence, and the spatial proximity between any two points reflects the degree of structural similarity between the corresponding sequences. The graphical representation of DNA sequences utilized, Chaos Game Representation (CGR), is genome- and species-specific and can thus act as a genomic signature. Consequently, Molecular Distance Maps could inform species identification, taxonomic classifications and, to a certain extent, evolutionary history. The image distance employed, Structural Dissimilarity Index (DSSIM), implicitly compares the occurrences of oligomers of length up to k (herein k = 9) in DNA sequences. We computed DSSIM distances for more than 5 million pairs of complete mitochondrial genomes, and used Multi-Dimensional Scaling (MDS) to obtain Molecular Distance Maps that visually display the sequence relatedness in various subsets, at different taxonomic levels. This general-purpose method does not require DNA sequence alignment and can thus be used to compare similar or vastly different DNA sequences, genomic or computer-generated, of the same or different lengths. We illustrate potential uses of this approach by applying it to several taxonomic subsets: phylum Vertebrata, (super)kingdom Protista, classes Amphibia-Insecta-Mammalia, class Amphibia, and order Primates. This analysis of an extensive dataset confirms that the oligomer composition of full mtDNA sequences can be a source of taxonomic information. This method also correctly finds the mtDNA sequences most closely related to that of the anatomically modern human (the Neanderthal, the Denisovan, and the chimp), and that the sequence most different from it in this dataset belongs to a cucumber. PMID:26000734
Tsui, Nancy B. Y.; Jiang, Peiyong; Chow, Katherine C. K.; Su, Xiaoxi; Leung, Tak Y.; Sun, Hao; Chan, K. C. Allen; Chiu, Rossa W. K.; Lo, Y. M. Dennis
2012-01-01
Background Fetal DNA in maternal urine, if present, would be a valuable source of fetal genetic material for noninvasive prenatal diagnosis. However, the existence of fetal DNA in maternal urine has remained controversial. The issue is due to the lack of appropriate technology to robustly detect the potentially highly degraded fetal DNA in maternal urine. Methodology We have used massively parallel paired-end sequencing to investigate cell-free DNA molecules in maternal urine. Catheterized urine samples were collected from seven pregnant women during the third trimester of pregnancies. We detected fetal DNA by identifying sequenced reads that contained fetal-specific alleles of the single nucleotide polymorphisms. The sizes of individual urinary DNA fragments were deduced from the alignment positions of the paired reads. We measured the fractional fetal DNA concentration as well as the size distributions of fetal and maternal DNA in maternal urine. Principal Findings Cell-free fetal DNA was detected in five of the seven maternal urine samples, with the fractional fetal DNA concentrations ranged from 1.92% to 4.73%. Fetal DNA became undetectable in maternal urine after delivery. The total urinary cell-free DNA molecules were less intact when compared with plasma DNA. Urinary fetal DNA fragments were very short, and the most dominant fetal sequences were between 29 bp and 45 bp in length. Conclusions With the use of massively parallel sequencing, we have confirmed the existence of transrenal fetal DNA in maternal urine, and have shown that urinary fetal DNA was heavily degraded. PMID:23118982
Image Encryption Algorithm Based on Hyperchaotic Maps and Nucleotide Sequences Database
2017-01-01
Image encryption technology is one of the main means to ensure the safety of image information. Using the characteristics of chaos, such as randomness, regularity, ergodicity, and initial value sensitiveness, combined with the unique space conformation of DNA molecules and their unique information storage and processing ability, an efficient method for image encryption based on the chaos theory and a DNA sequence database is proposed. In this paper, digital image encryption employs a process of transforming the image pixel gray value by using chaotic sequence scrambling image pixel location and establishing superchaotic mapping, which maps quaternary sequences and DNA sequences, and by combining with the logic of the transformation between DNA sequences. The bases are replaced under the displaced rules by using DNA coding in a certain number of iterations that are based on the enhanced quaternary hyperchaotic sequence; the sequence is generated by Chen chaos. The cipher feedback mode and chaos iteration are employed in the encryption process to enhance the confusion and diffusion properties of the algorithm. Theoretical analysis and experimental results show that the proposed scheme not only demonstrates excellent encryption but also effectively resists chosen-plaintext attack, statistical attack, and differential attack. PMID:28392799
Draft versus finished sequence data for DNA and protein diagnostic signature development
Gardner, Shea N.; Lam, Marisa W.; Smith, Jason R.; Torres, Clinton L.; Slezak, Tom R.
2005-01-01
Sequencing pathogen genomes is costly, demanding careful allocation of limited sequencing resources. We built a computational Sequencing Analysis Pipeline (SAP) to guide decisions regarding the amount of genomic sequencing necessary to develop high-quality diagnostic DNA and protein signatures. SAP uses simulations to estimate the number of target genomes and close phylogenetic relatives (near neighbors or NNs) to sequence. We use SAP to assess whether draft data are sufficient or finished sequencing is required using Marburg and variola virus sequences. Simulations indicate that intermediate to high-quality draft with error rates of 10−3–10−5 (∼8× coverage) of target organisms is suitable for DNA signature prediction. Low-quality draft with error rates of ∼1% (3× to 6× coverage) of target isolates is inadequate for DNA signature prediction, although low-quality draft of NNs is sufficient, as long as the target genomes are of high quality. For protein signature prediction, sequencing errors in target genomes substantially reduce the detection of amino acid sequence conservation, even if the draft is of high quality. In summary, high-quality draft of target and low-quality draft of NNs appears to be a cost-effective investment for DNA signature prediction, but may lead to underestimation of predicted protein signatures. PMID:16243783
Zhao, Ya-E; Wu, Li-Ping
2012-09-01
To confirm phylogenetic relationships in Demodex mites based on mitochondrial 16S rDNA partial sequences, mtDNA 16S partial sequences of ten isolates of three Demodex species from China were amplified, recombined, and sequenced and then analyzed with two Demodex folliculorum isolates from Spain. Lastly, genetic distance was computed, and phylogenetic tree was reconstructed. MEGA 4.0 analysis showed high sequence identity among 16S rDNA partial sequences of three Demodex species, which were 95.85 % in D. folliculorum, 98.53 % in Demodex canis, and 99.71 % in Demodex brevis. The divergence, genetic distance, and transition/transversions of the three Demodex species reached interspecies level, whereas there was no significant difference of the divergence (1.1 %), genetic distance (0.011), and transition/transversions (3/1) of the two geographic D. folliculorum isolates (Spain and China). Phylogenetic trees reveal that the three Demodex species formed three separate branches of one clade, where D. folliculorum and D. canis gathered first, and then gathered with D. brevis. The two Spain and five China D. folliculorum isolates did not form sister clades. In conclusion, 16S mtDNA are suitable for phylogenetic relationship analysis in low taxa (genus or species), but not for intraspecies determination of Demodex. The differentiation among the three Demodex species has reached interspecies level.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Nussbaum, R.L.; Lesko, J.G.; Lewis, R.A.
1987-09-01
Choroideremia, an X-chromosome linked retinal dystrophy of unknown pathogenesis, causes progressive nightblindness and eventual central blindness in affected males by the third to fourth decade of life. Choroideremia has been mapped to Xq13-21 by tight linkage to restriction fragment length polymorphism loci. The authors have recently identified two families in which choroideremia is inherited with mental retardation and deafness. In family XL-62, an interstitial deletion Xq21 is visible by cytogenetic analysis and two linked anonymous DNA markers, DXYS1 and DXS72, are deleted. In the second family, XL-45, an interstitial deletion was suspected on phenotypic grounds but could not be confirmedmore » by high-resolution cytogenetic analysis. They used phenol-enhanced reassociation of 48,XXXX DNA in competition with excess XL-45 DNA to generate a library of cloned DNA enriched for sequences that might be deleted in XL-45. Two of the first 83 sequences characterized from the library were found to be deleted in probands from family XL-45 as well as from family XL-62. Isolation of these sequences proves that XL-45 does contain a submicroscopic deletion and provides a starting point for identifying overlapping genomic sequences that span the XL-45 deletion. Each overlapping sequence will be studied to identify exons from the choroideremia locus.« less
Development of a Green Fluorescent Protein-Based Laboratory Curriculum
ERIC Educational Resources Information Center
Larkin, Patrick D.; Hartberg, Yasha
2005-01-01
A laboratory curriculum has been designed for an undergraduate biochemistry course that focuses on the investigation of the green fluorescent protein (GFP). The sequence of procedures extends from analysis of the DNA sequence through PCR amplification, recombinant plasmid DNA synthesis, bacterial transformation, expression, isolation, and…
Xu, Yi-Hua; Manoharan, Herbert T; Pitot, Henry C
2007-09-01
The bisulfite genomic sequencing technique is one of the most widely used techniques to study sequence-specific DNA methylation because of its unambiguous ability to reveal DNA methylation status to the order of a single nucleotide. One characteristic feature of the bisulfite genomic sequencing technique is that a number of sample sequence files will be produced from a single DNA sample. The PCR products of bisulfite-treated DNA samples cannot be sequenced directly because they are heterogeneous in nature; therefore they should be cloned into suitable plasmids and then sequenced. This procedure generates an enormous number of sample DNA sequence files as well as adding extra bases belonging to the plasmids to the sequence, which will cause problems in the final sequence comparison. Finding the methylation status for each CpG in each sample sequence is not an easy job. As a result CpG PatternFinder was developed for this purpose. The main functions of the CpG PatternFinder are: (i) to analyze the reference sequence to obtain CpG and non-CpG-C residue position information. (ii) To tailor sample sequence files (delete insertions and mark deletions from the sample sequence files) based on a configuration of ClustalW multiple alignment. (iii) To align sample sequence files with a reference file to obtain bisulfite conversion efficiency and CpG methylation status. And, (iv) to produce graphics, highlighted aligned sequence text and a summary report which can be easily exported to Microsoft Office suite. CpG PatternFinder is designed to operate cooperatively with BioEdit, a freeware on the internet. It can handle up to 100 files of sample DNA sequences simultaneously, and the total CpG pattern analysis process can be finished in minutes. CpG PatternFinder is an ideal software tool for DNA methylation studies to determine the differential methylation pattern in a large number of individuals in a population. Previously we developed the CpG Analyzer program; CpG PatternFinder is our further effort to create software tools for DNA methylation studies.
Rinke, Jenny; Schäfer, Vivien; Schmidt, Mathias; Ziermann, Janine; Kohlmann, Alexander; Hochhaus, Andreas; Ernst, Thomas
2013-08-01
We sought to establish a convenient, sensitive next-generation sequencing (NGS) method for genotyping the 26 most commonly mutated leukemia-associated genes in a single work flow and to optimize this method for low amounts of input template DNA. We designed 184 PCR amplicons that cover all of the candidate genes. NGS was performed with genomic DNA (gDNA) from a cohort of 10 individuals with chronic myelomonocytic leukemia. The results were compared with NGS data obtained from sequencing of DNA generated by whole-genome amplification (WGA) of 20 ng template gDNA. Differences between gDNA and WGA samples in variant frequencies were determined for 2 different WGA kits. For gDNA samples, 25 of 26 genes were successfully sequenced with a sensitivity of 5%, which was achieved by a median coverage of 492 reads (range, 308-636 reads) per amplicon. We identified 24 distinct mutations in 11 genes. With WGA samples, we reliably detected all mutations above 5% sensitivity with a median coverage of 506 reads (range, 256-653 reads) per amplicon. With all variants included in the analysis, WGA amplification by the 2 kits tested yielded differences in variant frequencies that ranged from -28.19% to +9.94% [mean (SD) difference, -0.2% (4.08%)] and from -35.03% to +18.67% [mean difference, -0.75% (5.12%)]. Our method permits simultaneous analysis of a wide range of leukemia-associated target genes in a single sequencing run. NGS can be performed after WGA of template DNA for reliable detection of variants without introducing appreciable bias.
Satellite DNA Sequences in Canidae and Their Chromosome Distribution in Dog and Red Fox.
Vozdova, Miluse; Kubickova, Svatava; Cernohorska, Halina; Fröhlich, Jan; Rubes, Jiri
2016-01-01
Satellite DNA is a characteristic component of mammalian centromeric heterochromatin, and a comparative analysis of its evolutionary dynamics can be used for phylogenetic studies. We analysed satellite and satellite-like DNA sequences available in NCBI for 4 species of the family Canidae (red fox, Vulpes vulpes, VVU; domestic dog, Canis familiaris, CFA; arctic fox, Vulpes lagopus, VLA; raccoon dog, Nyctereutes procyonoides procyonoides, NPR) by comparative sequence analysis, which revealed 86-90% intraspecies and 76-79% interspecies similarity. Comparative fluorescence in situ hybridisation in the red fox and dog showed signals of the red fox satellite probe in canine and vulpine autosomal centromeres, on VVUY, B chromosomes, and in the distal parts of VVU9q and VVU10p which were shown to contain nucleolus organiser regions. The CFA satellite probe stained autosomal centromeres only in the dog. The CFA satellite-like DNA did not show any significant sequence similarity with the satellite DNA of any species analysed and was localised to the centromeres of 9 canine chromosome pairs. No significant heterochromatin block was detected on the B chromosomes of the red fox. Our results show extensive heterogeneity of satellite sequences among Canidae and prove close evolutionary relationships between the red and arctic fox. © 2017 S. Karger AG, Basel.
Tin, Mandy Man-Ying; Economo, Evan Philip; Mikheyev, Alexander Sergeyevich
2014-01-01
Ancient and archival DNA samples are valuable resources for the study of diverse historical processes. In particular, museum specimens provide access to biotas distant in time and space, and can provide insights into ecological and evolutionary changes over time. However, archival specimens are difficult to handle; they are often fragile and irreplaceable, and typically contain only short segments of denatured DNA. Here we present a set of tools for processing such samples for state-of-the-art genetic analysis. First, we report a protocol for minimally destructive DNA extraction of insect museum specimens, which produced sequenceable DNA from all of the samples assayed. The 11 specimens analyzed had fragmented DNA, rarely exceeding 100 bp in length, and could not be amplified by conventional PCR targeting the mitochondrial cytochrome oxidase I gene. Our approach made these samples amenable to analysis with commonly used next-generation sequencing-based molecular analytic tools, including RAD-tagging and shotgun genome re-sequencing. First, we used museum ant specimens from three species, each with its own reference genome, for RAD-tag mapping. Were able to use the degraded DNA sequences, which were sequenced in full, to identify duplicate reads and filter them prior to base calling. Second, we re-sequenced six Hawaiian Drosophila species, with millions of years of divergence, but with only a single available reference genome. Despite a shallow coverage of 0.37 ± 0.42 per base, we could recover a sufficient number of overlapping SNPs to fully resolve the species tree, which was consistent with earlier karyotypic studies, and previous molecular studies, at least in the regions of the tree that these studies could resolve. Although developed for use with degraded DNA, all of these techniques are readily applicable to more recent tissue, and are suitable for liquid handling automation.
Clarification of the Concept of Ganoderma orbiforme with High Morphological Plasticity
Wang, Dong-Mei; Wu, Sheng-Hua; Yao, Yi-Jian
2014-01-01
Ganoderma has been considered a very difficult genus among the polypores to classify and is currently in a state of taxonomic chaos. In a study of Ganoderma collections including numerous type specimens, we found that six species namely G. cupreum, G. densizonatum, G. limushanense, G. mastoporum, G. orbiforme, G. subtornatum, and records of G. fornicatum from Mainland China and Taiwan are very similar to one another in basidiocarp texture, pilear cuticle structure, context color, pore color and basidiospore characteristics. Further, we sequenced the nrDNA ITS region (ITS1 and ITS2) and partial mtDNA SSU region of the studied materials, and performed phylogenetic analyses based on these sequence data. The nrDNA ITS sequence analysis results show that the eight nrDNA ITS sequences derived from this study have single-nucleotide polymorphisms in ITS1 and/or ITS2 at inter- and intra-individual levels. In the nrDNA ITS phylogenetic trees, all the sequences from this study are grouped together with those of G. cupreum and G. mastoporum retrieved from GenBank to form a distinct clade. The mtDNA SSU sequence analysis results reveal that the five mtDNA SSU sequences derived from this study are clustered together with those of G. cupreum retrieved from GenBank and also form a distinct clade in the mtDNA SSU phylogenetic trees. Based on morphological and molecular data, we conclude that the studied taxa are conspecific. Among the names assigned to this species, G. fornicatum given to Asian collections has nomenclatural priority over the others. However, the type of G. fornicatum from Brazil is probably lost and a modern description based on the type lacks. The identification of the Asian collections to G. fornicatum therefore cannot be confirmed. To the best of our knowledge, G. orbiforme is the earliest valid name for use. PMID:24875218
De Bruyn, Alexandre; Harimalala, Mireille; Hoareau, Murielle; Ranomenjanahary, Sahondramalala; Reynaud, Bernard; Lefeuvre, Pierre; Lett, Jean-Michel
2015-06-01
Here, we describe for the first time the complete genome sequence of a new bipartite begomovirus in Madagascar isolated from the weed Asystasia gangetica (Acanthaceae), for which we propose the tentative name asystasia mosaic Madagascar virus (AMMGV). DNA-A and -B nucleotide sequences of AMMGV were only distantly related to known begomovirus sequence and shared highest nucleotide sequence identity of 72.9 % (DNA-A) and 66.9 % (DNA-B) with a recently described bipartite begomovirus infecting Asystasia sp. in West Africa. Phylogenetic analysis demonstrated that this novel virus from Madagascar belongs to a new lineage of Old World bipartite begomoviruses.
Bacillus pumilus SAFR-032 isolate
NASA Technical Reports Server (NTRS)
Venkateswaran, Kasthuri J. (Inventor)
2007-01-01
The present invention relates to discovery and isolation of a biologically pure culture of a Bacillus pumilus SAFR-032 isolate with UV sterilization resistant properties. This novel strain has been characterized on the basis of phenotypic traits, 16S rDNA sequence analysis and DNA-DNA hybridization. According to the results of these analyses, this strain belongs to the genus Bacillus. The GenBank accession number for the 16S rDNA sequence of the Bacillus pumilus SAFR-032 isolate is AY167879.
Liu, Huitao; Cui, Peng; Zhan, Kehui; Lin, Qiang; Zhuo, Guoyin; Guo, Xiaoli; Ding, Feng; Yang, Wenlong; Liu, Dongcheng; Hu, Songnian; Yu, Jun; Zhang, Aimin
2011-03-29
Plant mitochondria, semiautonomous organelles that function as manufacturers of cellular ATP, have their own genome that has a slow rate of evolution and rapid rearrangement. Cytoplasmic male sterility (CMS), a common phenotype in higher plants, is closely associated with rearrangements in mitochondrial DNA (mtDNA), and is widely used to produce F1 hybrid seeds in a variety of valuable crop species. Novel chimeric genes deduced from mtDNA rearrangements causing CMS have been identified in several plants, such as rice, sunflower, pepper, and rapeseed, but there are very few reports about mtDNA rearrangements in wheat. In the present work, we describe the mitochondrial genome of a wheat K-type CMS line and compare it with its maintainer line. The complete mtDNA sequence of a wheat K-type (with cytoplasm of Aegilops kotschyi) CMS line, Ks3, was assembled into a master circle (MC) molecule of 647,559 bp and found to harbor 34 known protein-coding genes, three rRNAs (18 S, 26 S, and 5 S rRNAs), and 16 different tRNAs. Compared to our previously published sequence of a K-type maintainer line, Km3, we detected Ks3-specific mtDNA (> 100 bp, 11.38%) and repeats (> 100 bp, 29 units) as well as genes that are unique to each line: rpl5 was missing in Ks3 and trnH was absent from Km3. We also defined 32 single nucleotide polymorphisms (SNPs) in 13 protein-coding, albeit functionally irrelevant, genes, and predicted 22 unique ORFs in Ks3, representing potential candidates for K-type CMS. All these sequence variations are candidates for involvement in CMS. A comparative analysis of the mtDNA of several angiosperms, including those from Ks3, Km3, rice, maize, Arabidopsis thaliana, and rapeseed, showed that non-coding sequences of higher plants had mostly divergent multiple reorganizations during the mtDNA evolution of higher plants. The complete mitochondrial genome of the wheat K-type CMS line Ks3 is very different from that of its maintainer line Km3, especially in non-coding sequences. Sequence rearrangement has produced novel chimeric ORFs, which may be candidate genes for CMS. Comparative analysis of several angiosperm mtDNAs indicated that non-coding sequences are the most frequently reorganized during mtDNA evolution in higher plants.
Farjami, Elaheh; Clima, Lilia; Gothelf, Kurt V; Ferapontova, Elena E
2010-06-01
A DNA molecular beacon approach was used for the analysis of interactions between DNA and Methylene Blue (MB) as a redox indicator of a hybridization event. DNA hairpin structures of different length and guanine (G) content were immobilized onto gold electrodes in their folded states through the alkanethiol linker at the 5'-end. Binding of MB to the folded hairpin DNA was electrochemically studied and compared with binding to the duplex structure formed by hybridization of the hairpin DNA to a complementary DNA strand. Variation of the electrochemical signal from the DNA-MB complex was shown to depend primarily on the DNA length and sequence used: the G-C base pairs were the preferential sites of MB binding in the duplex. For short 20 nts long DNA sequences, the increased electrochemical response from MB bound to the duplex structure was consistent with the increased amount of bound and electrochemically readable MB molecules (i.e. MB molecules that are available for the electron transfer (ET) reaction with the electrode). With longer DNA sequences, the balance between the amounts of the electrochemically readable MB molecules bound to the hairpin DNA and to the hybrid was opposite: a part of the MB molecules bound to the long-sequence DNA duplex seem to be electrochemically mute due to long ET distance. The increasing electrochemical response from MB bound to the short-length DNA hybrid contrasts with the decreasing signal from MB bound to the long-length DNA hybrid and allows an "off"-"on" genosensor development.
Molecular Targeting of Prostate Cancer During Androgen Ablation: Inhibition of CHES1/FOXN3
2013-05-01
the DNA sequences (~25^6 reads/sample) were mapped to the human genome reference sequence (hg19...tumor the AR has a genomic abnormality, placing the novel sequence 3’ of the transcriptional start site. However, it is unclear if a genomic alteration...exon/intron organization of the CHES1 gene was determined by BLAST analysis of the human genome using the 1,473-bp CHES1 cDNA sequence
Eichmann, Cordula; Parson, Walther
2008-09-01
The traditional protocol for forensic mitochondrial DNA (mtDNA) analyses involves the amplification and sequencing of the two hypervariable segments HVS-I and HVS-II of the mtDNA control region. The primers usually span fragment sizes of 300-400 bp each region, which may result in weak or failed amplification in highly degraded samples. Here we introduce an improved and more stable approach using shortened amplicons in the fragment range between 144 and 237 bp. Ten such amplicons were required to produce overlapping fragments that cover the entire human mtDNA control region. These were co-amplified in two multiplex polymerase chain reactions and sequenced with the individual amplification primers. The primers were carefully selected to minimize binding on homoplasic and haplogroup-specific sites that would otherwise result in loss of amplification due to mis-priming. The multiplexes have successfully been applied to ancient and forensic samples such as bones and teeth that showed a high degree of degradation.
To Clone or Not To Clone: Method Analysis for Retrieving Consensus Sequences In Ancient DNA Samples
Winters, Misa; Barta, Jodi Lynn; Monroe, Cara; Kemp, Brian M.
2011-01-01
The challenges associated with the retrieval and authentication of ancient DNA (aDNA) evidence are principally due to post-mortem damage which makes ancient samples particularly prone to contamination from “modern” DNA sources. The necessity for authentication of results has led many aDNA researchers to adopt methods considered to be “gold standards” in the field, including cloning aDNA amplicons as opposed to directly sequencing them. However, no standardized protocol has emerged regarding the necessary number of clones to sequence, how a consensus sequence is most appropriately derived, or how results should be reported in the literature. In addition, there has been no systematic demonstration of the degree to which direct sequences are affected by damage or whether direct sequencing would provide disparate results from a consensus of clones. To address this issue, a comparative study was designed to examine both cloned and direct sequences amplified from ∼3,500 year-old ancient northern fur seal DNA extracts. Majority rules and the Consensus Confidence Program were used to generate consensus sequences for each individual from the cloned sequences, which exhibited damage at 31 of 139 base pairs across all clones. In no instance did the consensus of clones differ from the direct sequence. This study demonstrates that, when appropriate, cloning need not be the default method, but instead, should be used as a measure of authentication on a case-by-case basis, especially when this practice adds time and cost to studies where it may be superfluous. PMID:21738625
Screening and Characterization of RAPD Markers in Viscerotropic Leishmania Parasites
Mkada–Driss, Imen; Talbi, Chiraz; Guerbouj, Souheila; Driss, Mehdi; Elamine, Elwaleed M.; Cupolillo, Elisa; Mukhtar, Moawia M.; Guizani, Ikram
2014-01-01
Visceral leishmaniasis (VL) is mainly due to the Leishmania donovani complex. VL is endemic in many countries worldwide including East Africa and the Mediterranean region where the epidemiology is complex. Taxonomy of these pathogens is under controversy but there is a correlation between their genetic diversity and geographical origin. With steady increase in genome knowledge, RAPD is still a useful approach to identify and characterize novel DNA markers. Our aim was to identify and characterize polymorphic DNA markers in VL Leishmania parasites in diverse geographic regions using RAPD in order to constitute a pool of PCR targets having the potential to differentiate among the VL parasites. 100 different oligonucleotide decamers having arbitrary DNA sequences were screened for reproducible amplification and a selection of 28 was used to amplify DNA from 12 L. donovani, L. archibaldi and L. infantum strains having diverse origins. A total of 155 bands were amplified of which 60.65% appeared polymorphic. 7 out of 28 primers provided monomorphic patterns. Phenetic analysis allowed clustering the parasites according to their geographical origin. Differentially amplified bands were selected, among them 22 RAPD products were successfully cloned and sequenced. Bioinformatic analysis allowed mapping of the markers and sequences and priming sites analysis. This study was complemented with Southern-blot to confirm assignment of markers to the kDNA. The bioinformatic analysis identified 16 nuclear and 3 minicircle markers. Analysis of these markers highlighted polymorphisms at RAPD priming sites with mainly 5′ end transversions, and presence of inter– and intra– taxonomic complex sequence and microsatellites variations; a bias in transitions over transversions and indels between the different sequences compared is observed, which is however less marked between L. infantum and L. donovani. The study delivers a pool of well-documented polymorphic DNA markers, to develop molecular diagnostics assays to characterize and differentiate VL causing agents. PMID:25313833
Li, Zibo; Guo, Xinwu; Tang, Lili; Peng, Limin; Chen, Ming; Luo, Xipeng; Wang, Shouman; Xiao, Zhi; Deng, Zhongping; Dai, Lizhong; Xia, Kun; Wang, Jun
2016-10-01
Circulating cell-free DNA (cfDNA) has been considered as a potential biomarker for non-invasive cancer detection. To evaluate the methylation levels of six candidate genes (EGFR, GREM1, PDGFRB, PPM1E, SOX17, and WRN) in plasma cfDNA as biomarkers for breast cancer early detection, quantitative analysis of the promoter methylation of these genes from 86 breast cancer patients and 67 healthy controls was performed by using microfluidic-PCR-based target enrichment and next-generation bisulfite sequencing technology. The predictive performance of different logistic models based on methylation status of candidate genes was investigated by means of the area under the ROC curve (AUC) and odds ratio (OR) analysis. Results revealed that EGFR, PPM1E, and 8 gene-specific CpG sites showed significantly hypermethylation in cancer patients' plasma and significantly associated with breast cancer (OR ranging from 2.51 to 9.88). The AUC values for these biomarkers were ranging from 0.66 to 0.75. Combinations of multiple hypermethylated genes or CpG sites substantially improved the predictive performance for breast cancer detection. Our study demonstrated the feasibility of quantitative measurement of candidate gene methylation in cfDNA by using microfluidic-PCR-based target enrichment and bisulfite next-generation sequencing, which is worthy of further validation and potentially benefits a broad range of applications in clinical oncology practice. Quantitative analysis of methylation pattern of plasma cfDNA by next-generation sequencing might be a valuable non-invasive tool for early detection of breast cancer.
Uncommonly isolated clinical Pseudomonas: identification and phylogenetic assignation.
Mulet, M; Gomila, M; Ramírez, A; Cardew, S; Moore, E R B; Lalucat, J; García-Valdés, E
2017-02-01
Fifty-two Pseudomonas strains that were difficult to identify at the species level in the phenotypic routine characterizations employed by clinical microbiology laboratories were selected for genotypic-based analysis. Species level identifications were done initially by partial sequencing of the DNA dependent RNA polymerase sub-unit D gene (rpoD). Two other gene sequences, for the small sub-unit ribosonal RNA (16S rRNA) and for DNA gyrase sub-unit B (gyrB) were added in a multilocus sequence analysis (MLSA) study to confirm the species identifications. These sequences were analyzed with a collection of reference sequences from the type strains of 161 Pseudomonas species within an in-house multi-locus sequence analysis database. Whole-cell matrix-assisted laser-desorption/ionization time-of-flight mass spectrometry (MALDI-TOF MS) analyses of these strains complemented the DNA sequenced-based phylogenetic analyses and were observed to be in accordance with the results of the sequence data. Twenty-three out of 52 strains were assigned to 12 recognized species not commonly detected in clinical specimens and 29 (56 %) were considered representatives of at least ten putative new species. Most strains were distributed within the P. fluorescens and P. aeruginosa lineages. The value of rpoD sequences in species-level identifications for Pseudomonas is emphasized. The correct species identifications of clinical strains is essential for establishing the intrinsic antibiotic resistance patterns and improved treatment plans.
Redberg, G.L.; Hibbett, D.S.; Ammirati, J.F.; Rodriguez, R.J.
2003-01-01
The genetic diversity and phylogeny of Bridgeoporus nobilissimus have been analyzed. DNA was extracted from spores collected from individual fruiting bodies representing six geographically distinct populations in Oregon and Washington. Spore samples collected contained low levels of bacteria, yeast and a filamentous fungal species. Using taxon-specific PCR primers, it was possible to discriminate among rDNA from bacteria, yeast, a filamentous associate and B. nobilissimus. Nuclear rDNA internal transcribed spacer (ITS) region sequences of B. nobilissimus were compared among individuals representing six populations and were found to have less than 2% variation. These sequences also were used to design dual and nested PCR primers for B. nobilissimus-specific amplification. Mitochondrial small-subunit rDNA sequences were used in a phylogenetic analysis that placed B. nobilissimus in the hymenochaetoid clade, where it was associated with Oxyporus and Schizopora.
Pfeiffer, H; Hühne, J; Ortmann, C; Waterkamp, K; Brinkmann, B
1999-01-01
The analysis of mitochondrial DNA (mtDNA) from shed hairs has gained high importance in forensic casework since telogen hairs are one of the most common types of evidence left at the crime scene. In this systematic study of hair shafts from 20 individuals, the correlation of mtDNA recovery with hair morphology (length, diameter, volume, colour), with sex, and with body localisation (head, armpit, pubis) was investigated. The highest average success rate of hypervariable region 1 (HV 1) sequencing was found in head hair shafts (75%) followed by pubic (66%) and axillary hair shafts (52%). No statistically significant correlation between morphological parameters or sex and the success rate of sequencing was found. MtDNA sequences of buccal cells, head, pubic and axillary hair shafts did not show intraindividual differences. Heteroplasmic base positions were observed neither in the hair shafts nor in control samples of buccal cells.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Tuskan, Gerald A; Gunter, Lee E; DiFazio, Stephen P
The 18S-28S rDNA and 5S rDNA loci in Populus trichocarpa were localized using fluorescent in situ hybridization (FISH). Two 18S-28S rDNA sites and one 5S rDNA site were identified and located at the ends of 3 different chromosomes. FISH signals from the Arabidopsis -type telomere repeat sequence were observed at the distal ends of each chromosome. Six BAC clones selected from 2 linkage groups based on genome sequence assembly (LG-I and LG-VI) were localized on 2 chromosomes, as expected. BACs from LG-I hybridized to the longest chromosome in the complement. All BAC positions were found to be concordant with sequencemore » assembly positions. BAC-FISH will be useful for delineating each of the Populus trichocarpa chromosomes and improving the sequence assembly of this model angiosperm tree species.« less
Kovács, Endre R; Benko, Mária
2009-03-01
Partial genome characterisation of a novel adenovirus, found recently in organ samples of multiple species of dead birds of prey, was carried out by sequence analysis of PCR-amplified DNA fragments. The virus, named as raptor adenovirus 1 (RAdV-1), has originally been detected by a nested PCR method with consensus primers targeting the adenoviral DNA polymerase gene. Phylogenetic analysis with the deduced amino acid sequence of the small PCR product has implied a new siadenovirus type present in the samples. Since virus isolation attempts remained unsuccessful, further characterisation of this putative novel siadenovirus was carried out with the use of PCR on the infected organ samples. The DNA sequence of the central genome part of RAdV-1, encompassing nine full (pTP, 52K, pIIIa, III, pVII, pX, pVI, hexon, protease) and two partial (DNA polymerase and DBP) genes and exceeding 12 kb pairs in size, was determined. Phylogenetic tree reconstructions, based on several genes, unambiguously confirmed the preliminary classification of RAdV-1 as a new species within the genus Siadenovirus. Further study of RAdV-1 is of interest since it represents a rare adenovirus genus of yet undetermined host origin.
Giehr, Pascal; Walter, Jörn
2018-01-01
The accurate and quantitative detection of 5-methylcytosine is of great importance in the field of epigenetics. The method of choice is usually bisulfite sequencing because of the high resolution and the possibility to combine it with next generation sequencing. Nevertheless, also this method has its limitations. Following the bisulfite treatment DNA strands are no longer complementary such that in a subsequent PCR amplification the DNA methylation patterns information of only one of the two DNA strand is preserved. Several years ago Hairpin Bisulfite sequencing was developed as a method to obtain the pattern information on complementary DNA strands. The method requires fragmentation (usually by enzymatic cleavage) of genomic DNA followed by a covalent linking of both DNA strands through ligation of a short DNA hairpin oligonucleotide to both strands. The ligated covalently linked dsDNA products are then subjected to a conventional bisulfite treatment during which all unmodified cytosines are converted to uracils. During the treatment the DNA is denatured forming noncomplementary ssDNA circles. These circles serve as a template for a locus specific PCR to amplify chromosomal patterns of the region of interest. As a result one ends up with a linearized product, which contains the methylation information of both complementary DNA strands.
Onozawa, Masahiro; Zhang, Zhenhua; Kim, Yoo Jung; Goldberg, Liat; Varga, Tamas; Bergsagel, P Leif; Kuehl, W Michael; Aplan, Peter D
2014-05-27
We used the I-SceI endonuclease to produce DNA double-strand breaks (DSBs) and observed that a fraction of these DSBs were repaired by insertion of sequences, which we termed "templated sequence insertions" (TSIs), derived from distant regions of the genome. These TSIs were derived from genic, retrotransposon, or telomere sequences and were not deleted from the donor site in the genome, leading to the hypothesis that they were derived from reverse-transcribed RNA. Cotransfection of RNA and an I-SceI expression vector demonstrated insertion of RNA-derived sequences at the DNA-DSB site, and TSIs were suppressed by reverse-transcriptase inhibitors. Both observations support the hypothesis that TSIs were derived from RNA templates. In addition, similar insertions were detected at sites of DNA DSBs induced by transcription activator-like effector nuclease proteins. Whole-genome sequencing of myeloma cell lines revealed additional TSIs, demonstrating that repair of DNA DSBs via insertion was not restricted to experimentally produced DNA DSBs. Analysis of publicly available databases revealed that many of these TSIs are polymorphic in the human genome. Taken together, these results indicate that insertional events should be considered as alternatives to gross chromosomal rearrangements in the interpretation of whole-genome sequence data and that this mutagenic form of DNA repair may play a role in genetic disease, exon shuffling, and mammalian evolution.
Wang, Chuan; Zhang, Chaowu; Pei, Xiaofang; Liu, Hengchuan
2007-11-01
For being further applied and studied, one strain of Lactobacillus delbrueckii subsp. bulgaricus (wch9901) separated from yoghourt which had been identified by phenotype characteristic analysis was identified by 16S rDNA and phylogenetic analyzed. The 16S rDNA of wch9901 was amplified with the genomic DNA of wch9901 as template, and the conservative sequences of the 16S rDNA as primers. Inserted 16S rDNA amplified into clonal vector pGEM-T under the function of T4 DNA ligase to construct recombined plasmid pGEM-wch9901 16S rDNA. The recombined plasmid was identified by restriction enzyme digestion, and the eligible plasmid was presented to sequencing company for DNA sequencing. Nucleic acid sequence was blast in GenBank and phylogenetic tree was constructed using neighbor-joining method of distance methods by Mega3.1 soft. Results of blastn showed that the homology of 16S rDNA of wch9901 with the 16S rDNA of Lactobacillus delbrueckii subsp. bulgaricus strains was higher than 96%. On the phylogenetic tree, wch9901 formed a separate branch and located between Lactobacillus delbrueckii subsp. bulgaricus LGM2 evolution branch and another evolution branch which was composed of Lactobacillus delbrueckii subsp. bulgaricus DL2 evolution cluster and Lactobacillus delbrueckii subsp. bulgaricus JSQ evolution cluster. The distance between wch9901 evolution branch and Lactobacillus delbrueckii subsp. bulgaricus LGM2 evolution branch was the closest. wch9901 belonged to Lactobacillus delbrueckii subsp. bulgaricus. wch9901 showed the closest evolution relationship to Lactobacillus delbrueckii subsp. bulgaricus LGM2.
Automated one-step DNA sequencing based on nanoliter reaction volumes and capillary electrophoresis.
Pang, H M; Yeung, E S
2000-08-01
An integrated system with a nano-reactor for cycle-sequencing reaction coupled to on-line purification and capillary gel electrophoresis has been demonstrated. Fifty nanoliters of reagent solution, which includes dye-labeled terminators, polymerase, BSA and template, was aspirated and mixed with the template inside the nano-reactor followed by cycle-sequencing reaction. The reaction products were then purified by a size-exclusion chromatographic column operated at 50 degrees C followed by room temperature on-line injection of the DNA fragments into a capillary for gel electrophoresis. Over 450 bases of DNA can be separated and identified. As little as 25 nl reagent solution can be used for the cycle-sequencing reaction with a slightly shorter read length. Significant savings on reagent cost is achieved because the remaining stock solution can be reused without contamination. The steps of cycle sequencing, on-line purification, injection, DNA separation, capillary regeneration, gel-filling and fluidic manipulation were performed with complete automation. This system can be readily multiplexed for high-throughput DNA sequencing or PCR analysis directly from templates or even biological materials.
Evaluation of massively parallel sequencing for forensic DNA methylation profiling.
Richards, Rebecca; Patel, Jayshree; Stevenson, Kate; Harbison, SallyAnn
2018-05-11
Epigenetics is an emerging area of interest in forensic science. DNA methylation, a type of epigenetic modification, can be applied to chronological age estimation, identical twin differentiation and body fluid identification. However, there is not yet an agreed, established methodology for targeted detection and analysis of DNA methylation markers in forensic research. Recently a massively parallel sequencing-based approach has been suggested. The use of massively parallel sequencing is well established in clinical epigenetics and is emerging as a new technology in the forensic field. This review investigates the potential benefits, limitations and considerations of this technique for the analysis of DNA methylation in a forensic context. The importance of a robust protocol, regardless of the methodology used, that minimises potential sources of bias is highlighted. This article is protected by copyright. All rights reserved. This article is protected by copyright. All rights reserved.
Ebbie: automated analysis and storage of small RNA cloning data using a dynamic web server
Ebhardt, H Alexander; Wiese, Kay C; Unrau, Peter J
2006-01-01
Background DNA sequencing is used ubiquitously: from deciphering genomes[1] to determining the primary sequence of small RNAs (smRNAs) [2-5]. The cloning of smRNAs is currently the most conventional method to determine the actual sequence of these important regulators of gene expression. Typical smRNA cloning projects involve the sequencing of hundreds to thousands of smRNA clones that are delimited at their 5' and 3' ends by fixed sequence regions. These primers result from the biochemical protocol used to isolate and convert the smRNA into clonable PCR products. Recently we completed a smRNA cloning project involving tobacco plants, where analysis was required for ~700 smRNA sequences[6]. Finding no easily accessible research tool to enter and analyze smRNA sequences we developed Ebbie to assist us with our study. Results Ebbie is a semi-automated smRNA cloning data processing algorithm, which initially searches for any substring within a DNA sequencing text file, which is flanked by two constant strings. The substring, also termed smRNA or insert, is stored in a MySQL and BlastN database. These inserts are then compared using BlastN to locally installed databases allowing the rapid comparison of the insert to both the growing smRNA database and to other static sequence databases. Our laboratory used Ebbie to analyze scores of DNA sequencing data originating from an smRNA cloning project[6]. Through its built-in instant analysis of all inserts using BlastN, we were able to quickly identify 33 groups of smRNAs from ~700 database entries. This clustering allowed the easy identification of novel and highly expressed clusters of smRNAs. Ebbie is available under GNU GPL and currently implemented on Conclusion Ebbie was designed for medium sized smRNA cloning projects with about 1,000 database entries [6-8].Ebbie can be used for any type of sequence analysis where two constant primer regions flank a sequence of interest. The reliable storage of inserts, and their annotation in a MySQL database, BlastN[9] comparison of new inserts to dynamic and static databases make it a powerful new tool in any laboratory using DNA sequencing. Ebbie also prevents manual mistakes during the excision process and speeds up annotation and data-entry. Once the server is installed locally, its access can be restricted to protect sensitive new DNA sequencing data. Ebbie was primarily designed for smRNA cloning projects, but can be applied to a variety of RNA and DNA cloning projects[2,3,10,11]. PMID:16584563
A DNA Sequence Element That Advances Replication Origin Activation Time in Saccharomyces cerevisiae
Pohl, Thomas J.; Kolor, Katherine; Fangman, Walton L.; Brewer, Bonita J.; Raghuraman, M. K.
2013-01-01
Eukaryotic origins of DNA replication undergo activation at various times in S-phase, allowing the genome to be duplicated in a temporally staggered fashion. In the budding yeast Saccharomyces cerevisiae, the activation times of individual origins are not intrinsic to those origins but are instead governed by surrounding sequences. Currently, there are two examples of DNA sequences that are known to advance origin activation time, centromeres and forkhead transcription factor binding sites. By combining deletion and linker scanning mutational analysis with two-dimensional gel electrophoresis to measure fork direction in the context of a two-origin plasmid, we have identified and characterized a 19- to 23-bp and a larger 584-bp DNA sequence that are capable of advancing origin activation time. PMID:24022751
Bhatia, S; Singh Negi, M; Lakshmikumaran, M
1996-11-01
EcoRI restriction of the B. nigra rDNA recombinants, isolated from a lambda genomic library, showed that the 3.9-kb fragment corresponded to the Intergenic Spacer (IGS), which was sequenced and found to be 3,928 bp in size. Sequence and dot-matrix analyses showed that the organization of the B. nigra rDNA IGS was typical of most rDNA spacers, consisting of a central repetitive region and flanking unique sequences on either side. The repetitive region was composed of two repeat families-RF 'A' and RF 'B.' The B. nigra RF 'A' consisted of a tandem array of three full-length copies of a 106-bp sequence element. RF 'B' was composed of 66 tandemly repeated elements. Each 'B' element was only 21-bp in size and this is the smallest repeat unit identified in plant rDNA to date. The putative transcription initiation site (TIS) was identified as nucleotide position 3,110. Based on the sequence analysis it was suggested that the present organization of the repeat families was generated by successive cycles of deletions and amplifications and was being maintained by homogenization processes such as gene conversion and crossing-over.A detailed comparison of the rDNA IGS sequences of the three diploid Brassica species-namely, B. nigra, B. campestris, and B. oleracea-was carried out. First, comparisons revealed that B. campestris and B. oleracea were close to each other as the repeat families in both showed high sequence homology between each other. Second, the repeat elements in both the species were organized in an interspersed manner. Third, a 52-bp sequence, present just downstream of the repeats in B. campestris, was found to be identical to the B. oleracea repeats, thereby suggesting a common progenitor. On the other hand, in B. nigra no interspersion pattern of organization of repeats was observed. Further, the B. nigra RF 'A' was identified as distinct from the repeat families of B. campestris and B. oleracea. Based on this analysis, it was suggested that during speciation B. campestris and B. oleracea evolved in one lineage whereas B. nigra diverged into a separate lineage. The comparative analysis of the IGS helped in identifying not only conserved ancestral sequence motifs of possible functional significance such as promoters and enhancers, but also sequences which showed variation between the three diploid species and were therefore identified as species-specific sequences.
Madhaiyan, Munusamy; Poonguzhali, Selvaraj; Kwon, Soon-Wo; Sa, Tong-Min
2009-01-01
A pink-pigmented, aerobic, facultatively methylotrophic bacterial strain, CBMB27T, isolated from leaf tissues of rice (Oryza sativa L. 'Dong-Jin'), was analysed using a polyphasic taxonomic approach. Comparative 16S rRNA gene sequence-based phylogenetic analysis placed the strain in a clade with the species Methylobacterium oryzae, Methylobacterium fujisawaense and Methylobacterium mesophilicum; strain CBMB27T showed sequence similarities of 98.3, 98.5 and 97.3 %, respectively, to the type strains of these three species. DNA-DNA hybridization experiments revealed low levels (<38 %) of DNA-DNA relatedness between strain CBMB27T and its closest relatives. The sequence of the 1-aminocyclopropane-1-carboxylate deaminase gene (acdS) in strain CBMB27T differed from those of close relatives. The major fatty acid of the isolate was C(18 : 1)omega7c and the G+C content of the genomic DNA was 66.8 mol%. Based on the results of 16S rRNA gene sequence analysis, DNA-DNA hybridization, and physiological and biochemical characterization, which enabled the isolate to be differentiated from all recognized species of the genus Methylobacterium, it was concluded that strain CBMB27T represents a novel species in the genus Methylobacterium for which the name Methylobacterium phyllosphaerae sp. nov. is proposed (type strain CBMB27T =LMG 24361T =KACC 11716T =DSM 19779T).
SeqCompress: an algorithm for biological sequence compression.
Sardaraz, Muhammad; Tahir, Muhammad; Ikram, Ataul Aziz; Bajwa, Hassan
2014-10-01
The growth of Next Generation Sequencing technologies presents significant research challenges, specifically to design bioinformatics tools that handle massive amount of data efficiently. Biological sequence data storage cost has become a noticeable proportion of total cost in the generation and analysis. Particularly increase in DNA sequencing rate is significantly outstripping the rate of increase in disk storage capacity, which may go beyond the limit of storage capacity. It is essential to develop algorithms that handle large data sets via better memory management. This article presents a DNA sequence compression algorithm SeqCompress that copes with the space complexity of biological sequences. The algorithm is based on lossless data compression and uses statistical model as well as arithmetic coding to compress DNA sequences. The proposed algorithm is compared with recent specialized compression tools for biological sequences. Experimental results show that proposed algorithm has better compression gain as compared to other existing algorithms. Copyright © 2014 Elsevier Inc. All rights reserved.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Shevtsov, M. B.; Streeter, S. D.; Thresh, S.-J.
2015-02-01
The structure of the new class of controller proteins (exemplified by C.Csp231I) in complex with its 21 bp DNA-recognition sequence is presented, and the molecular basis of sequence recognition in this class of proteins is discussed. An unusual extended spacer between the dimer binding sites suggests a novel interaction between the two C-protein dimers. In a wide variety of bacterial restriction–modification systems, a regulatory ‘controller’ protein (or C-protein) is required for effective transcription of its own gene and for transcription of the endonuclease gene found on the same operon. We have recently turned our attention to a new class ofmore » controller proteins (exemplified by C.Csp231I) that have quite novel features, including a much larger DNA-binding site with an 18 bp (∼60 Å) spacer between the two palindromic DNA-binding sequences and a very different recognition sequence from the canonical GACT/AGTC. Using X-ray crystallography, the structure of the protein in complex with its 21 bp DNA-recognition sequence was solved to 1.8 Å resolution, and the molecular basis of sequence recognition in this class of proteins was elucidated. An unusual aspect of the promoter sequence is the extended spacer between the dimer binding sites, suggesting a novel interaction between the two C-protein dimers when bound to both recognition sites correctly spaced on the DNA. A U-bend model is proposed for this tetrameric complex, based on the results of gel-mobility assays, hydrodynamic analysis and the observation of key contacts at the interface between dimers in the crystal.« less
Pilotte, Nils; Papaiakovou, Marina; Grant, Jessica R; Bierwert, Lou Ann; Llewellyn, Stacey; McCarthy, James S; Williams, Steven A
2016-03-01
The soil transmitted helminths are a group of parasitic worms responsible for extensive morbidity in many of the world's most economically depressed locations. With growing emphasis on disease mapping and eradication, the availability of accurate and cost-effective diagnostic measures is of paramount importance to global control and elimination efforts. While real-time PCR-based molecular detection assays have shown great promise, to date, these assays have utilized sub-optimal targets. By performing next-generation sequencing-based repeat analyses, we have identified high copy-number, non-coding DNA sequences from a series of soil transmitted pathogens. We have used these repetitive DNA elements as targets in the development of novel, multi-parallel, PCR-based diagnostic assays. Utilizing next-generation sequencing and the Galaxy-based RepeatExplorer web server, we performed repeat DNA analysis on five species of soil transmitted helminths (Necator americanus, Ancylostoma duodenale, Trichuris trichiura, Ascaris lumbricoides, and Strongyloides stercoralis). Employing high copy-number, non-coding repeat DNA sequences as targets, novel real-time PCR assays were designed, and assays were tested against established molecular detection methods. Each assay provided consistent detection of genomic DNA at quantities of 2 fg or less, demonstrated species-specificity, and showed an improved limit of detection over the existing, proven PCR-based assay. The utilization of next-generation sequencing-based repeat DNA analysis methodologies for the identification of molecular diagnostic targets has the ability to improve assay species-specificity and limits of detection. By exploiting such high copy-number repeat sequences, the assays described here will facilitate soil transmitted helminth diagnostic efforts. We recommend similar analyses when designing PCR-based diagnostic tests for the detection of other eukaryotic pathogens.
Analysis of mutational spectra by denaturant capillary electrophoresis
Ekstrøm, Per O.; Khrapko, Konstantin; Li-Sucholeiki, Xiao-Cheng; Hunter, Ian W.; Thilly, William G.
2009-01-01
Numbers and kinds of point mutant within DNA from cells, tissues and human population may be discovered for nearly any 75–250bp DNA sequence. High fidelity DNA amplification incorporating a thermally stable DNA “clamp” is followed by separation by denaturing capillary electrophoresis (DCE). DCE allows for peak collection and verification sequencing. DCE in a mode of cycling temperature, e.g.+/− 5°C, CyDCE, permits high resolution of mutant sequences using computer defined analytes without preliminary optimization experiments. DNA sequencers have been modified to permit higher throughput CyDCE and a massively parallel,~25,000 capillary system, has been designed for pangenomic scans in large human populations. DCE has been used to define quantitative point mutational spectra for study a wide variety of genetic phenomena: errors of DNA polymerases, mutations induced in human cells by chemicals and irradiation, testing of human gene-common disease associations and the discovery of origins of point mutations in human development and carcinogenesis. PMID:18600220
Liew, Pauline Woanying; Jong, Bor Chyan
2008-05-01
Two culture-independent methods, namely ribosomal DNA libraries and denaturing gradient gel electrophoresis (DGGE), were adopted to examine the microbial community of a Malaysian light crude oil. In this study, both 16S and 18S rDNAs were PCR-amplified from bulk DNA of crude oil samples, cloned, and sequenced. Analyses of restriction fragment length polymorphism (RFLP) and phylogenetics clustered the 16S and 18S rDNA sequences into seven and six groups, respectively. The ribosomal DNA sequences obtained showed sequence similarity between 90 to 100% to those available in the GenBank database. The closest relatives documented for the 16S rDNAs include member species of Thermoincola and Rhodopseudomonas, whereas the closest fungal relatives include Acremonium, Ceriporiopsis, Xeromyces, Lecythophora, and Candida. Others were affiliated to uncultured bacteria and uncultured ascomycete. The 16S rDNA library demonstrated predomination by a single uncultured bacterial type by >80% relative abundance. The predomination was confirmed by DGGE analysis.
Morise, Hisashi; Miyazaki, Erika; Yoshimitsu, Shoko; Eki, Toshihiko
2012-01-01
Soil nematodes play crucial roles in the soil food web and are a suitable indicator for assessing soil environments and ecosystems. Previous nematode community analyses based on nematode morphology classification have been shown to be useful for assessing various soil environments. Here we have conducted DNA barcode analysis for soil nematode community analyses in Japanese soils. We isolated nematodes from two different environmental soils of an unmanaged flowerbed and an agricultural field using the improved flotation-sieving method. Small subunit (SSU) rDNA fragments were directly amplified from each of 68 (flowerbed samples) and 48 (field samples) isolated nematodes to determine the nucleotide sequence. Sixteen and thirteen operational taxonomic units (OTUs) were obtained by multiple sequence alignment from the flowerbed and agricultural field nematodes, respectively. All 29 SSU rDNA-derived OTUs (rOTUs) were further mapped onto a phylogenetic tree with 107 known nematode species. Interestingly, the two nematode communities examined were clearly distinct from each other in terms of trophic groups: Animal predators and plant feeders were markedly abundant in the flowerbed soils, in contrast, bacterial feeders were dominantly observed in the agricultural field soils. The data from the flowerbed nematodes suggests a possible food web among two different trophic nematode groups and plants (weeds) in the closed soil environment. Finally, DNA sequences derived from the mitochondrial cytochrome oxidase c subunit 1 (COI) gene were determined as a DNA barcode from 43 agricultural field soil nematodes. These nematodes were assigned to 13 rDNA-derived OTUs, but in the COI gene analysis were assigned to 23 COI gene-derived OTUs (cOTUs), indicating that COI gene-based barcoding may provide higher taxonomic resolution than conventional SSU rDNA-barcoding in soil nematode community analysis. PMID:23284767
DnaSAM: Software to perform neutrality testing for large datasets with complex null models.
Eckert, Andrew J; Liechty, John D; Tearse, Brandon R; Pande, Barnaly; Neale, David B
2010-05-01
Patterns of DNA sequence polymorphisms can be used to understand the processes of demography and adaptation within natural populations. High-throughput generation of DNA sequence data has historically been the bottleneck with respect to data processing and experimental inference. Advances in marker technologies have largely solved this problem. Currently, the limiting step is computational, with most molecular population genetic software allowing a gene-by-gene analysis through a graphical user interface. An easy-to-use analysis program that allows both high-throughput processing of multiple sequence alignments along with the flexibility to simulate data under complex demographic scenarios is currently lacking. We introduce a new program, named DnaSAM, which allows high-throughput estimation of DNA sequence diversity and neutrality statistics from experimental data along with the ability to test those statistics via Monte Carlo coalescent simulations. These simulations are conducted using the ms program, which is able to incorporate several genetic parameters (e.g. recombination) and demographic scenarios (e.g. population bottlenecks). The output is a set of diversity and neutrality statistics with associated probability values under a user-specified null model that are stored in easy to manipulate text file. © 2009 Blackwell Publishing Ltd.
Phylogeographic Analysis of Mitochondrial DNA in Northern Asian Populations
Derenko, Miroslava ; Malyarchuk, Boris ; Grzybowski, Tomasz ; Denisova, Galina ; Dambueva, Irina ; Perkova, Maria ; Dorzhu, Choduraa ; Luzina, Faina ; Lee, Hong Kyu ; Vanecek, Tomas ; Villems, Richard ; Zakharov, Ilia
2007-01-01
To elucidate the human colonization process of northern Asia and human dispersals to the Americas, a diverse subset of 71 mitochondrial DNA (mtDNA) lineages was chosen for complete genome sequencing from the collection of 1,432 control-region sequences sampled from 18 autochthonous populations of northern, central, eastern, and southwestern Asia. On the basis of complete mtDNA sequencing, we have revised the classification of haplogroups A, D2, G1, M7, and I; identified six new subhaplogroups (I4, N1e, G1c, M7d, M7e, and J1b2a); and fully characterized haplogroups N1a and G1b, which were previously described only by the first hypervariable segment (HVS1) sequencing and coding-region restriction-fragment–length polymorphism analysis. Our findings indicate that the southern Siberian mtDNA pool harbors several lineages associated with the Late Upper Paleolithic and/or early Neolithic dispersals from both eastern Asia and southwestern Asia/southern Caucasus. Moreover, the phylogeography of the D2 lineages suggests that southern Siberia is likely to be a geographical source for the last postglacial maximum spread of this subhaplogroup to northern Siberia and that the expansion of the D2b branch occurred in Beringia ∼7,000 years ago. In general, a detailed analysis of mtDNA gene pools of northern Asians provides the additional evidence to rule out the existence of a northern Asian route for the initial human colonization of Asia. PMID:17924343
Phylogeographic analysis of mitochondrial DNA in northern Asian populations.
Derenko, Miroslava; Malyarchuk, Boris; Grzybowski, Tomasz; Denisova, Galina; Dambueva, Irina; Perkova, Maria; Dorzhu, Choduraa; Luzina, Faina; Lee, Hong Kyu; Vanecek, Tomas; Villems, Richard; Zakharov, Ilia
2007-11-01
To elucidate the human colonization process of northern Asia and human dispersals to the Americas, a diverse subset of 71 mitochondrial DNA (mtDNA) lineages was chosen for complete genome sequencing from the collection of 1,432 control-region sequences sampled from 18 autochthonous populations of northern, central, eastern, and southwestern Asia. On the basis of complete mtDNA sequencing, we have revised the classification of haplogroups A, D2, G1, M7, and I; identified six new subhaplogroups (I4, N1e, G1c, M7d, M7e, and J1b2a); and fully characterized haplogroups N1a and G1b, which were previously described only by the first hypervariable segment (HVS1) sequencing and coding-region restriction-fragment-length polymorphism analysis. Our findings indicate that the southern Siberian mtDNA pool harbors several lineages associated with the Late Upper Paleolithic and/or early Neolithic dispersals from both eastern Asia and southwestern Asia/southern Caucasus. Moreover, the phylogeography of the D2 lineages suggests that southern Siberia is likely to be a geographical source for the last postglacial maximum spread of this subhaplogroup to northern Siberia and that the expansion of the D2b branch occurred in Beringia ~7,000 years ago. In general, a detailed analysis of mtDNA gene pools of northern Asians provides the additional evidence to rule out the existence of a northern Asian route for the initial human colonization of Asia.
Supervised DNA Barcodes species classification: analysis, comparisons and results
2014-01-01
Background Specific fragments, coming from short portions of DNA (e.g., mitochondrial, nuclear, and plastid sequences), have been defined as DNA Barcode and can be used as markers for organisms of the main life kingdoms. Species classification with DNA Barcode sequences has been proven effective on different organisms. Indeed, specific gene regions have been identified as Barcode: COI in animals, rbcL and matK in plants, and ITS in fungi. The classification problem assigns an unknown specimen to a known species by analyzing its Barcode. This task has to be supported with reliable methods and algorithms. Methods In this work the efficacy of supervised machine learning methods to classify species with DNA Barcode sequences is shown. The Weka software suite, which includes a collection of supervised classification methods, is adopted to address the task of DNA Barcode analysis. Classifier families are tested on synthetic and empirical datasets belonging to the animal, fungus, and plant kingdoms. In particular, the function-based method Support Vector Machines (SVM), the rule-based RIPPER, the decision tree C4.5, and the Naïve Bayes method are considered. Additionally, the classification results are compared with respect to ad-hoc and well-established DNA Barcode classification methods. Results A software that converts the DNA Barcode FASTA sequences to the Weka format is released, to adapt different input formats and to allow the execution of the classification procedure. The analysis of results on synthetic and real datasets shows that SVM and Naïve Bayes outperform on average the other considered classifiers, although they do not provide a human interpretable classification model. Rule-based methods have slightly inferior classification performances, but deliver the species specific positions and nucleotide assignments. On synthetic data the supervised machine learning methods obtain superior classification performances with respect to the traditional DNA Barcode classification methods. On empirical data their classification performances are at a comparable level to the other methods. Conclusions The classification analysis shows that supervised machine learning methods are promising candidates for handling with success the DNA Barcoding species classification problem, obtaining excellent performances. To conclude, a powerful tool to perform species identification is now available to the DNA Barcoding community. PMID:24721333
A novel chaotic image encryption scheme using DNA sequence operations
NASA Astrophysics Data System (ADS)
Wang, Xing-Yuan; Zhang, Ying-Qian; Bao, Xue-Mei
2015-10-01
In this paper, we propose a novel image encryption scheme based on DNA (Deoxyribonucleic acid) sequence operations and chaotic system. Firstly, we perform bitwise exclusive OR operation on the pixels of the plain image using the pseudorandom sequences produced by the spatiotemporal chaos system, i.e., CML (coupled map lattice). Secondly, a DNA matrix is obtained by encoding the confused image using a kind of DNA encoding rule. Then we generate the new initial conditions of the CML according to this DNA matrix and the previous initial conditions, which can make the encryption result closely depend on every pixel of the plain image. Thirdly, the rows and columns of the DNA matrix are permuted. Then, the permuted DNA matrix is confused once again. At last, after decoding the confused DNA matrix using a kind of DNA decoding rule, we obtain the ciphered image. Experimental results and theoretical analysis show that the scheme is able to resist various attacks, so it has extraordinarily high security.
[cDNA library construction from panicle meristem of finger millet].
Radchuk, V; Pirko, Ia V; Isaenkov, S V; Emets, A I; Blium, Ia B
2014-01-01
The protocol for production of full-size cDNA using SuperScript Full-Length cDNA Library Construction Kit II (Invitrogen) was tested and high quality cDNA library from meristematic tissue of finger millet panicle (Eleusine coracana (L.) Gaertn) was created. The titer of obtained cDNA library comprised 3.01 x 10(5) CFU/ml in avarage. In average the length of cDNA insertion consisted about 1070 base pairs, the effectivity of cDNA fragment insertions--99.5%. The selective sequencing of cDNA clones from created library was performed. The sequences of cDNA clones were identified with usage of BLAST-search. The results of cDNA library analysis and selective sequencing represents prove good functionality and full length character of inserted cDNA clones. Obtained cDNA library from meristematic tissue of finger millet panicle represents good and valuable source for isolation and identification of key genes regulating metabolism and meristematic development and for mining of new molecular markers to conduct out high quality genetic investigations and molecular breeding as well.
Mutation detection using automated fluorescence-based sequencing.
Montgomery, Kate T; Iartchouck, Oleg; Li, Li; Perera, Anoja; Yassin, Yosuf; Tamburino, Alex; Loomis, Stephanie; Kucherlapati, Raju
2008-04-01
The development of high-throughput DNA sequencing techniques has made direct DNA sequencing of PCR-amplified genomic DNA a rapid and economical approach to the identification of polymorphisms that may play a role in disease. Point mutations as well as small insertions or deletions are readily identified by DNA sequencing. The mutations may be heterozygous (occurring in one allele while the other allele retains the normal sequence) or homozygous (occurring in both alleles). Sequencing alone cannot discriminate between true homozygosity and apparent homozygosity due to the loss of one allele due to a large deletion. In this unit, strategies are presented for using PCR amplification and automated fluorescence-based sequencing to identify sequence variation. The size of the project and laboratory preference and experience will dictate how the data is managed and which software tools are used for analysis. A high-throughput protocol is given that has been used to search for mutations in over 200 different genes at the Harvard Medical School - Partners Center for Genetics and Genomics (HPCGG, http://www.hpcgg.org/). Copyright 2008 by John Wiley & Sons, Inc.
Beyond sequencing: optical mapping of DNA in the age of nanotechnology and nanoscopy.
Levy-Sakin, Michal; Ebenstein, Yuval
2013-08-01
Next generation sequencing (NGS) is revolutionizing all fields of biological research but it fails to extract the full range of information associated with genetic material. Optical mapping of DNA grants access to genetic and epigenetic information on individual DNA molecules up to ∼1 Mbp in length. Fluorescent labeling of specific sequence motifs, epigenetic marks and other genomic information on individual DNA molecules generates a high content optical barcode along the DNA. By stretching the DNA to a linear configuration this barcode may be directly visualized by fluorescence microscopy. We discuss the advances of these methods in light of recent developments in nano-fabrication and super-resolution optical imaging (nanoscopy) and review the latest achievements of optical mapping in the context of genomic analysis. Copyright © 2013 Elsevier Ltd. All rights reserved.
Miller, Mark P.; Knaus, Brian J.; Mullins, Thomas D.; Haig, Susan M.
2013-01-01
SSR_pipeline is a flexible set of programs designed to efficiently identify simple sequence repeats (e.g., microsatellites) from paired-end high-throughput Illumina DNA sequencing data. The program suite contains 3 analysis modules along with a fourth control module that can automate analyses of large volumes of data. The modules are used to 1) identify the subset of paired-end sequences that pass Illumina quality standards, 2) align paired-end reads into a single composite DNA sequence, and 3) identify sequences that possess microsatellites (both simple and compound) conforming to user-specified parameters. The microsatellite search algorithm is extremely efficient, and we have used it to identify repeats with motifs from 2 to 25bp in length. Each of the 3 analysis modules can also be used independently to provide greater flexibility or to work with FASTQ or FASTA files generated from other sequencing platforms (Roche 454, Ion Torrent, etc.). We demonstrate use of the program with data from the brine fly Ephydra packardi (Diptera: Ephydridae) and provide empirical timing benchmarks to illustrate program performance on a common desktop computer environment. We further show that the Illumina platform is capable of identifying large numbers of microsatellites, even when using unenriched sample libraries and a very small percentage of the sequencing capacity from a single DNA sequencing run. All modules from SSR_pipeline are implemented in the Python programming language and can therefore be used from nearly any computer operating system (Linux, Macintosh, and Windows).
NASA Astrophysics Data System (ADS)
Hamid, Nur Athirah Abd; Ismail, Ismanizan
2013-11-01
Polygonum minus, locally named as Kesum is an aromatic herb which is high in secondary metabolite content. Alcohol dehydrogenase is an important enzyme that catalyzes the reversible oxidation of alcohol and aldehyde with the presence of NAD(P)(H) as co-factor. The main focus of this research is to identify the gene of ADH. The total RNA was extracted from leaves of P. minus which was treated with 150 μM Jasmonic acid. Full-length cDNA sequence of ADH was isolated via rapid amplification cDNA end (RACE). Subsequently, in silico analysis was conducted on the full-length cDNA sequence and PCR was done on genomic DNA to determine the exon and intron organization. Two sequences of ADH, designated as PmADH1 and PmADH2 were successfully isolated. Both sequences have ORF of 801 bp which encode 266 aa residues. Nucleotide sequence comparison of PmADH1 and PmADH2 indicated that both sequences are highly similar at the ORF region but divergent in the 3' untranslated regions (UTR). The amino acid is differ at the 107 residue; PmADH1 contains Gly (G) residue while PmADH2 contains Cys (C) residue. The intron-exon organization pattern of both sequences are also same, with 3 introns and 4 exons. Based on in silico analysis, both sequences contain "classical" short chain alcohol dehydrogenases/reductases ((c) SDRs) conserved domain. The results suggest that both sequences are the members of short chain alcohol dehydrogenase family.
Miller, Mark P; Knaus, Brian J; Mullins, Thomas D; Haig, Susan M
2013-01-01
SSR_pipeline is a flexible set of programs designed to efficiently identify simple sequence repeats (e.g., microsatellites) from paired-end high-throughput Illumina DNA sequencing data. The program suite contains 3 analysis modules along with a fourth control module that can automate analyses of large volumes of data. The modules are used to 1) identify the subset of paired-end sequences that pass Illumina quality standards, 2) align paired-end reads into a single composite DNA sequence, and 3) identify sequences that possess microsatellites (both simple and compound) conforming to user-specified parameters. The microsatellite search algorithm is extremely efficient, and we have used it to identify repeats with motifs from 2 to 25 bp in length. Each of the 3 analysis modules can also be used independently to provide greater flexibility or to work with FASTQ or FASTA files generated from other sequencing platforms (Roche 454, Ion Torrent, etc.). We demonstrate use of the program with data from the brine fly Ephydra packardi (Diptera: Ephydridae) and provide empirical timing benchmarks to illustrate program performance on a common desktop computer environment. We further show that the Illumina platform is capable of identifying large numbers of microsatellites, even when using unenriched sample libraries and a very small percentage of the sequencing capacity from a single DNA sequencing run. All modules from SSR_pipeline are implemented in the Python programming language and can therefore be used from nearly any computer operating system (Linux, Macintosh, and Windows).
Pasi, Marco; Maddocks, John H.; Lavery, Richard
2015-01-01
Microsecond molecular dynamics simulations of B-DNA oligomers carried out in an aqueous environment with a physiological salt concentration enable us to perform a detailed analysis of how potassium ions interact with the double helix. The oligomers studied contain all 136 distinct tetranucleotides and we are thus able to make a comprehensive analysis of base sequence effects. Using a recently developed curvilinear helicoidal coordinate method we are able to analyze the details of ion populations and densities within the major and minor grooves and in the space surrounding DNA. The results show higher ion populations than have typically been observed in earlier studies and sequence effects that go beyond the nature of individual base pairs or base pair steps. We also show that, in some special cases, ion distributions converge very slowly and, on a microsecond timescale, do not reflect the symmetry of the corresponding base sequence. PMID:25662221
Li, Shuang; Shang, Xinxin; Liu, Jia; Wang, Yujie; Guo, Yingshu; You, Jinmao
2017-07-01
We present a universal amplified-colorimetric for detecting nucleic acid targets or aptamer-specific ligand targets based on gold nanoparticle-DNA (GNP-DNA) hybridization chain reaction (HCR). The universal arrays consisted of capture probe and hairpin DNA-GNP. First, capture probe recognized target specificity and released the initiator sequence. Then dispersed hairpin DNA modified GNPs were cross-linked to form aggregates through HCR events triggered by initiator sequence. As the aggregates accumulate, a significant red-to purple color change can be easily visualized by the naked eye. We used miRNA target sequence (miRNA-203) and aptamer-specific ligand (ATP) as target molecules for this proof-of-concept experiment. Initiator sequence (DNA2) was released from the capture probe (MNP/DNA1/2 conjugates) under the strong competitiveness of miRNA-203. Hairpin DNA (H1 and H2) can be complementary with the help of initiator DNA2 to form GNP-H1/GNP-H2 aggregates. The absorption ratio (A 620 /A 520 ) values of solutions were a sensitive function of miRNA-203 concentration covering from 1.0 × 10 -11 M to 9.0 × 10 -10 M, and as low as 1.0 × 10 -11 M could be detected. At the same time, the color changed from light wine red to purple and then to light blue have occurred in the solution. For ATP, initiator sequence (5'-end of DNA3) was released from the capture probe (DNA3) under the strong combination of aptamer-ATP. The present colorimetric for specific detection of ATP exhibited good sensitivity and 1.0 × 10 -8 M ATP could be detected. The proposed strategy also showed good performances for qualitative analysis and quantitative analysis of intracellular nucleic acids and aptamer-specific ligands. Copyright © 2017 Elsevier Inc. All rights reserved.
Wang, Jing; McCord, Bruce
2011-06-01
A common problem in the analysis of forensic DNA evidence is the presence of environmentally degraded and inhibited DNA. Such samples produce a variety of interpretational problems such as allele imbalance, allele dropout and sequence specific inhibition. In an attempt to develop methods to enhance the recovery of this type of evidence, magnetic bead hybridization has been applied to extract and preconcentrate DNA sequences containing short tandem repeat (STR) alleles of interest. In this work, genomic DNA was fragmented by heating, and sequences associated with STR alleles were selectively hybridized to allele-specific biotinylated probes. Each particular biotinylated probe-DNA complex was bound to streptavidin-coated magnetic beads using enabling enrichment of target DNA sequences. Experiments conducted using degraded DNA samples, as well as samples containing a large concentration of inhibitory substances, showed good specificity and recovery of missing alleles. Based on the favorable results obtained with these specific probes, this method should prove useful as a tool to improve the recovery of alleles from degraded and inhibited DNA samples. Copyright © 2011 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
Gu, Xuan; Zhang, Xiao-qin; Song, Xiao-na; Zang, Yi-mei; Li Yan-peng; Ma, Chang-hua; Zhao, Bai-xiao; Liu, Chun-sheng
2014-12-01
The fruit of Lycium ruthenicum is a common folk medicine in China. Now it is popular for its antioxidative effect and other medical functions. The adulterants of the herb confuse consumers. In order to identify a new adulterant of L. ruthenicum, a research was performed based on NCBI Nucleotide Database ITS Sequence, combined analysis of the origin and morphology of the adulterant to traceable varieties. Total genomic DNA was isolated from the materials, and nuclear DNA ITS sequences were amplified and sequenced; DNA fragments were collated and matched by using ContingExpress. Similarity identification of BLAST analysis was performed. Besides, the distribution of plant origin and morphology were considered to further identification and verification. Families and genera were identified by molecular identification method. The adulterant was identified as plant belonging to Berberis. Origin analysis narrowed the range of sample identification. Seven different kinds of plants in Berberis were potential sources of the sample. Adulterants variety was traced by morphological analysis. The united molecular identification-origin-morphology research proves to be a preceding way to medical herbs traceability with time-saving and economic advantages and the results showed the new adulterant of L. ruthenicum was B. kaschgarica. The main differences between B. kaschgarica and L. ruthenicum are as follows: in terms of the traits, the surface of B. kaschgarica is smooth and crispy, and that of L. ruthenicum is shrinkage, solid and hard. In microscopic characteristics, epicarp cells of B. aschgarica thickening like a string of beads, stone cells as the rectangle, and the stone cell walls of L. ruthenicum is wavy, obvious grain layer. In molecular sequences, the length of ITS sequence of B. kaschgarica is 606 bp, L. ruthenicum is 654 bp, the similarity of the two sequences is 53.32%.
Recognition of platinum-DNA adducts by HMGB1a.
Ramachandran, Srinivas; Temple, Brenda; Alexandrova, Anastassia N; Chaney, Stephen G; Dokholyan, Nikolay V
2012-09-25
Cisplatin (CP) and oxaliplatin (OX), platinum-based drugs used widely in chemotherapy, form adducts on intrastrand guanines (5'GG) in genomic DNA. DNA damage recognition proteins, transcription factors, mismatch repair proteins, and DNA polymerases discriminate between CP- and OX-GG DNA adducts, which could partly account for differences in the efficacy, toxicity, and mutagenicity of CP and OX. In addition, differential recognition of CP- and OX-GG adducts is highly dependent on the sequence context of the Pt-GG adduct. In particular, DNA binding protein domain HMGB1a binds to CP-GG DNA adducts with up to 53-fold greater affinity than to OX-GG adducts in the TGGA sequence context but shows much smaller differences in binding in the AGGC or TGGT sequence contexts. Here, simulations of the HMGB1a-Pt-DNA complex in the three sequence contexts revealed a higher number of interface contacts for the CP-DNA complex in the TGGA sequence context than in the OX-DNA complex. However, the number of interface contacts was similar in the TGGT and AGGC sequence contexts. The higher number of interface contacts in the CP-TGGA sequence context corresponded to a larger roll of the Pt-GG base pair step. Furthermore, geometric analysis of stacking of phenylalanine 37 in HMGB1a (Phe37) with the platinated guanines revealed more favorable stacking modes correlated with a larger roll of the Pt-GG base pair step in the TGGA sequence context. These data are consistent with our previous molecular dynamics simulations showing that the CP-TGGA complex was able to sample larger roll angles than the OX-TGGA complex or either CP- or OX-DNA complexes in the AGGC or TGGT sequences. We infer that the high binding affinity of HMGB1a for CP-TGGA is due to the greater flexibility of CP-TGGA compared to OX-TGGA and other Pt-DNA adducts. This increased flexibility is reflected in the ability of CP-TGGA to sample larger roll angles, which allows for a higher number of interface contacts between the Pt-DNA adduct and HMGB1a.
NASA Technical Reports Server (NTRS)
Smith, David J.; Burton, Aaron; Castro-Wallace, Sarah; John, Kristen; Stahl, Sarah E.; Dworkin, Jason Peter; Lupisella, Mark L.
2016-01-01
On the International Space Station (ISS), technologies capable of rapid microbial identification and disease diagnostics are not currently available. NASA still relies upon sample return for comprehensive, molecular-based sample characterization. Next-generation DNA sequencing is a powerful approach for identifying microorganisms in air, water, and surfaces onboard spacecraft. The Biomolecule Sequencer payload, manifested to SpaceX-9 and scheduled on the Increment 4748 research plan (June 2016), will assess the functionality of a commercially-available next-generation DNA sequencer in the microgravity environment of ISS. The MinION device from Oxford Nanopore Technologies (Oxford, UK) measures picoamp changes in electrical current dependent on nucleotide sequences of the DNA strand migrating through nanopores in the system. The hardware is exceptionally small (9.5 x 3.2 x 1.6 cm), lightweight (120 grams), and powered only by a USB connection. For the ISS technology demonstration, the Biomolecule Sequencer will be powered by a Microsoft Surface Pro3. Ground-prepared samples containing lambda bacteriophage, Escherichia coli, and mouse genomic DNA, will be launched and stored frozen on the ISS until experiment initiation. Immediately prior to sequencing, a crew member will collect and thaw frozen DNA samples, connect the sequencer to the Surface Pro3, inject thawed samples into a MinION flow cell, and initiate sequencing. At the completion of the sequencing run, data will be downlinked for ground analysis. Identical, synchronous ground controls will be used for data comparisons to determine sequencer functionality, run-time sequence, current dynamics, and overall accuracy. We will present our latest results from the ISS flight experiment the first time DNA has ever been sequenced in space and discuss the many potential applications of the Biomolecule Sequencer for environmental monitoring, medical diagnostics, higher fidelity and more adaptable Space Biology Human Research Program investigations, and even life detection experiments for astrobiology missions.
Ceccarelli, Marcello; Galluzzi, Luca; Diotallevi, Aurora; Andreoni, Francesca; Fowler, Hailie; Petersen, Christine; Vitale, Fabrizio; Magnani, Mauro
2017-05-16
Leishmaniasis is a neglected disease caused by many Leishmania species, belonging to subgenera Leishmania (Leishmania) and Leishmania (Viannia). Several qPCR-based molecular diagnostic approaches have been reported for detection and quantification of Leishmania species. Many of these approaches use the kinetoplast DNA (kDNA) minicircles as the target sequence. These assays had potential cross-species amplification, due to sequence similarity between Leishmania species. Previous works demonstrated discrimination between L. (Leishmania) and L. (Viannia) by SYBR green-based qPCR assays designed on kDNA, followed by melting or high-resolution melt (HRM) analysis. Importantly, these approaches cannot fully distinguish L. (L.) infantum from L. (L.) amazonensis, which can coexist in the same geographical area. DNA from 18 strains/isolates of L. (L.) infantum, L. (L.) amazonensis, L. (V.) braziliensis, L. (V.) panamensis, L. (V.) guyanensis, and 62 clinical samples from L. (L.) infantum-infected dogs were amplified by a previously developed qPCR (qPCR-ML) and subjected to HRM analysis; selected PCR products were sequenced using an ABI PRISM 310 Genetic Analyzer. Based on the obtained sequences, a new SYBR-green qPCR assay (qPCR-ama) intended to amplify a minicircle subclass more abundant in L. (L.) amazonensis was designed. The qPCR-ML followed by HRM analysis did not allow discrimination between L. (L.) amazonensis and L. (L.) infantum in 53.4% of cases. Hence, the novel SYBR green-based qPCR (qPCR-ama) has been tested. This assay achieved a detection limit of 0.1 pg of parasite DNA in samples spiked with host DNA and did not show cross amplification with Trypanosoma cruzi or host DNA. Although the qPCR-ama also amplified L. (L.) infantum strains, the C q values were dramatically increased compared to qPCR-ML. Therefore, the combined analysis of C q values from qPCR-ML and qPCR-ama allowed to distinguish L. (L.) infantum and L. (L.) amazonensis in 100% of tested samples. A new and affordable SYBR-green qPCR-based approach to distinguish between L. (L.) infantum and L. (L.) amazonensis was developed exploiting the major abundance of a minicircle sequence rather than targeting a hypothetical species-specific sequence. The fast and accurate discrimination between these species can be useful to provide adequate prognosis and treatment.
Mitochondrial DNA diagnosis for taeniasis and cysticercosis.
Yamasaki, Hiroshi; Nakao, Minoru; Sako, Yasuhito; Nakaya, Kazuhiro; Sato, Marcello Otake; Ito, Akira
2006-01-01
Molecular diagnosis for taeniasis and cysticercosis in humans on the basis of mitochondrial DNA analysis was reviewed. Development and application of three different methods, including restriction fragment length polymorphism analysis, base excision sequence scanning thymine-base analysis and multiplex PCR, were described. Moreover, molecular diagnosis of cysticerci found in specimens submitted for histopathology and the molecular detection of taeniasis using copro-DNA were discussed.
Bai, W L; Yin, R H; Dou, Q L; Jiang, W Q; Zhao, S J; Ma, Z J; Luo, G B; Zhao, Z H
2011-04-01
κ-Casein is one of the major proteins in the milk of mammals. It plays an important role in determining the size and specific function of milk micelles. We have previously identified and characterized a genetic variant of yak κ-casein by evaluating genomic DNA. Here, we isolate and characterize a yak κ-casein cDNA harboring the full-length open reading frame (ORF) from lactating mammary gland. Total RNA was extracted from mammary tissue of lactating female yak, and the κ-casein cDNA were synthesized by RT-PCR technique, then cloned and sequenced. The obtained cDNA of 660-bp contained an ORF sufficient to encode the entire amino acid sequence of κ-casein precursor protein consisting of 190 amino acids with a signal peptide of 21 amino acids. Yak κ-casein has a predicted molecular mass of 19,006.588 Da with a calculated isoelectric point of 7.245. Compared with the corresponding sequences in GenBank of cattle, buffalo, sheep, goat, Arabian camel, horse, and rabbit, yak κ-casein sequence had identity of 64.76-98.78% in cDNA, and identity of 44.79-98.42% and similarity of 53.65-98.42% in deduced amino acids, revealing a high homology with the other livestock species. Based on κ-casein cDNA sequences, the phylogenetic analysis indicated that yak κ-casein had a close relationship with that of cattle. This work might be useful in the genetic engineering researches for yak κ-casein.
Distinct Circular Single-Stranded DNA Viruses Exist in Different Soil Types
Swanson, Maud M.; Dawson, Lorna; Freitag, Thomas E.; Singh, Brajesh K.; Torrance, Lesley; Mushegian, Arcady R.
2015-01-01
The potential dependence of virus populations on soil types was examined by electron microscopy, and the total abundance of virus particles in four soil types was similar to that previously observed in soil samples. The four soil types examined differed in the relative abundances of four morphological groups of viruses. Machair, a unique type of coastal soil in western Scotland and Ireland, differed from the others tested in having a higher proportion of tailed bacteriophages. The other soils examined contained predominantly spherical and thin filamentous virus particles, but the Machair soil had a more even distribution of the virus types. As the first step in looking at differences in populations in detail, virus sequences from Machair and brown earth (agricultural pasture) soils were examined by metagenomic sequencing after enriching for circular Rep-encoding single-stranded DNA (ssDNA) (CRESS-DNA) virus genomes. Sequences from the family Microviridae (icosahedral viruses mainly infecting bacteria) of CRESS-DNA viruses were predominant in both soils. Phylogenetic analysis of Microviridae major coat protein sequences from the Machair viruses showed that they spanned most of the diversity of the subfamily Gokushovirinae, whose members mainly infect obligate intracellular parasites. The brown earth soil had a higher proportion of sequences that matched the morphologically similar family Circoviridae in BLAST searches. However, analysis of putative replicase proteins that were similar to those of viruses in the Circoviridae showed that they are a novel clade of Circoviridae-related CRESS-DNA viruses distinct from known Circoviridae genera. Different soils have substantially different taxonomic biodiversities even within ssDNA viruses, which may be driven by physicochemical factors. PMID:25841004
2010-01-01
Background Cryptic species complexes are common among anophelines. Previous phylogenetic analysis based on the complete mtDNA COI gene sequences detected paraphyly in the Neotropical malaria vector Anopheles marajoara. The "Folmer region" detects a single taxon using a 3% divergence threshold. Methods To test the paraphyletic hypothesis and examine the utility of the Folmer region, genealogical trees based on a concatenated (white + 3' COI sequences) dataset and pairwise differentiation of COI fragments were examined. The population structure and demographic history were based on partial COI sequences for 294 individuals from 14 localities in Amazonian Brazil. 109 individuals from 12 localities were sequenced for the nDNA white gene, and 57 individuals from 11 localities were sequenced for the ribosomal DNA (rDNA) internal transcribed spacer 2 (ITS2). Results Distinct A. marajoara lineages were detected by combined genealogical analysis and were also supported among COI haplotypes using a median joining network and AMOVA, with time since divergence during the Pleistocene (<100,000 ya). COI sequences at the 3' end were more variable, demonstrating significant pairwise differentiation (3.82%) compared to the more moderate 2.92% detected by the Folmer region. Lineage 1 was present in all localities, whereas lineage 2 was restricted mainly to the west. Mismatch distributions for both lineages were bimodal, likely due to multiple colonization events and spatial expansion (~798 - 81,045 ya). There appears to be gene flow within, not between lineages, and a partial barrier was detected near Rio Jari in Amapá state, separating western and eastern populations. In contrast, both nDNA data sets (white gene sequences with or without the retention of the 4th intron, and ITS2 sequences and length) detected a single A. marajoara lineage. Conclusions Strong support for combined data with significant differentiation detected in the COI and absent in the nDNA suggest that the divergence is recent, and detectable only by the faster evolving mtDNA. A within subgenus threshold of >2% may be more appropriate among sister taxa in cryptic anopheline complexes than the standard 3%. Differences in demographic history and climatic changes may have contributed to mtDNA lineage divergence in A. marajoara. PMID:20929572
Chemale, Gustavo; Paneto, Greiciane Gaburro; Menezes, Meiga Aurea Mendes; de Freitas, Jorge Marcelo; Jacques, Guilherme Silveira; Cicarelli, Regina Maria Barretto; Fagundes, Paulo Roberto
2013-05-01
Mitochondrial DNA (mtDNA) analysis is usually a last resort in routine forensic DNA casework. However, it has become a powerful tool for the analysis of highly degraded samples or samples containing too little or no nuclear DNA, such as old bones and hair shafts. The gold standard methodology still constitutes the direct sequencing of polymerase chain reaction (PCR) products or cloned amplicons from the HVS-1 and HVS-2 (hypervariable segment) control region segments. Identifications using mtDNA are time consuming, expensive and can be very complex, depending on the amount and nature of the material being tested. The main goal of this work is to develop a less labour-intensive and less expensive screening method for mtDNA analysis, in order to aid in the exclusion of non-matching samples and as a presumptive test prior to final confirmatory DNA sequencing. We have selected 14 highly discriminatory single nucleotide polymorphisms (SNPs) based on simulations performed by Salas and Amigo (2010) to be typed using SNaPShot(TM) (Applied Biosystems, Foster City, CA, USA). The assay was validated by typing more than 100 HVS-1/HVS-2 sequenced samples. No differences were observed between the SNP typing and DNA sequencing when results were compared, with the exception of allelic dropouts observed in a few haplotypes. Haplotype diversity simulations were performed using 172 mtDNA sequences representative of the Brazilian population and a score of 0.9794 was obtained when the 14 SNPs were used, showing that the theoretical prediction approach for the selection of highly discriminatory SNPs suggested by Salas and Amigo (2010) was confirmed in the population studied. As the main goal of the work is to develop a screening assay to skip the sequencing of all samples in a particular case, a pair-wise comparison of the sequences was done using the selected SNPs. When both HVS-1/HVS-2 SNPs were used for simulations, at least two differences were observed in 93.2% of the comparisons performed. The assay was validated with casework samples. Results show that the method is straightforward and can be used for exclusionary purposes, saving time and laboratory resources. The assay confirms the theoretic prediction suggested by Salas and Amigo (2010). All forensic advantages, such as high sensitivity and power of discrimination, as also the disadvantages, such as the occurrence of allele dropouts, are discussed throughout the article. Copyright © 2013 Elsevier Ireland Ltd. All rights reserved.
2012-01-01
Background The feline genome is valuable to the veterinary and model organism genomics communities because the cat is an obligate carnivore and a model for endangered felids. The initial public release of the Felis catus genome assembly provided a framework for investigating the genomic basis of feline biology. However, the entire set of protein coding genes has not been elucidated. Results We identified and characterized 1227 protein coding feline sequences, of which 913 map to public sequences and 314 are novel. These sequences have been deposited into NCBI's genbank database and complement public genomic resources by providing additional protein coding sequences that fill in some of the gaps in the feline genome assembly. Through functional and comparative genomic analyses, we gained an understanding of the role of these sequences in feline development, nutrition and health. Specifically, we identified 104 orthologs of human genes associated with Mendelian disorders. We detected negative selection within sequences with gene ontology annotations associated with intracellular trafficking, cytoskeleton and muscle functions. We detected relatively less negative selection on protein sequences encoding extracellular networks, apoptotic pathways and mitochondrial gene ontology annotations. Additionally, we characterized feline cDNA sequences that have mouse orthologs associated with clinical, nutritional and developmental phenotypes. Together, this analysis provides an overview of the value of our cDNA sequences and enhances our understanding of how the feline genome is similar to, and different from other mammalian genomes. Conclusions The cDNA sequences reported here expand existing feline genomic resources by providing high-quality sequences annotated with comparative genomic information providing functional, clinical, nutritional and orthologous gene information. PMID:22257742
Jakubec, David; Laskowski, Roman A.; Vondrasek, Jiri
2016-01-01
Decades of intensive experimental studies of the recognition of DNA sequences by proteins have provided us with a view of a diverse and complicated world in which few to no features are shared between individual DNA-binding protein families. The originally conceived direct readout of DNA residue sequences by amino acid side chains offers very limited capacity for sequence recognition, while the effects of the dynamic properties of the interacting partners remain difficult to quantify and almost impossible to generalise. In this work we investigated the energetic characteristics of all DNA residue—amino acid side chain combinations in the conformations found at the interaction interface in a very large set of protein—DNA complexes by the means of empirical potential-based calculations. General specificity-defining criteria were derived and utilised to look beyond the binding motifs considered in previous studies. Linking energetic favourability to the observed geometrical preferences, our approach reveals several additional amino acid motifs which can distinguish between individual DNA bases. Our results remained valid in environments with various dielectric properties. PMID:27384774
Ventura, Marco; Zink, Ralf; Fitzgerald, Gerald F; van Sinderen, Douwe
2005-01-01
The incorporation and delivery of bifidobacterial strains as probiotic components in many food preparations expose these microorganisms to a multitude of environmental insults, including heat and osmotic stresses. We characterized the dnaK gene region of Bifidobacterium breve UCC 2003. Sequence analysis of the dnaK locus revealed four genes with the organization dnaK-grpE-dnaJ-ORF1, whose deduced protein products display significant similarity to corresponding chaperones found in other bacteria. Northern hybridization and real-time LightCycler PCR analysis revealed that the transcription of the dnaK operon was strongly induced by osmotic shock but was not induced significantly by heat stress. A 4.4-kb polycistronic mRNA, which represented the transcript of the complete dnaK gene region, was detected. Many other small transcripts, which were assumed to have resulted from intensive processing or degradation of this polycistronic mRNA, were identified. The transcription start site of the dnaK operon was determined by primer extension. Phylogenetic analysis of the available bifidobacterial grpE and dnaK genes suggested that the evolutionary development of these genes has been similar. The phylogeny derived from the various bifidobacterial grpE and dnaK sequences is consistent with that derived from 16S rRNA. The use of these genes in bifidobacterial species as an alternative or complement to the 16S rRNA gene marker provides sequence signatures that allow a high level of discrimination between closely related species of this genus.
Watanabe, Yoshiyuki; Yamamoto, Hiroyuki; Oikawa, Ritsuko; Toyota, Minoru; Yamamoto, Masakazu; Kokudo, Norihiro; Tanaka, Shinji; Arii, Shigeki; Yotsuyanagi, Hiroshi; Koike, Kazuhiko; Itoh, Fumio
2015-01-01
Integration of DNA viruses into the human genome plays an important role in various types of tumors, including hepatitis B virus (HBV)–related hepatocellular carcinoma. However, the molecular details and clinical impact of HBV integration on either human or HBV epigenomes are unknown. Here, we show that methylation of the integrated HBV DNA is related to the methylation status of the flanking human genome. We developed a next-generation sequencing-based method for structural methylation analysis of integrated viral genomes (denoted G-NaVI). This method is a novel approach that enables enrichment of viral fragments for sequencing using unique baits based on the sequence of the HBV genome. We detected integrated HBV sequences in the genome of the PLC/PRF/5 cell line and found variable levels of methylation within the integrated HBV genomes. Allele-specific methylation analysis revealed that the HBV genome often became significantly methylated when integrated into highly methylated host sites. After integration into unmethylated human genome regions such as promoters, however, the HBV DNA remains unmethylated and may eventually play an important role in tumorigenesis. The observed dynamic changes in DNA methylation of the host and viral genomes may functionally affect the biological behavior of HBV. These findings may impact public health given that millions of people worldwide are carriers of HBV. We also believe our assay will be a powerful tool to increase our understanding of the various types of DNA virus-associated tumorigenesis. PMID:25653310
mtDNA-Server: next-generation sequencing data analysis of human mitochondrial DNA in the cloud.
Weissensteiner, Hansi; Forer, Lukas; Fuchsberger, Christian; Schöpf, Bernd; Kloss-Brandstätter, Anita; Specht, Günther; Kronenberg, Florian; Schönherr, Sebastian
2016-07-08
Next generation sequencing (NGS) allows investigating mitochondrial DNA (mtDNA) characteristics such as heteroplasmy (i.e. intra-individual sequence variation) to a higher level of detail. While several pipelines for analyzing heteroplasmies exist, issues in usability, accuracy of results and interpreting final data limit their usage. Here we present mtDNA-Server, a scalable web server for the analysis of mtDNA studies of any size with a special focus on usability as well as reliable identification and quantification of heteroplasmic variants. The mtDNA-Server workflow includes parallel read alignment, heteroplasmy detection, artefact or contamination identification, variant annotation as well as several quality control metrics, often neglected in current mtDNA NGS studies. All computational steps are parallelized with Hadoop MapReduce and executed graphically with Cloudgene. We validated the underlying heteroplasmy and contamination detection model by generating four artificial sample mix-ups on two different NGS devices. Our evaluation data shows that mtDNA-Server detects heteroplasmies and artificial recombinations down to the 1% level with perfect specificity and outperforms existing approaches regarding sensitivity. mtDNA-Server is currently able to analyze the 1000G Phase 3 data (n = 2,504) in less than 5 h and is freely accessible at https://mtdna-server.uibk.ac.at. © The Author(s) 2016. Published by Oxford University Press on behalf of Nucleic Acids Research.
HUNT: launch of a full-length cDNA database from the Helix Research Institute.
Yudate, H T; Suwa, M; Irie, R; Matsui, H; Nishikawa, T; Nakamura, Y; Yamaguchi, D; Peng, Z Z; Yamamoto, T; Nagai, K; Hayashi, K; Otsuki, T; Sugiyama, T; Ota, T; Suzuki, Y; Sugano, S; Isogai, T; Masuho, Y
2001-01-01
The Helix Research Institute (HRI) in Japan is releasing 4356 HUman Novel Transcripts and related information in the newly established HUNT database. The institute is a joint research project principally funded by the Japanese Ministry of International Trade and Industry, and the clones were sequenced in the governmental New Energy and Industrial Technology Development Organization (NEDO) Human cDNA Sequencing Project. The HUNT database contains an extensive amount of annotation from advanced analysis and represents an essential bioinformatics contribution towards understanding of the gene function. The HRI human cDNA clones were obtained from full-length enriched cDNA libraries constructed with the oligo-capping method and have resulted in novel full-length cDNA sequences. A large fraction has little similarity to any proteins of known function and to obtain clues about possible function we have developed original analysis procedures. Any putative function deduced here can be validated or refuted by complementary analysis results. The user can also extract information from specific categories like PROSITE patterns, PFAM domains, PSORT localization, transmembrane helices and clones with GENIUS structure assignments. The HUNT database can be accessed at http://www.hri.co.jp/HUNT.
Busslinger, M; Portmann, R; Irminger, J C; Birnstiel, M L
1980-01-01
The DNA sequences of the entire structural H4, H3, H2A and H2B genes and of their 5' flanking regions have been determined in the histone DNA clone h19 of the sea urchin Psammechinus miliaris. In clone h19 the polarity of transcription and the relative arrangement of the histone genes is identical to that in clone h22 of the same species. The histone proteins encoded by h19 DNA differ in their primary structure from those encoded by clone h22 and have been compared to histone protein sequences of other sea urchin species as well as other eukaryotes. A comparative analysis of the 5' flanking DNA sequences of the structural histone genes in both clones revealed four ubiquitous sequence motifs; a pentameric element GATCC, followed at short distance by the Hogness box GTATAAATAG, a conserved sequence PyCATTCPu, in or near which the 5' ends of the mRNAs map in h22 DNA and lastly a sequence A, containing the initiation codon. These sequences are also found, sometimes in modified version, in front of other eukaryotic genes transcribed by polymerase II. When prelude sequences of isocoding histone genes in clone h19 and h22 are compared areas of homology are seen to extend beyond the ubiquitous sequence motifs towards the divergent AT-rich spacer and terminate between approximately 140 and 240 nucleotides away from the structural gene. These prelude regions contain quite large conservative sequence blocks which are specific for each type of histone genes. Images PMID:7443547
Pan, Pinliang; Tao, Xiaoxia; Zhang, Qi; Xing, Wenge; Sun, Xianguang; Pei, Lijian; Jiang, Yan
2007-12-01
To investigate the correlation between three viral load assays for circulating recombinant form (CRF)_BC. Recent studies in HIV-1 molecular epidemiology, reveals that CRF_BC is the dominant subtype of HIV-1 virus in mainland China, representing over 45% of the HIV-1 infected population. The performances of nucleic acid sequence-based amplification (NASBA), branched DNA (bDNA) and reverse transcriptase polymerase chain reaction (RT-PCR) were compared for the HIV-1 viral load detection and quantitation of CRF_BC in China. Sixteen HIV-1 positive and three HIV-1 negative samples were collected. Sequencing of the positive samples in the gp41 region was conducted. The HIV-1 viral load values were determined using bDNA, RT-PCR and NASBA assays. Deming regression analysis with SPSS 12.0 (SPS Inc., Chicago, Illinois, USA) was performed for data analysis. Sequencing and phylogenetic analysis of env gene (gp41) region of the 16 HIV-1 positive clinical specimens from Guizhou Province in southwest China revealed the dominance of the subtype CRF_BC in that region. A good correlation of their viral load values was observed among three assays. Pearson's correlation between RT-PCR and bDNA is 0.969, Lg(VL)RT-PCR = 0.969 * Lg(VL)bDNA + 0.55; Pearson's correlation between RT-PCR and NASBA is 0.968, Lg(VL)RT-PCR = 0.968 * Lg(VL)NASBA + 0.937; Pearson's correlation between NASBA and bDNA is 0.980, Lg(VL)NASBA = 0.980 * Lg(VL)bDNA - 0.318. When testing with 3 different assays, RT-PCR, bDNA and NASBA, the group of 16 HIV-1 positive samples showed the viral load value was highest for RT-PCR, followed by bDNA then NASBA, which is consistent with the former results in subtype B. The three viral load assays are highly correlative for CRF_BC in China.
Prychitko, T M; Moore, W S
1997-10-01
Estimating phylogenies from DNA sequence data has become the major methodology of molecular phylogenetics. To date, molecular phylogenetics of the vertebrates has been very dependent on mtDNA, but studies involving mtDNA are limited because the several genes comprising the mt-genome are inherited as a single linkage group. The only apparent solution to this problem is to sequence additional genes, each representing a distinct linkage group, so that the resultant gene trees provide independent estimates of the species tree. There exists the need to find novel gene sequences which contain enough phylogenetic information to resolve relationships between closely related species. A possible source is the nuclear-encoded introns, because they evolve more rapidly than exons. We designed primers to amplify and sequence the 7 intron from the beta-fibrinogen gene for a recently evolved group, the woodpeckers. We sequenced the entire intron for 10 specimens representing five species. Nucleotide substitutions are randomly distributed along the length of the intron, suggesting selective neutrality. A preliminary analysis indicates that the phylogenetic signal in the intron is as strong as that in the mitochondrial encoded cytochrome b (cyt b) gene. The topology of the beta-fibrinogen tree is identical to that of the cyt b tree. This analysis demonstrates the ability of the 7 intron of beta-fibrinogen to provide well resolved, independent gene trees for recently evolved groups and establishes it as a source of sequences to be used in other phylogenetic studies. Copyright 1997 Academic Press
Gregory M. Bonito; Andrii P. Gryganskyi; James M. Trappe; Rytas Vilgalys
2010-01-01
Truffles (Tuber) are ectomycorrhizal fungi characterized by hypogeous fruitbodies. Their biodiversity, host associations and geographical distributions are not well documented. ITS rDNA sequences of Tuber are commonly recovered from molecular surveys of fungal communities, but most remain insufficiently identified making it...
USDA-ARS?s Scientific Manuscript database
Genic microsatellites or simple sequence repeat (genic-SSR) markers were developed in boxwood (Buxus taxa) for genetic diversity analysis, identification of taxa, and to facilitate breeding. cDNA libraries were developed from mRNA extracted from leaves of Buxus sempervirens ‘Vardar Valley’ and seque...
The partial 16S rDNA gene sequences of two thermophilic archaeal strains, TY and TYS, previously isolated from the Guaymas Basin hydrothermal vent site were determined. Lipid analyses and a comparative analysis performed with 16S rDNA sequences of similar thermophilic species sho...
Molecular Characterization of Watermelon Chlorotic Stunt Virus (WmCSV) from Palestine
Ali-Shtayeh, Mohammed S.; Jamous, Rana M.; Mallah, Omar B.; Abu-Zeitoun, Salam Y.
2014-01-01
The incidence of watermelon chlorotic stunt disease and molecular characterization of the Palestinian isolate of Watermelon chlorotic stunt virus (WmCSV-[PAL]) are described in this study. Symptomatic leaf samples obtained from watermelon Citrullus lanatus (Thunb.), and cucumber (Cucumis sativus L.) plants were tested for WmCSV-[PAL] infection by polymerase chain reaction (PCR) and Rolling Circle Amplification (RCA). Disease incidence ranged between 25%–98% in watermelon fields in the studied area, 77% of leaf samples collected from Jenin were found to be mixed infected with WmCSV-[PAL] and SLCV. The full-length DNA-A and DNA-B genomes of WmCSV-[PAL] were amplified and sequenced, and the sequences were deposited in the GenBank. Sequence analysis of virus genomes showed that DNA-A and DNA-B had 97.6%–99.42% and 93.16%–98.26% nucleotide identity with other virus isolates in the region, respectively. Sequence analysis also revealed that the Palestinian isolate of WmCSV shared the highest nucleotide identity with an isolate from Israel suggesting that the virus was introduced to Palestine from Israel. PMID:24956181
Lam, Kelly Y C; Chan, Gallant K L; Xin, Gui-Zhong; Xu, Hong; Ku, Chuen-Fai; Chen, Jian-Ping; Yao, Ping; Lin, Huang-Quan; Dong, Tina T X; Tsim, Karl W K
2015-12-15
Cordyceps sinensis is an endoparasitic fungus widely used as a tonic and medicinal food in the practice of traditional Chinese medicine (TCM). In historical usage, Cordyceps specifically is referring to the species of C. sinensis. However, a number of closely related species are named themselves as Cordyceps, and they are sold commonly as C. sinensis. The substitutes and adulterants of C. sinensis are often introduced either intentionally or accidentally in the herbal market, which seriously affects the therapeutic effects or even leads to life-threatening poisoning. Here, we aim to identify Cordyceps by DNA sequencing technology. Two different DNA-based approaches were compared. The internal transcribed spacer (ITS) sequences and the random amplified polymorphic DNA (RAPD)-sequence characterized amplified region (SCAR) were developed here to authenticate different species of Cordyceps. Both approaches generally enabled discrimination of C. sinensis from others. The application of the two methods, supporting each other, increases the security of identification. For better reproducibility and faster analysis, the SCAR markers derived from the RAPD results provide a new method for quick authentication of Cordyceps.
Takeo, Toshinori; Tanaka, Tetsuya; Matsubayashi, Makoto; Maeda, Hiroki; Kusakisako, Kodai; Matsui, Toshihiro; Mochizuki, Masami; Matsuo, Tomohide
2014-08-01
Previously, we characterized an undocumented strain of Eimeria krijgsmanni by morphological and biological features. Here, we present a detailed molecular phylogenetic analysis of this organism. Namely, 18S ribosomal RNA gene (rDNA) sequences of E. krijgsmanni were analyzed to incorporate this species into a comprehensive Eimeria phylogeny. As a result, partial 18S rDNA sequence from E. krijgsmanni was successfully determined, and two different types, Type A and Type B, that differed by 1 base pair were identified. E. krijgsmanni was originally isolated from a single oocyst, and thus the result show that the two types might have allelic sequence heterogeneity in the 18S rDNA. Based on phylogenetic analyses, the two types of E. krijgsmanni 18S rDNA formed one of two clades among murine Eimeria spp.; these Eimeria clades reflected morphological similarity among the Eimeria spp. This is the third molecular phylogenetic characterization of a murine Eimeria spp. in addition to E. falciformis and E. papillata. Copyright © 2014 Elsevier Ireland Ltd. All rights reserved.
DNA cross-linking by dehydromonocrotaline lacks apparent base sequence preference.
Rieben, W Kurt; Coulombe, Roger A
2004-12-01
Pyrrolizidine alkaloids (PAs) are ubiquitous plant toxins, many of which, upon oxidation by hepatic mixed-function oxidases, become reactive bifunctional pyrrolic electrophiles that form DNA-DNA and DNA-protein cross-links. The anti-mitotic, toxic, and carcinogenic action of PAs is thought to be caused, at least in part, by these cross-links. We wished to determine whether the activated PA pyrrole dehydromonocrotaline (DHMO) exhibits base sequence preferences when cross-linked to a set of model duplex poly A-T 14-mer oligonucleotides with varying internal and/or end 5'-d(CG), 5'-d(GC), 5'-d(TA), 5'-d(CGCG), or 5'-d(GCGC) sequences. DHMO-DNA cross-links were assessed by electrophoretic mobility shift assay (EMSA) of 32P endlabeled oligonucleotides and by HPLC analysis of cross-linked DNAs enzymatically digested to their constituent deoxynucleosides. The degree of DNA cross-links depended upon the concentration of the pyrrole, but not on the base sequence of the oligonucleotide target. Likewise, HPLC chromatograms of cross-linked and digested DNAs showed no discernible sequence preference for any nucleotide. Added glutathione, tyrosine, cysteine, and aspartic acid, but not phenylalanine, threonine, serine, lysine, or methionine competed with DNA as alternate nucleophiles for cross-linking by DHMO. From these data it appears that DHMO exhibits no strong base preference when forming cross-links with DNA, and that some cellular nucleophiles can inhibit DNA cross-link formation.
Dasytricha dominance in Surti buffalo rumen revealed by 18S rRNA sequences and real-time PCR assay.
Singh, K M; Tripathi, A K; Pandya, P R; Rank, D N; Kothari, R K; Joshi, C G
2011-09-01
The genetic diversity of protozoa in Surti buffalo rumen was studied by amplified ribosomal DNA restriction analysis, 18S rDNA sequence homology and phylogenetic and Real-time PCR analysis methods. Three animals were fed diet comprised green fodder Napier bajra 21 (Pennisetum purpureum), mature pasture grass (Dicanthium annulatum) and concentrate mixture (20% crude protein, 65% total digestible nutrients). A protozoa-specific primer (P-SSU-342f) and a eukarya-specific primer (Medlin B) were used to amplify a 1,360 bp fragment of DNA encoding protozoal small subunit (SSU) ribosomal RNA from rumen fluid. A total of 91 clones were examined and identified 14 different 18S RNA sequences based on PCR-RFLP pattern. These 14 phylotypes were distributed into four genera-based 18S rDNA database sequences and identified as Dasytricha (57 clones), Isotricha (14 clones), Ostracodinium (11 clones) and Polyplastron (9 clones). Phylogenetic analyses were also used to infer the makeup of protozoa communities in the rumen of Surti buffalo. Out of 14 sequences, 8 sequences (69 clones) clustered with the Dasytricha ruminantium-like clone and 4 sequences (13 clones) were also phylogenetically placed with the Isotricha prostoma-like clone. Moreover, 2 phylotypes (9 clones) were related to Polyplastron multivesiculatum-like clone. In addition, the number of 18S rDNA gene copies of Dasytricha ruminantium (0.05% to ciliate protozoa) was higher than Entodinium sp. (2.0 × 10(5) vs. 1.3 × 10(4)) in per ml ruminal fluid.
Mak, Sarah Siu Tze; Gopalakrishnan, Shyam; Carøe, Christian; Geng, Chunyu; Liu, Shanlin; Sinding, Mikkel-Holger S; Kuderna, Lukas F K; Zhang, Wenwei; Fu, Shujin; Vieira, Filipe G; Germonpré, Mietje; Bocherens, Hervé; Fedorov, Sergey; Petersen, Bent; Sicheritz-Pontén, Thomas; Marques-Bonet, Tomas; Zhang, Guojie; Jiang, Hui; Gilbert, M Thomas P
2017-01-01
Abstract Ancient DNA research has been revolutionized following development of next-generation sequencing platforms. Although a number of such platforms have been applied to ancient DNA samples, the Illumina series are the dominant choice today, mainly because of high production capacities and short read production. Recently a potentially attractive alternative platform for palaeogenomic data generation has been developed, the BGISEQ-500, whose sequence output are comparable with the Illumina series. In this study, we modified the standard BGISEQ-500 library preparation specifically for use on degraded DNA, then directly compared the sequencing performance and data quality of the BGISEQ-500 to the Illumina HiSeq2500 platform on DNA extracted from 8 historic and ancient dog and wolf samples. The data generated were largely comparable between sequencing platforms, with no statistically significant difference observed for parameters including level (P = 0.371) and average sequence length (P = 0718) of endogenous nuclear DNA, sequence GC content (P = 0.311), double-stranded DNA damage rate (v. 0.309), and sequence clonality (P = 0.093). Small significant differences were found in single-strand DNA damage rate (δS; slightly lower for the BGISEQ-500, P = 0.011) and the background rate of difference from the reference genome (θ; slightly higher for BGISEQ-500, P = 0.012). This may result from the differences in amplification cycles used to polymerase chain reaction–amplify the libraries. A significant difference was also observed in the mitochondrial DNA percentages recovered (P = 0.018), although we believe this is likely a stochastic effect relating to the extremely low levels of mitochondria that were sequenced from 3 of the samples with overall very low levels of endogenous DNA. Although we acknowledge that our analyses were limited to animal material, our observations suggest that the BGISEQ-500 holds the potential to represent a valid and potentially valuable alternative platform for palaeogenomic data generation that is worthy of future exploration by those interested in the sequencing and analysis of degraded DNA. PMID:28854615
Yu, Haining; Gao, Jiuxiang; Lu, Yiling; Guang, Huijuan; Cai, Shasha; Zhang, Songyan; Wang, Yipeng
2013-11-01
Lysozymes are key proteins that play important roles in innate immune defense in many animal phyla by breaking down the bacterial cell-walls. In this study, we report the molecular cloning, sequence analysis and phylogeny of the first caudate amphibian g-lysozyme: a full-length spleen cDNA library from axolotl (Ambystoma mexicanum). A goose-type (g-lysozyme) EST was identified and the full-length cDNA was obtained using RACE-PCR. The axolotl g-lysozyme sequence represents an open reading frame for a putative signal peptide and the mature protein composed of 184 amino acids. The calculated molecular mass and the theoretical isoelectric point (pl) of this mature protein are 21523.0 Da and 4.37, respectively. Expression of g-lysozyme mRNA is predominantly found in skin, with lower levels in spleen, liver, muscle, and lung. Phylogenetic analysis revealed that caudate amphibian g-lysozyme had distinct evolution pattern for being juxtaposed with not only anura amphibian, but also with the fish, bird and mammal. Although the first complete cDNA sequence for caudate amphibian g-lysozyme is reported in the present study, clones encoding axolotl's other functional immune molecules in the full-length cDNA library will have to be further sequenced to gain insight into the fundamental aspects of antibacterial mechanisms in caudate.
VaDiR: an integrated approach to Variant Detection in RNA.
Neums, Lisa; Suenaga, Seiji; Beyerlein, Peter; Anders, Sara; Koestler, Devin; Mariani, Andrea; Chien, Jeremy
2018-02-01
Advances in next-generation DNA sequencing technologies are now enabling detailed characterization of sequence variations in cancer genomes. With whole-genome sequencing, variations in coding and non-coding sequences can be discovered. But the cost associated with it is currently limiting its general use in research. Whole-exome sequencing is used to characterize sequence variations in coding regions, but the cost associated with capture reagents and biases in capture rate limit its full use in research. Additional limitations include uncertainty in assigning the functional significance of the mutations when these mutations are observed in the non-coding region or in genes that are not expressed in cancer tissue. We investigated the feasibility of uncovering mutations from expressed genes using RNA sequencing datasets with a method called Variant Detection in RNA(VaDiR) that integrates 3 variant callers, namely: SNPiR, RVBoost, and MuTect2. The combination of all 3 methods, which we called Tier 1 variants, produced the highest precision with true positive mutations from RNA-seq that could be validated at the DNA level. We also found that the integration of Tier 1 variants with those called by MuTect2 and SNPiR produced the highest recall with acceptable precision. Finally, we observed a higher rate of mutation discovery in genes that are expressed at higher levels. Our method, VaDiR, provides a possibility of uncovering mutations from RNA sequencing datasets that could be useful in further functional analysis. In addition, our approach allows orthogonal validation of DNA-based mutation discovery by providing complementary sequence variation analysis from paired RNA/DNA sequencing datasets.
Guo, Bingfu; Guo, Yong; Hong, Huilong; Qiu, Li-Juan
2016-01-01
Molecular characterization of sequence flanking exogenous fragment insertion is essential for safety assessment and labeling of genetically modified organism (GMO). In this study, the T-DNA insertion sites and flanking sequences were identified in two newly developed transgenic glyphosate-tolerant soybeans GE-J16 and ZH10-6 based on whole genome sequencing (WGS) method. More than 22.4 Gb sequence data (∼21 × coverage) for each line was generated on Illumina HiSeq 2500 platform. The junction reads mapped to boundaries of T-DNA and flanking sequences in these two events were identified by comparing all sequencing reads with soybean reference genome and sequence of transgenic vector. The putative insertion loci and flanking sequences were further confirmed by PCR amplification, Sanger sequencing, and co-segregation analysis. All these analyses supported that exogenous T-DNA fragments were integrated in positions of Chr19: 50543767-50543792 and Chr17: 7980527-7980541 in these two transgenic lines. Identification of genomic insertion sites of G2-EPSPS and GAT transgenes will facilitate the utilization of their glyphosate-tolerant traits in soybean breeding program. These results also demonstrated that WGS was a cost-effective and rapid method for identifying sites of T-DNA insertions and flanking sequences in soybean.
Beccari, T; Hoade, J; Orlacchio, A; Stirling, J L
1992-01-01
cDNAs encoding the mouse beta-N-acetylhexosaminidase alpha-subunit were isolated from a mouse testis library. The longest of these (1.7 kb) was sequenced and showed 83% similarity with the human alpha-subunit cDNA sequence. The 5' end of the coding sequence was obtained from a genomic DNA clone. Alignment of the human and mouse sequences showed that all three putative N-glycosylation sites are conserved, but that the mouse alpha-subunit has an additional site towards the C-terminus. All eight cysteines in the human sequence are conserved in the mouse. There are an additional two cysteines in the mouse alpha-subunit signal peptide. All amino acids affected in Tay-Sachs-disease mutations are conserved in the mouse. Images Fig. 1. PMID:1379046
Churchill, Mair E.A.; Klass, Janet; Zoetewey, David L.
2010-01-01
The ubiquitous eukaryotic High-Mobility-Group-Box (HMGB) chromosomal proteins promote many chromatin-mediated cellular activities through their non-sequence-specific binding and bending of DNA. Minor groove DNA binding by the HMG box results in substantial DNA bending toward the major groove owing to electrostatic interactions, shape complementarity and DNA intercalation that occurs at two sites. Here, the structures of the complexes formed with DNA by a partially DNA intercalation-deficient mutant of Drosophila melanogaster HMGD have been determined by X-ray crystallography at a resolution of 2.85 Å. The six proteins and fifty base pairs of DNA in the crystal structure revealed a variety of bound conformations. All of the proteins bound in the minor groove, bridging DNA molecules, presumably because these DNA regions are easily deformed. The loss of the primary site of DNA intercalation decreased overall DNA bending and shape complementarity. However, DNA bending at the secondary site of intercalation was retained and most protein-DNA contacts were preserved. The mode of binding resembles the HMGB1-boxA-cisplatin-DNA complex, which also lacks a primary intercalating residue. This study provides new insights into the binding mechanisms used by HMG boxes to recognize varied DNA structures and sequences as well as modulate DNA structure and DNA bending. PMID:20800069
Analysis of DNA methylation in FFPE tissues using the MethyLight technology.
Dallol, Ashraf; Al-Ali, Waleed; Al-Shaibani, Amina; Al-Mulla, Fahd
2011-01-01
Novel biomarkers are sought after by mining DNA extracted from formalin-fixed, paraffin-embedded (FFPE) tissues. Such tissues offer the great advantage of often having complete clinical data (including survival), as well as the tissues are amenable for laser microdissection targeting specific tissue areas. Downstream analysis of such DNA includes mutational screens and methylation profiling. Screening for mutations by sequencing requires a significant amount of DNA for PCR and cycle sequencing. This is self-inhibitory if the gene screened has a large number of exons. Profiling DNA methylation using the MethyLight technology circumvents this problem and allows for the mining of several biomarkers from DNA extracted from a single microscope slide of the tissue of interest. We describe in this chapter a detailed protocol for MethyLight and its use in the determination of CpG Island Methylator Phenotype status in FFPE colorectal cancer samples.
A novel model for DNA sequence similarity analysis based on graph theory.
Qi, Xingqin; Wu, Qin; Zhang, Yusen; Fuller, Eddie; Zhang, Cun-Quan
2011-01-01
Determination of sequence similarity is one of the major steps in computational phylogenetic studies. As we know, during evolutionary history, not only DNA mutations for individual nucleotide but also subsequent rearrangements occurred. It has been one of major tasks of computational biologists to develop novel mathematical descriptors for similarity analysis such that various mutation phenomena information would be involved simultaneously. In this paper, different from traditional methods (eg, nucleotide frequency, geometric representations) as bases for construction of mathematical descriptors, we construct novel mathematical descriptors based on graph theory. In particular, for each DNA sequence, we will set up a weighted directed graph. The adjacency matrix of the directed graph will be used to induce a representative vector for DNA sequence. This new approach measures similarity based on both ordering and frequency of nucleotides so that much more information is involved. As an application, the method is tested on a set of 0.9-kb mtDNA sequences of twelve different primate species. All output phylogenetic trees with various distance estimations have the same topology, and are generally consistent with the reported results from early studies, which proves the new method's efficiency; we also test the new method on a simulated data set, which shows our new method performs better than traditional global alignment method when subsequent rearrangements happen frequently during evolutionary history.
Trading genes along the silk road: mtDNA sequences and the origin of central Asian populations.
Comas, D; Calafell, F; Mateu, E; Pérez-Lezaun, A; Bosch, E; Martínez-Arias, R; Clarimon, J; Facchini, F; Fiori, G; Luiselli, D; Pettener, D; Bertranpetit, J
1998-01-01
Central Asia is a vast region at the crossroads of different habitats, cultures, and trade routes. Little is known about the genetics and the history of the population of this region. We present the analysis of mtDNA control-region sequences in samples of the Kazakh, the Uighurs, the lowland Kirghiz, and the highland Kirghiz, which we have used to address both the population history of the region and the possible selective pressures that high altitude has on mtDNA genes. Central Asian mtDNA sequences present features intermediate between European and eastern Asian sequences, in several parameters-such as the frequencies of certain nucleotides, the levels of nucleotide diversity, mean pairwise differences, and genetic distances. Several hypotheses could explain the intermediate position of central Asia between Europe and eastern Asia, but the most plausible would involve extensive levels of admixture between Europeans and eastern Asians in central Asia, possibly enhanced during the Silk Road trade and clearly after the eastern and western Eurasian human groups had diverged. Lowland and highland Kirghiz mtDNA sequences are very similar, and the analysis of molecular variance has revealed that the fraction of mitochondrial genetic variance due to altitude is not significantly different from zero. Thus, it seems unlikely that altitude has exerted a major selective pressure on mitochondrial genes in central Asian populations. PMID:9837835
Herrnstadt, Corinna; Elson, Joanna L; Fahy, Eoin; Preston, Gwen; Turnbull, Douglass M; Anderson, Christen; Ghosh, Soumitra S; Olefsky, Jerrold M; Beal, M Flint; Davis, Robert E; Howell, Neil
2002-05-01
The evolution of the human mitochondrial genome is characterized by the emergence of ethnically distinct lineages or haplogroups. Nine European, seven Asian (including Native American), and three African mitochondrial DNA (mtDNA) haplogroups have been identified previously on the basis of the presence or absence of a relatively small number of restriction-enzyme recognition sites or on the basis of nucleotide sequences of the D-loop region. We have used reduced-median-network approaches to analyze 560 complete European, Asian, and African mtDNA coding-region sequences from unrelated individuals to develop a more complete understanding of sequence diversity both within and between haplogroups. A total of 497 haplogroup-associated polymorphisms were identified, 323 (65%) of which were associated with one haplogroup and 174 (35%) of which were associated with two or more haplogroups. Approximately one-half of these polymorphisms are reported for the first time here. Our results confirm and substantially extend the phylogenetic relationships among mitochondrial genomes described elsewhere from the major human ethnic groups. Another important result is that there were numerous instances both of parallel mutations at the same site and of reversion (i.e., homoplasy). It is likely that homoplasy in the coding region will confound evolutionary analysis of small sequence sets. By a linkage-disequilibrium approach, additional evidence for the absence of human mtDNA recombination is presented here.
Urasaki, Naoya; Goeku, Satoko; Kaneshima, Risa; Takamine, Tomonori; Tarora, Kazuhiko; Takeuchi, Makoto; Moromizato, Chie; Yonamine, Kaname; Hosaka, Fumiko; Terakami, Shingo; Matsumura, Hideo; Yamamoto, Toshiya; Shoda, Moriyuki
2015-01-01
To explore genome-wide DNA polymorphisms and identify DNA markers for leaf margin phenotypes, a restriction-site-associated DNA sequencing analysis was employed to analyze three bulked DNAs of F1 progeny from a cross between a ‘piping-leaf-type’ cultivar, ‘Yugafu’, and a ‘spiny-tip-leaf-type’ variety, ‘Yonekura’. The parents were both Ananas comosus var. comosus. From the analysis, piping-leaf and spiny-tip-leaf gene-specific restriction-site-associated DNA sequencing tags were obtained and designated as PLSTs and STLSTs, respectively. The five PLSTs and two STSLTs were successfully converted to cleaved amplified polymorphic sequence (CAPS) or simple sequence repeat (SSR) markers using the sequence differences between alleles. Based on the genotyping of the F1 with two SSR and three CAPS markers, the five PLST markers were mapped in the vicinity of the P locus, with the closest marker, PLST1_SSR, being located 1.5 cM from the P locus. The two CAPS markers from STLST1 and STLST3 perfectly assessed the ‘spiny-leaf type’ as homozygotes of the recessive s allele of the S gene. The recombination value between the S locus and STLST loci was 2.4, and STLSTs were located 2.2 cM from the S locus. SSR and CAPS markers are applicable to marker-assisted selection of leaf margin phenotypes in pineapple breeding. PMID:26175625
Urasaki, Naoya; Goeku, Satoko; Kaneshima, Risa; Takamine, Tomonori; Tarora, Kazuhiko; Takeuchi, Makoto; Moromizato, Chie; Yonamine, Kaname; Hosaka, Fumiko; Terakami, Shingo; Matsumura, Hideo; Yamamoto, Toshiya; Shoda, Moriyuki
2015-06-01
To explore genome-wide DNA polymorphisms and identify DNA markers for leaf margin phenotypes, a restriction-site-associated DNA sequencing analysis was employed to analyze three bulked DNAs of F1 progeny from a cross between a 'piping-leaf-type' cultivar, 'Yugafu', and a 'spiny-tip-leaf-type' variety, 'Yonekura'. The parents were both Ananas comosus var. comosus. From the analysis, piping-leaf and spiny-tip-leaf gene-specific restriction-site-associated DNA sequencing tags were obtained and designated as PLSTs and STLSTs, respectively. The five PLSTs and two STSLTs were successfully converted to cleaved amplified polymorphic sequence (CAPS) or simple sequence repeat (SSR) markers using the sequence differences between alleles. Based on the genotyping of the F1 with two SSR and three CAPS markers, the five PLST markers were mapped in the vicinity of the P locus, with the closest marker, PLST1_SSR, being located 1.5 cM from the P locus. The two CAPS markers from STLST1 and STLST3 perfectly assessed the 'spiny-leaf type' as homozygotes of the recessive s allele of the S gene. The recombination value between the S locus and STLST loci was 2.4, and STLSTs were located 2.2 cM from the S locus. SSR and CAPS markers are applicable to marker-assisted selection of leaf margin phenotypes in pineapple breeding.
Pallavi, Tokala; Chandra, Rampalli Viswa; Reddy, Aileni Amarender; Reddy, Bavigadda Harish; Naveen, Anumala
2016-01-01
Context: The inflammatory processes involved in chronic periodontitis and coronary artery diseases (CADs) are similar and produce reactive oxygen species that may result in similar somatic mutations in mitochondrial deoxyribonucleic acid (mtDNA). Aims: The aims of the present study were to identify somatic mtDNA mutations in periodontal and cardiac tissues from subjects undergoing coronary artery bypass surgery and determine what fraction was identical and unique to these tissues. Settings and Design: The study population consisted of 30 chronic periodontitis subjects who underwent coronary artery surgery after an angiogram had indicated CAD. Materials and Methods: Gingival tissue samples were taken from the site with deepest probing depth; coronary artery tissue samples were taken during the coronary artery bypass grafting procedures, and blood samples were drawn during this surgical procedure. These samples were stored under aseptic conditions and later transported for mtDNA analysis. Statistical Analysis Used: Complete mtDNA sequences were obtained and aligned with the revised Cambridge reference sequence (NC_012920) using sequence analysis and auto assembler tools. Results: Among the complete mtDNA sequences, a total of 162 variations were spread across the whole mitochondrial genome and present only in the coronary artery and the gingival tissue samples but not in the blood samples. Among the 162 variations, 12 were novel and four of the 12 novel variations were found in mitochondrial NADH dehydrogenase subunit 5 complex I gene (33.3%). Conclusions: Analysis of mtDNA mutations indicated 162 variants unique to periodontitis and CAD. Of these, 12 were novel and may have resulted from destructive oxidative forces common to these two diseases. PMID:27041832
Species-specific identification of commercial probiotic strains.
Yeung, P S M; Sanders, M E; Kitts, C L; Cano, R; Tong, P S
2002-05-01
Products containing probiotic bacteria are gaining popularity, increasing the importance of their accurate speciation. Unfortunately, studies have suggested that improper labeling of probiotic species is common in commercial products. Species identification of a bank of commercial probiotic strains was attempted using partial 16S rDNA sequencing, carbohydrate fermentation analysis, and cellular fatty acid methyl ester analysis. Results from partial 16S rDNA sequencing indicated discrepancies between species designations for 26 out of 58 strains tested, including two ATCC Lactobacillus strains. When considering only the commercial strains obtained directly from the manufacturers, 14 of 29 strains carried species designations different from those obtained by partial 16S rDNA sequencing. Strains from six commercial products were species not listed on the label. The discrepancies mainly occurred in Lactobacillus acidophilus and Lactobacillus casei groups. Carbohydrate fermentation analysis was not sensitive enough to identify species within the L. acidophilus group. Fatty acid methyl ester analysis was found to be variable and inaccurate and is not recommended to identify probiotic lactobacilli.
Genetic characterization of the Bifidobacterium breve UCC 2003 hrcA locus.
Ventura, Marco; Canchaya, Carlos; Bernini, Valentina; Del Casale, Antonio; Dellaglio, Franco; Neviani, Erasmo; Fitzgerald, Gerald F; van Sinderen, Douwe
2005-12-01
The bacterial heat shock response is characterized by the elevated expression of a number of chaperone complexes and transcriptional regulators, including the DnaJ and the HrcA proteins. Genome analysis of Bifidobacterium breve UCC 2003 revealed a second copy of a dnaJ gene, named dnaJ2, which is flanked by the hrcA gene in a genetic constellation that appears to be unique to the actinobacteria. Phylogenetic analysis using 53 bacterial dnaJ sequences, including both dnaJ1 and dnaJ2 sequences, suggests that these genes have followed a different evolutionary development. Furthermore, the B. breve UCC 2003 dnaJ2 gene seems to be regulated in a manner that is different from that of the previously characterized dnaJ1 gene. The dnaJ2 gene, which was shown to be part of a 2.3-kb bicistronic operon with hrcA, was induced by osmotic shock but not significantly by heat stress. This induction pattern is unlike those of other characterized dnaJ genes and may be indicative of a unique stress adaptation strategy by this commensal microorganism.
Genetic Characterization of the Bifidobacterium breve UCC 2003 hrcA Locus
Ventura, Marco; Canchaya, Carlos; Bernini, Valentina; Del Casale, Antonio; Dellaglio, Franco; Neviani, Erasmo; Fitzgerald, Gerald F.; van Sinderen, Douwe
2005-01-01
The bacterial heat shock response is characterized by the elevated expression of a number of chaperone complexes and transcriptional regulators, including the DnaJ and the HrcA proteins. Genome analysis of Bifidobacterium breve UCC 2003 revealed a second copy of a dnaJ gene, named dnaJ2, which is flanked by the hrcA gene in a genetic constellation that appears to be unique to the actinobacteria. Phylogenetic analysis using 53 bacterial dnaJ sequences, including both dnaJ1 and dnaJ2 sequences, suggests that these genes have followed a different evolutionary development. Furthermore, the B. breve UCC 2003 dnaJ2 gene seems to be regulated in a manner that is different from that of the previously characterized dnaJ1 gene. The dnaJ2 gene, which was shown to be part of a 2.3-kb bicistronic operon with hrcA, was induced by osmotic shock but not significantly by heat stress. This induction pattern is unlike those of other characterized dnaJ genes and may be indicative of a unique stress adaptation strategy by this commensal microorganism. PMID:16332909
Functional specificity of a Hox protein mediated by the recognition of minor groove structure.
Joshi, Rohit; Passner, Jonathan M; Rohs, Remo; Jain, Rinku; Sosinsky, Alona; Crickmore, Michael A; Jacob, Vinitha; Aggarwal, Aneel K; Honig, Barry; Mann, Richard S
2007-11-02
The recognition of specific DNA-binding sites by transcription factors is a critical yet poorly understood step in the control of gene expression. Members of the Hox family of transcription factors bind DNA by making nearly identical major groove contacts via the recognition helices of their homeodomains. In vivo specificity, however, often depends on extended and unstructured regions that link Hox homeodomains to a DNA-bound cofactor, Extradenticle (Exd). Using a combination of structure determination, computational analysis, and in vitro and in vivo assays, we show that Hox proteins recognize specific Hox-Exd binding sites via residues located in these extended regions that insert into the minor groove but only when presented with the correct DNA sequence. Our results suggest that these residues, which are conserved in a paralog-specific manner, confer specificity by recognizing a sequence-dependent DNA structure instead of directly reading a specific DNA sequence.
Repair of DNA damage caused by cytosine deamination in mitochondrial DNA of forensic case samples.
Gorden, Erin M; Sturk-Andreaggi, Kimberly; Marshall, Charla
2018-05-01
DNA sequence damage from cytosine deamination is well documented in degraded samples, such as those from ancient and forensic contexts. This study examined the effect of a DNA repair treatment on mitochondrial DNA (mtDNA) from aged and degraded skeletal samples. DNA extracts from 21 non-probative, degraded skeletal samples (aged 50-70 years) were utilized for the analysis. A portion of each sample extract was subjected to DNA repair using a commercial repair kit, the New England BioLabs' NEBNext FFPE DNA Repair Kit (Ipswich, MA). MtDNA was enriched using PCR and targeted capture in a side-by-side experiment of untreated and repaired DNA. Sequencing was performed using both traditional (Sanger-type; STS) and next-generation sequencing (NGS) methods Although cytosine deamination was evident in the mtDNA sequence data, the observed level of damaged bases varied by sequencing method as well as by enrichment type. The STS PCR amplicon data did not show evidence of cytosine deamination that could be distinguished from background signal in either the untreated or repaired sample set. However, the same PCR amplicons showed 850 C → T/G → A substitutions consistent with cytosine deamination with variant frequencies (VFs) of up to 25% when sequenced using NGS methods The occurrence of base misincorporation due to cytosine deamination was reduced by 98% (to 10) in the NGS amplicon data after repair. The NGS capture data indicated low levels (1-2%) of cytosine deamination in mtDNA fragments that was effectively mitigated by DNA repair. The observed difference in the level of cytosine deamination between the PCR and capture enrichment methods can be attributed to the greater propensity for stochastic effects from the PCR enrichment technique employed (e.g., low template input, increased PCR cycles). Altogether these results indicate that DNA repair may be required when sequencing PCR-amplified DNA from degraded forensic case samples with NGS methods. Copyright © 2018 The Authors. Published by Elsevier B.V. All rights reserved.
Mechanism of chimera formation during the Multiple Displacement Amplification reaction.
Lasken, Roger S; Stockwell, Timothy B
2007-04-12
Multiple Displacement Amplification (MDA) is a method used for amplifying limiting DNA sources. The high molecular weight amplified DNA is ideal for DNA library construction. While this has enabled genomic sequencing from one or a few cells of unculturable microorganisms, the process is complicated by the tendency of MDA to generate chimeric DNA rearrangements in the amplified DNA. Determining the source of the DNA rearrangements would be an important step towards reducing or eliminating them. Here, we characterize the major types of chimeras formed by carrying out an MDA whole genome amplification from a single E. coli cell and sequencing by the 454 Life Sciences method. Analysis of 475 chimeras revealed the predominant reaction mechanisms that create the DNA rearrangements. The highly branched DNA synthesized in MDA can assume many alternative secondary structures. DNA strands extended on an initial template can be displaced becoming available to prime on a second template creating the chimeras. Evidence supports a model in which branch migration can displace 3'-ends freeing them to prime on the new templates. More than 85% of the resulting DNA rearrangements were inverted sequences with intervening deletions that the model predicts. Intramolecular rearrangements were favored, with displaced 3'-ends reannealing to single stranded 5'-strands contained within the same branched DNA molecule. In over 70% of the chimeric junctions, the 3' termini had initiated priming at complimentary sequences of 2-21 nucleotides (nts) in the new templates. Formation of chimeras is an important limitation to the MDA method, particularly for whole genome sequencing. Identification of the mechanism for chimera formation provides new insight into the MDA reaction and suggests methods to reduce chimeras. The 454 sequencing approach used here will provide a rapid method to assess the utility of reaction modifications.
Mechanism of chimera formation during the Multiple Displacement Amplification reaction
Lasken, Roger S; Stockwell, Timothy B
2007-01-01
Background Multiple Displacement Amplification (MDA) is a method used for amplifying limiting DNA sources. The high molecular weight amplified DNA is ideal for DNA library construction. While this has enabled genomic sequencing from one or a few cells of unculturable microorganisms, the process is complicated by the tendency of MDA to generate chimeric DNA rearrangements in the amplified DNA. Determining the source of the DNA rearrangements would be an important step towards reducing or eliminating them. Results Here, we characterize the major types of chimeras formed by carrying out an MDA whole genome amplification from a single E. coli cell and sequencing by the 454 Life Sciences method. Analysis of 475 chimeras revealed the predominant reaction mechanisms that create the DNA rearrangements. The highly branched DNA synthesized in MDA can assume many alternative secondary structures. DNA strands extended on an initial template can be displaced becoming available to prime on a second template creating the chimeras. Evidence supports a model in which branch migration can displace 3'-ends freeing them to prime on the new templates. More than 85% of the resulting DNA rearrangements were inverted sequences with intervening deletions that the model predicts. Intramolecular rearrangements were favored, with displaced 3'-ends reannealing to single stranded 5'-strands contained within the same branched DNA molecule. In over 70% of the chimeric junctions, the 3' termini had initiated priming at complimentary sequences of 2–21 nucleotides (nts) in the new templates. Conclusion Formation of chimeras is an important limitation to the MDA method, particularly for whole genome sequencing. Identification of the mechanism for chimera formation provides new insight into the MDA reaction and suggests methods to reduce chimeras. The 454 sequencing approach used here will provide a rapid method to assess the utility of reaction modifications. PMID:17430586
Reddy, M Sreekanth; Kanakala, S; Srinivas, K P; Hema, M; Malathi, V G; Sreenivasulu, P
2014-05-01
The complete DNA A genome of a virus isolate associated with yellow mosaic disease of a medicinal plant, Hemidesmus indicus, from India was cloned and sequenced. The length of DNA A was 2825 nucleotides, 35 nucleotides longer than the unit genome of monopartite begomoviruses. Comparison of the nucleotide sequence of DNA A of the virus isolate with those of other begomoviruses showed maximum sequence identity of 69 % to DNA A of ageratum yellow vein China virus (AYVCNV; AJ558120) and 68 % with tomato yellow leaf curl virus- LBa4 (TYLCV; EF185318), and it formed a distinct clade in phylogenetic analysis. The genome organization of the present virus isolate was found to be similar to that of Old World monopartite begomoviruses. The genome was considered to be monopartite, because association of DNA B and β satellite DNA components was not detected. Based on its sequence identity (<70 %) to all other begomoviruses known to date and ICTV (International Committee on Taxonomy of Viruses) species demarcating criteria (<89 % identity), it is considered a member of a novel begomovirus species, and the tentative name "Hemidesmus yellow mosaic virus" (HeYMV) is proposed.
Yoshimitsu, Makoto; Higuchi, Koji; Miyata, Masaaki; Devine, Sean; Mattman, Andre; Sirrs, Sandra; Medin, Jeffrey A; Tei, Chuwa; Takenaka, Toshihiro
2011-05-01
Fabry disease is an X-linked lysosomal storage disorder caused by mutations of the α-galactosidase A (GLA) gene, and the disease is a relatively prevalent cause of left ventricular hypertrophy followed by conduction abnormalities and arrhythmias. Mutation analysis of the GLA gene is a valuable tool for accurate diagnosis of affected families. In this study, we carried out molecular studies of 10 unrelated families diagnosed with Fabry disease. Genetic analysis of the GLA gene using conventional genomic sequencing was performed in 9 hemizygous males and 6 heterozygous females. In patients with no mutations in coding DNA sequence, multiplex ligation-dependent probe amplification (MLPA) and/or cDNA sequencing were performed. We identified a novel exon 2 deletion (IVS1_IVS2) in a heterozygous female by MLPA, which was undetectable by conventional sequencing methods. In addition, the g.9331G>A mutation that has previously been found only in patients with cardiac Fabry disease was found in 3 unrelated, newly-diagnosed, cardiac Fabry patients by sequencing GLA genomic DNA and cDNA. Two other novel mutations, g.8319A>G and 832delA were also found in addition to 4 previously reported mutations (R112C, C142Y, M296I, and G373D) in 6 other families. We could identify GLA gene mutations in all hemizygotes and heterozygotes from 10 families with Fabry disease. Mutations in 4 out of 10 families could not be identified by classical genomic analysis, which focuses on exons and the flanking region. Instead, these data suggest that MLPA analysis and cDNA sequence should be considered in genetic testing surveys of patients with Fabry disease. Copyright © 2011 Japanese College of Cardiology. Published by Elsevier Ltd. All rights reserved.
Begum, Rabeya; Zakrzewski, Falk; Menzel, Gerhard; Weber, Beatrice; Alam, Sheikh Shamimul; Schmidt, Thomas
2013-01-01
Background and Aims The cultivated jute species Corchorus olitorius and Corchorus capsularis are important fibre crops. The analysis of repetitive DNA sequences, comprising a major part of plant genomes, has not been carried out in jute but is useful to investigate the long-range organization of chromosomes. The aim of this study was the identification of repetitive DNA sequences to facilitate comparative molecular and cytogenetic studies of two jute cultivars and to develop a fluorescent in situ hybridization (FISH) karyotype for chromosome identification. Methods A plasmid library was generated from C. olitorius and C. capsularis with genomic restriction fragments of 100–500 bp, which was complemented by targeted cloning of satellite DNA by PCR. The diversity of the repetitive DNA families was analysed comparatively. The genomic abundance and chromosomal localization of different repeat classes were investigated by Southern analysis and FISH, respectively. The cytosine methylation of satellite arrays was studied by immunolabelling. Key Results Major satellite repeats and retrotransposons have been identified from C. olitorius and C. capsularis. The satellite family CoSat I forms two undermethylated species-specific subfamilies, while the long terminal repeat (LTR) retrotransposons CoRetro I and CoRetro II show similarity to the Metaviridea of plant retroelements. FISH karyotypes were developed by multicolour FISH using these repetitive DNA sequences in combination with 5S and 18S–5·8S–25S rRNA genes which enable the unequivocal chromosome discrimination in both jute species. Conclusions The analysis of the structure and diversity of the repeated DNA is crucial for genome sequence annotation. The reference karyotypes will be useful for breeding of jute and provide the basis for karyotyping homeologous chromosomes of wild jute species to reveal the genetic and evolutionary relationship between cultivated and wild Corchorus species. PMID:23666888
The role of DNA repair in herpesvirus pathogenesis.
Brown, Jay C
2014-10-01
In cells latently infected with a herpesvirus, the viral DNA is present in the cell nucleus, but it is not extensively replicated or transcribed. In this suppressed state the virus DNA is vulnerable to mutagenic events that affect the host cell and have the potential to destroy the virus' genetic integrity. Despite the potential for genetic damage, however, herpesvirus sequences are well conserved after reactivation from latency. To account for this apparent paradox, I have tested the idea that host cell-encoded mechanisms of DNA repair are able to control genetic damage to latent herpesviruses. Studies were focused on homologous recombination-dependent DNA repair (HR). Methods of DNA sequence analysis were employed to scan herpesvirus genomes for DNA features able to activate HR. Analyses were carried out with a total of 39 herpesvirus DNA sequences, a group that included viruses from the alpha-, beta- and gamma-subfamilies. The results showed that all 39 genome sequences were enriched in two or more of the eight recombination-initiating features examined. The results were interpreted to indicate that HR can stabilize latent herpesvirus genomes. The results also showed, unexpectedly, that repair-initiating DNA features differed in alpha- compared to gamma-herpesviruses. Whereas inverted and tandem repeats predominated in alpha-herpesviruses, gamma-herpesviruses were enriched in short, GC-rich initiation sequences such as CCCAG and depleted in repeats. In alpha-herpesviruses, repair-initiating repeat sequences were found to be concentrated in a specific region (the S segment) of the genome while repair-initiating short sequences were distributed more uniformly in gamma-herpesviruses. The results suggest that repair pathways are activated differently in alpha- compared to gamma-herpesviruses. Copyright © 2014. Published by Elsevier Inc.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Hadano, S.; Ishida, Y.; Tomiyasu, H.
1994-09-01
To complete a transcription map of the 1 Mb region in human chromosome 4p16.3 containing the Huntington disease (HD) gene, the isolation of cDNA clones are being performed throughout. Our method relies on a direct screening of the cDNA libraries probed with single copy microclones from 3 YAC clones spanning 1 Mbp of the HD gene region. AC-DNAs were isolated by a preparative pulsed-field gel electrophoresis, amplified by both a single unique primer (SUP)-PCR and a linker ligation PCR, and 6 microclone-DNA libraries were generated. Then, 8,640 microclones from these libraries were independently amplified by PCR, and arrayed onto themore » membranes. 800-900 microclones that were not cross-hybridized with total human and yeast genomic DNA, TAC vector DNA, and ribosomal cDNA on a dot hybridization (putatively carrying single copy sequences) were pooled to make 9 probe pools. A total of {approximately}1.8x10{sup 7} plaques from the human brain cDNA libraries was screened with 9 pool-probes, and then 672 positive cDNA clones were obtained. So far, 597 cDNA clones were defined and arrayed onto a map of the 1 Mbp of the HD gene region by hybridization with HD region-specific cosmid contigs and YAC clones. Further characterization including a DNA sequencing and Northern blot analysis is currently underway.« less
Kumar, Girish; Kocour, Martin; Kunal, Swaraj Priyaranjan
2016-05-01
In order to assess the DNA sequence variation and phylogenetic relationship among five tuna species (Auxis thazard, Euthynnus affinis, Katsuwonus pelamis, Thunnus tonggol, and T. albacares) out of all four tuna genera, partial sequences of the mitochondrial DNA (mtDNA) D-loop region were analyzed. The estimate of intra-specific sequence variation in studied species was low, ranging from 0.027 to 0.080 [Kimura's two parameter distance (K2P)], whereas values of inter-specific variation ranged from 0.049 to 0.491. The longtail tuna (T. tonggol) and yellowfin tuna (T. albacares) were found to share a close relationship (K2P = 0.049) while skipjack tuna (K. pelamis) was most divergent studied species. Phylogenetic analysis using Maximum-Likelihood (ML) and Neighbor-Joining (NJ) methods supported the monophyletic origin of Thunnus species. Similarly, phylogeny of Auxis and Euthynnus species substantiate the monophyly. However, results showed a distinct origin of K. pelamis from genus Thunnus as well as Auxis and Euthynnus. Thus, the mtDNA D-loop region sequence data supports the polyphyletic origin of tuna species.
Yan, H. H.; Liu, G. Q.; Cheng, Z. K.; Li, X. B.; Liu, G. Z.; Min, S. K.; Zhu, L.H.
2002-02-01
In the course of transferring the brown planthopper resistance from a diploid, CC-genome wild rice species, Oryza eichingeri (IRGC acc. 105159 and 105163), to the cultivated rice variety 02428, we have isolated many alien addition and introgression lines. The O. eichingeri chromatin in some of these lines has previously been identified using genomic in situ hybridization and molecular-marker analysis. Here we cloned a tandemly repetitive DNA sequence from O. eichingeri IRGC acc105163, and detected it in 25 introgression lines. This repetitive DNA sequence showed high specificity to the rice CC genome, but was absent from all the four tetraploid species with BBCC or CCDD genomes. The monomer in this repetitive DNA sequence is 325-366-bp long, with a copy number of about 5,000 per 1 C of the O. eichingerigenome, showing 88% homology to a repetitive DNA sequence isolated from Oryza officinalis(2n=2 x=24, CC). Fluorescent in situ hybridization revealed 11 signals distributed over eight O. eichingeri chromosomes, mostly in terminal or subterminal regions.
A multiple-alignment based primer design algorithm for genetically highly variable DNA targets
2013-01-01
Background Primer design for highly variable DNA sequences is difficult, and experimental success requires attention to many interacting constraints. The advent of next-generation sequencing methods allows the investigation of rare variants otherwise hidden deep in large populations, but requires attention to population diversity and primer localization in relatively conserved regions, in addition to recognized constraints typically considered in primer design. Results Design constraints include degenerate sites to maximize population coverage, matching of melting temperatures, optimizing de novo sequence length, finding optimal bio-barcodes to allow efficient downstream analyses, and minimizing risk of dimerization. To facilitate primer design addressing these and other constraints, we created a novel computer program (PrimerDesign) that automates this complex procedure. We show its powers and limitations and give examples of successful designs for the analysis of HIV-1 populations. Conclusions PrimerDesign is useful for researchers who want to design DNA primers and probes for analyzing highly variable DNA populations. It can be used to design primers for PCR, RT-PCR, Sanger sequencing, next-generation sequencing, and other experimental protocols targeting highly variable DNA samples. PMID:23965160
Navigating the tip of the genomic iceberg: Next-generation sequencing for plant systematics.
Straub, Shannon C K; Parks, Matthew; Weitemier, Kevin; Fishbein, Mark; Cronn, Richard C; Liston, Aaron
2012-02-01
Just as Sanger sequencing did more than 20 years ago, next-generation sequencing (NGS) is poised to revolutionize plant systematics. By combining multiplexing approaches with NGS throughput, systematists may no longer need to choose between more taxa or more characters. Here we describe a genome skimming (shallow sequencing) approach for plant systematics. Through simulations, we evaluated optimal sequencing depth and performance of single-end and paired-end short read sequences for assembly of nuclear ribosomal DNA (rDNA) and plastomes and addressed the effect of divergence on reference-guided plastome assembly. We also used simulations to identify potential phylogenetic markers from low-copy nuclear loci at different sequencing depths. We demonstrated the utility of genome skimming through phylogenetic analysis of the Sonoran Desert clade (SDC) of Asclepias (Apocynaceae). Paired-end reads performed better than single-end reads. Minimum sequencing depths for high quality rDNA and plastome assemblies were 40× and 30×, respectively. Divergence from the reference significantly affected plastome assembly, but relatively similar references are available for most seed plants. Deeper rDNA sequencing is necessary to characterize intragenomic polymorphism. The low-copy fraction of the nuclear genome was readily surveyed, even at low sequencing depths. Nearly 160000 bp of sequence from three organelles provided evidence of phylogenetic incongruence in the SDC. Adoption of NGS will facilitate progress in plant systematics, as whole plastome and rDNA cistrons, partial mitochondrial genomes, and low-copy nuclear markers can now be efficiently obtained for molecular phylogenetics studies.
Robinson, Lois; Panayiotakis, Alexandra; Papas, Takis S.; Kola, Ismail; Seth, Arun
1997-01-01
ETS transcription factors play important roles in hematopoiesis, angiogenesis, and organogenesis during murine development. The ETS genes also have a role in neoplasia, for example in Ewing’s sarcomas and retrovirally induced cancers. The ETS genes encode transcription factors that bind to specific DNA sequences and activate transcription of various cellular and viral genes. To isolate novel ETS target genes, we used two approaches. In the first approach, we isolated genes by the RNA differential display technique. Previously, we have shown that the overexpression of ETS1 and ETS2 genes effects transformation of NIH 3T3 cells and specific transformants produce high levels of the ETS proteins. To isolate ETS1 and ETS2 responsive genes in these transformed cells, we prepared RNA from ETS1, ETS2 transformants, and normal NIH 3T3 cell lines and converted it into cDNA. This cDNA was amplified by PCR and displayed on sequencing gels. The differentially displayed bands were subcloned into plasmid vectors. By Northern blot analysis, several clones showed differential patterns of mRNA expression in the NIH 3T3-, ETS1-, and ETS2-expressing cell lines. Sixteen clones were analyzed by DNA sequence analysis, and 13 of them appeared to be unique because their DNA sequences did not match with any of the known genes present in the gene bank. Three known genes were found to be identical to the CArG box binding factor, phospholipase A2-activating protein, and early growth response 1 (Egr1) genes. In the second approach, to isolate ETS target promoters directly, we performed ETS1 binding with MboI-cleaved genomic DNA in the presence of a specific mAb followed by whole genome PCR. The immune complex-bound ETS binding sites containing DNA fragments were amplified and subcloned into pBluescript and subjected to DNA sequence and computer analysis. We found that, of a large number of clones isolated, 43 represented unique sequences not previously identified. Three clones turned out to contain regulatory sequences derived from human serglycin, preproapolipoprotein C II, and Egr1 genes. The ETS binding sites derived from these three regulatory sequences showed specific binding with recombinant ETS proteins. Of interest, Egr1 was identified by both of these techniques, suggesting strongly that it is indeed an ETS target gene. PMID:9207063
Wysoczynski, Christina L.; Roemer, Sarah C.; Dostal, Vishantie; Barkley, Robert M.; Churchill, Mair E. A.; Malarkey, Christopher S.
2013-01-01
Obtaining quantities of highly pure duplex DNA is a bottleneck in the biophysical analysis of protein–DNA complexes. In traditional DNA purification methods, the individual cognate DNA strands are purified separately before annealing to form DNA duplexes. This approach works well for palindromic sequences, in which top and bottom strands are identical and duplex formation is typically complete. However, in cases where the DNA is non-palindromic, excess of single-stranded DNA must be removed through additional purification steps to prevent it from interfering in further experiments. Here we describe and apply a novel reversed-phase ion-pair liquid chromatography purification method for double-stranded DNA ranging in lengths from 17 to 51 bp. Both palindromic and non-palindromic DNA can be readily purified. This method has the unique ability to separate blunt double-stranded DNA from pre-attenuated (n-1, n-2, etc) synthesis products, and from DNA duplexes with single base pair overhangs. Additionally, palindromic DNA sequences with only minor differences in the central spacer sequence of the DNA can be separated, and the purified DNA is suitable for co-crystallization of protein–DNA complexes. Thus, double-stranded ion-pair liquid chromatography is a useful approach for duplex DNA purification for many applications. PMID:24013567
Zhang, Wanying; Wang, Tao; Huang, Shuaiwu; Zhao, Xiuli
2018-04-10
To detect mutation of HPGD gene among three pedigrees affected with primary hypertrophic osteoarthropathy (PHO) by DNA sequencing and high-resolution melting (HRM) analysis. Genomic DNA was extracted from peripheral blood samples collected from the pedigrees. PCR and direct sequencing were carried out to identify potential mutations of the HPGD gene. Amplicons containing the mutation spot were generated by nested PCR. The products were then subjected to HRM analysis using the HR-1 instrument. Direct sequencing was carried out in family members and healthy individuals to confirm the result of HRM analysis. A homozygous mutation c.310_311delCT was detected in 2 affected probands, while a heterozygous mutation c.310_311delCT was detected in the third proband. HRM analysis of the fragments encompassing HPGD exon 3 showed 3 curve patterns representing three different genotypes, i.e., the wild type, the c.310_311delCT homozygote, and the c.310_311delCT heterozygote. Result of DNA sequencing was consistent with that of the HRM analysis and phenotype of the subjects. The c.310_311delCT mutation may be the most prevalent mutation among Chinese population. HRM analysis has provided an optimized method for genetic testing of HPGD mutation for its simplicity, rapid turnover and high sensitivity.
Yang, Xian-Xian; Zhang, Mei; Yan, Zhao-Wen; Zhang, Ru-Hong; Mu, Xiong-Zheng
2008-01-01
To construct a high effective eukaryotic expressing plasmid PcDNA 3.1-MSX-2 encoding Sprague-Dawley rat MSX-2 gene for the further study of MSX-2 gene function. The full length SD rat MSX-2 gene was amplified by PCR, and the full length DNA was inserted in the PMD1 8-T vector. It was isolated by restriction enzyme digest with BamHI and Xhol, then ligated into the cloning site of the PcDNA3.1 expression plasmid. The positive recombinant was identified by PCR analysis, restriction endonudease analysis and sequence analysis. Expression of RNA and protein was detected by RT-PCR and Western blot analysis in PcDNA3.1-MSX-2 transfected HEK293 cells. Sequence analysis and restriction endonudease analysis of PcDNA3.1-MSX-2 demonstrated that the position and size of MSX-2 cDNA insertion were consistent with the design. RT-PCR and Western blot analysis showed specific expression of mRNA and protein of MSX-2 in the transfected HEK293 cells. The high effective eukaryotic expression plasmid PcDNA3.1-MSX-2 encoding Sprague-Dawley Rat MSX-2 gene which is related to craniofacial development can be successfully reconstructed. It may serve as the basis for the further study of MSX-2 gene function.
Peters, R; King, C Y; Ukiyama, E; Falsafi, S; Donahoe, P K; Weiss, M A
1995-04-11
SRY, a genetic "master switch" for male development in mammals, exhibits two biochemical activities: sequence-specific recognition of duplex DNA and sequence-independent binding to the sharp angles of four-way DNA junctions. Here, we distinguish between these activities by analysis of a mutant SRY associated with human sex reversal (46, XY female with pure gonadal dysgenesis). The substitution (168T in human SRY) alters a nonpolar side chain in the minor-groove DNA recognition alpha-helix of the HMG box [Haqq, C.M., King, C.-Y., Ukiyama, E., Haqq, T.N., Falsalfi, S., Donahoe, P.K., & Weiss, M.A. (1994) Science 266, 1494-1500]. The native (but not mutant) side chain inserts between specific base pairs in duplex DNA, interrupting base stacking at a site of induced DNA bending. Isotope-aided 1H-NMR spectroscopy demonstrates that analogous side-chain insertion occurs on binding of SRY to a four-way junction, establishing a shared mechanism of sequence- and structure-specific DNA binding. Although the mutant DNA-binding domain exhibits > 50-fold reduction in sequence-specific DNA recognition, near wild-type affinity for four-way junctions is retained. Our results (i) identify a shared SRY-DNA contact at a site of either induced or intrinsic DNA bending, (ii) demonstrate that this contact is not required to bind an intrinsically bent DNA target, and (iii) rationalize patterns of sequence conservation or diversity among HMG boxes. Clinical association of the I68T mutation with human sex reversal supports the hypothesis that specific DNA recognition by SRY is required for male sex determination.
Molecular cloning of a gene encoding translation initiation factor (TIF) from Candida albicans.
Mirbod, F; Nakashima, S; Kitajima, Y; Ghannoum, M A; Cannon, R D; Nozawa, Y
1996-01-01
The differential display technique was applied to compare mRNAs from two clinical isolates of Candida albicans with different virulence; high (potent strain, 16240) and low (weak strain, 18084) extracellular phospholipase activities. Complementary DNA fragments corresponding to several apparently differentially expressed mRNAs were recovered and sequenced. A complementary DNA fragment seen distinctly in the potent phospholipase producing strain was highly homologous to the yeast translation initiation factor (TIF). The selected DNA fragment was then used as a probe to isolate its corresponding complementary DNA clone from a library of C. albicans genomic DNA. The sequence of isolated gene revealed an open reading frame of 1194 nucleotides with the potential to encode a protein of 397 amino acids with a predicted molecular weight of 43 kDa. Over its entire length, the amino acid sequence showed strong homology (78-89%) to Saccharomyces cerevisiae TIF and (63-80%) to mouse eIF-4A proteins. Therefore, our C. albicans gene was identified to be TIF (Ca TIF). Northern blot analysis in the two strains of C. albicans revealed that Ca TIF expression is 1.5-fold higher in the potent phospholipase producing strain. The restriction endonuclease digestion of genomic DNA from this potent strain revealed at least two hybridized bands in Southern blot analysis, suggesting two or more closely related sequences in the C. albicans genome.
Sie, Daoud; Snijders, Peter J F; Meijer, Gerrit A; Doeleman, Marije W; van Moorsel, Marinda I H; van Essen, Hendrik F; Eijk, Paul P; Grünberg, Katrien; van Grieken, Nicole C T; Thunnissen, Erik; Verheul, Henk M; Smit, Egbert F; Ylstra, Bauke; Heideman, Daniëlle A M
2014-10-01
Next generation DNA sequencing (NGS) holds promise for diagnostic applications, yet implementation in routine molecular pathology practice requires performance evaluation on DNA derived from routine formalin-fixed paraffin-embedded (FFPE) tissue specimens. The current study presents a comprehensive analysis of TruSeq Amplicon Cancer Panel-based NGS using a MiSeq Personal sequencer (TSACP-MiSeq-NGS) for somatic mutation profiling. TSACP-MiSeq-NGS (testing 212 hotspot mutation amplicons of 48 genes) and a data analysis pipeline were evaluated in a retrospective learning/test set approach (n = 58/n = 45 FFPE-tumor DNA samples) against 'gold standard' high-resolution-melting (HRM)-sequencing for the genes KRAS, EGFR, BRAF and PIK3CA. Next, the performance of the validated test algorithm was assessed in an independent, prospective cohort of FFPE-tumor DNA samples (n = 75). In the learning set, a number of minimum parameter settings was defined to decide whether a FFPE-DNA sample is qualified for TSACP-MiSeq-NGS and for calling mutations. The resulting test algorithm revealed 82% (37/45) compliance to the quality criteria and 95% (35/37) concordant assay findings for KRAS, EGFR, BRAF and PIK3CA with HRM-sequencing (kappa = 0.92; 95% CI = 0.81-1.03) in the test set. Subsequent application of the validated test algorithm to the prospective cohort yielded a success rate of 84% (63/75), and a high concordance with HRM-sequencing (95% (60/63); kappa = 0.92; 95% CI = 0.84-1.01). TSACP-MiSeq-NGS detected 77 mutations in 29 additional genes. TSACP-MiSeq-NGS is suitable for diagnostic gene mutation profiling in oncopathology.
Triplex-mediated analysis of cytosine methylation at CpA sites in DNA.
Johannsen, Marie W; Gerrard, Simon R; Melvin, Tracy; Brown, Tom
2014-01-18
Modified triplex-forming oligonucleotides distinguish 5-methyl cytosine from unmethylated cytosine in DNA duplexes by differences in triplex melting temperatures. The discrimination is sequence-specific; dramatic differences in stabilisation are seen for CpA methylation, whereas CpG methylation is not detected. This direct detection of DNA methylation constitutes a new approach for epigenetic analysis.
The bacterial composition of chlorinated drinking water was analyzed using 16S rRNA gene clone libraries derived from DNA extracts of 12 samples and compared to clone libraries previously generated using RNA extracts from the same samples. Phylogenetic analysis of 761 DNA-based ...
Ouwerkerk, D; Klieve, A V; Forster, R J; Templeton, J M; Maguire, A J
2005-01-01
To determine the culturable biodiversity of anaerobic bacteria isolated from the forestomach contents of an eastern grey kangaroo, Macropus giganteus, using phenotypic characterization and 16S rDNA sequence analysis. Bacteria from forestomach contents of an eastern grey kangaroo were isolated using anaerobic media containing milled curly Mitchell grass (Astrebla lappacea). DNA was extracted and the 16S rDNA sequenced for phylogenetic analysis. Forty bacterial isolates were obtained and placed in 17 groups based on phenotypic characteristics and restriction enzyme digestion of 16S rDNA PCR products. DNA sequencing revealed that the 17 groups comprised five known species (Clostridium butyricum, Streptococcus bovis, Clostridium sporogenes, Clostridium paraputrificum and Enterococcus avium) and 12 groups apparently representing new species, all within the phylum Firmicutes. Foregut contents from Australian macropod marsupials contain a microbial ecosystem with a novel bacterial biodiversity comprising a high percentage of previously unrecognized species. This study adds to knowledge of Australia's unique biodiversity, which may provide a future bioresource of genetic information and bacterial species of benefit to agriculture.
In Vivo Control of CpG and Non-CpG DNA Methylation by DNA Methyltransferases
Arand, Julia; Spieler, David; Karius, Tommy; Branco, Miguel R.; Meilinger, Daniela; Meissner, Alexander; Jenuwein, Thomas; Xu, Guoliang; Leonhardt, Heinrich; Wolf, Verena; Walter, Jörn
2012-01-01
The enzymatic control of the setting and maintenance of symmetric and non-symmetric DNA methylation patterns in a particular genome context is not well understood. Here, we describe a comprehensive analysis of DNA methylation patterns generated by high resolution sequencing of hairpin-bisulfite amplicons of selected single copy genes and repetitive elements (LINE1, B1, IAP-LTR-retrotransposons, and major satellites). The analysis unambiguously identifies a substantial amount of regional incomplete methylation maintenance, i.e. hemimethylated CpG positions, with variant degrees among cell types. Moreover, non-CpG cytosine methylation is confined to ESCs and exclusively catalysed by Dnmt3a and Dnmt3b. This sequence position–, cell type–, and region-dependent non-CpG methylation is strongly linked to neighboring CpG methylation and requires the presence of Dnmt3L. The generation of a comprehensive data set of 146,000 CpG dyads was used to apply and develop parameter estimated hidden Markov models (HMM) to calculate the relative contribution of DNA methyltransferases (Dnmts) for de novo and maintenance DNA methylation. The comparative modelling included wild-type ESCs and mutant ESCs deficient for Dnmt1, Dnmt3a, Dnmt3b, or Dnmt3a/3b, respectively. The HMM analysis identifies a considerable de novo methylation activity for Dnmt1 at certain repetitive elements and single copy sequences. Dnmt3a and Dnmt3b contribute de novo function. However, both enzymes are also essential to maintain symmetrical CpG methylation at distinct repetitive and single copy sequences in ESCs. PMID:22761581
DNA-DNA hybridization values and their relationship to whole-genome sequence similarities.
Goris, Johan; Konstantinidis, Konstantinos T; Klappenbach, Joel A; Coenye, Tom; Vandamme, Peter; Tiedje, James M
2007-01-01
DNA-DNA hybridization (DDH) values have been used by bacterial taxonomists since the 1960s to determine relatedness between strains and are still the most important criterion in the delineation of bacterial species. Since the extent of hybridization between a pair of strains is ultimately governed by their respective genomic sequences, we examined the quantitative relationship between DDH values and genome sequence-derived parameters, such as the average nucleotide identity (ANI) of common genes and the percentage of conserved DNA. A total of 124 DDH values were determined for 28 strains for which genome sequences were available. The strains belong to six important and diverse groups of bacteria for which the intra-group 16S rRNA gene sequence identity was greater than 94 %. The results revealed a close relationship between DDH values and ANI and between DNA-DNA hybridization and the percentage of conserved DNA for each pair of strains. The recommended cut-off point of 70 % DDH for species delineation corresponded to 95 % ANI and 69 % conserved DNA. When the analysis was restricted to the protein-coding portion of the genome, 70 % DDH corresponded to 85 % conserved genes for a pair of strains. These results reveal extensive gene diversity within the current concept of "species". Examination of reciprocal values indicated that the level of experimental error associated with the DDH method is too high to reveal the subtle differences in genome size among the strains sampled. It is concluded that ANI can accurately replace DDH values for strains for which genome sequences are available.
Enterococcus Xinjiangensis sp. nov., Isolated from Yogurt of Xinjiang, China.
Ren, Xiaopu; Li, Mingyang; Guo, Dongqi
2016-09-01
A Gram-strain-positive bacterial strain 48(T) was isolated from traditional yogurt in Xinjiang Province, China. The bacterium was characterized by a polyphasic approach, including 16S rRNA gene sequence analysis, polymerase α subunit (rpoA) gene sequence analysis, determination of DNA G+C content, DNA-DNA hybridization with the type strain of Enterococcus ratti and analysis of phenotypic features. Strain 48(T) accounted for 96.1, 95.8, 95.8, and 95.7 % with Enterococcus faecium CGMCC 1.2136(T), Enterococcus hirae ATCC 9790(T), Enterococcus durans CECT 411(T), and E. ratti ATCC 700914(T) in the 16S rRNA gene sequence similarities, respectively. The sequence of rpoA gene showed similarities of 99.0, 96.0, 96.0, and 96 % with that of E. faecium ATCC 19434(T), Enterococcus villorum LMG12287, E. hirae ATCC 9790(T), and E. durans ATCC 19432(T), respectively. Based upon of polyphasic characterization data obtained in the study, a novel species, Enterococcus xinjiangensis sp. nov., was proposed and the type strain was 48(T)(=CCTCC AB 2014041(T) = JCM 30200(T)).
Kim, Tae Hoon; Dekker, Job
2018-05-01
Owing to its digital nature, ChIP-seq has become the standard method for genome-wide ChIP analysis. Using next-generation sequencing platforms (notably the Illumina Genome Analyzer), millions of short sequence reads can be obtained. The densities of recovered ChIP sequence reads along the genome are used to determine the binding sites of the protein. Although a relatively small amount of ChIP DNA is required for ChIP-seq, the current sequencing platforms still require amplification of the ChIP DNA by ligation-mediated PCR (LM-PCR). This protocol, which involves linker ligation followed by size selection, is the standard ChIP-seq protocol using an Illumina Genome Analyzer. The size-selected ChIP DNA is amplified by LM-PCR and size-selected for the second time. The purified ChIP DNA is then loaded into the Genome Analyzer. The ChIP DNA can also be processed in parallel for ChIP-chip results. © 2018 Cold Spring Harbor Laboratory Press.
Recognition of the DNA sequence by an inorganic crystal surface
Sampaolese, Beatrice; Bergia, Anna; Scipioni, Anita; Zuccheri, Giampaolo; Savino, Maria; Samorì, Bruno; De Santis, Pasquale
2002-01-01
The sequence-dependent curvature is generally recognized as an important and biologically relevant property of DNA because it is involved in the formation and stability of association complexes with proteins. When a DNA tract, intrinsically curved for the periodical recurrence on the same strand of A-tracts phased with the B-DNA periodicity, is deposited on a flat surface, it exposes to that surface either a T- or an A-rich face. The surface of a freshly cleaved mica crystal recognizes those two faces and preferentially interacts with the former one. Statistical analysis of scanning force microscopy (SFM) images provides evidence of this recognition between an inorganic crystal surface and nanoscale structures of double-stranded DNA. This finding could open the way toward the use of the sequence-dependent adhesion to specific crystal faces for nanotechnological purposes. PMID:12361979
Molecular barcodes detect redundancy and contamination in hairpin-bisulfite PCR
Miner, Brooks E.; Stöger, Reinhard J.; Burden, Alice F.; Laird, Charles D.; Hansen, R. Scott
2004-01-01
PCR amplification of limited amounts of DNA template carries an increased risk of product redundancy and contamination. We use molecular barcoding to label each genomic DNA template with an individual sequence tag prior to PCR amplification. In addition, we include molecular ‘batch-stamps’ that effectively label each genomic template with a sample ID and analysis date. This highly sensitive method identifies redundant and contaminant sequences and serves as a reliable method for positive identification of desired sequences; we can therefore capture accurately the genomic template diversity in the sample analyzed. Although our application described here involves the use of hairpin-bisulfite PCR for amplification of double-stranded DNA, the method can readily be adapted to single-strand PCR. Useful applications will include analyses of limited template DNA for biomedical, ancient DNA and forensic purposes. PMID:15459281
Azospirillum zeae sp. nov., a diazotrophic bacterium isolated from rhizosphere soil of Zea mays.
Mehnaz, Samina; Weselowski, Brian; Lazarovits, George
2007-12-01
Two free-living nitrogen-fixing bacterial strains, N6 and N7(T), were isolated from corn rhizosphere. A polyphasic taxonomic approach, including morphological characterization, Biolog analysis, DNA-DNA hybridization, and 16S rRNA, cpn60 and nifH gene sequence analysis, was taken to analyse the two strains. 16S rRNA gene sequence analysis indicated that strains N6 and N7(T) both belonged to the genus Azospirillum and were closely related to Azospirillum oryzae (98.7 and 98.8 % similarity, respectively) and Azospirillum lipoferum (97.5 and 97.6 % similarity, respectively). DNA-DNA hybridization of strains N6 and N7(T) showed reassociation values of 48 and 37 %, respectively, with A. oryzae and 43 % with A. lipoferum. Sequences of the nifH and cpn60 genes of both strains showed 99 and approximately 95 % similarity, respectively, with those of A. oryzae. Chemotaxonomic characteristics (Q-10 as quinone system, 18 : 1omega7c as major fatty acid) and G+C content of the DNA (67.6 mol%) were also similar to those of members of the genus Azospirillum. Gene sequences and Biolog and fatty acid analysis showed that strains N6 and N7(T) differed from the closely related species A. lipoferum and A. oryzae. On the basis of these results, it is proposed that these nitrogen-fixing strains represent a novel species. The name Azospirillum zeae sp. nov. is suggested, with N7(T) (=NCCB 100147(T)=LMG 23989(T)) as the type strain.
A parallel and sensitive software tool for methylation analysis on multicore platforms.
Tárraga, Joaquín; Pérez, Mariano; Orduña, Juan M; Duato, José; Medina, Ignacio; Dopazo, Joaquín
2015-10-01
DNA methylation analysis suffers from very long processing time, as the advent of Next-Generation Sequencers has shifted the bottleneck of genomic studies from the sequencers that obtain the DNA samples to the software that performs the analysis of these samples. The existing software for methylation analysis does not seem to scale efficiently neither with the size of the dataset nor with the length of the reads to be analyzed. As it is expected that the sequencers will provide longer and longer reads in the near future, efficient and scalable methylation software should be developed. We present a new software tool, called HPG-Methyl, which efficiently maps bisulphite sequencing reads on DNA, analyzing DNA methylation. The strategy used by this software consists of leveraging the speed of the Burrows-Wheeler Transform to map a large number of DNA fragments (reads) rapidly, as well as the accuracy of the Smith-Waterman algorithm, which is exclusively employed to deal with the most ambiguous and shortest reads. Experimental results on platforms with Intel multicore processors show that HPG-Methyl significantly outperforms in both execution time and sensitivity state-of-the-art software such as Bismark, BS-Seeker or BSMAP, particularly for long bisulphite reads. Software in the form of C libraries and functions, together with instructions to compile and execute this software. Available by sftp to anonymous@clariano.uv.es (password 'anonymous'). juan.orduna@uv.es or jdopazo@cipf.es. © The Author 2015. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.
Phylogenetic analysis of Sicilian goats reveals a new mtDNA lineage.
Sardina, M T; Ballester, M; Marmi, J; Finocchiaro, R; van Kaam, J B C H M; Portolano, B; Folch, J M
2006-08-01
The mitochondrial hypervariable region 1 (HVR1) sequence of 67 goats belonging to the Girgentana, Maltese and Derivata di Siria breeds was partially sequenced in order to present the first phylogenetic characterization of Sicilian goat breeds. These sequences were compared with published sequences of Indian and Pakistani domestic goats and wild goats. Mitochondrial lineage A was observed in most of the Sicilian goats. However, three Girgentana haplotypes were highly divergent from the Capra hircus clade, indicating that a new mtDNA lineage in domestic goats was found.
Singular over-representation of an octameric palindrome, HIP1, in DNA from many cyanobacteria.
Robinson, N J; Robinson, P J; Gupta, A; Bleasby, A J; Whitton, B A; Morby, A P
1995-03-11
An octameric palindrome (5'-GCGATCGC-3') is abundant in cyanobacterial sequences within databases (GenBank/EMBL) and was designated HIP1 (highly iterated palindrome). The frequency of occurrence of all 256 octameric palindromes has now been determined in sub-databases revealing large and unique over-representation of HIP1 in cyanobacterial entries. DNA sequences from other bacteria were searched for any over-represented octameric palindromes analogous to HIP1. Only two sequences were identified, in the genomes of a thermophile and halophilic archaebacteria, although these were less abundant than HIP1 in cyanobacteria and relate to codon usage. To test the proposed widespread distribution of HIP1 in DNA from the cyanobacterium Synechococcus PCC 6301, randomly selected genomic clones were partly sequenced. HIP1 constituted 2.5% of the novel sequences, equivalent to a site on average once every 320 nucleotides. An oligonucleotide including HIP1 was also tested in PCR. Multiple products were obtained using template DNA from cyanobacterial strains in which HIP1 is abundant in known sequences, and some strains generated characteristic HIP-PCR banding patterns. However, analysis of DNA from one strain (not previously represented in databases) by random sequencing, HIP-PCR and Pvul digestion, confirms that not all cyanobacterial genomes are rich in HIP1.
Botero, Adriana; Kapeller, Irit; Cooper, Crystal; Clode, Peta L; Shlomai, Joseph; Thompson, R C Andrew
2018-05-17
Kinetoplast DNA (kDNA) is the mitochondrial genome of trypanosomatids. It consists of a few dozen maxicircles and several thousand minicircles, all catenated topologically to form a two-dimensional DNA network. Minicircles are heterogeneous in size and sequence among species. They present one or several conserved regions that contain three highly conserved sequence blocks. CSB-1 (10 bp sequence) and CSB-2 (8 bp sequence) present lower interspecies homology, while CSB-3 (12 bp sequence) or the Universal Minicircle Sequence is conserved within most trypanosomatids. The Universal Minicircle Sequence is located at the replication origin of the minicircles, and is the binding site for the UMS binding protein, a protein involved in trypanosomatid survival and virulence. Here, we describe the structure and organisation of the kDNA of Trypanosoma copemani, a parasite that has been shown to infect mammalian cells and has been associated with the drastic decline of the endangered Australian marsupial, the woylie (Bettongia penicillata). Deep genomic sequencing showed that T. copemani presents two classes of minicircles that share sequence identity and organisation in the conserved sequence blocks with those of Trypanosoma cruzi and Trypanosoma lewisi. A 19,257 bp partial region of the maxicircle of T. copemani that contained the entire coding region was obtained. Comparative analysis of the T. copemani entire maxicircle coding region with the coding regions of T. cruzi and T. lewisi showed they share 71.05% and 71.28% identity, respectively. The shared features in the maxicircle/minicircle organisation and sequence between T. copemani and T. cruzi/T. lewisi suggest similarities in their process of kDNA replication, and are of significance in understanding the evolution of Australian trypanosomes. Copyright © 2018 The Authors. Published by Elsevier Ltd.. All rights reserved.
Eastman, Alexander W.; Yuan, Ze-Chun
2015-01-01
Advances in sequencing technology have drastically increased the depth and feasibility of bacterial genome sequencing. However, little information is available that details the specific techniques and procedures employed during genome sequencing despite the large numbers of published genomes. Shotgun approaches employed by second-generation sequencing platforms has necessitated the development of robust bioinformatics tools for in silico assembly, and complete assembly is limited by the presence of repetitive DNA sequences and multi-copy operons. Typically, re-sequencing with multiple platforms and laborious, targeted Sanger sequencing are employed to finish a draft bacterial genome. Here we describe a novel strategy based on the identification and targeted sequencing of repetitive rDNA operons to expedite bacterial genome assembly and finishing. Our strategy was validated by finishing the genome of Paenibacillus polymyxa strain CR1, a bacterium with potential in sustainable agriculture and bio-based processes. An analysis of the 38 contigs contained in the P. polymyxa strain CR1 draft genome revealed 12 repetitive rDNA operons with varied intragenic and flanking regions of variable length, unanimously located at contig boundaries and within contig gaps. These highly similar but not identical rDNA operons were experimentally verified and sequenced simultaneously with multiple, specially designed primer sets. This approach also identified and corrected significant sequence rearrangement generated during the initial in silico assembly of sequencing reads. Our approach reduces the required effort associated with blind primer walking for contig assembly, increasing both the speed and feasibility of genome finishing. Our study further reinforces the notion that repetitive DNA elements are major limiting factors for genome finishing. Moreover, we provided a step-by-step workflow for genome finishing, which may guide future bacterial genome finishing projects. PMID:25653642
Fortin, Connor H; Schulze, Katharina V; Babbitt, Gregory A
2015-01-01
It is now widely-accepted that DNA sequences defining DNA-protein interactions functionally depend upon local biophysical features of DNA backbone that are important in defining sites of binding interaction in the genome (e.g. DNA shape, charge and intrinsic dynamics). However, these physical features of DNA polymer are not directly apparent when analyzing and viewing Shannon information content calculated at single nucleobases in a traditional sequence logo plot. Thus, sequence logos plots are severely limited in that they convey no explicit information regarding the structural dynamics of DNA backbone, a feature often critical to binding specificity. We present TRX-LOGOS, an R software package and Perl wrapper code that interfaces the JASPAR database for computational regulatory genomics. TRX-LOGOS extends the traditional sequence logo plot to include Shannon information content calculated with regard to the dinucleotide-based BI-BII conformation shifts in phosphate linkages on the DNA backbone, thereby adding a visual measure of intrinsic DNA flexibility that can be critical for many DNA-protein interactions. TRX-LOGOS is available as an R graphics module offered at both SourceForge and as a download supplement at this journal. To demonstrate the general utility of TRX logo plots, we first calculated the information content for 416 Saccharomyces cerevisiae transcription factor binding sites functionally confirmed in the Yeastract database and matched to previously published yeast genomic alignments. We discovered that flanking regions contain significantly elevated information content at phosphate linkages than can be observed at nucleobases. We also examined broader transcription factor classifications defined by the JASPAR database, and discovered that many general signatures of transcription factor binding are locally more information rich at the level of DNA backbone dynamics than nucleobase sequence. We used TRX-logos in combination with MEGA 6.0 software for molecular evolutionary genetics analysis to visually compare the human Forkhead box/FOX protein evolution to its binding site evolution. We also compared the DNA binding signatures of human TP53 tumor suppressor determined by two different laboratory methods (SELEX and ChIP-seq). Further analysis of the entire yeast genome, center aligned at the start codon, also revealed a distinct sequence-independent 3 bp periodic pattern in information content, present only in coding region, and perhaps indicative of the non-random organization of the genetic code. TRX-LOGOS is useful in any situation in which important information content in DNA can be better visualized at the positions of phosphate linkages (i.e. dinucleotides) where the dynamic properties of the DNA backbone functions to facilitate DNA-protein interaction.
The effects of metal ions on the DNA damage induced by hydrogen peroxide.
Kobayashi, S; Ueda, K; Komano, T
1990-01-01
The effects of metal ions on DNA damage induced by hydrogen peroxide were investigated using two methods, agarose-gel electrophoretic analysis of supercoiled DNA and sequencing-gel analysis of single end-labeled DNA fragments of defined sequences. Hydrogen peroxide induced DNA damage when iron or copper ion was present. At least two classes of DNA damage were induced, one being direct DNA-strand cleavage, and the other being base modification labile to hot piperidine. The investigation of the damaged sites and the inhibitory effects of radical scavengers revealed that hydroxyl radical was the species which attacked DNA in the reaction of H2O2/Fe(II). On the other hand, two types of DNA damage were induced by H2O2/Cu(II). Type I damage was predominant and inhibited by potassium iodide, but type II was not. The sites of the base-modification induced by type I damage were similar to those by lipid peroxidation products and by ascorbate in the presence of Cu(II), suggesting the involvement of radical species other than free hydroxyl radical in the damaging reactions.
Feuillie, Cécile; Merheb, Maxime M.; Gillet, Benjamin; Montagnac, Gilles; Daniel, Isabelle; Hänni, Catherine
2014-01-01
The analysis of ancient or processed DNA samples is often a great challenge, because traditional Polymerase Chain Reaction – based amplification is impeded by DNA damage. Blocking lesions such as abasic sites are known to block the bypass of DNA polymerases, thus stopping primer elongation. In the present work, we applied the SERRS-hybridization assay, a fully non-enzymatic method, to the detection of DNA refractory to PCR amplification. This method combines specific hybridization with detection by Surface Enhanced Resonant Raman Scattering (SERRS). It allows the detection of a series of double-stranded DNA molecules containing a varying number of abasic sites on both strands, when PCR failed to detect the most degraded sequences. Our SERRS approach can quickly detect DNA molecules without any need for DNA repair. This assay could be applied as a pre-requisite analysis prior to enzymatic reparation or amplification. A whole new set of samples, both forensic and archaeological, could then deliver information that was not yet available due to a high degree of DNA damage. PMID:25502338
Feuillie, Cécile; Merheb, Maxime M; Gillet, Benjamin; Montagnac, Gilles; Daniel, Isabelle; Hänni, Catherine
2014-01-01
The analysis of ancient or processed DNA samples is often a great challenge, because traditional Polymerase Chain Reaction - based amplification is impeded by DNA damage. Blocking lesions such as abasic sites are known to block the bypass of DNA polymerases, thus stopping primer elongation. In the present work, we applied the SERRS-hybridization assay, a fully non-enzymatic method, to the detection of DNA refractory to PCR amplification. This method combines specific hybridization with detection by Surface Enhanced Resonant Raman Scattering (SERRS). It allows the detection of a series of double-stranded DNA molecules containing a varying number of abasic sites on both strands, when PCR failed to detect the most degraded sequences. Our SERRS approach can quickly detect DNA molecules without any need for DNA repair. This assay could be applied as a pre-requisite analysis prior to enzymatic reparation or amplification. A whole new set of samples, both forensic and archaeological, could then deliver information that was not yet available due to a high degree of DNA damage.
Ruppitsch, W; Stöger, A; Indra, A; Grif, K; Schabereiter-Gurtner, C; Hirschl, A; Allerberger, F
2007-03-01
In a bioterrorism event a rapid tool is needed to identify relevant dangerous bacteria. The aim of the study was to assess the usefulness of partial 16S rRNA gene sequence analysis and the suitability of diverse databases for identifying dangerous bacterial pathogens. For rapid identification purposes a 500-bp fragment of the 16S rRNA gene of 28 isolates comprising Bacillus anthracis, Brucella melitensis, Burkholderia mallei, Burkholderia pseudomallei, Francisella tularensis, Yersinia pestis, and eight genus-related and unrelated control strains was amplified and sequenced. The obtained sequence data were submitted to three public and two commercial sequence databases for species identification. The most frequent reason for incorrect identification was the lack of the respective 16S rRNA gene sequences in the database. Sequence analysis of a 500-bp 16S rDNA fragment allows the rapid identification of dangerous bacterial species. However, for discrimination of closely related species sequencing of the entire 16S rRNA gene, additional sequencing of the 23S rRNA gene or sequencing of the 16S-23S rRNA intergenic spacer is essential. This work provides comprehensive information on the suitability of partial 16S rDNA analysis and diverse databases for rapid and accurate identification of dangerous bacterial pathogens.
Rohs, Remo; Sklenar, Heinz
2004-04-01
The results presented in this paper on methylene blue (MB) binding to DNA with AT alternating base sequence complement the data obtained in two former modeling studies of MB binding to GC alternating DNA. In the light of the large amount of experimental data for both systems, this theoretical study is focused on a detailed energetic analysis and comparison in order to understand their different behavior. Since experimental high-resolution structures of the complexes are not available, the analysis is based on energy minimized structural models of the complexes in different binding modes. For both sequences, four different intercalation structures and two models for MB binding in the minor and major groove have been proposed. Solvent electrostatic effects were included in the energetic analysis by using electrostatic continuum theory, and the dependence of MB binding on salt concentration was investigated by solving the non-linear Poisson-Boltzmann equation. We find that the relative stability of the different complexes is similar for the two sequences, in agreement with the interpretation of spectroscopic data. Subtle differences, however, are seen in energy decompositions and can be attributed to the change from symmetric 5'-YpR-3' intercalation to minor groove binding with increasing salt concentration, which is experimentally observed for the AT sequence at lower salt concentration than for the GC sequence. According to our results, this difference is due to the significantly lower non-electrostatic energy for the minor groove complex with AT alternating DNA, whereas the slightly lower binding energy to this sequence is caused by a higher deformation energy of DNA. The energetic data are in agreement with the conclusions derived from different spectroscopic studies and can also be structurally interpreted on the basis of the modeled complexes. The simple static modeling technique and the neglect of entropy terms and of non-electrostatic solute-solvent interactions, which are assumed to be nearly constant for the compared complexes of MB with DNA, seem to be justified by the results.
msgbsR: An R package for analysing methylation-sensitive restriction enzyme sequencing data.
Mayne, Benjamin T; Leemaqz, Shalem Y; Buckberry, Sam; Rodriguez Lopez, Carlos M; Roberts, Claire T; Bianco-Miotto, Tina; Breen, James
2018-02-01
Genotyping-by-sequencing (GBS) or restriction-site associated DNA marker sequencing (RAD-seq) is a practical and cost-effective method for analysing large genomes from high diversity species. This method of sequencing, coupled with methylation-sensitive enzymes (often referred to as methylation-sensitive restriction enzyme sequencing or MRE-seq), is an effective tool to study DNA methylation in parts of the genome that are inaccessible in other sequencing techniques or are not annotated in microarray technologies. Current software tools do not fulfil all methylation-sensitive restriction sequencing assays for determining differences in DNA methylation between samples. To fill this computational need, we present msgbsR, an R package that contains tools for the analysis of methylation-sensitive restriction enzyme sequencing experiments. msgbsR can be used to identify and quantify read counts at methylated sites directly from alignment files (BAM files) and enables verification of restriction enzyme cut sites with the correct recognition sequence of the individual enzyme. In addition, msgbsR assesses DNA methylation based on read coverage, similar to RNA sequencing experiments, rather than methylation proportion and is a useful tool in analysing differential methylation on large populations. The package is fully documented and available freely online as a Bioconductor package ( https://bioconductor.org/packages/release/bioc/html/msgbsR.html ).
Myopathic mtDNA Depletion Syndrome Due to Mutation in TK2 Gene.
Martín-Hernández, Elena; García-Silva, María Teresa; Quijada-Fraile, Pilar; Rodríguez-García, María Elena; Rivera, Henry; Hernández-Laín, Aurelio; Coca-Robinot, David; Fernández-Toral, Joaquín; Arenas, Joaquín; Martín, Miguel A; Martínez-Azorín, Francisco
2017-01-01
Whole-exome sequencing was used to identify the disease gene(s) in a Spanish girl with failure to thrive, muscle weakness, mild facial weakness, elevated creatine kinase, deficiency of mitochondrial complex III and depletion of mtDNA. With whole-exome sequencing data, it was possible to get the whole mtDNA sequencing and discard any pathogenic variant in this genome. The analysis of whole exome uncovered a homozygous pathogenic mutation in thymidine kinase 2 gene ( TK2; NM_004614.4:c.323 C>T, p.T108M). TK2 mutations have been identified mainly in patients with the myopathic form of mtDNA depletion syndromes. This patient presents an atypical TK2-related myopathic form of mtDNA depletion syndromes, because despite having a very low content of mtDNA (<20%), she presents a slower and less severe evolution of the disease. In conclusion, our data confirm the role of TK2 gene in mtDNA depletion syndromes and expanded the phenotypic spectrum.
Nanopore-based fourth-generation DNA sequencing technology.
Feng, Yanxiao; Zhang, Yuechuan; Ying, Cuifeng; Wang, Deqiang; Du, Chunlei
2015-02-01
Nanopore-based sequencers, as the fourth-generation DNA sequencing technology, have the potential to quickly and reliably sequence the entire human genome for less than $1000, and possibly for even less than $100. The single-molecule techniques used by this technology allow us to further study the interaction between DNA and protein, as well as between protein and protein. Nanopore analysis opens a new door to molecular biology investigation at the single-molecule scale. In this article, we have reviewed academic achievements in nanopore technology from the past as well as the latest advances, including both biological and solid-state nanopores, and discussed their recent and potential applications. Copyright © 2015 The Authors. Production and hosting by Elsevier Ltd.. All rights reserved.
Réfega, Susana; Girard-Misguich, Fabienne; Bourdieu, Christiane; Péry, Pierre; Labbé, Marie
2003-04-02
Specific antibodies were produced ex vivo from intestinal culture of Eimeria tenella infected chickens. The specificity of these intestinal antibodies was tested against different parasite stages. These antibodies were used to immunoscreen first generation schizont and sporozoite cDNA libraries permitting the identification of new E. tenella antigens. We obtained a total of 119 cDNA clones which were subjected to sequence analysis. The sequences coding for the proteins inducing local immune responses were compared with nucleotide or protein databases and with expressed sequence tags (ESTs) databases. We identified new Eimeria genes coding for heat shock proteins, a ribosomal protein, a pyruvate kinase and a pyridoxine kinase. Specific features of other sequences are discussed.
A private DNA motif finding algorithm.
Chen, Rui; Peng, Yun; Choi, Byron; Xu, Jianliang; Hu, Haibo
2014-08-01
With the increasing availability of genomic sequence data, numerous methods have been proposed for finding DNA motifs. The discovery of DNA motifs serves a critical step in many biological applications. However, the privacy implication of DNA analysis is normally neglected in the existing methods. In this work, we propose a private DNA motif finding algorithm in which a DNA owner's privacy is protected by a rigorous privacy model, known as ∊-differential privacy. It provides provable privacy guarantees that are independent of adversaries' background knowledge. Our algorithm makes use of the n-gram model and is optimized for processing large-scale DNA sequences. We evaluate the performance of our algorithm over real-life genomic data and demonstrate the promise of integrating privacy into DNA motif finding. Copyright © 2014 Elsevier Inc. All rights reserved.
Lin, Chentao; Thomashow, Michael F.
1992-01-01
Previous studies have indicated that changes in gene expression occur in Arabidopsis thaliana L. (Heyn) during cold acclimation and that certain of the cor (cold-regulated) genes encode polypeptides that share the unusual property of remaining soluble upon boiling in aqueous solution. Here, we identify a cDNA clone for a cold-regulated gene encoding one of the “boiling-stable” polypeptides, COR15. DNA sequence analysis indicated that the gene, designated cor15, encodes a 14.7-kilodalton hydrophilic polypeptide having an N-terminal amino acid sequence that closely resembles transit peptides that target proteins to the stromal compartment of chloroplasts. Immunological studies indicated that COR15 is processed in vivo and that the mature polypeptide, COR 15m, is present in the soluble fraction of chloroplasts. Possible functions of COR 15m are discussed. ImagesFigure 1Figure 4Figure 5Figure 6Figure 7 PMID:16668917
Cartwright, Joseph F; Anderson, Karin; Longworth, Joseph; Lobb, Philip; James, David C
2018-06-01
High-fidelity replication of biologic-encoding recombinant DNA sequences by engineered mammalian cell cultures is an essential pre-requisite for the development of stable cell lines for the production of biotherapeutics. However, immortalized mammalian cells characteristically exhibit an increased point mutation frequency compared to mammalian cells in vivo, both across their genomes and at specific loci (hotspots). Thus unforeseen mutations in recombinant DNA sequences can arise and be maintained within producer cell populations. These may affect both the stability of recombinant gene expression and give rise to protein sequence variants with variable bioactivity and immunogenicity. Rigorous quantitative assessment of recombinant DNA integrity should therefore form part of the cell line development process and be an essential quality assurance metric for instances where synthetic/multi-component assemblies are utilized to engineer mammalian cells, such as the assessment of recombinant DNA fidelity or the mutability of single-site integration target loci. Based on Pacific Biosciences (Menlo Park, CA) single molecule real-time (SMRT™) circular consensus sequencing (CCS) technology we developed a rDNA sequence analysis tool to process the multi-parallel sequencing of ∼40,000 single recombinant DNA molecules. After statistical filtering of raw sequencing data, we show that this analytical method is capable of detecting single point mutations in rDNA to a minimum single mutation frequency of 0.0042% (<1/24,000 bases). Using a stable CHO transfectant pool harboring a randomly integrated 5 kB plasmid construct encoding GFP we found that 28% of recombinant plasmid copies contained at least one low frequency (<0.3%) point mutation. These mutations were predominantly found in GC base pairs (85%) and that there was no positional bias in mutation across the plasmid sequence. There was no discernable difference between the mutation frequencies of coding and non-coding DNA. The putative ratio of non-synonymous and synonymous changes within the open reading frames (ORFs) in the plasmid sequence indicates that natural selection does not impact upon the prevalence of these mutations. Here we have demonstrated the abundance of mutations that fall outside of the reported range of detection of next generation sequencing (NGS) and second generation sequencing (SGS) platforms, providing a methodology capable of being utilized in cell line development platforms to identify the fidelity of recombinant genes throughout the production process. © 2018 Wiley Periodicals, Inc.
Zill, Oliver A; Banks, Kimberly C; Fairclough, Stephen R; Mortimer, Stefanie; Vowles, James V; Mokhtari, Reza; Gandara, David R; Mack, Philip C; Odegaard, Justin I; Nagy, Rebecca J; Baca, Arthur M; Eltoukhy, Helmy; Chudova, Darya I; Lanman, Richard B; Talasaz, AmirAli
2018-05-18
Cell-free DNA (cfDNA) sequencing provides a non-invasive method for obtaining actionable genomic information to guide personalized cancer treatment, but the presence of multiple alterations in circulation related to treatment and tumor heterogeneity complicate the interpretation of the observed variants. Experimental Design: We describe the somatic mutation landscape of 70 cancer genes from cfDNA deep-sequencing analysis of 21,807 patients with treated, late-stage cancers across >50 cancer types. To facilitate interpretation of the genomic complexity of circulating tumor DNA in advanced, treated cancer patients, we developed methods to identify cfDNA copy-number driver alterations and cfDNA clonality. Patterns and prevalence of cfDNA alterations in major driver genes for non-small cell lung, breast, and colorectal cancer largely recapitulated those from tumor tissue sequencing compendia (TCGA and COSMIC; r=0.90-0.99), with the principle differences in alteration prevalence being due to patient treatment. This highly sensitive cfDNA sequencing assay revealed numerous subclonal tumor-derived alterations, expected as a result of clonal evolution, but leading to an apparent departure from mutual exclusivity in treatment-naïve tumors. Upon applying novel cfDNA clonality and copy-number driver identification methods, robust mutual exclusivity was observed among predicted truncal driver cfDNA alterations (FDR=5x10 -7 for EGFR and ERBB2 ), in effect distinguishing tumor-initiating alterations from secondary alterations. Treatment-associated resistance, including both novel alterations and parallel evolution, was common in the cfDNA cohort and was enriched in patients with targetable driver alterations (>18.6% patients). Together these retrospective analyses of a large cfDNA sequencing data set reveal subclonal structures and emerging resistance in advanced solid tumors. Copyright ©2018, American Association for Cancer Research.
Analysis of expressed sequence tags generated from full-length enriched cDNA libraries of melon
2011-01-01
Background Melon (Cucumis melo), an economically important vegetable crop, belongs to the Cucurbitaceae family which includes several other important crops such as watermelon, cucumber, and pumpkin. It has served as a model system for sex determination and vascular biology studies. However, genomic resources currently available for melon are limited. Result We constructed eleven full-length enriched and four standard cDNA libraries from fruits, flowers, leaves, roots, cotyledons, and calluses of four different melon genotypes, and generated 71,577 and 22,179 ESTs from full-length enriched and standard cDNA libraries, respectively. These ESTs, together with ~35,000 ESTs available in public domains, were assembled into 24,444 unigenes, which were extensively annotated by comparing their sequences to different protein and functional domain databases, assigning them Gene Ontology (GO) terms, and mapping them onto metabolic pathways. Comparative analysis of melon unigenes and other plant genomes revealed that 75% to 85% of melon unigenes had homologs in other dicot plants, while approximately 70% had homologs in monocot plants. The analysis also identified 6,972 gene families that were conserved across dicot and monocot plants, and 181, 1,192, and 220 gene families specific to fleshy fruit-bearing plants, the Cucurbitaceae family, and melon, respectively. Digital expression analysis identified a total of 175 tissue-specific genes, which provides a valuable gene sequence resource for future genomics and functional studies. Furthermore, we identified 4,068 simple sequence repeats (SSRs) and 3,073 single nucleotide polymorphisms (SNPs) in the melon EST collection. Finally, we obtained a total of 1,382 melon full-length transcripts through the analysis of full-length enriched cDNA clones that were sequenced from both ends. Analysis of these full-length transcripts indicated that sizes of melon 5' and 3' UTRs were similar to those of tomato, but longer than many other dicot plants. Codon usages of melon full-length transcripts were largely similar to those of Arabidopsis coding sequences. Conclusion The collection of melon ESTs generated from full-length enriched and standard cDNA libraries is expected to play significant roles in annotating the melon genome. The ESTs and associated analysis results will be useful resources for gene discovery, functional analysis, marker-assisted breeding of melon and closely related species, comparative genomic studies and for gaining insights into gene expression patterns. PMID:21599934
Herrmann, Alexander; Haake, Andrea; Ammerpohl, Ole; Martin-Guerrero, Idoia; Szafranski, Karol; Stemshorn, Kathryn; Nothnagel, Michael; Kotsopoulos, Steve K; Richter, Julia; Warner, Jason; Olson, Jeff; Link, Darren R; Schreiber, Stefan; Krawczak, Michael; Platzer, Matthias; Nürnberg, Peter; Siebert, Reiner; Hampe, Jochen
2011-01-01
Cytosine methylation provides an epigenetic level of cellular plasticity that is important for development, differentiation and cancerogenesis. We adopted microdroplet PCR to bisulfite treated target DNA in combination with second generation sequencing to simultaneously assess DNA sequence and methylation. We show measurement of methylation status in a wide range of target sequences (total 34 kb) with an average coverage of 95% (median 100%) and good correlation to the opposite strand (rho = 0.96) and to pyrosequencing (rho = 0.87). Data from lymphoma and colorectal cancer samples for SNRPN (imprinted gene), FGF6 (demethylated in the cancer samples) and HS3ST2 (methylated in the cancer samples) serve as a proof of principle showing the integration of SNP data and phased DNA-methylation information into "hepitypes" and thus the analysis of DNA methylation phylogeny in the somatic evolution of cancer.
Rapid and Easy Protocol for Quantification of Next-Generation Sequencing Libraries.
Hawkins, Steve F C; Guest, Paul C
2018-01-01
The emergence of next-generation sequencing (NGS) over the last 10 years has increased the efficiency of DNA sequencing in terms of speed, ease, and price. However, the exact quantification of a NGS library is crucial in order to obtain good data on sequencing platforms developed by the current market leader Illumina. Different approaches for DNA quantification are available currently and the most commonly used are based on analysis of the physical properties of the DNA through spectrophotometric or fluorometric methods. Although these methods are technically simple, they do not allow exact quantification as can be achieved using a real-time quantitative PCR (qPCR) approach. A qPCR protocol for DNA quantification with applications in NGS library preparation studies is presented here. This can be applied in various fields of study such as medical disorders resulting from nutritional programming disturbances.
Xiao, Yongli; Sheng, Zong-Mei; Taubenberger, Jeffery K.
2015-01-01
The vast majority of surgical biopsy and post-mortem tissue samples are formalin-fixed and paraffin-embedded (FFPE), but this process leads to RNA degradation that limits gene expression analysis. As an example, the viral RNA genome of the 1918 pandemic influenza A virus was previously determined in a 9-year effort by overlapping RT-PCR from post-mortem samples. Using the protocols described here, the full genome of the 1918 virus at high coverage was determined in one high-throughput sequencing run of a cDNA library derived from total RNA of a 1918 FFPE sample after duplex-specific nuclease treatments. This basic methodological approach should assist in the analysis of FFPE tissue samples isolated over the past century from a variety of infectious diseases. PMID:26344216
Identification of Y-Chromosome Sequences in Turner Syndrome.
Silva-Grecco, Roseane Lopes da; Trovó-Marqui, Alessandra Bernadete; Sousa, Tiago Alves de; Croce, Lilian Da; Balarin, Marly Aparecida Spadotto
2016-05-01
To investigate the presence of Y-chromosome sequences and determine their frequency in patients with Turner syndrome. The study included 23 patients with Turner syndrome from Brazil, who gave written informed consent for participating in the study. Cytogenetic analyses were performed in peripheral blood lymphocytes, with 100 metaphases per patient. Genomic DNA was also extracted from peripheral blood lymphocytes, and gene sequences DYZ1, DYZ3, ZFY and SRY were amplified by Polymerase Chain Reaction. The cytogenetic analysis showed a 45,X karyotype in 9 patients (39.2 %) and a mosaic pattern in 14 (60.8 %). In 8.7 % (2 out of 23) of the patients, Y-chromosome sequences were found. This prevalence is very similar to those reported previously. The initial karyotype analysis of these patients did not reveal Y-chromosome material, but they were found positive for Y-specific sequences in the lymphocyte DNA analysis. The PCR technique showed that 2 (8.7 %) of the patients with Turner syndrome had Y-chromosome sequences, both presenting marker chromosomes on cytogenetic analysis.
[Identification of antler powder components based on DNA barcoding technology].
Jia, Jing; Shi, Lin-chun; Xu, Zhi-chao; Xin, Tian-yi; Song, Jing-yuan; Chen Shi, Lin
2015-10-01
In order to authenticate the components of antler powder in the market, DNA barcoding technology coupled with cloning method were used. Cytochrome c oxidase subunit I (COI) sequences were obtained according to the DNA barcoding standard operation procedure (SOP). For antler powder with possible mixed components, the cloning method was used to get each COI sequence. 65 COI sequences were successfully obtained from commercial antler powders via sequencing PCR products. The results indicates that only 38% of these samples were derived from Cervus nippon Temminck or Cervus elaphus Linnaeus which is recorded in the 2010 edition of "Chinese Pharmacopoeia", while 62% of them were derived from other species. Rangifer tarandus Linnaeus was the most frequent species among the adulterants. Further analysis showed that some samples collected from different regions, companies and prices, contained adulterants. Analysis of 36 COI sequences obtained by the cloning method showed that C. elaphus and C. nippon were main components. In addition, some samples were marked clearly as antler powder on the label, however, C. elaphus or R. tarandus were their main components. In summary, DNA barcoding can accurately and efficiently distinguish the exact content in the commercial antler powder, which provides a new technique to ensure clinical safety and improve quality control of Chinese traditional medicine
Active bacterial community structure along vertical redox gradients in Baltic Sea sediment
DOE Office of Scientific and Technical Information (OSTI.GOV)
Jansson, Janet; Edlund, Anna; Hardeman, Fredrik
Community structures of active bacterial populations were investigated along a vertical redox profile in coastal Baltic Sea sediments by terminal-restriction fragment length polymorphism (T-RFLP) and clone library analysis. According to correspondence analysis of T-RFLP results and sequencing of cloned 16S rRNA genes, the microbial community structures at three redox depths (179 mV, -64 mV and -337 mV) differed significantly. The bacterial communities in the community DNA differed from those in bromodeoxyuridine (BrdU)-labeled DNA, indicating that the growing members of the community that incorporated BrdU were not necessarily the most dominant members. The structures of the actively growing bacterial communities weremore » most strongly correlated to organic carbon followed by total nitrogen and redox potentials. Bacterial identification by sequencing of 16S rRNA genes from clones of BrdU-labeled DNA and DNA from reverse transcription PCR (rt-PCR) showed that bacterial taxa involved in nitrogen and sulfur cycling were metabolically active along the redox profiles. Several sequences had low similarities to previously detected sequences indicating that novel lineages of bacteria are present in Baltic Sea sediments. Also, a high number of different 16S rRNA gene sequences representing different phyla were detected at all sampling depths.« less
Salton, S R
1991-09-01
A nervous system-specific mRNA that is rapidly induced in PC12 cells to a greater extent by nerve growth factor (NGF) than by epidermal growth factor treatment has been cloned. The polypeptide deduced from the nucleic acid sequence of the NGF33.1 cDNA clone contains regions of amino acid sequence identity with that predicted by the cDNA clone VGF, and further analysis suggests that both NGF33.1 and VGF cDNA clones very likely correspond to the same mRNA (VGF). In this report both the nucleic acid sequence that corresponds to VGF mRNA and the polypeptide predicted by the NGF33.1 cDNA clone are presented. Genomic Southern analysis and database comparison did not detect additional sequences with high homology to the VGF gene. Induction of VGF mRNA by depolarization and phorbol 12-myristate 13-acetate treatment was greater than by serum stimulation or protein kinase A pathway activation. These studies suggest that VGF mRNA is induced to the greatest extent by NGF treatment and that VGF is one of the most rapidly regulated neuronal mRNAs identified in PC12 cells.
Gocayne, J; Robinson, D A; FitzGerald, M G; Chung, F Z; Kerlavage, A R; Lentes, K U; Lai, J; Wang, C D; Fraser, C M; Venter, J C
1987-12-01
Two cDNA clones, lambda RHM-MF and lambda RHB-DAR, encoding the muscarinic cholinergic receptor and the beta-adrenergic receptor, respectively, have been isolated from a rat heart cDNA library. The cDNA clones were characterized by restriction mapping and automated DNA sequence analysis utilizing fluorescent dye primers. The rat heart muscarinic receptor consists of 466 amino acids and has a calculated molecular weight of 51,543. The rat heart beta-adrenergic receptor consists of 418 amino acids and has a calculated molecular weight of 46,890. The two cardiac receptors have substantial amino acid homology (27.2% identity, 50.6% with favored substitutions). The rat cardiac beta receptor has 88.0% homology (92.5% with favored substitutions) with the human brain beta receptor and the rat cardiac muscarinic receptor has 94.6% homology (97.6% with favored substitutions) with the porcine cardiac muscarinic receptor. The muscarinic cholinergic and beta-adrenergic receptors appear to be as conserved as hemoglobin and cytochrome c but less conserved than histones and are clearly members of a multigene family. These data support our hypothesis, based upon biochemical and immunological evidence, that suggests considerable structural homology and evolutionary conservation between adrenergic and muscarinic cholinergic receptors. To our knowledge, this is the first report utilizing automated DNA sequence analysis to determine the structure of a gene.
Zhitnikova, M Y; Shestopalova, A V
2017-11-01
The structural adjustments of the sugar-phosphate DNA backbone (switching of the γ angle (O5'-C5'-C4'-C3') from canonical to alternative conformations and/or C2'-endo → C3'-endo transition of deoxyribose) lead to the sequence-specific changes in accessible surface area of both polar and non-polar atoms of the grooves and the polar/hydrophobic profile of the latter ones. The distribution of the minor groove electrostatic potential is likely to be changing as a result of such conformational rearrangements in sugar-phosphate DNA backbone. Our analysis of the crystal structures of the short free DNA fragments and calculation of their electrostatic potentials allowed us to determine: (1) the number of classical and alternative γ angle conformations in the free B-DNA; (2) changes in the minor groove electrostatic potential, depending on the conformation of the sugar-phosphate DNA backbone; (3) the effect of the DNA sequence on the minor groove electrostatic potential. We have demonstrated that the structural adjustments of the DNA double helix (the conformations of the sugar-phosphate backbone and the minor groove dimensions) induce changes in the distribution of the minor groove electrostatic potential and are sequence-specific. Therefore, these features of the minor groove sizes and distribution of minor groove electrostatic potential can be used as a signal for recognition of the target DNA sequence by protein in the implementation of the indirect readout mechanism.
Analysis of DNA Sequences by an Optical ime-Integrating Correlator: Proposal
1991-11-01
CURRENT TECHNOLOGY 2 3.0 TIME-INTEGRATING CORRELATOR 2 4.0 REPRESENTATIONS OF THE DNA BASES 8 5.0 DNA ANALYSIS STRATEGY 8 6.0 STRATEGY FOR COARSE...1)-correlation peak formed by the AxB term and (2)-pedestal formed by the A + B terms. 7 Figure 4: Short representations of the DNA bases where each...linear scale. 15 x LIST OF TABLES PAGE Table 1: Short representations of the DNA bases where each base is represented by 7-bits long pseudorandom
Duret, Laurent; Cohen, Jean; Jubin, Claire; Dessen, Philippe; Goût, Jean-François; Mousset, Sylvain; Aury, Jean-Marc; Jaillon, Olivier; Noël, Benjamin; Arnaiz, Olivier; Bétermier, Mireille; Wincker, Patrick; Meyer, Eric; Sperling, Linda
2008-01-01
Ciliates are the only unicellular eukaryotes known to separate germinal and somatic functions. Diploid but silent micronuclei transmit the genetic information to the next sexual generation. Polyploid macronuclei express the genetic information from a streamlined version of the genome but are replaced at each sexual generation. The macronuclear genome of Paramecium tetraurelia was recently sequenced by a shotgun approach, providing access to the gene repertoire. The 72-Mb assembly represents a consensus sequence for the somatic DNA, which is produced after sexual events by reproducible rearrangements of the zygotic genome involving elimination of repeated sequences, precise excision of unique-copy internal eliminated sequences (IES), and amplification of the cellular genes to high copy number. We report use of the shotgun sequencing data (>106 reads representing 13× coverage of a completely homozygous clone) to evaluate variability in the somatic DNA produced by these developmental genome rearrangements. Although DNA amplification appears uniform, both of the DNA elimination processes produce sequence heterogeneity. The variability that arises from IES excision allowed identification of hundreds of putative new IESs, compared to 42 that were previously known, and revealed cases of erroneous excision of segments of coding sequences. We demonstrate that IESs in coding regions are under selective pressure to introduce premature termination of translation in case of excision failure. PMID:18256234
Santini, A C; Santos, H R M; Gross, E; Corrêa, R X
2013-03-11
The genus Burkholderia (β-Proteobacteria) currently comprises more than 60 species, including parasites, symbionts and free-living organisms. Several new species of Burkholderia have recently been described showing a great diversity of phenotypes. We examined the diversity of Burkholderia spp in environmental samples collected from Caatinga and Atlantic rainforest biomes of Bahia, Brazil. Legume nodules were collected from five locations, and 16S rDNA and recA genes of the isolated microorganisms were analyzed. Thirty-three contigs of 16S rRNA genes and four contigs of the recA gene related to the genus Burkholderia were obtained. The genetic dissimilarity of the strains ranged from 0 to 2.5% based on 16S rDNA analysis, indicating two main branches: one distinct branch of the dendrogram for the B. cepacia complex and another branch that rendered three major groups, partially reflecting host plants and locations. A dendrogram designed with sequences of this research and those designed with sequences of Burkholderia-type strains and the first hit BLAST had similar topologies. A dendrogram similar to that constructed by analysis of 16S rDNA was obtained using sequences of the fragment of the recA gene. The 16S rDNA sequences enabled sufficient identification of relevant similarities and groupings amongst isolates and the sequences that we obtained. Only 6 of the 33 isolates analyzed via 16S rDNA sequencing showed high similarity with the B. cepacia complex. Thus, over 3/4 of the isolates have potential for biotechnological applications.
Chitty, Lyn S.; Lo, Y. M. Dennis
2015-01-01
The identification of cell-free fetal DNA (cffDNA) in maternal plasma in 1997 heralded the most significant change in obstetric care for decades, with the advent of safer screening and diagnosis based on analysis of maternal blood. Here, we describe how the technological advances offered by next-generation sequencing have allowed for the development of a highly sensitive screening test for aneuploidies as well as definitive prenatal molecular diagnosis for some monogenic disorders. PMID:26187875
Bernsen, M R; Dijkman, H B; de Vries, E; Figdor, C G; Ruiter, D J; Adema, G J; van Muijen, G N
1998-10-01
Molecular analysis of small tissue samples has become increasingly important in biomedical studies. Using a laser dissection microscope and modified nucleic acid isolation protocols, we demonstrate that multiple mRNA as well as DNA sequences can be identified from a single-cell sample. In addition, we show that the specificity of procurement of tissue samples is not compromised by smear contamination resulting from scraping of the microtome knife during sectioning of lesions. The procedures described herein thus allow for efficient RT-PCR or PCR analysis of multiple nucleic acid sequences from small tissue samples obtained by laser-assisted microdissection.
Sequence analysis of Leukemia DNA
NASA Astrophysics Data System (ADS)
Nacong, Nasria; Lusiyanti, Desy; Irawan, Muhammad. Isa
2018-03-01
Cancer is a very deadly disease, one of which is leukemia disease or better known as blood cancer. The cancer cell can be detected by taking DNA in laboratory test. This study focused on local alignment of leukemia and non leukemia data resulting from NCBI in the form of DNA sequences by using Smith-Waterman algorithm. SmithWaterman algorithm was invented by TF Smith and MS Waterman in 1981. These algorithms try to find as much as possible similarity of a pair of sequences, by giving a negative value to the unequal base pair (mismatch), and positive values on the same base pair (match). So that will obtain the maximum positive value as the end of the alignment, and the minimum value as the initial alignment. This study will use sequences of leukemia and 3 sequences of non leukemia.
NASA Astrophysics Data System (ADS)
Narayanaswamy, Nagarjun; Kumar, Manoj; Das, Sadhan; Sharma, Rahul; Samanta, Pralok K.; Pati, Swapan K.; Dhar, Suman K.; Kundu, Tapas K.; Govindaraju, T.
2014-09-01
Sequence-specific recognition of DNA by small turn-on fluorescence probes is a promising tool for bioimaging, bioanalytical and biomedical applications. Here, the authors report a novel cell-permeable and red fluorescent hemicyanine-based thiazole coumarin (TC) probe for DNA recognition, nuclear staining and cell cycle analysis. TC exhibited strong fluorescence enhancement in the presence of DNA containing AT-base pairs, but did not fluoresce with GC sequences, single-stranded DNA, RNA and proteins. The fluorescence staining of HeLa S3 and HEK 293 cells by TC followed by DNase and RNase digestion studies depicted the selective staining of DNA in the nucleus over the cytoplasmic region. Fluorescence-activated cell sorting (FACS) analysis by flow cytometry demonstrated the potential application of TC in cell cycle analysis in HEK 293 cells. Metaphase chromosome and malaria parasite DNA imaging studies further confirmed the in vivo diagnostic and therapeutic applications of probe TC. Probe TC may find multiple applications in fluorescence spectroscopy, diagnostics, bioimaging and molecular and cell biology.
Narayanaswamy, Nagarjun; Kumar, Manoj; Das, Sadhan; Sharma, Rahul; Samanta, Pralok K.; Pati, Swapan K.; Dhar, Suman K.; Kundu, Tapas K.; Govindaraju, T.
2014-01-01
Sequence-specific recognition of DNA by small turn-on fluorescence probes is a promising tool for bioimaging, bioanalytical and biomedical applications. Here, the authors report a novel cell-permeable and red fluorescent hemicyanine-based thiazole coumarin (TC) probe for DNA recognition, nuclear staining and cell cycle analysis. TC exhibited strong fluorescence enhancement in the presence of DNA containing AT-base pairs, but did not fluoresce with GC sequences, single-stranded DNA, RNA and proteins. The fluorescence staining of HeLa S3 and HEK 293 cells by TC followed by DNase and RNase digestion studies depicted the selective staining of DNA in the nucleus over the cytoplasmic region. Fluorescence-activated cell sorting (FACS) analysis by flow cytometry demonstrated the potential application of TC in cell cycle analysis in HEK 293 cells. Metaphase chromosome and malaria parasite DNA imaging studies further confirmed the in vivo diagnostic and therapeutic applications of probe TC. Probe TC may find multiple applications in fluorescence spectroscopy, diagnostics, bioimaging and molecular and cell biology. PMID:25252596
First report of the complete sequence of Sida golden yellow vein virus from Jamaica.
Stewart, Cheryl S; Kon, Tatsuya; Gilbertson, Robert L; Roye, Marcia E
2011-08-01
Begomoviruses are phytopathogens that threaten food security [18]. Sida spp. are ubiquitous weed species found in Jamaica. Sida samples were collected island-wide, DNA was extracted via a modified Dellaporta method, and the viral genome was amplified using degenerate and sequence-specific primers [2, 11]. The amplicons were cloned and sequenced. Sequence analysis revealed that a DNA-A molecule isolated from a plant in Liguanea, St. Andrew, was 90.9% similar to Sida golden yellow vein virus-[United States of America:Homestead:A11], making it a strain of SiGYVV. It was named Sida golden yellow vein virus-[Jamaica:Liguanea 2:2008] (SiGYVV-[JM:Lig2:08]). The cognate DNA-B, previously unreported, was successfully cloned and was most similar to that of Malvastrum yellow mosaic Jamaica virus (MaYMJV). Phylogenetic analysis suggested that this virus was most closely related to begomoviruses that infect malvaceous hosts in Jamaica, Cuba and Florida in the United States.
Umetsu, Kazuo; Iwabuchi, Naruki; Yuasa, Isao; Saitou, Naruya; Clark, Paul F; Boxshall, Geoff; Osawa, Motoki; Igarashi, Keiji
2002-12-01
The complete mitochondrial DNA (mtNDA) of the tadpole shrimp Triops cancriformis was sequenced. The sequence consisted of 15,101 bp with an A+T content of 69%. Its gene arrangement was identical with those sequences of the water flea (Daphnia pulex) and giant tiger prawn (Penaeus monodon), whereas it differed from that of the brine shrimp (Artemia franciscana) in the arrangement of its genes for tRNAs. Phylogenetic analysis revealed T. cancriformis to be more closely related to the water flea than to the brine shrimp and giant tiger prawn. We also compared the 16S rRNA sequences of five formalin-fixed tadpole shrimps that had been collected in five different locations and stored in a museum. The sequence divergence was in the range of 0-1.51%, suggesting that those samples were closely related to each other.
High-throughput sequencing: a failure mode analysis.
Yang, George S; Stott, Jeffery M; Smailus, Duane; Barber, Sarah A; Balasundaram, Miruna; Marra, Marco A; Holt, Robert A
2005-01-04
Basic manufacturing principles are becoming increasingly important in high-throughput sequencing facilities where there is a constant drive to increase quality, increase efficiency, and decrease operating costs. While high-throughput centres report failure rates typically on the order of 10%, the causes of sporadic sequencing failures are seldom analyzed in detail and have not, in the past, been formally reported. Here we report the results of a failure mode analysis of our production sequencing facility based on detailed evaluation of 9,216 ESTs generated from two cDNA libraries. Two categories of failures are described; process-related failures (failures due to equipment or sample handling) and template-related failures (failures that are revealed by close inspection of electropherograms and are likely due to properties of the template DNA sequence itself). Preventative action based on a detailed understanding of failure modes is likely to improve the performance of other production sequencing pipelines.
Hartman, Amber L; Riddle, Sean; McPhillips, Timothy; Ludäscher, Bertram; Eisen, Jonathan A
2010-06-12
For more than two decades microbiologists have used a highly conserved microbial gene as a phylogenetic marker for bacteria and archaea. The small-subunit ribosomal RNA gene, also known as 16 S rRNA, is encoded by ribosomal DNA, 16 S rDNA, and has provided a powerful comparative tool to microbial ecologists. Over time, the microbial ecology field has matured from small-scale studies in a select number of environments to massive collections of sequence data that are paired with dozens of corresponding collection variables. As the complexity of data and tool sets have grown, the need for flexible automation and maintenance of the core processes of 16 S rDNA sequence analysis has increased correspondingly. We present WATERS, an integrated approach for 16 S rDNA analysis that bundles a suite of publicly available 16 S rDNA analysis software tools into a single software package. The "toolkit" includes sequence alignment, chimera removal, OTU determination, taxonomy assignment, phylogentic tree construction as well as a host of ecological analysis and visualization tools. WATERS employs a flexible, collection-oriented 'workflow' approach using the open-source Kepler system as a platform. By packaging available software tools into a single automated workflow, WATERS simplifies 16 S rDNA analyses, especially for those without specialized bioinformatics, programming expertise. In addition, WATERS, like some of the newer comprehensive rRNA analysis tools, allows researchers to minimize the time dedicated to carrying out tedious informatics steps and to focus their attention instead on the biological interpretation of the results. One advantage of WATERS over other comprehensive tools is that the use of the Kepler workflow system facilitates result interpretation and reproducibility via a data provenance sub-system. Furthermore, new "actors" can be added to the workflow as desired and we see WATERS as an initial seed for a sizeable and growing repository of interoperable, easy-to-combine tools for asking increasingly complex microbial ecology questions.
Escorza-Treviño, S; Dizon, A E
2000-08-01
Mitochondrial DNA (mtDNA) control-region sequences and microsatellite loci length polymorphisms were used to estimate phylogeographical patterns (historical patterns underlying contemporary distribution), intraspecific population structure and gender-biased dispersal of Phocoenoides dalli dalli across its entire range. One-hundred and thirteen animals from several geographical strata were sequenced over 379 bp of mtDNA, resulting in 58 mtDNA haplotypes. Analysis using F(ST) values (based on haplotype frequencies) and phi(ST) values (based on frequencies and genetic distances between haplotypes) yielded statistically significant separation (bootstrap values P < 0.05) among most of the stocks currently used for management purposes. A minimum spanning network of haplotypes showed two very distinctive clusters, differentially occupied by western and eastern populations, with some common widespread haplotypes. This suggests some degree of phyletic radiation from west to east, superimposed on gene flow. Highly male-biased migration was detected for several population comparisons. Nuclear microsatellite DNA markers (119 individuals and six loci) provided additional support for population subdivision and gender-biased dispersal detected in the mtDNA sequences. Analysis using F(ST) values (based on allelic frequencies) yielded statistically significant separation between some, but not all, populations distinguished by mtDNA analysis. R(ST) values (based on frequencies of and genetic distance between alleles) showed no statistically significant subdivision. Again, highly male-biased dispersal was detected for all population comparisons, suggesting, together with morphological and reproductive data, the existence of sexual selection. Our molecular results argue for nine distinct dalli-type populations that should be treated as separate units for management purposes.
Transcription blockage by stable H-DNA analogs in vitro
Pandey, Shristi; Ogloblina, Anna M.; Belotserkovskii, Boris P.; Dolinnaya, Nina G.; Yakubovskaya, Marianna G.; Mirkin, Sergei M.; Hanawalt, Philip C.
2015-01-01
DNA sequences that can form unusual secondary structures are implicated in regulating gene expression and causing genomic instability. H-palindromes are an important class of such DNA sequences that can form an intramolecular triplex structure, H-DNA. Within an H-palindrome, the H-DNA and canonical B-DNA are in a dynamic equilibrium that shifts toward H-DNA with increased negative supercoiling. The interplay between H- and B-DNA and the fact that the process of transcription affects supercoiling makes it difficult to elucidate the effects of H-DNA upon transcription. We constructed a stable structural analog of H-DNA that cannot flip into B-DNA, and studied the effects of this structure on transcription by T7 RNA polymerase in vitro. We found multiple transcription blockage sites adjacent to and within sequences engaged in this triplex structure. Triplex-mediated transcription blockage varied significantly with changes in ambient conditions: it was exacerbated in the presence of Mn2+ or by increased concentrations of K+ and Li+. Analysis of the detailed pattern of the blockage suggests that RNA polymerase is sterically hindered by H-DNA and has difficulties in unwinding triplex DNA. The implications of these findings for the biological roles of triple-stranded DNA structures are discussed. PMID:26101261
Benschop, Corina C G; Quaak, Frederike C A; Boon, Mathilde E; Sijen, Titia; Kuiper, Irene
2012-03-01
Forensic analysis of biological traces generally encompasses the investigation of both the person who contributed to the trace and the body site(s) from which the trace originates. For instance, for sexual assault cases, it can be beneficial to distinguish vaginal samples from skin or saliva samples. In this study, we explored the use of microbial flora to indicate vaginal origin. First, we explored the vaginal microbiome for a large set of clinical vaginal samples (n = 240) by next generation sequencing (n = 338,184 sequence reads) and found 1,619 different sequences. Next, we selected 389 candidate probes targeting genera or species and designed a microarray, with which we analysed a diverse set of samples; 43 DNA extracts from vaginal samples and 25 DNA extracts from samples from other body sites, including sites in close proximity of or in contact with the vagina. Finally, we used the microarray results and next generation sequencing dataset to assess the potential for a future approach that uses microbial markers to indicate vaginal origin. Since no candidate genera/species were found to positively identify all vaginal DNA extracts on their own, while excluding all non-vaginal DNA extracts, we deduce that a reliable statement about the cellular origin of a biological trace should be based on the detection of multiple species within various genera. Microarray analysis of a sample will then render a microbial flora pattern that is probably best analysed in a probabilistic approach.
Organizational heterogeneity of vertebrate genomes.
Frenkel, Svetlana; Kirzhner, Valery; Korol, Abraham
2012-01-01
Genomes of higher eukaryotes are mosaics of segments with various structural, functional, and evolutionary properties. The availability of whole-genome sequences allows the investigation of their structure as "texts" using different statistical and computational methods. One such method, referred to as Compositional Spectra (CS) analysis, is based on scoring the occurrences of fixed-length oligonucleotides (k-mers) in the target DNA sequence. CS analysis allows generating species- or region-specific characteristics of the genome, regardless of their length and the presence of coding DNA. In this study, we consider the heterogeneity of vertebrate genomes as a joint effect of regional variation in sequence organization superimposed on the differences in nucleotide composition. We estimated compositional and organizational heterogeneity of genome and chromosome sequences separately and found that both heterogeneity types vary widely among genomes as well as among chromosomes in all investigated taxonomic groups. The high correspondence of heterogeneity scores obtained on three genome fractions, coding, repetitive, and the remaining part of the noncoding DNA (the genome dark matter--GDM) allows the assumption that CS-heterogeneity may have functional relevance to genome regulation. Of special interest for such interpretation is the fact that natural GDM sequences display the highest deviation from the corresponding reshuffled sequences.
Separation and parallel sequencing of the genomes and transcriptomes of single cells using G&T-seq.
Macaulay, Iain C; Teng, Mabel J; Haerty, Wilfried; Kumar, Parveen; Ponting, Chris P; Voet, Thierry
2016-11-01
Parallel sequencing of a single cell's genome and transcriptome provides a powerful tool for dissecting genetic variation and its relationship with gene expression. Here we present a detailed protocol for G&T-seq, a method for separation and parallel sequencing of genomic DNA and full-length polyA(+) mRNA from single cells. We provide step-by-step instructions for the isolation and lysis of single cells; the physical separation of polyA(+) mRNA from genomic DNA using a modified oligo-dT bead capture and the respective whole-transcriptome and whole-genome amplifications; and library preparation and sequence analyses of these amplification products. The method allows the detection of thousands of transcripts in parallel with the genetic variants captured by the DNA-seq data from the same single cell. G&T-seq differs from other currently available methods for parallel DNA and RNA sequencing from single cells, as it involves physical separation of the DNA and RNA and does not require bespoke microfluidics platforms. The process can be implemented manually or through automation. When performed manually, paired genome and transcriptome sequencing libraries from eight single cells can be produced in ∼3 d by researchers experienced in molecular laboratory work. For users with experience in the programming and operation of liquid-handling robots, paired DNA and RNA libraries from 96 single cells can be produced in the same time frame. Sequence analysis and integration of single-cell G&T-seq DNA and RNA data requires a high level of bioinformatics expertise and familiarity with a wide range of informatics tools.
Normand, A C; Packeu, A; Cassagne, C; Hendrickx, M; Ranque, S; Piarroux, R
2018-05-01
Conventional dermatophyte identification is based on morphological features. However, recent studies have proposed to use the nucleotide sequences of the rRNA internal transcribed spacer (ITS) region as an identification barcode of all fungi, including dermatophytes. Several nucleotide databases are available to compare sequences and thus identify isolates; however, these databases often contain mislabeled sequences that impair sequence-based identification. We evaluated five of these databases on a clinical isolate panel. We selected 292 clinical dermatophyte strains that were prospectively subjected to an ITS2 nucleotide sequence analysis. Sequences were analyzed against the databases, and the results were compared to clusters obtained via DNA alignment of sequence segments. The DNA tree served as the identification standard throughout the study. According to the ITS2 sequence identification, the majority of strains (255/292) belonged to the genus Trichophyton , mainly T. rubrum complex ( n = 184), T. interdigitale ( n = 40), T. tonsurans ( n = 26), and T. benhamiae ( n = 5). Other genera included Microsporum (e.g., M. canis [ n = 21], M. audouinii [ n = 10], Nannizzia gypsea [ n = 3], and Epidermophyton [ n = 3]). Species-level identification of T. rubrum complex isolates was an issue. Overall, ITS DNA sequencing is a reliable tool to identify dermatophyte species given that a comprehensive and correctly labeled database is consulted. Since many inaccurate identification results exist in the DNA databases used for this study, reference databases must be verified frequently and amended in line with the current revisions of fungal taxonomy. Before describing a new species or adding a new DNA reference to the available databases, its position in the phylogenetic tree must be verified. Copyright © 2018 American Society for Microbiology.
Antalis, T M; Clark, M A; Barnes, T; Lehrbach, P R; Devine, P L; Schevzov, G; Goss, N H; Stephens, R W; Tolstoshev, P
1988-02-01
Human monocyte-derived plasminogen activator inhibitor (mPAI-2) was purified to homogeneity from the U937 cell line and partially sequenced. Oligonucleotide probes derived from this sequence were used to screen a cDNA library prepared from U937 cells. One positive clone was sequenced and contained most of the coding sequence as well as a long incomplete 3' untranslated region (1112 base pairs). This cDNA sequence was shown to encode mPAI-2 by hybrid-select translation. A cDNA clone encoding the remainder of the mPAI-2 mRNA was obtained by primer extension of U937 poly(A)+ RNA using a probe complementary to the mPAI-2 coding region. The coding sequence for mPAI-2 was placed under the control of the lambda PL promoter, and the protein expressed in Escherichia coli formed a complex with urokinase that could be detected immunologically. By nucleotide sequence analysis, mPAI-2 cDNA encodes a protein containing 415 amino acids with a predicted unglycosylated Mr of 46,543. The predicted amino acid sequence of mPAI-2 is very similar to placental PAI-2 (3 amino acid differences) and shows extensive homology with members of the serine protease inhibitor (serpin) superfamily. mPAI-2 was found to be more homologous to ovalbumin (37%) than the endothelial plasminogen activator inhibitor, PAI-1 (26%). Like ovalbumin, mPAI-2 appears to have no typical amino-terminal signal sequence. The 3' untranslated region of the mPAI-2 cDNA contains a putative regulatory sequence that has been associated with the inflammatory mediators.
Antalis, T M; Clark, M A; Barnes, T; Lehrbach, P R; Devine, P L; Schevzov, G; Goss, N H; Stephens, R W; Tolstoshev, P
1988-01-01
Human monocyte-derived plasminogen activator inhibitor (mPAI-2) was purified to homogeneity from the U937 cell line and partially sequenced. Oligonucleotide probes derived from this sequence were used to screen a cDNA library prepared from U937 cells. One positive clone was sequenced and contained most of the coding sequence as well as a long incomplete 3' untranslated region (1112 base pairs). This cDNA sequence was shown to encode mPAI-2 by hybrid-select translation. A cDNA clone encoding the remainder of the mPAI-2 mRNA was obtained by primer extension of U937 poly(A)+ RNA using a probe complementary to the mPAI-2 coding region. The coding sequence for mPAI-2 was placed under the control of the lambda PL promoter, and the protein expressed in Escherichia coli formed a complex with urokinase that could be detected immunologically. By nucleotide sequence analysis, mPAI-2 cDNA encodes a protein containing 415 amino acids with a predicted unglycosylated Mr of 46,543. The predicted amino acid sequence of mPAI-2 is very similar to placental PAI-2 (3 amino acid differences) and shows extensive homology with members of the serine protease inhibitor (serpin) superfamily. mPAI-2 was found to be more homologous to ovalbumin (37%) than the endothelial plasminogen activator inhibitor, PAI-1 (26%). Like ovalbumin, mPAI-2 appears to have no typical amino-terminal signal sequence. The 3' untranslated region of the mPAI-2 cDNA contains a putative regulatory sequence that has been associated with the inflammatory mediators. Images PMID:3257578
[Detection and diversity analysis of rumen methanogens in the co-cultures with anaerobic fungi].
Cheng, Yan-fen; Mao, Sheng-yong; Pei, Cai-xia; Liu, Jian-xin; Zhu, Wei-yun
2006-12-01
Rumen methanogen diversity in the co-cultures with anaerobic fungi from goat rumen was analyzed. Mix-cultures of anaerobic fungi and methanogens were obtained from goat rumen using anaerobic fungal medium and the addition of penicillin and streptomycin and then subcultured 62 times by transferring cultures every 3 - 4d. Total DNA from the original rumen fluid and subcultured fungal cultures was used for PCR/DGGE and RFLP analysis. 16S rDNA of clones corresponding to representative OTUs were sequenced. Results showed that the diversity index (Shannon index) of the methanogens generated from DGGE profiles reduced from 1.32 to 0.99 from rumen fluid to fungal culture after 45 subculturing, with the lowest similarity of DGGE profiles at 34.7%. The Shannon index increased from 0.99 to 1.15 from the fungal culture after 45 subculturing to that after 62 subculturing, with the lowest similarity at 89.2% . A total of 5 OTUs were obtained from 69. clones using RFLP analysis and six clones representing the 5 OTUs respectively were sequenced. Of the 5 OTUs, three had their cloned 16S rDNA sequences most closely related to uncultured archaeal symbiont PA202 with the same similarity of 95 %, but had not closely related to any identified culturable methanogen. The rest two OTUs had their cloned 16S rDNA sequences sharing the same closest relative, uncultured rumen methanogen 956, with the same similarity of 97% .Their 16S rDNA sequences of these two OTUs also showed 97% similar to the closest identified culturable methanogen Methanobrevibacter sp. NT7. In conclusion, diverse yet unidentified rumen methanogen species exist in the co-cultures with anaerobic fungi isolated from the goat rumen.
Bonen, Linda; Boer, Poppo H.; Gray, Michael W.
1984-01-01
We have determined the sequence of the wheat mitochondrial gene for cytochrome oxidase subunit II (COII) and find that its derived protein sequence differs from that of maize at only three amino acid positions. Unexpectedly, all three replacements are non-conservative ones. The wheat COII gene has a highly-conserved intron at the same position as in maize, but the wheat intron is 1.5 times longer because of an insert relative to its maize counterpart. Hybridization analysis of mitochondrial DNA from rye, pea, broad bean and cucumber indicates strong sequence conservation of COII coding sequences among all these higher plants. However, only rye and maize mitochondrial DNA show homology with wheat COII intron sequences and rye alone with intron-insert sequences. We find that a sequence identical to the region of the 5' exon corresponding to the transmembrane domain of the COII protein is present at a second genomic location in wheat mitochondria. These variations in COII gene structure and size, as well as the presence of repeated COII sequences, illustrate at the DNA sequence level, factors which contribute to higher plant mitochondrial DNA diversity and complexity. ImagesFig. 3.Fig. 4.Fig. 5. PMID:16453565
Li, XiaoChing; Wang, Xiu-Jie; Tannenhauser, Jonathan; Podell, Sheila; Mukherjee, Piali; Hertel, Moritz; Biane, Jeremy; Masuda, Shoko; Nottebohm, Fernando; Gaasterland, Terry
2007-01-01
Vocal learning and neuronal replacement have been studied extensively in songbirds, but until recently, few molecular and genomic tools for songbird research existed. Here we describe new molecular/genomic resources developed in our laboratory. We made cDNA libraries from zebra finch (Taeniopygia guttata) brains at different developmental stages. A total of 11,000 cDNA clones from these libraries, representing 5,866 unique gene transcripts, were randomly picked and sequenced from the 3′ ends. A web-based database was established for clone tracking, sequence analysis, and functional annotations. Our cDNA libraries were not normalized. Sequencing ESTs without normalization produced many developmental stage-specific sequences, yielding insights into patterns of gene expression at different stages of brain development. In particular, the cDNA library made from brains at posthatching day 30–50, corresponding to the period of rapid song system development and song learning, has the most diverse and richest set of genes expressed. We also identified five microRNAs whose sequences are highly conserved between zebra finch and other species. We printed cDNA microarrays and profiled gene expression in the high vocal center of both adult male zebra finches and canaries (Serinus canaria). Genes differentially expressed in the high vocal center were identified from the microarray hybridization results. Selected genes were validated by in situ hybridization. Networks among the regulated genes were also identified. These resources provide songbird biologists with tools for genome annotation, comparative genomics, and microarray gene expression analysis. PMID:17426146
Neugebauer, Tomasz; Bordeleau, Eric; Burrus, Vincent; Brzezinski, Ryszard
2015-01-01
Data visualization methods are necessary during the exploration and analysis activities of an increasingly data-intensive scientific process. There are few existing visualization methods for raw nucleotide sequences of a whole genome or chromosome. Software for data visualization should allow the researchers to create accessible data visualization interfaces that can be exported and shared with others on the web. Herein, novel software developed for generating DNA data visualization interfaces is described. The software converts DNA data sets into images that are further processed as multi-scale images to be accessed through a web-based interface that supports zooming, panning and sequence fragment selection. Nucleotide composition frequencies and GC skew of a selected sequence segment can be obtained through the interface. The software was used to generate DNA data visualization of human and bacterial chromosomes. Examples of visually detectable features such as short and long direct repeats, long terminal repeats, mobile genetic elements, heterochromatic segments in microbial and human chromosomes, are presented. The software and its source code are available for download and further development. The visualization interfaces generated with the software allow for the immediate identification and observation of several types of sequence patterns in genomes of various sizes and origins. The visualization interfaces generated with the software are readily accessible through a web browser. This software is a useful research and teaching tool for genetics and structural genomics.
Fan, Lihua; Shuai, Jiangbing; Zeng, Ruoxue; Mo, Hongfei; Wang, Suhua; Zhang, Xiaofeng; He, Yongqiang
2017-12-01
Genome fragment enrichment (GFE) method was applied to identify host-specific bacterial genetic markers that differ among different fecal metagenomes. To enrich for swine-specific DNA fragments, swine fecal DNA composite (n = 34) was challenged against a DNA composite consisting of cow, human, goat, sheep, chicken, duck and goose fecal DNA extracts (n = 83). Bioinformatic analyses of 384 non-redundant swine enriched metagenomic sequences indicated a preponderance of Bacteroidales-like regions predicted to encode metabolism-associated, cellular processes and information storage and processing. After challenged against fecal DNA extracted from different animal sources, four sequences from the clone libraries targeting two Bacteroidales- (genes 1-38 and 3-53), a Clostridia- (gene 2-109) as well as a Bacilli-like sequence (gene 2-95), respectively, showed high specificity to swine feces based on PCR analysis. Host-specificity and host-sensitivity analysis confirmed that oligonucleotide primers and probes capable of annealing to select Bacteroidales-like sequences (1-38 and 3-53) exhibited high specificity (>90%) in quantitative PCR assays with 71 fecal DNAs from non-target animal sources. The two assays also demonstrated broad distributions of corresponding genetic markers (>94% positive) among 72 swine feces. After evaluation with environmental water samples from different areas, swine-targeted assays based on two Bacteroidales-like GFE sequences appear to be suitable quantitative tracing tools for swine fecal pollution. Copyright © 2017 Elsevier Ltd. All rights reserved.
Nakamura, Mikiko; Suzuki, Ayako; Akada, Junko; Tomiyoshi, Keisuke; Hoshida, Hisashi; Akada, Rinji
2015-12-01
Mammalian gene expression constructs are generally prepared in a plasmid vector, in which a promoter and terminator are located upstream and downstream of a protein-coding sequence, respectively. In this study, we found that front terminator constructs-DNA constructs containing a terminator upstream of a promoter rather than downstream of a coding region-could sufficiently express proteins as a result of end joining of the introduced DNA fragment. By taking advantage of front terminator constructs, FLAG substitutions, and deletions were generated using mutagenesis primers to identify amino acids specifically recognized by commercial FLAG antibodies. A minimal epitope sequence for polyclonal FLAG antibody recognition was also identified. In addition, we analyzed the sequence of a C-terminal Ser-Lys-Leu peroxisome localization signal, and identified the key residues necessary for peroxisome targeting. Moreover, front terminator constructs of hepatitis B surface antigen were used for deletion analysis, leading to the identification of regions required for the particle formation. Collectively, these results indicate that front terminator constructs allow for easy manipulations of C-terminal protein-coding sequences, and suggest that direct gene expression with PCR-amplified DNA is useful for high-throughput protein analysis in mammalian cells.
NASA Astrophysics Data System (ADS)
Zhang, Yuning; Reisner, Walter
2013-03-01
Nanopore and nanochannel based devices are robust methods for biomolecular sensing and single DNA manipulation. Nanopore-based DNA sensing has attractive features that make it a leading candidate as a single-molecule DNA sequencing technology. Nanochannel based extension of DNA, combined with enzymatic or denaturation-based barcoding schemes, is already a powerful approach for genome analysis. We believe that there is revolutionary potential in devices that combine nanochannels with embedded pore detectors. In particular, due to the fast translocation of a DNA molecule through a standard nanopore configuration, there is an unfavorable trade-off between signal and sequence resolution. With a combined nanochannel-nanopore device, based on embedding a pore inside a nanochannel, we can in principle gain independent control over both DNA translocation speed and sensing signal, solving the key draw-back of the standard nanopore configuration. We demonstrate that we can optically detect successful translocation of DNA from the nanochannel out through the nanopore, a possible method to 'select' a given barcode for further analysis. In particular, we show that in equilibrium DNA will not escape through an embedded sub-persistence length nanopore, suggesting that the pore could be used as a nanoscale window through which to interrogate a nanochannel extended DNA molecule. Furthermore, electrical measurements through the nanopore are performed, indicating that DNA sensing is feasible using the nanochannel-nanopore device.
Chernicky, C L; Tan, H; Burfeind, P; Ilan, J; Ilan, J
1996-02-01
There are several cell types within the placenta that produce cytokines which can contribute to the regulatory mechanisms that ensure normal pregnancy. The immunological milieu at the maternofetal interface is considered to be crucial for survival of the fetus. Interleukin-2 (IL-2) is expressed by the syncytiotrophoblast, the cell layer between the mother and the fetus. IL-2 appears to be a key factor in maintenance of pregnancy. Therefore, it was important to determine the sequence of human placental interleukin-2. Direct sequencing of human placental IL-2 cDNA was determined for the coding region. Subclone sequencing was carried out for the 5'- and 3'-untranslated regions (5'-UTR and 3'-UTR). The 5'-UTR for human placental IL-2 cDNA is 294 bp, which is 247 nucleotides longer than that reported for cDNA IL-2 derived from T cells. The sequence of the coding region is identical to that reported for T cell IL-2, while sequence analysis of the polymerase chain reaction (PCR) product showed that the cDNA from the 3' end was the same as that reported for cDNA from T cells. Human placental IL-2 cDNA is 1,028 base pairs (excluding the poly A tail), which is 247 bp longer at the 5' end than that reported for IL-2 T cell cDNA. Therefore, the extended 5'-UTR of the placental IL-2 cDNA may be a consequence of alternative promoter utilization in the placenta.
Analysis of Multiallelic CNVs by Emulsion Haplotype Fusion PCR.
Tyson, Jess; Armour, John A L
2017-01-01
Emulsion-fusion PCR recovers long-range sequence information by combining products in cis from individual genomic DNA molecules. Emulsion droplets act as very numerous small reaction chambers in which different PCR products from a single genomic DNA molecule are condensed into short joint products, to unite sequences in cis from widely separated genomic sites. These products can therefore provide information about the arrangement of sequences and variants at a larger scale than established long-read sequencing methods. The method has been useful in defining the phase of variants in haplotypes, the typing of inversions, and determining the configuration of sequence variants in multiallelic CNVs. In this description we outline the rationale for the application of emulsion-fusion PCR methods to the analysis of multiallelic CNVs, and give practical details for our own implementation of the method in that context.
Integrative Clinical Genomics of Metastatic Cancer
Robinson, Dan R.; Wu, Yi-Mi; Lonigro, Robert J.; Vats, Pankaj; Cobain, Erin; Everett, Jessica; Cao, Xuhong; Rabban, Erica; Kumar-Sinha, Chandan; Raymond, Victoria; Schuetze, Scott; Alva, Ajjai; Siddiqui, Javed; Chugh, Rashmi; Worden, Francis; Zalupski, Mark M.; Innis, Jeffrey; Mody, Rajen J.; Tomlins, Scott A.; Lucas, David; Baker, Laurence H.; Ramnath, Nithya; Schott, Ann F.; Hayes, Daniel F.; Vijai, Joseph; Offit, Kenneth; Stoffel, Elena M.; Roberts, J. Scott; Smith, David C.; Kunju, Lakshmi P.; Talpaz, Moshe; Cieslik, Marcin; Chinnaiyan, Arul M.
2017-01-01
SUMMARY Metastasis is the primary cause of cancer-related deaths. While The Cancer Genome Atlas (TCGA) has sequenced primary tumor types obtained from surgical resections, much less comprehensive molecular analysis is available from clinically acquired metastatic cancers. Here, we perform whole exome and transcriptome sequencing of 500 adult patients with metastatic solid tumors of diverse lineage and biopsy site. The most prevalent genes somatically altered in metastatic cancer included TP53, CDKN2A, PTEN, PIK3CA, and RB1. Putative pathogenic germline variants were present in 12.2% of cases of which 75% were related to defects in DNA repair. RNA sequencing complemented DNA sequencing for the identification of gene fusions, pathway activation, and immune profiling. Integrative sequence analysis provides a clinically relevant, multi-dimensional view of the complex molecular landscape and microenvironment of metastatic cancers. PMID:28783718
van Keulen, H; Campbell, S R; Erlandsen, S L; Jarroll, E L
1991-06-01
In an attempt to study Giardia at the DNA sequence level, the rRNA genes of three species, Giardia duodenalis, Giardia ardeae and Giardia muris were cloned and restriction enzyme maps were constructed. The rDNA repeats of these Giardia show completely different restriction enzyme recognition patterns. The size of the rDNA repeat ranges from approximately 5.6 kb in G. duodenalis to 7.6 kb in both G. muris and G. ardeae. These size differences are mainly attributable to the variation in length of the spacer. Minor differences exist among these Giardia in the sizes of their small subunit rRNA and the internal transcribed spacer between small and large subunit rRNA. The genetic maps were constructed by sequence analysis of the DNA around the 5' and 3' ends of the mature rRNA genes and between the rRNA covering the 5.8S rRNA gene and internal transcribed spacer. Comparison of the 5.8S rDNA and 3' end of large subunit rDNA from these three Giardia species showed considerable sequence variation, but the rDNA sequences of G. duodenalis and G. ardeae appear more closely related to each other than to G. muris.
Characterization and mapping of cDNA encoding aspartate aminotransferase in rice, Oryza sativa L.
Song, J; Yamamoto, K; Shomura, A; Yano, M; Minobe, Y; Sasaki, T
1996-10-31
Fifteen cDNA clones, putatively identified as encoding aspartate aminotransferase (AST, EC 2.6.1.1.), were isolated and partially sequenced. Together with six previously isolated clones putatively identified to encode ASTs (Sasaki, et al. 1994, Plant Journal 6, 615-624), their sequences were characterized and classified into 4 cDNA species. Two of the isolated clones, C60213 and C2079, were full-length cDNAs, and their complete nucleotide sequences were determined. C60213 was 1612 bp long and its deduced amino acid sequence showed 88% homology with that of Panicum miliaceum L. mitochondrial AST. The C60213-encoded protein had an N-terminal amino acid sequence that was characteristic of a mitochondrial transit peptide. On the other hand, C2079 was 1546 bp long and had 91% amino acid sequence homology with P. miliaceum L. cytosolic AST but lacked in the transit peptide sequence. The homologies of nucleotide sequences and deduced amino acid sequences of C2079 and C60213 were 54% and 52%, respectively. C2079 and C60213 were mapped on chromosomes 1 and 6, respectively, by restriction fragment length polymorphism linkage analysis. Northern blot analysis using C2079 as a probe revealed much higher transcript levels in callus and root than in green and etiolated shoots, suggesting tissue-specific variations of AST gene expression.
Genetic mutation analysis of human gastric adenocarcinomas using ion torrent sequencing platform.
Xu, Zhi; Huo, Xinying; Ye, Hua; Tang, Chuanning; Nandakumar, Vijayalakshmi; Lou, Feng; Zhang, Dandan; Dong, Haichao; Sun, Hong; Jiang, Shouwen; Zhang, Guangchun; Liu, Zhiyuan; Dong, Zhishou; Guo, Baishuai; He, Yan; Yan, Chaowei; Wang, Lu; Su, Ziyi; Li, Yangyang; Gu, Dongying; Zhang, Xiaojing; Wu, Xiaomin; Wei, Xiaowei; Hong, Lingzhi; Zhang, Yangmei; Yang, Jinsong; Gong, Yonglin; Tang, Cuiju; Jones, Lindsey; Huang, Xue F; Chen, Si-Yi; Chen, Jinfei
2014-01-01
Gastric cancer is the one of the major causes of cancer-related death, especially in Asia. Gastric adenocarcinoma, the most common type of gastric cancer, is heterogeneous and its incidence and cause varies widely with geographical regions, gender, ethnicity, and diet. Since unique mutations have been observed in individual human cancer samples, identification and characterization of the molecular alterations underlying individual gastric adenocarcinomas is a critical step for developing more effective, personalized therapies. Until recently, identifying genetic mutations on an individual basis by DNA sequencing remained a daunting task. Recent advances in new next-generation DNA sequencing technologies, such as the semiconductor-based Ion Torrent sequencing platform, makes DNA sequencing cheaper, faster, and more reliable. In this study, we aim to identify genetic mutations in the genes which are targeted by drugs in clinical use or are under development in individual human gastric adenocarcinoma samples using Ion Torrent sequencing. We sequenced 737 loci from 45 cancer-related genes in 238 human gastric adenocarcinoma samples using the Ion Torrent Ampliseq Cancer Panel. The sequencing analysis revealed a high occurrence of mutations along the TP53 locus (9.7%) in our sample set. Thus, this study indicates the utility of a cost and time efficient tool such as Ion Torrent sequencing to screen cancer mutations for the development of personalized cancer therapy.
Single Nucleobase Identification Using Biophysical Signatures from Nanoelectronic Quantum Tunneling.
Korshoj, Lee E; Afsari, Sepideh; Khan, Sajida; Chatterjee, Anushree; Nagpal, Prashant
2017-03-01
Nanoelectronic DNA sequencing can provide an important alternative to sequencing-by-synthesis by reducing sample preparation time, cost, and complexity as a high-throughput next-generation technique with accurate single-molecule identification. However, sample noise and signature overlap continue to prevent high-resolution and accurate sequencing results. Probing the molecular orbitals of chemically distinct DNA nucleobases offers a path for facile sequence identification, but molecular entropy (from nucleotide conformations) makes such identification difficult when relying only on the energies of lowest-unoccupied and highest-occupied molecular orbitals (LUMO and HOMO). Here, nine biophysical parameters are developed to better characterize molecular orbitals of individual nucleobases, intended for single-molecule DNA sequencing using quantum tunneling of charges. For this analysis, theoretical models for quantum tunneling are combined with transition voltage spectroscopy to obtain measurable parameters unique to the molecule within an electronic junction. Scanning tunneling spectroscopy is then used to measure these nine biophysical parameters for DNA nucleotides, and a modified machine learning algorithm identified nucleobases. The new parameters significantly improve base calling over merely using LUMO and HOMO frontier orbital energies. Furthermore, high accuracies for identifying DNA nucleobases were observed at different pH conditions. These results have significant implications for developing a robust and accurate high-throughput nanoelectronic DNA sequencing technique. © 2017 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
Nanofluidic Device with Embedded Nanopore
NASA Astrophysics Data System (ADS)
Zhang, Yuning; Reisner, Walter
2014-03-01
Nanofluidic based devices are robust methods for biomolecular sensing and single DNA manipulation. Nanopore-based DNA sensing has attractive features that make it a leading candidate as a single-molecule DNA sequencing technology. Nanochannel based extension of DNA, combined with enzymatic or denaturation-based barcoding schemes, is already a powerful approach for genome analysis. We believe that there is revolutionary potential in devices that combine nanochannels with nanpore detectors. In particular, due to the fast translocation of a DNA molecule through a standard nanopore configuration, there is an unfavorable trade-off between signal and sequence resolution. With a combined nanochannel-nanopore device, based on embedding a nanopore inside a nanochannel, we can in principle gain independent control over both DNA translocation speed and sensing signal, solving the key draw-back of the standard nanopore configuration. We demonstrate that we can detect - using fluorescent microscopy - successful translocation of DNA from the nanochannel out through the nanopore, a possible method to 'select' a given barcode for further analysis. We also show that in equilibrium DNA will not escape through an embedded sub-persistence length nanopore until a certain voltage bias is added.
Tam, Annie S; Chu, Jeffrey S C; Rose, Ann M
2015-11-12
Cancer therapy largely depends on chemotherapeutic agents that generate DNA lesions. However, our understanding of the nature of the resulting lesions as well as the mutational profiles of these chemotherapeutic agents is limited. Among these lesions, DNA interstrand crosslinks are among the more toxic types of DNA damage. Here, we have characterized the mutational spectrum of the commonly used DNA interstrand crosslinking agent mitomycin C (MMC). Using a combination of genetic mapping, whole genome sequencing, and genomic analysis, we have identified and confirmed several genomic lesions linked to MMC-induced DNA damage in Caenorhabditis elegans. Our data indicate that MMC predominantly causes deletions, with a 5'-CpG-3' sequence context prevalent in the deleted regions of DNA. Furthermore, we identified microhomology flanking the deletion junctions, indicative of DNA repair via nonhomologous end joining. Based on these results, we propose a general repair mechanism that is likely to be involved in the biological response to this highly toxic agent. In conclusion, the systematic study we have described provides insight into potential sequence specificity of MMC with DNA. Copyright © 2016 Tam et al.
Ancient DNA analysis reveals woolly rhino evolutionary relationships.
Orlando, Ludovic; Leonard, Jennifer A; Thenot, Aurélie; Laudet, Vincent; Guerin, Claude; Hänni, Catherine
2003-09-01
With ancient DNA technology, DNA sequences have been added to the list of characters available to infer the phyletic position of extinct species in evolutionary trees. We have sequenced the entire 12S rRNA and partial cytochrome b (cyt b) genes of one 60-70,000-year-old sample, and partial 12S rRNA and cyt b sequences of two 40-45,000-year-old samples of the extinct woolly rhinoceros (Coelodonta antiquitatis). Based on these two mitochondrial markers, phylogenetic analyses show that C. antiquitatis is most closely related to one of the three extant Asian rhinoceros species, Dicerorhinus sumatrensis. Calculations based on a molecular clock suggest that the lineage leading to C. antiquitatis and D. sumatrensis diverged in the Oligocene, 21-26 MYA. Both results agree with morphological models deduced from palaeontological data. Nuclear inserts of mitochondrial DNA were identified in the ancient specimens. These data should encourage the use of nuclear DNA in future ancient DNA studies. It also further establishes that the degraded nature of ancient DNA does not completely protect ancient DNA studies based on mitochondrial data from the problems associated with nuclear inserts.
Unknown sequence amplification: Application to in vitro genome walking in Chlamydia trachomatis L2
DOE Office of Scientific and Technical Information (OSTI.GOV)
Copley, C.G.; Boot, C.; Bundell, K.
1991-01-01
A recently described technique, Chemical Genetics' unknown sequence amplification method, which requires only one specific oligonucleotide, has broadened the applicability of the polymerase chain reaction to DNA of unknown sequence. The authors have adapted this technique to the study of the genome of Chlamydia trachomatis, an obligate intracellular bacterium, and describe modifications that significantly improve the utility of this approach. These techniques allow for rapid genomic analysis entirely in vitro, using DNA of limited quantity of purity.
Analysis on the DNA Fingerprinting of Aspergillus Oryzae Mutant Induced by High Hydrostatic Pressure
NASA Astrophysics Data System (ADS)
Wang, Hua; Zhang, Jian; Yang, Fan; Wang, Kai; Shen, Si-Le; Liu, Bing-Bing; Zou, Bo; Zou, Guang-Tian
2011-01-01
The mutant strains of aspergillus oryzae (HP300a) are screened under 300 MPa for 20 min. Compared with the control strains, the screened mutant strains have unique properties such as genetic stability, rapid growth, lots of spores, and high protease activity. Random amplified polymorphic DNA (RAPD) and inter simple sequence repeats (ISSR) are used to analyze the DNA fingerprinting of HP300a and the control strains. There are 67.9% and 51.3% polymorphic bands obtained by these two markers, respectively, indicating significant genetic variations between HP300a and the control strains. In addition, comparison of HP300a and the control strains, the genetic distances of random sequence and simple sequence repeat of DNA are 0.51 and 0.34, respectively.
Toward rules relating zinc finger protein sequences and DNA binding site preferences.
Desjarlais, J R; Berg, J M
1992-08-15
Zinc finger proteins of the Cys2-His2 type consist of tandem arrays of domains, where each domain appears to contact three adjacent base pairs of DNA through three key residues. We have designed and prepared a series of variants of the central zinc finger within the DNA binding domain of Sp1 by using information from an analysis of a large data base of zinc finger protein sequences. Through systematic variations at two of the three contact positions (underlined), relatively specific recognition of sequences of the form 5'-GGGGN(G or T)GGG-3' has been achieved. These results provide the basis for rules that may develop into a code that will allow the design of zinc finger proteins with preselected DNA site specificity.
Pusch, Carsten M; Bachmann, Lutz
2004-05-01
Proof of authenticity is the greatest challenge in palaeogenetic research, and many safeguards have become standard routine in laboratories specialized on ancient DNA research. Here we describe an as-yet unknown source of artifacts that will require special attention in the future. We show that ancient DNA extracts on their own can have an inhibitory and mutagenic effect under PCR. We have spiked PCR reactions including known human test DNA with 14 selected ancient DNA extracts from human and nonhuman sources. We find that the ancient DNA extracts inhibit the amplification of large fragments to different degrees, suggesting that the usual control against contaminations, i.e., the absence of long amplifiable fragments, is not sufficient. But even more important, we find that the extracts induce mutations in a nonrandom fashion. We have amplified a 148-bp stretch of the mitochondrial HVRI from contemporary human template DNA in spiked PCR reactions. Subsequent analysis of 547 sequences from cloned amplicons revealed that the vast majority (76.97%) differed from the correct sequence by single nucleotide substitutions and/or indels. In total, 34 positions of a 103-bp alignment are affected, and most mutations occur repeatedly in independent PCR amplifications. Several of the induced mutations occur at positions that have previously been detected in studies of ancient hominid sequences, including the Neandertal sequences. Our data imply that PCR-induced mutations are likely to be an intrinsic and general problem of PCR amplifications of ancient templates. Therefore, ancient DNA sequences should be considered with caution, at least as long as the molecular basis for the extract-induced mutations is not understood.
COI (cytochrome oxidase-I) sequence based studies of Carangid fishes from Kakinada coast, India.
Persis, M; Chandra Sekhar Reddy, A; Rao, L M; Khedkar, G D; Ravinder, K; Nasruddin, K
2009-09-01
Mitochondrial DNA, cytochrome oxidase-1 gene sequences were analyzed for species identification and phylogenetic relationship among the very high food value and commercially important Indian carangid fish species. Sequence analysis of COI gene very clearly indicated that all the 28 fish species fell into five distinct groups, which are genetically distant from each other and exhibited identical phylogenetic reservation. All the COI gene sequences from 28 fishes provide sufficient phylogenetic information and evolutionary relationship to distinguish the carangid species unambiguously. This study proves the utility of mtDNA COI gene sequence based approach in identifying fish species at a faster pace.
Ali, M A; Al-Hemaid, F M; Lee, J; Hatamleh, A A; Gyulai, G; Rahman, M O
2015-10-02
The present study explored the systematic inventory of Echinops L. (Asteraceae) of Saudi Arabia, with special reference to the molecular typing of Echinops abuzinadianus Chaudhary, an endemic species to Saudi Arabia, based on the internal transcribed spacer (ITS) sequences (ITS1-5.8S-ITS2) of nuclear ribosomal DNA. A sequence similarity search using BLAST and a phylogenetic analysis of the ITS sequence of E. abuzinadianus revealed a high level of sequence similarity with E. glaberrimus DC. (section Ritropsis). The novel primary sequence and the secondary structure of ITS2 of E. abuzinadianus could potentially be used for molecular genotyping.
Molecular Dynamics Simulations of DNA-Free and DNA-Bound TAL Effectors
Wan, Hua; Hu, Jian-ping; Li, Kang-shun; Tian, Xu-hong; Chang, Shan
2013-01-01
TAL (transcriptional activator-like) effectors (TALEs) are DNA-binding proteins, containing a modular central domain that recognizes specific DNA sequences. Recently, the crystallographic studies of TALEs revealed the structure of DNA-recognition domain. In this article, molecular dynamics (MD) simulations are employed to study two crystal structures of an 11.5-repeat TALE, in the presence and absence of DNA, respectively. The simulated results indicate that the specific binding of RVDs (repeat-variable diresidues) with DNA leads to the markedly reduced fluctuations of tandem repeats, especially at the two ends. In the DNA-bound TALE system, the base-specific interaction is formed mainly by the residue at position 13 within a TAL repeat. Tandem repeats with weak RVDs are unfavorable for the TALE-DNA binding. These observations are consistent with experimental studies. By using principal component analysis (PCA), the dominant motions are open-close movements between the two ends of the superhelical structure in both DNA-free and DNA-bound TALE systems. The open-close movements are found to be critical for the recognition and binding of TALE-DNA based on the analysis of free energy landscape (FEL). The conformational analysis of DNA indicates that the 5′ end of DNA target sequence has more remarkable structural deformability than the other sites. Meanwhile, the conformational change of DNA is likely associated with the specific interaction of TALE-DNA. We further suggest that the arrangement of N-terminal repeats with strong RVDs may help in the design of efficient TALEs. This study provides some new insights into the understanding of the TALE-DNA recognition mechanism. PMID:24130757
Theory on the mechanism of site-specific DNA-protein interactions in the presence of traps
NASA Astrophysics Data System (ADS)
Niranjani, G.; Murugan, R.
2016-08-01
The speed of site-specific binding of transcription factor (TFs) proteins with genomic DNA seems to be strongly retarded by the randomly occurring sequence traps. Traps are those DNA sequences sharing significant similarity with the original specific binding sites (SBSs). It is an intriguing question how the naturally occurring TFs and their SBSs are designed to manage the retarding effects of such randomly occurring traps. We develop a simple random walk model on the site-specific binding of TFs with genomic DNA in the presence of sequence traps. Our dynamical model predicts that (a) the retarding effects of traps will be minimum when the traps are arranged around the SBS such that there is a negative correlation between the binding strength of TFs with traps and the distance of traps from the SBS and (b) the retarding effects of sequence traps can be appeased by the condensed conformational state of DNA. Our computational analysis results on the distribution of sequence traps around the putative binding sites of various TFs in mouse and human genome clearly agree well the theoretical predictions. We propose that the distribution of traps can be used as an additional metric to efficiently identify the SBSs of TFs on genomic DNA.
Droege, Marcus; Hill, Brendon
2008-08-31
The Genome Sequencer FLX System (GS FLX), powered by 454 Sequencing, is a next-generation DNA sequencing technology featuring a unique mix of long reads, exceptional accuracy, and ultra-high throughput. It has been proven to be the most versatile of all currently available next-generation sequencing technologies, supporting many high-profile studies in over seven applications categories. GS FLX users have pursued innovative research in de novo sequencing, re-sequencing of whole genomes and target DNA regions, metagenomics, and RNA analysis. 454 Sequencing is a powerful tool for human genetics research, having recently re-sequenced the genome of an individual human, currently re-sequencing the complete human exome and targeted genomic regions using the NimbleGen sequence capture process, and detected low-frequency somatic mutations linked to cancer.
Qualitative and quantitative assessment of Illumina's forensic STR and SNP kits on MiSeq FGx™.
Sharma, Vishakha; Chow, Hoi Yan; Siegel, Donald; Wurmbach, Elisa
2017-01-01
Massively parallel sequencing (MPS) is a powerful tool transforming DNA analysis in multiple fields ranging from medicine, to environmental science, to evolutionary biology. In forensic applications, MPS offers the ability to significantly increase the discriminatory power of human identification as well as aid in mixture deconvolution. However, before the benefits of any new technology can be employed, a thorough evaluation of its quality, consistency, sensitivity, and specificity must be rigorously evaluated in order to gain a detailed understanding of the technique including sources of error, error rates, and other restrictions/limitations. This extensive study assessed the performance of Illumina's MiSeq FGx MPS system and ForenSeq™ kit in nine experimental runs including 314 reaction samples. In-depth data analysis evaluated the consequences of different assay conditions on test results. Variables included: sample numbers per run, targets per run, DNA input per sample, and replications. Results are presented as heat maps revealing patterns for each locus. Data analysis focused on read numbers (allele coverage), drop-outs, drop-ins, and sequence analysis. The study revealed that loci with high read numbers performed better and resulted in fewer drop-outs and well balanced heterozygous alleles. Several loci were prone to drop-outs which led to falsely typed homozygotes and therefore to genotype errors. Sequence analysis of allele drop-in typically revealed a single nucleotide change (deletion, insertion, or substitution). Analyses of sequences, no template controls, and spurious alleles suggest no contamination during library preparation, pooling, and sequencing, but indicate that sequencing or PCR errors may have occurred due to DNA polymerase infidelities. Finally, we found utilizing Illumina's FGx System at recommended conditions does not guarantee 100% outcomes for all samples tested, including the positive control, and required manual editing due to low read numbers and/or allele drop-in. These findings are important for progressing towards implementation of MPS in forensic DNA testing.
CAPRRESI: Chimera Assembly by Plasmid Recovery and Restriction Enzyme Site Insertion.
Santillán, Orlando; Ramírez-Romero, Miguel A; Dávila, Guillermo
2017-06-25
Here, we present chimera assembly by plasmid recovery and restriction enzyme site insertion (CAPRRESI). CAPRRESI benefits from many strengths of the original plasmid recovery method and introduces restriction enzyme digestion to ease DNA ligation reactions (required for chimera assembly). For this protocol, users clone wildtype genes into the same plasmid (pUC18 or pUC19). After the in silico selection of amino acid sequence regions where chimeras should be assembled, users obtain all the synonym DNA sequences that encode them. Ad hoc Perl scripts enable users to determine all synonym DNA sequences. After this step, another Perl script searches for restriction enzyme sites on all synonym DNA sequences. This in silico analysis is also performed using the ampicillin resistance gene (ampR) found on pUC18/19 plasmids. Users design oligonucleotides inside synonym regions to disrupt wildtype and ampR genes by PCR. After obtaining and purifying complementary DNA fragments, restriction enzyme digestion is accomplished. Chimera assembly is achieved by ligating appropriate complementary DNA fragments. pUC18/19 vectors are selected for CAPRRESI because they offer technical advantages, such as small size (2,686 base pairs), high copy number, advantageous sequencing reaction features, and commercial availability. The usage of restriction enzymes for chimera assembly eliminates the need for DNA polymerases yielding blunt-ended products. CAPRRESI is a fast and low-cost method for fusing protein-coding genes.
Bueno, Danilo; Palacios-Gimenez, Octavio Manuel; Martí, Dardo Andrea; Mariguela, Tatiane Casagrande; Cabral-de-Mello, Diogo Cavalcanti
2016-08-01
The 5S ribosomal DNA (rDNA) sequences are subject of dynamic evolution at chromosomal and molecular levels, evolving through concerted and/or birth-and-death fashion. Among grasshoppers, the chromosomal location for this sequence was established for some species, but little molecular information was obtained to infer evolutionary patterns. Here, we integrated data from chromosomal and nucleotide sequence analysis for 5S rDNA in two Abracris species aiming to identify evolutionary dynamics. For both species, two arrays were identified, a larger sequence (named type-I) that consisted of the entire 5S rDNA gene plus NTS (non-transcribed spacer) and a smaller (named type-II) with truncated 5S rDNA gene plus short NTS that was considered a pseudogene. For type-I sequences, the gene corresponding region contained the internal control region and poly-T motif and the NTS presented partial transposable elements. Between the species, nucleotide differences for type-I were noticed, while type-II was identical, suggesting pseudogenization in a common ancestor. At chromosomal point to view, the type-II was placed in one bivalent, while type-I occurred in multiple copies in distinct chromosomes. In Abracris, the evolution of 5S rDNA was apparently influenced by the chromosomal distribution of clusters (single or multiple location), resulting in a mixed mechanism integrating concerted and birth-and-death evolution depending on the unit.
Initial steps towards a production platform for DNA sequence analysis on the grid.
Luyf, Angela C M; van Schaik, Barbera D C; de Vries, Michel; Baas, Frank; van Kampen, Antoine H C; Olabarriaga, Silvia D
2010-12-14
Bioinformatics is confronted with a new data explosion due to the availability of high throughput DNA sequencers. Data storage and analysis becomes a problem on local servers, and therefore it is needed to switch to other IT infrastructures. Grid and workflow technology can help to handle the data more efficiently, as well as facilitate collaborations. However, interfaces to grids are often unfriendly to novice users. In this study we reused a platform that was developed in the VL-e project for the analysis of medical images. Data transfer, workflow execution and job monitoring are operated from one graphical interface. We developed workflows for two sequence alignment tools (BLAST and BLAT) as a proof of concept. The analysis time was significantly reduced. All workflows and executables are available for the members of the Dutch Life Science Grid and the VL-e Medical virtual organizations All components are open source and can be transported to other grid infrastructures. The availability of in-house expertise and tools facilitates the usage of grid resources by new users. Our first results indicate that this is a practical, powerful and scalable solution to address the capacity and collaboration issues raised by the deployment of next generation sequencers. We currently adopt this methodology on a daily basis for DNA sequencing and other applications. More information and source code is available via http://www.bioinformaticslaboratory.nl/
Tan, L; Wang, H; Li, C; Pan, Y
2014-12-01
Acute exacerbations of chronic obstructive pulmonary disease (AE-COPD) are leading causes of mortality in hospital intensive care units. We sought to determine whether dental plaque biofilms might harbor pathogenic bacteria that can eventually cause lung infections in patients with severe AE-COPD. Paired samples of subgingival plaque biofilm and tracheal aspirate were collected from 53 patients with severe AE-COPD. Total bacterial DNA was extracted from each sample individually for polymerase chain reaction amplification and/or generation of bacterial 16S rDNA sequences and cDNA libraries. We used a metagenomic approach, based on bacterial 16S rDNA sequences, to compare the distribution of species present in dental plaque and lung. Analysis of 1060 sequences (20 clones per patient) revealed a wide range of aerobic, anaerobic, pathogenic, opportunistic, novel and uncultivable bacterial species. Species indistinguishable between the paired subgingival plaque and tracheal aspirate samples (97-100% similarity in 16S rDNA sequence) were dental plaque pathogens (Aggregatibacter actinomycetemcomitans, Capnocytophaga sputigena, Porphyromonas gingivalis, Tannerella forsythia and Treponema denticola) and lung pathogens (Acinetobacter baumannii, Klebsiella pneumoniae, Pseudomonas aeruginosa and Streptococcus pneumoniae). Real-time polymerase chain reaction of 16S rDNA indicated lower levels of Pseudomonas aeruginosa and Porphyromonas gingivalis colonizing the dental plaques compared with the paired tracheal aspirate samples. These results support the hypothesis that dental bacteria may contribute to the pathology of severe AE-COPD. © 2014 John Wiley & Sons A/S. Published by John Wiley & Sons Ltd.