dna sequencing final: Topics by Science.gov

Sample records for dna sequencing final

A Simulation of DNA Sequencing Utilizing 3M Post-It[R] Notes

ERIC Educational Resources Information Center

Christensen, Doug

2009-01-01

An inexpensive and equipment free approach to teaching the technical aspects of DNA sequencing. The activity described requires an instructor with a familiarity of DNA sequencing technology but provides a straight forward method of teaching the technical aspects of sequencing in the absence of expensive sequencing equipment. The final sequence…
Development of a Novel Technology for Label Free DNA Sequencing

DTIC Science & Technology

2012-05-21

of the C-H bond stretch vibrations in the planes of the corresponding DNA bases , and in the higher-frequency side, sequence-identifier region is...composed of the N-H bond stretch vibrations in the planes of the corresponding DNA bases . In addition, the sequence-identifier dividing region almost...regions are localized at the corresponding DNA bases and exhibit a definable dependence on the sequence form of the codons under study. Final
Advances in high throughput DNA sequence data compression.

PubMed

Sardaraz, Muhammad; Tahir, Muhammad; Ikram, Ataul Aziz

2016-06-01

Advances in high throughput sequencing technologies and reduction in cost of sequencing have led to exponential growth in high throughput DNA sequence data. This growth has posed challenges such as storage, retrieval, and transmission of sequencing data. Data compression is used to cope with these challenges. Various methods have been developed to compress genomic and sequencing data. In this article, we present a comprehensive review of compression methods for genome and reads compression. Algorithms are categorized as referential or reference free. Experimental results and comparative analysis of various methods for data compression are presented. Finally, key challenges and research directions in DNA sequence data compression are highlighted.
Sequence verification as quality-control step for production of cDNA microarrays.

PubMed

Taylor, E; Cogdell, D; Coombes, K; Hu, L; Ramdas, L; Tabor, A; Hamilton, S; Zhang, W

2001-07-01

To generate cDNA arrays in our core laboratory, we amplified about 2300 PCR products from a human, sequence-verified cDNA clone library. As a quality-control step, we sequenced the PCR products immediately before printing. The sequence information was used to search the GenBank database to confirm the identities. Although these clones were previously sequence verified by the company, we found that only 79% of the clones matched the original database after handling. Our experience strongly indicates the necessity to sequence verify the clones at the final stage before printing on microarray slides and to modify the gene list accordingly.
Sequence analysis of the canine mitochondrial DNA control region from shed hair samples in criminal investigations.

PubMed

Berger, C; Berger, B; Parson, W

2012-01-01

In recent years, evidence from domestic dogs has increasingly been analyzed by forensic DNA testing. Especially, canine hairs have proved most suitable and practical due to the high rate of hair transfer occurring between dogs and humans. Starting with the description of a contamination-free sample handling procedure, we give a detailed workflow for sequencing hypervariable segments (HVS) of the mtDNA control region from canine evidence. After the hair material is lysed and the DNA extracted by Phenol/Chloroform, the amplification and sequencing strategy comprises the HVS I and II of the canine control region and is optimized for DNA of medium-to-low quality and quantity. The sequencing procedure is based on the Sanger Big-dye deoxy-terminator method and the separation of the sequencing reaction products is performed on a conventional multicolor fluorescence detection capillary electrophoresis platform. Finally, software-aided base calling and sequence interpretation are addressed exemplarily.
Quantitation of next generation sequencing library preparation protocol efficiencies using droplet digital PCR assays - a systematic comparison of DNA library preparation kits for Illumina sequencing.

PubMed

Aigrain, Louise; Gu, Yong; Quail, Michael A

2016-06-13

The emergence of next-generation sequencing (NGS) technologies in the past decade has allowed the democratization of DNA sequencing both in terms of price per sequenced bases and ease to produce DNA libraries. When it comes to preparing DNA sequencing libraries for Illumina, the current market leader, a plethora of kits are available and it can be difficult for the users to determine which kit is the most appropriate and efficient for their applications; the main concerns being not only cost but also minimal bias, yield and time efficiency. We compared 9 commercially available library preparation kits in a systematic manner using the same DNA sample by probing the amount of DNA remaining after each protocol steps using a new droplet digital PCR (ddPCR) assay. This method allows the precise quantification of fragments bearing either adaptors or P5/P7 sequences on both ends just after ligation or PCR enrichment. We also investigated the potential influence of DNA input and DNA fragment size on the final library preparation efficiency. The overall library preparations efficiencies of the libraries show important variations between the different kits with the ones combining several steps into a single one exhibiting some final yields 4 to 7 times higher than the other kits. Detailed ddPCR data also reveal that the adaptor ligation yield itself varies by more than a factor of 10 between kits, certain ligation efficiencies being so low that it could impair the original library complexity and impoverish the sequencing results. When a PCR enrichment step is necessary, lower adaptor-ligated DNA inputs leads to greater amplification yields, hiding the latent disparity between kits. We describe a ddPCR assay that allows us to probe the efficiency of the most critical step in the library preparation, ligation, and to draw conclusion on which kits is more likely to preserve the sample heterogeneity and reduce the need of amplification.
Automated sample-preparation technologies in genome sequencing projects.

PubMed

Hilbert, H; Lauber, J; Lubenow, H; Düsterhöft, A

2000-01-01

A robotic workstation system (BioRobot 96OO, QIAGEN) and a 96-well UV spectrophotometer (Spectramax 250, Molecular Devices) were integrated in to the process of high-throughput automated sequencing of double-stranded plasmid DNA templates. An automated 96-well miniprep kit protocol (QIAprep Turbo, QIAGEN) provided high-quality plasmid DNA from shotgun clones. The DNA prepared by this procedure was used to generate more than two mega bases of final sequence data for two genomic projects (Arabidopsis thaliana and Schizosaccharomyces pombe), three thousand expressed sequence tags (ESTs) plus half a mega base of human full-length cDNA clones, and approximately 53,000 single reads for a whole genome shotgun project (Pseudomonas putida).
Optimized mtDNA Control Region Primer Extension Capture Analysis for Forensically Relevant Samples and Highly Compromised mtDNA of Different Age and Origin

PubMed Central

Eduardoff, Mayra; Xavier, Catarina; Strobl, Christina; Casas-Vargas, Andrea; Parson, Walther

2017-01-01

The analysis of mitochondrial DNA (mtDNA) has proven useful in forensic genetics and ancient DNA (aDNA) studies, where specimens are often highly compromised and DNA quality and quantity are low. In forensic genetics, the mtDNA control region (CR) is commonly sequenced using established Sanger-type Sequencing (STS) protocols involving fragment sizes down to approximately 150 base pairs (bp). Recent developments include Massively Parallel Sequencing (MPS) of (multiplex) PCR-generated libraries using the same amplicon sizes. Molecular genetic studies on archaeological remains that harbor more degraded aDNA have pioneered alternative approaches to target mtDNA, such as capture hybridization and primer extension capture (PEC) methods followed by MPS. These assays target smaller mtDNA fragment sizes (down to 50 bp or less), and have proven to be substantially more successful in obtaining useful mtDNA sequences from these samples compared to electrophoretic methods. Here, we present the modification and optimization of a PEC method, earlier developed for sequencing the Neanderthal mitochondrial genome, with forensic applications in mind. Our approach was designed for a more sensitive enrichment of the mtDNA CR in a single tube assay and short laboratory turnaround times, thus complying with forensic practices. We characterized the method using sheared, high quantity mtDNA (six samples), and tested challenging forensic samples (n = 2) as well as compromised solid tissue samples (n = 15) up to 8 kyrs of age. The PEC MPS method produced reliable and plausible mtDNA haplotypes that were useful in the forensic context. It yielded plausible data in samples that did not provide results with STS and other MPS techniques. We addressed the issue of contamination by including four generations of negative controls, and discuss the results in the forensic context. We finally offer perspectives for future research to enable the validation and accreditation of the PEC MPS method for final implementation in forensic genetic laboratories. PMID:28934125
Phylogenetic characterization of a biogas plant microbial community integrating clone library 16S-rDNA sequences and metagenome sequence data obtained by 454-pyrosequencing.

PubMed

Kröber, Magdalena; Bekel, Thomas; Diaz, Naryttza N; Goesmann, Alexander; Jaenicke, Sebastian; Krause, Lutz; Miller, Dimitri; Runte, Kai J; Viehöver, Prisca; Pühler, Alfred; Schlüter, Andreas

2009-06-01

The phylogenetic structure of the microbial community residing in a fermentation sample from a production-scale biogas plant fed with maize silage, green rye and liquid manure was analysed by an integrated approach using clone library sequences and metagenome sequence data obtained by 454-pyrosequencing. Sequencing of 109 clones from a bacterial and an archaeal 16S-rDNA amplicon library revealed that the obtained nucleotide sequences are similar but not identical to 16S-rDNA database sequences derived from different anaerobic environments including digestors and bioreactors. Most of the bacterial 16S-rDNA sequences could be assigned to the phylum Firmicutes with the most abundant class Clostridia and to the class Bacteroidetes, whereas most archaeal 16S-rDNA sequences cluster close to the methanogen Methanoculleus bourgensis. Further sequences of the archaeal library most probably represent so far non-characterised species within the genus Methanoculleus. A similar result derived from phylogenetic analysis of mcrA clone sequences. The mcrA gene product encodes the alpha-subunit of methyl-coenzyme-M reductase involved in the final step of methanogenesis. BLASTn analysis applying stringent settings resulted in assignment of 16S-rDNA metagenome sequence reads to 62 16S-rDNA amplicon sequences thus enabling frequency of abundance estimations for 16S-rDNA clone library sequences. Ribosomal Database Project (RDP) Classifier processing of metagenome 16S-rDNA reads revealed abundance of the phyla Firmicutes, Bacteroidetes and Euryarchaeota and the orders Clostridiales, Bacteroidales and Methanomicrobiales. Moreover, a large fraction of 16S-rDNA metagenome reads could not be assigned to lower taxonomic ranks, demonstrating that numerous microorganisms in the analysed fermentation sample of the biogas plant are still unclassified or unknown.
High-fidelity target sequencing of individual molecules identified using barcode sequences: de novo detection and absolute quantitation of mutations in plasma cell-free DNA from cancer patients.

PubMed

Kukita, Yoji; Matoba, Ryo; Uchida, Junji; Hamakawa, Takuya; Doki, Yuichiro; Imamura, Fumio; Kato, Kikuya

2015-08-01

Circulating tumour DNA (ctDNA) is an emerging field of cancer research. However, current ctDNA analysis is usually restricted to one or a few mutation sites due to technical limitations. In the case of massively parallel DNA sequencers, the number of false positives caused by a high read error rate is a major problem. In addition, the final sequence reads do not represent the original DNA population due to the global amplification step during the template preparation. We established a high-fidelity target sequencing system of individual molecules identified in plasma cell-free DNA using barcode sequences; this system consists of the following two steps. (i) A novel target sequencing method that adds barcode sequences by adaptor ligation. This method uses linear amplification to eliminate the errors introduced during the early cycles of polymerase chain reaction. (ii) The monitoring and removal of erroneous barcode tags. This process involves the identification of individual molecules that have been sequenced and for which the number of mutations have been absolute quantitated. Using plasma cell-free DNA from patients with gastric or lung cancer, we demonstrated that the system achieved near complete elimination of false positives and enabled de novo detection and absolute quantitation of mutations in plasma cell-free DNA. © The Author 2015. Published by Oxford University Press on behalf of Kazusa DNA Research Institute.
Barcode extension for analysis and reconstruction of structures

NASA Astrophysics Data System (ADS)

Myhrvold, Cameron; Baym, Michael; Hanikel, Nikita; Ong, Luvena L.; Gootenberg, Jonathan S.; Yin, Peng

2017-03-01

Collections of DNA sequences can be rationally designed to self-assemble into predictable three-dimensional structures. The geometric and functional diversity of DNA nanostructures created to date has been enhanced by improvements in DNA synthesis and computational design. However, existing methods for structure characterization typically image the final product or laboriously determine the presence of individual, labelled strands using gel electrophoresis. Here we introduce a new method of structure characterization that uses barcode extension and next-generation DNA sequencing to quantitatively measure the incorporation of every strand into a DNA nanostructure. By quantifying the relative abundances of distinct DNA species in product and monomer bands, we can study the influence of geometry and sequence on assembly. We have tested our method using 2D and 3D DNA brick and DNA origami structures. Our method is general and should be extensible to a wide variety of DNA nanostructures.
Barcode extension for analysis and reconstruction of structures.

PubMed

Myhrvold, Cameron; Baym, Michael; Hanikel, Nikita; Ong, Luvena L; Gootenberg, Jonathan S; Yin, Peng

2017-03-13

Collections of DNA sequences can be rationally designed to self-assemble into predictable three-dimensional structures. The geometric and functional diversity of DNA nanostructures created to date has been enhanced by improvements in DNA synthesis and computational design. However, existing methods for structure characterization typically image the final product or laboriously determine the presence of individual, labelled strands using gel electrophoresis. Here we introduce a new method of structure characterization that uses barcode extension and next-generation DNA sequencing to quantitatively measure the incorporation of every strand into a DNA nanostructure. By quantifying the relative abundances of distinct DNA species in product and monomer bands, we can study the influence of geometry and sequence on assembly. We have tested our method using 2D and 3D DNA brick and DNA origami structures. Our method is general and should be extensible to a wide variety of DNA nanostructures.
Barcode extension for analysis and reconstruction of structures

PubMed Central

Myhrvold, Cameron; Baym, Michael; Hanikel, Nikita; Ong, Luvena L; Gootenberg, Jonathan S; Yin, Peng

2017-01-01

Collections of DNA sequences can be rationally designed to self-assemble into predictable three-dimensional structures. The geometric and functional diversity of DNA nanostructures created to date has been enhanced by improvements in DNA synthesis and computational design. However, existing methods for structure characterization typically image the final product or laboriously determine the presence of individual, labelled strands using gel electrophoresis. Here we introduce a new method of structure characterization that uses barcode extension and next-generation DNA sequencing to quantitatively measure the incorporation of every strand into a DNA nanostructure. By quantifying the relative abundances of distinct DNA species in product and monomer bands, we can study the influence of geometry and sequence on assembly. We have tested our method using 2D and 3D DNA brick and DNA origami structures. Our method is general and should be extensible to a wide variety of DNA nanostructures. PMID:28287117
Combined subtraction hybridization and polymerase chain reaction amplification procedure for isolation of strain-specific Rhizobium DNA sequences.

PubMed Central

Bjourson, A J; Stone, C E; Cooper, J E

1992-01-01

A novel subtraction hybridization procedure, incorporating a combination of four separation strategies, was developed to isolate unique DNA sequences from a strain of Rhizobium leguminosarum bv. trifolii. Sau3A-digested DNA from this strain, i.e., the probe strain, was ligated to a linker and hybridized in solution with an excess of pooled subtracter DNA from seven other strains of the same biovar which had been restricted, ligated to a different, biotinylated, subtracter-specific linker, and amplified by polymerase chain reaction to incorporate dUTP. Subtracter DNA and subtracter-probe hybrids were removed by phenol-chloroform extraction of a streptavidin-biotin-DNA complex. NENSORB chromatography of the sequences remaining in the aqueous layer captured biotinylated subtracter DNA which may have escaped removal by phenol-chloroform treatment. Any traces of contaminating subtracter DNA were removed by digestion with uracil DNA glycosylase. Finally, remaining sequences were amplified by polymerase chain reaction with a probe strain-specific primer, labelled with 32P, and tested for specificity in dot blot hybridizations against total genomic target DNA from each strain in the subtracter pool. Two rounds of subtraction-amplification were sufficient to remove cross-hybridizing sequences and to give a probe which hybridized only with homologous target DNA. The method is applicable to the isolation of DNA and RNA sequences from both procaryotic and eucaryotic cells. Images PMID:1637166
Statistical properties of DNA sequences

NASA Technical Reports Server (NTRS)

Peng, C. K.; Buldyrev, S. V.; Goldberger, A. L.; Havlin, S.; Mantegna, R. N.; Simons, M.; Stanley, H. E.

1995-01-01

We review evidence supporting the idea that the DNA sequence in genes containing non-coding regions is correlated, and that the correlation is remarkably long range--indeed, nucleotides thousands of base pairs distant are correlated. We do not find such a long-range correlation in the coding regions of the gene. We resolve the problem of the "non-stationarity" feature of the sequence of base pairs by applying a new algorithm called detrended fluctuation analysis (DFA). We address the claim of Voss that there is no difference in the statistical properties of coding and non-coding regions of DNA by systematically applying the DFA algorithm, as well as standard FFT analysis, to every DNA sequence (33301 coding and 29453 non-coding) in the entire GenBank database. Finally, we describe briefly some recent work showing that the non-coding sequences have certain statistical features in common with natural and artificial languages. Specifically, we adapt to DNA the Zipf approach to analyzing linguistic texts. These statistical properties of non-coding sequences support the possibility that non-coding regions of DNA may carry biological information.
In the Absence of Writhe, DNA Relieves Torsional Stress with Localized, Sequence-Dependent Structural Failure to Preserve B-form

DOE Office of Scientific and Technical Information (OSTI.GOV)

Randall, Graham L.; Zechiedrich, E. L.; Pettitt, Bernard M.

2009-09-01

To understand how underwinding and overwinding the DNA helix affects its structure, we simulated 19 independent DNA systems with fixed degrees of twist using molecular dynamics in a system that does not allow writhe. Underwinding DNA induced spontaneous, sequence-dependent base flipping and local denaturation, while overwinding DNA induced the formation of Pauling-like DNA (P-DNA). The winding resulted in a bimodal state simultaneously including local structural failure and B-form DNA for both underwinding and extreme overwinding. Our simulations suggest that base flipping and local denaturation may provide a landscape influencing protein recognition of DNA sequence to affect, for examples, replication, transcriptionmore » and recombination. Additionally, our findings help explain results from singlemolecule experiments and demonstrate that elastic rod models are strictly valid on average only for unstressed or overwound DNA up to P-DNA formation. Finally, our data support a model in which base flipping can result from torsional stress.« less
Molecular dynamics studies on the DNA-binding process of ERG.

PubMed

Beuerle, Matthias G; Dufton, Neil P; Randi, Anna M; Gould, Ian R

2016-11-15

The ETS family of transcription factors regulate gene targets by binding to a core GGAA DNA-sequence. The ETS factor ERG is required for homeostasis and lineage-specific functions in endothelial cells, some subset of haemopoietic cells and chondrocytes; its ectopic expression is linked to oncogenesis in multiple tissues. To date details of the DNA-binding process of ERG including DNA-sequence recognition outside the core GGAA-sequence are largely unknown. We combined available structural and experimental data to perform molecular dynamics simulations to study the DNA-binding process of ERG. In particular we were able to reproduce the ERG DNA-complex with a DNA-binding simulation starting in an unbound configuration with a final root-mean-square-deviation (RMSD) of 2.1 Å to the core ETS domain DNA-complex crystal structure. This allowed us to elucidate the relevance of amino acids involved in the formation of the ERG DNA-complex and to identify Arg385 as a novel key residue in the DNA-binding process. Moreover we were able to show that water-mediated hydrogen bonds are present between ERG and DNA in our simulations and that those interactions have the potential to achieve sequence recognition outside the GGAA core DNA-sequence. The methodology employed in this study shows the promising capabilities of modern molecular dynamics simulations in the field of protein DNA-interactions.
Enhanced sequencing coverage with digital droplet multiple displacement amplification

PubMed Central

Sidore, Angus M.; Lan, Freeman; Lim, Shaun W.; Abate, Adam R.

2016-01-01

Sequencing small quantities of DNA is important for applications ranging from the assembly of uncultivable microbial genomes to the identification of cancer-associated mutations. To obtain sufficient quantities of DNA for sequencing, the small amount of starting material must be amplified significantly. However, existing methods often yield errors or non-uniform coverage, reducing sequencing data quality. Here, we describe digital droplet multiple displacement amplification, a method that enables massive amplification of low-input material while maintaining sequence accuracy and uniformity. The low-input material is compartmentalized as single molecules in millions of picoliter droplets. Because the molecules are isolated in compartments, they amplify to saturation without competing for resources; this yields uniform representation of all sequences in the final product and, in turn, enhances the quality of the sequence data. We demonstrate the ability to uniformly amplify the genomes of single Escherichia coli cells, comprising just 4.7 fg of starting DNA, and obtain sequencing coverage distributions that rival that of unamplified material. Digital droplet multiple displacement amplification provides a simple and effective method for amplifying minute amounts of DNA for accurate and uniform sequencing. PMID:26704978
High-resolution biophysical analysis of the dynamics of nucleosome formation

PubMed Central

Hatakeyama, Akiko; Hartmann, Brigitte; Travers, Andrew; Nogues, Claude; Buckle, Malcolm

2016-01-01

We describe a biophysical approach that enables changes in the structure of DNA to be followed during nucleosome formation in in vitro reconstitution with either the canonical “Widom” sequence or a judiciously mutated sequence. The rapid non-perturbing photochemical analysis presented here provides ‘snapshots’ of the DNA configuration at any given moment in time during nucleosome formation under a very broad range of reaction conditions. Changes in DNA photochemical reactivity upon protein binding are interpreted as being mainly induced by alterations in individual base pair roll angles. The results strengthen the importance of the role of an initial (H3/H4)2 histone tetramer-DNA interaction and highlight the modulation of this early event by the DNA sequence. (H3/H4)2 binding precedes and dictates subsequent H2A/H2B-DNA interactions, which are less affected by the DNA sequence, leading to the final octameric nucleosome. Overall, our results provide a novel, exciting way to investigate those biophysical properties of DNA that constitute a crucial component in nucleosome formation and stabilization. PMID:27263658
HMG-D is an architecture-specific protein that preferentially binds to DNA containing the dinucleotide TG.

PubMed Central

Churchill, M E; Jones, D N; Glaser, T; Hefner, H; Searles, M A; Travers, A A

1995-01-01

The high mobility group (HMG) protein HMG-D from Drosophila melanogaster is a highly abundant chromosomal protein that is closely related to the vertebrate HMG domain proteins HMG1 and HMG2. In general, chromosomal HMG domain proteins lack sequence specificity. However, using both NMR spectroscopy and standard biochemical techniques we show that binding of HMG-D to a single DNA site is sequence selective. The preferred duplex DNA binding site comprises at least 5 bp and contains the deformable dinucleotide TG embedded in A/T-rich sequences. The TG motif constitutes a common core element in the binding sites of the well-characterized sequence-specific HMG domain proteins. We show that a conserved aromatic residue in helix 1 of the HMG domain may be involved in recognition of this core sequence. In common with other HMG domain proteins HMG-D binds preferentially to DNA sites that are stably bent and underwound, therefore HMG-D can be considered an architecture-specific protein. Finally, we show that HMG-D bends DNA and may confer a superhelical DNA conformation at a natural DNA binding site in the Drosophila fushi tarazu scaffold-associated region. Images PMID:7720717

Chemical biology on the genome.

PubMed

Balasubramanian, Shankar

2014-08-15

In this article I discuss studies towards understanding the structure and function of DNA in the context of genomes from the perspective of a chemist. The first area I describe concerns the studies that led to the invention and subsequent development of a method for sequencing DNA on a genome scale at high speed and low cost, now known as Solexa/Illumina sequencing. The second theme will feature the four-stranded DNA structure known as a G-quadruplex with a focus on its fundamental properties, its presence in cellular genomic DNA and the prospects for targeting such a structure in cels with small molecules. The final topic for discussion is naturally occurring chemically modified DNA bases with an emphasis on chemistry for decoding (or sequencing) such modifications in genomic DNA. The genome is a fruitful topic to be further elucidated by the creation and application of chemical approaches. Copyright © 2014 Elsevier Ltd. All rights reserved.
Comparison of DNA Microarray, Loop-Mediated Isothermal Amplification (LAMP) and Real-Time PCR with DNA Sequencing for Identification of Fusarium spp. Obtained from Patients with Hematologic Malignancies.

PubMed

de Souza, Marcela; Matsuzawa, Tetsuhiro; Sakai, Kanae; Muraosa, Yasunori; Lyra, Luzia; Busso-Lopes, Ariane Fidelis; Levin, Anna Sara Shafferman; Schreiber, Angélica Zaninelli; Mikami, Yuzuru; Gonoi, Tohoru; Kamei, Katsuhiko; Moretti, Maria Luiza; Trabasso, Plínio

2017-08-01

The performance of three molecular biology techniques, i.e., DNA microarray, loop-mediated isothermal amplification (LAMP), and real-time PCR were compared with DNA sequencing for properly identification of 20 isolates of Fusarium spp. obtained from blood stream as etiologic agent of invasive infections in patients with hematologic malignancies. DNA microarray, LAMP and real-time PCR identified 16 (80%) out of 20 samples as Fusarium solani species complex (FSSC) and four (20%) as Fusarium spp. The agreement among the techniques was 100%. LAMP exhibited 100% specificity, while DNA microarray, LAMP and real-time PCR showed 100% sensitivity. The three techniques had 100% agreement with DNA sequencing. Sixteen isolates were identified as FSSC by sequencing, being five Fusarium keratoplasticum, nine Fusarium petroliphilum and two Fusarium solani. On the other hand, sequencing identified four isolates as Fusarium non-solani species complex (FNSSC), being three isolates as Fusarium napiforme and one isolate as Fusarium oxysporum. Finally, LAMP proved to be faster and more accessible than DNA microarray and real-time PCR, since it does not require a thermocycler. Therefore, LAMP signalizes as emerging and promising methodology to be used in routine identification of Fusarium spp. among cases of invasive fungal infections.
Performing SELEX experiments in silico

NASA Astrophysics Data System (ADS)

Wondergem, J. A. J.; Schiessel, H.; Tompitak, M.

2017-11-01

Due to the sequence-dependent nature of the elasticity of DNA, many protein-DNA complexes and other systems in which DNA molecules must be deformed have preferences for the type of DNA sequence they interact with. SELEX (Systematic Evolution of Ligands by EXponential enrichment) experiments and similar sequence selection experiments have been used extensively to examine the (indirect readout) sequence preferences of, e.g., nucleosomes (protein spools around which DNA is wound for compactification) and DNA rings. We show how recently developed computational and theoretical tools can be used to emulate such experiments in silico. Opening up this possibility comes with several benefits. First, it allows us a better understanding of our models and systems, specifically about the roles played by the simulation temperature and the selection pressure on the sequences. Second, it allows us to compare the predictions made by the model of choice with experimental results. We find agreement on important features between predictions of the rigid base-pair model and experimental results for DNA rings and interesting differences that point out open questions in the field. Finally, our simulations allow application of the SELEX methodology to systems that are experimentally difficult to realize because they come with high energetic costs and are therefore unlikely to form spontaneously, such as very short or overwound DNA rings.
Marine Fungi: Their Ecology and Molecular Diversity

NASA Astrophysics Data System (ADS)

Richards, Thomas A.; Jones, Meredith D. M.; Leonard, Guy; Bass, David

2012-01-01

Fungi appear to be rare in marine environments. There are relatively few marine isolates in culture, and fungal small subunit ribosomal DNA (SSU rDNA) sequences are rarely recovered in marine clone library experiments (i.e., culture-independent sequence surveys of eukaryotic microbial diversity from environmental DNA samples). To explore the diversity of marine fungi, we took a broad selection of SSU rDNA data sets and calculated a summary phylogeny. Bringing these data together identified a diverse collection of marine fungi, including sequences branching close to chytrids (flagellated fungi), filamentous hypha-forming fungi, and multicellular fungi. However, the majority of the sequences branched with ascomycete and basidiomycete yeasts. We discuss evidence for 36 novel marine lineages, the majority and most divergent of which branch with the chytrids. We then investigate what these data mean for the evolutionary history of the Fungi and specifically marine-terrestrial transitions. Finally, we discuss the roles of fungi in marine ecosystems.
The use of coded PCR primers enables high-throughput sequencing of multiple homolog amplification products by 454 parallel sequencing.

PubMed

Binladen, Jonas; Gilbert, M Thomas P; Bollback, Jonathan P; Panitz, Frank; Bendixen, Christian; Nielsen, Rasmus; Willerslev, Eske

2007-02-14

The invention of the Genome Sequence 20 DNA Sequencing System (454 parallel sequencing platform) has enabled the rapid and high-volume production of sequence data. Until now, however, individual emulsion PCR (emPCR) reactions and subsequent sequencing runs have been unable to combine template DNA from multiple individuals, as homologous sequences cannot be subsequently assigned to their original sources. We use conventional PCR with 5'-nucleotide tagged primers to generate homologous DNA amplification products from multiple specimens, followed by sequencing through the high-throughput Genome Sequence 20 DNA Sequencing System (GS20, Roche/454 Life Sciences). Each DNA sequence is subsequently traced back to its individual source through 5'tag-analysis. We demonstrate that this new approach enables the assignment of virtually all the generated DNA sequences to the correct source once sequencing anomalies are accounted for (miss-assignment rate<0.4%). Therefore, the method enables accurate sequencing and assignment of homologous DNA sequences from multiple sources in single high-throughput GS20 run. We observe a bias in the distribution of the differently tagged primers that is dependent on the 5' nucleotide of the tag. In particular, primers 5' labelled with a cytosine are heavily overrepresented among the final sequences, while those 5' labelled with a thymine are strongly underrepresented. A weaker bias also exists with regards to the distribution of the sequences as sorted by the second nucleotide of the dinucleotide tags. As the results are based on a single GS20 run, the general applicability of the approach requires confirmation. However, our experiments demonstrate that 5'primer tagging is a useful method in which the sequencing power of the GS20 can be applied to PCR-based assays of multiple homologous PCR products. The new approach will be of value to a broad range of research areas, such as those of comparative genomics, complete mitochondrial analyses, population genetics, and phylogenetics.
[Influence of PCR cycle number on microbial diversity analysis through next generation sequencing].

PubMed

An, Yunhe; Gao, Lijuan; Li, Junbo; Tian, Yanjie; Wang, Jinlong; Zheng, Xuejuan; Wu, Huijuan

2016-08-25

Using of high throughput sequencing technology to study the microbial diversity in complex samples has become one of the hottest issues in the field of microbial diversity research. In this study, the soil and sheep rumen chyme samples were used to extract DNA, respectively. Then the 25 ng total DNA was used to amplify the 16S rRNA V3 region with 20, 25, 30 PCR cycles, and the final sequencing library was constructed by mixing equal amounts of purified PCR products. Finally, the operational taxonomic unit (OUT) amount, rarefaction curve, microbial number and species were compared through data analysis. It was found that at the same amount of DNA template, the proportion of the community composition was not the best with more numbers of PCR cycle, although the species number was much more. In all, when the PCR cycle number is 25, the number of species and proportion of the community composition were the most optimal both in soil or chyme samples.
Continuous Influx of Genetic Material from Host to Virus Populations

PubMed Central

Gilbert, Clément; Peccoud, Jean; Chateigner, Aurélien; Moumen, Bouziane

2016-01-01

Many genes of large double-stranded DNA viruses have a cellular origin, suggesting that host-to-virus horizontal transfer (HT) of DNA is recurrent. Yet, the frequency of these transfers has never been assessed in viral populations. Here we used ultra-deep DNA sequencing of 21 baculovirus populations extracted from two moth species to show that a large diversity of moth DNA sequences (n = 86) can integrate into viral genomes during the course of a viral infection. The majority of the 86 different moth DNA sequences are transposable elements (TEs, n = 69) belonging to 10 superfamilies of DNA transposons and three superfamilies of retrotransposons. The remaining 17 sequences are moth sequences of unknown nature. In addition to bona fide DNA transposition, we uncover microhomology-mediated recombination as a mechanism explaining integration of moth sequences into viral genomes. Many sequences integrated multiple times at multiple positions along the viral genome. We detected a total of 27,504 insertions of moth sequences in the 21 viral populations and we calculate that on average, 4.8% of viruses harbor at least one moth sequence in these populations. Despite this substantial proportion, no insertion of moth DNA was maintained in any viral population after 10 successive infection cycles. Hence, there is a constant turnover of host DNA inserted into viral genomes each time the virus infects a moth. Finally, we found that at least 21 of the moth TEs integrated into viral genomes underwent repeated horizontal transfers between various insect species, including some lepidopterans susceptible to baculoviruses. Our results identify host DNA influx as a potent source of genetic diversity in viral populations. They also support a role for baculoviruses as vectors of DNA HT between insects, and call for an evaluation of possible gene or TE spread when using viruses as biopesticides or gene delivery vectors. PMID:26829124
Continuous Influx of Genetic Material from Host to Virus Populations.

PubMed

Gilbert, Clément; Peccoud, Jean; Chateigner, Aurélien; Moumen, Bouziane; Cordaux, Richard; Herniou, Elisabeth A

2016-02-01

Many genes of large double-stranded DNA viruses have a cellular origin, suggesting that host-to-virus horizontal transfer (HT) of DNA is recurrent. Yet, the frequency of these transfers has never been assessed in viral populations. Here we used ultra-deep DNA sequencing of 21 baculovirus populations extracted from two moth species to show that a large diversity of moth DNA sequences (n = 86) can integrate into viral genomes during the course of a viral infection. The majority of the 86 different moth DNA sequences are transposable elements (TEs, n = 69) belonging to 10 superfamilies of DNA transposons and three superfamilies of retrotransposons. The remaining 17 sequences are moth sequences of unknown nature. In addition to bona fide DNA transposition, we uncover microhomology-mediated recombination as a mechanism explaining integration of moth sequences into viral genomes. Many sequences integrated multiple times at multiple positions along the viral genome. We detected a total of 27,504 insertions of moth sequences in the 21 viral populations and we calculate that on average, 4.8% of viruses harbor at least one moth sequence in these populations. Despite this substantial proportion, no insertion of moth DNA was maintained in any viral population after 10 successive infection cycles. Hence, there is a constant turnover of host DNA inserted into viral genomes each time the virus infects a moth. Finally, we found that at least 21 of the moth TEs integrated into viral genomes underwent repeated horizontal transfers between various insect species, including some lepidopterans susceptible to baculoviruses. Our results identify host DNA influx as a potent source of genetic diversity in viral populations. They also support a role for baculoviruses as vectors of DNA HT between insects, and call for an evaluation of possible gene or TE spread when using viruses as biopesticides or gene delivery vectors.
CpG PatternFinder: a Windows-based utility program for easy and rapid identification of the CpG methylation status of DNA.

PubMed

Xu, Yi-Hua; Manoharan, Herbert T; Pitot, Henry C

2007-09-01

The bisulfite genomic sequencing technique is one of the most widely used techniques to study sequence-specific DNA methylation because of its unambiguous ability to reveal DNA methylation status to the order of a single nucleotide. One characteristic feature of the bisulfite genomic sequencing technique is that a number of sample sequence files will be produced from a single DNA sample. The PCR products of bisulfite-treated DNA samples cannot be sequenced directly because they are heterogeneous in nature; therefore they should be cloned into suitable plasmids and then sequenced. This procedure generates an enormous number of sample DNA sequence files as well as adding extra bases belonging to the plasmids to the sequence, which will cause problems in the final sequence comparison. Finding the methylation status for each CpG in each sample sequence is not an easy job. As a result CpG PatternFinder was developed for this purpose. The main functions of the CpG PatternFinder are: (i) to analyze the reference sequence to obtain CpG and non-CpG-C residue position information. (ii) To tailor sample sequence files (delete insertions and mark deletions from the sample sequence files) based on a configuration of ClustalW multiple alignment. (iii) To align sample sequence files with a reference file to obtain bisulfite conversion efficiency and CpG methylation status. And, (iv) to produce graphics, highlighted aligned sequence text and a summary report which can be easily exported to Microsoft Office suite. CpG PatternFinder is designed to operate cooperatively with BioEdit, a freeware on the internet. It can handle up to 100 files of sample DNA sequences simultaneously, and the total CpG pattern analysis process can be finished in minutes. CpG PatternFinder is an ideal software tool for DNA methylation studies to determine the differential methylation pattern in a large number of individuals in a population. Previously we developed the CpG Analyzer program; CpG PatternFinder is our further effort to create software tools for DNA methylation studies.
Identification of genes in anonymous DNA sequences. Final report: Report period, 15 April 1993--15 April 1994

DOE Office of Scientific and Technical Information (OSTI.GOV)

Fields, C.A.

1994-09-01

This Report concludes the DOE Human Genome Program project, ``Identification of Genes in Anonymous DNA Sequence.`` The central goals of this project have been (1) understanding the problem of identifying genes in anonymous sequences, and (2) development of tools, primarily the automated identification system gm, for identifying genes. The activities supported under the previous award are summarized here to provide a single complete report on the activities supported as part of the project from its inception to its completion.
Rapid and efficient cDNA library screening by self-ligation of inverse PCR products (SLIP).

PubMed

Hoskins, Roger A; Stapleton, Mark; George, Reed A; Yu, Charles; Wan, Kenneth H; Carlson, Joseph W; Celniker, Susan E

2005-12-02

cDNA cloning is a central technology in molecular biology. cDNA sequences are used to determine mRNA transcript structures, including splice junctions, open reading frames (ORFs) and 5'- and 3'-untranslated regions (UTRs). cDNA clones are valuable reagents for functional studies of genes and proteins. Expressed Sequence Tag (EST) sequencing is the method of choice for recovering cDNAs representing many of the transcripts encoded in a eukaryotic genome. However, EST sequencing samples a cDNA library at random, and it recovers transcripts with low expression levels inefficiently. We describe a PCR-based method for directed screening of plasmid cDNA libraries. We demonstrate its utility in a screen of libraries used in our Drosophila EST projects for 153 transcription factor genes that were not represented by full-length cDNA clones in our Drosophila Gene Collection. We recovered high-quality, full-length cDNAs for 72 genes and variously compromised clones for an additional 32 genes. The method can be used at any scale, from the isolation of cDNA clones for a particular gene of interest, to the improvement of large gene collections in model organisms and the human. Finally, we discuss the relative merits of directed cDNA library screening and RT-PCR approaches.
Origin and composition of cell-free DNA in spent medium from human embryo culture during preimplantation development.

PubMed

Vera-Rodriguez, M; Diez-Juan, A; Jimenez-Almazan, J; Martinez, S; Navarro, R; Peinado, V; Mercader, A; Meseguer, M; Blesa, D; Moreno, I; Valbuena, D; Rubio, C; Simon, C

2018-04-01

What is the origin and composition of cell-free DNA in human embryo spent culture media? Cell-free DNA from human embryo spent culture media represents a mix of maternal and embryonic DNA, and the mixture can be more complex for mosaic embryos. In 2016, ~300 000 human embryos were chromosomally and/or genetically analyzed using preimplantation genetic testing for aneuploidies (PGT-A) or monogenic disorders (PGT-M) before transfer into the uterus. While progress in genetic techniques has enabled analysis of the full karyotype in a single cell with high sensitivity and specificity, these approaches still require an embryo biopsy. Thus, non-invasive techniques are sought as an alternative. This study was based on a total of 113 human embryos undergoing trophectoderm biopsy as part of PGT-A analysis. For each embryo, the spent culture media used between Day 3 and Day 5 of development were collected for cell-free DNA analysis. In addition to the 113 spent culture media samples, 28 media drops without embryo contact were cultured in parallel under the same conditions to use as controls. In total, 141 media samples were collected and divided into two groups: one for direct DNA quantification (53 spent culture media and 17 controls), the other for whole-genome amplification (60 spent culture media and 11 controls) and subsequent quantification. Some samples with amplified DNA (N = 56) were used for aneuploidy testing by next-generation sequencing; of those, 35 samples underwent single-nucleotide polymorphism (SNP) sequencing to detect maternal contamination. Finally, from the 35 spent culture media analyzed by SNP sequencing, 12 whole blastocysts were analyzed by fluorescence in situ hybridization (FISH) to determine the level of mosaicism in each embryo, as a possible origin for discordance between sample types. Trophectoderm biopsies and culture media samples (20 μl) underwent whole-genome amplification, then libraries were generated and sequenced for an aneuploidy study. For SNP sequencing, triads including trophectoderm DNA, cell-free DNA, and follicular fluid DNA were analyzed. In total, 124 SNPs were included with 90 SNPs distributed among all autosomes and 34 SNPs located on chromosome Y. Finally, 12 whole blastocysts were fixed and individual cells were analyzed by FISH using telomeric/centromeric probes for the affected chromosomes. We found a higher quantity of cell-free DNA in spent culture media co-cultured with embryos versus control media samples (P ≤ 0.001). The presence of cell-free DNA in the spent culture media enabled a chromosomal diagnosis, although results differed from those of trophectoderm biopsy analysis in most cases (67%). Discordant results were mainly attributable to a high percentage of maternal DNA in the spent culture media, with a median percentage of embryonic DNA estimated at 8%. Finally, from the discordant cases, 91.7% of whole blastocysts analyzed by FISH were mosaic and 75% of the analyzed chromosomes were concordant with the trophectoderm DNA diagnosis instead of the cell-free DNA result. This study was limited by the sample size and the number of cells analyzed by FISH. This is the first study to combine chromosomal analysis of cell-free DNA, SNP sequencing to identify maternal contamination, and whole-blastocyst analysis for detecting mosaicism. Our results provide a better understanding of the origin of cell-free DNA in spent culture media, offering an important step toward developing future non-invasive karyotyping that must rely on the specific identification of DNA released from human embryos. This work was funded by Igenomix S.L. There are no competing interests.
Chaotic Image Encryption Algorithm Based on Bit Permutation and Dynamic DNA Encoding.

PubMed

Zhang, Xuncai; Han, Feng; Niu, Ying

2017-01-01

With the help of the fact that chaos is sensitive to initial conditions and pseudorandomness, combined with the spatial configurations in the DNA molecule's inherent and unique information processing ability, a novel image encryption algorithm based on bit permutation and dynamic DNA encoding is proposed here. The algorithm first uses Keccak to calculate the hash value for a given DNA sequence as the initial value of a chaotic map; second, it uses a chaotic sequence to scramble the image pixel locations, and the butterfly network is used to implement the bit permutation. Then, the image is coded into a DNA matrix dynamic, and an algebraic operation is performed with the DNA sequence to realize the substitution of the pixels, which further improves the security of the encryption. Finally, the confusion and diffusion properties of the algorithm are further enhanced by the operation of the DNA sequence and the ciphertext feedback. The results of the experiment and security analysis show that the algorithm not only has a large key space and strong sensitivity to the key but can also effectively resist attack operations such as statistical analysis and exhaustive analysis.
Chaotic Image Encryption Algorithm Based on Bit Permutation and Dynamic DNA Encoding

PubMed Central

2017-01-01

With the help of the fact that chaos is sensitive to initial conditions and pseudorandomness, combined with the spatial configurations in the DNA molecule's inherent and unique information processing ability, a novel image encryption algorithm based on bit permutation and dynamic DNA encoding is proposed here. The algorithm first uses Keccak to calculate the hash value for a given DNA sequence as the initial value of a chaotic map; second, it uses a chaotic sequence to scramble the image pixel locations, and the butterfly network is used to implement the bit permutation. Then, the image is coded into a DNA matrix dynamic, and an algebraic operation is performed with the DNA sequence to realize the substitution of the pixels, which further improves the security of the encryption. Finally, the confusion and diffusion properties of the algorithm are further enhanced by the operation of the DNA sequence and the ciphertext feedback. The results of the experiment and security analysis show that the algorithm not only has a large key space and strong sensitivity to the key but can also effectively resist attack operations such as statistical analysis and exhaustive analysis. PMID:28912802
Protein Science by DNA Sequencing: How Advances in Molecular Biology Are Accelerating Biochemistry.

PubMed

Higgins, Sean A; Savage, David F

2018-01-09

A fundamental goal of protein biochemistry is to determine the sequence-function relationship, but the vastness of sequence space makes comprehensive evaluation of this landscape difficult. However, advances in DNA synthesis and sequencing now allow researchers to assess the functional impact of every single mutation in many proteins, but challenges remain in library construction and the development of general assays applicable to a diverse range of protein functions. This Perspective briefly outlines the technical innovations in DNA manipulation that allow massively parallel protein biochemistry and then summarizes the methods currently available for library construction and the functional assays of protein variants. Areas in need of future innovation are highlighted with a particular focus on assay development and the use of computational analysis with machine learning to effectively traverse the sequence-function landscape. Finally, applications in the fundamentals of protein biochemistry, disease prediction, and protein engineering are presented.
Re-sequencing transgenic plants revealed rearrangements at T-DNA inserts, and integration of a short T-DNA fragment, but no increase of small mutations elsewhere.

PubMed

Schouten, Henk J; Vande Geest, Henri; Papadimitriou, Sofia; Bemer, Marian; Schaart, Jan G; Smulders, Marinus J M; Perez, Gabino Sanchez; Schijlen, Elio

2017-03-01

Transformation resulted in deletions and translocations at T-DNA inserts, but not in genome-wide small mutations. A tiny T-DNA splinter was detected that probably would remain undetected by conventional techniques. We investigated to which extent Agrobacterium tumefaciens-mediated transformation is mutagenic, on top of inserting T-DNA. To prevent mutations due to in vitro propagation, we applied floral dip transformation of Arabidopsis thaliana. We re-sequenced the genomes of five primary transformants, and compared these to genomic sequences derived from a pool of four wild-type plants. By genome-wide comparisons, we identified ten small mutations in the genomes of the five transgenic plants, not correlated to the positions or number of T-DNA inserts. This mutation frequency is within the range of spontaneous mutations occurring during seed propagation in A. thaliana, as determined earlier. In addition, we detected small as well as large deletions specifically at the T-DNA insert sites. Furthermore, we detected partial T-DNA inserts, one of these a tiny 50-bp fragment originating from a central part of the T-DNA construct used, inserted into the plant genome without flanking other T-DNA. Because of its small size, we named this fragment a T-DNA splinter. As far as we know this is the first report of such a small T-DNA fragment insert in absence of any T-DNA border sequence. Finally, we found evidence for translocations from other chromosomes, flanking T-DNA inserts. In this study, we showed that next-generation sequencing (NGS) is a highly sensitive approach to detect T-DNA inserts in transgenic plants.
Mapping Simple Repeated DNA Sequences in Heterochromatin of Drosophila Melanogaster

PubMed Central

Lohe, A. R.; Hilliker, A. J.; Roberts, P. A.

1993-01-01

Heterochromatin in Drosophila has unusual genetic, cytological and molecular properties. Highly repeated DNA sequences (satellites) are the principal component of heterochromatin. Using probes from cloned satellites, we have constructed a chromosome map of 10 highly repeated, simple DNA sequences in heterochromatin of mitotic chromosomes of Drosophila melanogaster. Despite extensive sequence homology among some satellites, chromosomal locations could be distinguished by stringent in situ hybridizations for each satellite. Only two of the localizations previously determined using gradient-purified bulk satellite probes are correct. Eight new satellite localizations are presented, providing a megabase-level chromosome map of one-quarter of the genome. Five major satellites each exhibit a multichromosome distribution, and five minor satellites hybridize to single sites on the Y chromosome. Satellites closely related in sequence are often located near one another on the same chromosome. About 80% of Y chromosome DNA is composed of nine simple repeated sequences, in particular (AAGAC)(n) (8 Mb), (AAGAG)(n) (7 Mb) and (AATAT)(n) (6 Mb). Similarly, more than 70% of the DNA in chromosome 2 heterochromatin is composed of five simple repeated sequences. We have also generated a high resolution map of satellites in chromosome 2 heterochromatin, using a series of translocation chromosomes whose breakpoints in heterochromatin were ordered by N-banding. Finally, staining and banding patterns of heterochromatic regions are correlated with the locations of specific repeated DNA sequences. The basis for the cytochemical heterogeneity in banding appears to depend exclusively on the different satellite DNAs present in heterochromatin. PMID:8375654
Initial Characterization of the Pf-Int Recombinase from the Malaria Parasite Plasmodium falciparum

PubMed Central

Ghorbal, Mehdi; Scheidig-Benatar, Christine; Bouizem, Salma; Thomas, Christophe; Paisley, Genevieve; Faltermeier, Claire; Liu, Melanie; Scherf, Artur; Lopez-Rubio, Jose-Juan; Gopaul, Deshmukh N.

2012-01-01

Background Genetic variation is an essential means of evolution and adaptation in many organisms in response to environmental change. Certain DNA alterations can be carried out by site-specific recombinases (SSRs) that fall into two families: the serine and the tyrosine recombinases. SSRs are seldom found in eukaryotes. A gene homologous to a tyrosine site-specific recombinase has been identified in the genome of Plasmodium falciparum. The sequence is highly conserved among five other members of Plasmodia. Methodology/Principal Findings The predicted open reading frame encodes for a ∼57 kDa protein containing a C-terminal domain including the putative tyrosine recombinase conserved active site residues R-H-R-(H/W)-Y. The N-terminus has the typical alpha-helical bundle and potentially a mixed alpha-beta domain resembling that of λ-Int. Pf-Int mRNA is expressed differentially during the P. falciparum erythrocytic life stages, peaking in the schizont stage. Recombinant Pf-Int and affinity chromatography of DNA from genomic or synthetic origin were used to identify potential DNA targets after sequencing or micro-array hybridization. Interestingly, the sequences captured also included highly variable subtelomeric genes such as var, rif, and stevor sequences. Electrophoretic mobility shift assays with DNA were carried out to verify Pf-Int/DNA binding. Finally, Pf-Int knock-out parasites were created in order to investigate the biological role of Pf-Int. Conclusions/Significance Our data identify for the first time a malaria parasite gene with structural and functional features of recombinases. Pf-Int may bind to and alter DNA, either in a sequence specific or in a non-specific fashion, and may contribute to programmed or random DNA rearrangements. Pf-Int is the first molecular player identified with a potential role in genome plasticity in this pathogen. Finally, Pf-Int knock-out parasite is viable showing no detectable impact on blood stage development, which is compatible with such function. PMID:23056326
The twilight zone of cis element alignments.

PubMed

Sebastian, Alvaro; Contreras-Moreira, Bruno

2013-02-01

Sequence alignment of proteins and nucleic acids is a routine task in bioinformatics. Although the comparison of complete peptides, genes or genomes can be undertaken with a great variety of tools, the alignment of short DNA sequences and motifs entails pitfalls that have not been fully addressed yet. Here we confront the structural superposition of transcription factors with the sequence alignment of their recognized cis elements. Our goals are (i) to test TFcompare (http://floresta.eead.csic.es/tfcompare), a structural alignment method for protein-DNA complexes; (ii) to benchmark the pairwise alignment of regulatory elements; (iii) to define the confidence limits and the twilight zone of such alignments and (iv) to evaluate the relevance of these thresholds with elements obtained experimentally. We find that the structure of cis elements and protein-DNA interfaces is significantly more conserved than their sequence and measures how this correlates with alignment errors when only sequence information is considered. Our results confirm that DNA motifs in the form of matrices produce better alignments than individual sequences. Finally, we report that empirical and theoretically derived twilight thresholds are useful for estimating the natural plasticity of regulatory sequences, and hence for filtering out unreliable alignments.
The twilight zone of cis element alignments

PubMed Central

Sebastian, Alvaro; Contreras-Moreira, Bruno

2013-01-01

Sequence alignment of proteins and nucleic acids is a routine task in bioinformatics. Although the comparison of complete peptides, genes or genomes can be undertaken with a great variety of tools, the alignment of short DNA sequences and motifs entails pitfalls that have not been fully addressed yet. Here we confront the structural superposition of transcription factors with the sequence alignment of their recognized cis elements. Our goals are (i) to test TFcompare (http://floresta.eead.csic.es/tfcompare), a structural alignment method for protein–DNA complexes; (ii) to benchmark the pairwise alignment of regulatory elements; (iii) to define the confidence limits and the twilight zone of such alignments and (iv) to evaluate the relevance of these thresholds with elements obtained experimentally. We find that the structure of cis elements and protein–DNA interfaces is significantly more conserved than their sequence and measures how this correlates with alignment errors when only sequence information is considered. Our results confirm that DNA motifs in the form of matrices produce better alignments than individual sequences. Finally, we report that empirical and theoretically derived twilight thresholds are useful for estimating the natural plasticity of regulatory sequences, and hence for filtering out unreliable alignments. PMID:23268451

Base-Calling Algorithm with Vocabulary (BCV) Method for Analyzing Population Sequencing Chromatograms

PubMed Central

Fantin, Yuri S.; Neverov, Alexey D.; Favorov, Alexander V.; Alvarez-Figueroa, Maria V.; Braslavskaya, Svetlana I.; Gordukova, Maria A.; Karandashova, Inga V.; Kuleshov, Konstantin V.; Myznikova, Anna I.; Polishchuk, Maya S.; Reshetov, Denis A.; Voiciehovskaya, Yana A.; Mironov, Andrei A.; Chulanov, Vladimir P.

2013-01-01

Sanger sequencing is a common method of reading DNA sequences. It is less expensive than high-throughput methods, and it is appropriate for numerous applications including molecular diagnostics. However, sequencing mixtures of similar DNA of pathogens with this method is challenging. This is important because most clinical samples contain such mixtures, rather than pure single strains. The traditional solution is to sequence selected clones of PCR products, a complicated, time-consuming, and expensive procedure. Here, we propose the base-calling with vocabulary (BCV) method that computationally deciphers Sanger chromatograms obtained from mixed DNA samples. The inputs to the BCV algorithm are a chromatogram and a dictionary of sequences that are similar to those we expect to obtain. We apply the base-calling function on a test dataset of chromatograms without ambiguous positions, as well as one with 3–14% sequence degeneracy. Furthermore, we use BCV to assemble a consensus sequence for an HIV genome fragment in a sample containing a mixture of viral DNA variants and to determine the positions of the indels. Finally, we detect drug-resistant Mycobacterium tuberculosis strains carrying frameshift mutations mixed with wild-type bacteria in the pncA gene, and roughly characterize bacterial communities in clinical samples by direct 16S rRNA sequencing. PMID:23382983
Inferring coarse-grain histone-DNA interaction potentials from high-resolution structures of the nucleosome

NASA Astrophysics Data System (ADS)

Meyer, Sam; Everaers, Ralf

2015-02-01

The histone-DNA interaction in the nucleosome is a fundamental mechanism of genomic compaction and regulation, which remains largely unknown despite increasing structural knowledge of the complex. In this paper, we propose a framework for the extraction of a nanoscale histone-DNA force-field from a collection of high-resolution structures, which may be adapted to a larger class of protein-DNA complexes. We applied the procedure to a large crystallographic database extended by snapshots from molecular dynamics simulations. The comparison of the structural models first shows that, at histone-DNA contact sites, the DNA base-pairs are shifted outwards locally, consistent with locally repulsive forces exerted by the histones. The second step shows that the various force profiles of the structures under analysis derive locally from a unique, sequence-independent, quadratic repulsive force-field, while the sequence preferences are entirely due to internal DNA mechanics. We have thus obtained the first knowledge-derived nanoscale interaction potential for histone-DNA in the nucleosome. The conformations obtained by relaxation of nucleosomal DNA with high-affinity sequences in this potential accurately reproduce the experimental values of binding preferences. Finally we address the more generic binding mechanisms relevant to the 80% genomic sequences incorporated in nucleosomes, by computing the conformation of nucleosomal DNA with sequence-averaged properties. This conformation differs from those found in crystals, and the analysis suggests that repulsive histone forces are related to local stretch tension in nucleosomal DNA, mostly between adjacent contact points. This tension could play a role in the stability of the complex.
Environmental DNA metabarcoding: Transforming how we survey animal and plant communities.

PubMed

Deiner, Kristy; Bik, Holly M; Mächler, Elvira; Seymour, Mathew; Lacoursière-Roussel, Anaïs; Altermatt, Florian; Creer, Simon; Bista, Iliana; Lodge, David M; de Vere, Natasha; Pfrender, Michael E; Bernatchez, Louis

2017-11-01

The genomic revolution has fundamentally changed how we survey biodiversity on earth. High-throughput sequencing ("HTS") platforms now enable the rapid sequencing of DNA from diverse kinds of environmental samples (termed "environmental DNA" or "eDNA"). Coupling HTS with our ability to associate sequences from eDNA with a taxonomic name is called "eDNA metabarcoding" and offers a powerful molecular tool capable of noninvasively surveying species richness from many ecosystems. Here, we review the use of eDNA metabarcoding for surveying animal and plant richness, and the challenges in using eDNA approaches to estimate relative abundance. We highlight eDNA applications in freshwater, marine and terrestrial environments, and in this broad context, we distill what is known about the ability of different eDNA sample types to approximate richness in space and across time. We provide guiding questions for study design and discuss the eDNA metabarcoding workflow with a focus on primers and library preparation methods. We additionally discuss important criteria for consideration of bioinformatic filtering of data sets, with recommendations for increasing transparency. Finally, looking to the future, we discuss emerging applications of eDNA metabarcoding in ecology, conservation, invasion biology, biomonitoring, and how eDNA metabarcoding can empower citizen science and biodiversity education. © 2017 The Authors. Molecular Ecology Published by John Wiley & Sons Ltd.
Compressing DNA sequence databases with coil.

PubMed

White, W Timothy J; Hendy, Michael D

2008-05-20

Publicly available DNA sequence databases such as GenBank are large, and are growing at an exponential rate. The sheer volume of data being dealt with presents serious storage and data communications problems. Currently, sequence data is usually kept in large "flat files," which are then compressed using standard Lempel-Ziv (gzip) compression - an approach which rarely achieves good compression ratios. While much research has been done on compressing individual DNA sequences, surprisingly little has focused on the compression of entire databases of such sequences. In this study we introduce the sequence database compression software coil. We have designed and implemented a portable software package, coil, for compressing and decompressing DNA sequence databases based on the idea of edit-tree coding. coil is geared towards achieving high compression ratios at the expense of execution time and memory usage during compression - the compression time represents a "one-off investment" whose cost is quickly amortised if the resulting compressed file is transmitted many times. Decompression requires little memory and is extremely fast. We demonstrate a 5% improvement in compression ratio over state-of-the-art general-purpose compression tools for a large GenBank database file containing Expressed Sequence Tag (EST) data. Finally, coil can efficiently encode incremental additions to a sequence database. coil presents a compelling alternative to conventional compression of flat files for the storage and distribution of DNA sequence databases having a narrow distribution of sequence lengths, such as EST data. Increasing compression levels for databases having a wide distribution of sequence lengths is a direction for future work.
Compressing DNA sequence databases with coil

PubMed Central

White, W Timothy J; Hendy, Michael D

2008-01-01

Background Publicly available DNA sequence databases such as GenBank are large, and are growing at an exponential rate. The sheer volume of data being dealt with presents serious storage and data communications problems. Currently, sequence data is usually kept in large "flat files," which are then compressed using standard Lempel-Ziv (gzip) compression – an approach which rarely achieves good compression ratios. While much research has been done on compressing individual DNA sequences, surprisingly little has focused on the compression of entire databases of such sequences. In this study we introduce the sequence database compression software coil. Results We have designed and implemented a portable software package, coil, for compressing and decompressing DNA sequence databases based on the idea of edit-tree coding. coil is geared towards achieving high compression ratios at the expense of execution time and memory usage during compression – the compression time represents a "one-off investment" whose cost is quickly amortised if the resulting compressed file is transmitted many times. Decompression requires little memory and is extremely fast. We demonstrate a 5% improvement in compression ratio over state-of-the-art general-purpose compression tools for a large GenBank database file containing Expressed Sequence Tag (EST) data. Finally, coil can efficiently encode incremental additions to a sequence database. Conclusion coil presents a compelling alternative to conventional compression of flat files for the storage and distribution of DNA sequence databases having a narrow distribution of sequence lengths, such as EST data. Increasing compression levels for databases having a wide distribution of sequence lengths is a direction for future work. PMID:18489794
Threading DNA through nanopores for biosensing applications

NASA Astrophysics Data System (ADS)

Fyta, Maria

2015-07-01

This review outlines the recent achievements in the field of nanopore research. Nanopores are typically used in single-molecule experiments and are believed to have a high potential to realize an ultra-fast and very cheap genome sequencer. Here, the various types of nanopore materials, ranging from biological to 2D nanopores are discussed together with their advantages and disadvantages. These nanopores can utilize different protocols to read out the DNA nucleobases. Although, the first nanopore devices have reached the market, many still have issues which do not allow a full realization of a nanopore sequencer able to sequence the human genome in about a day. Ways to control the DNA, its dynamics and speed as the biomolecule translocates the nanopore in order to increase the signal-to-noise ratio in the reading-out process are examined in this review. Finally, the advantages, as well as the drawbacks in distinguishing the DNA nucleotides, i.e., the genetic information, are presented in view of their importance in the field of nanopore sequencing.
Nanowire-nanopore transistor sensor for DNA detection during translocation

NASA Astrophysics Data System (ADS)

Xie, Ping; Xiong, Qihua; Fang, Ying; Qing, Quan; Lieber, Charles

2011-03-01

Nanopore sequencing, as a promising low cost, high throughput sequencing technique, has been proposed more than a decade ago. Due to the incompatibility between small ionic current signal and fast translocation speed and the technical difficulties on large scale integration of nanopore for direct ionic current sequencing, alternative methods rely on integrated DNA sensors have been proposed, such as using capacitive coupling or tunnelling current etc. But none of them have been experimentally demonstrated yet. Here we show that for the first time an amplified sensor signal has been experimentally recorded from a nanowire-nanopore field effect transistor sensor during DNA translocation. Independent multi-channel recording was also demonstrated for the first time. Our results suggest that the signal is from highly localized potential change caused by DNA translocation in none-balanced buffer condition. Given this method may produce larger signal for smaller nanopores, we hope our experiment can be a starting point for a new generation of nanopore sequencing devices with larger signal, higher bandwidth and large-scale multiplexing capability and finally realize the ultimate goal of low cost high throughput sequencing.
Directing folding pathways for multi-component DNA origami nanostructures with complex topology

NASA Astrophysics Data System (ADS)

Marras, A. E.; Zhou, L.; Kolliopoulos, V.; Su, H.-J.; Castro, C. E.

2016-05-01

Molecular self-assembly has become a well-established technique to design complex nanostructures and hierarchical mesoscale assemblies. The typical approach is to design binding complementarity into nucleotide or amino acid sequences to achieve the desired final geometry. However, with an increasing interest in dynamic nanodevices, the need to design structures with motion has necessitated the development of multi-component structures. While this has been achieved through hierarchical assembly of similar structural units, here we focus on the assembly of topologically complex structures, specifically with concentric components, where post-folding assembly is not feasible. We exploit the ability to direct folding pathways to program the sequence of assembly and present a novel approach of designing the strand topology of intermediate folding states to program the topology of the final structure, in this case a DNA origami slider structure that functions much like a piston-cylinder assembly in an engine. The ability to program the sequence and control orientation and topology of multi-component DNA origami nanostructures provides a foundation for a new class of structures with internal and external moving parts and complex scaffold topology. Furthermore, this work provides critical insight to guide the design of intermediate states along a DNA origami folding pathway and to further understand the details of DNA origami self-assembly to more broadly control folding states and landscapes.
Identifying active foraminifera in the Sea of Japan using metatranscriptomic approach

NASA Astrophysics Data System (ADS)

Lejzerowicz, Franck; Voltsky, Ivan; Pawlowski, Jan

2013-02-01

Metagenetics represents an efficient and rapid tool to describe environmental diversity patterns of microbial eukaryotes based on ribosomal DNA sequences. However, the results of metagenetic studies are often biased by the presence of extracellular DNA molecules that are persistent in the environment, especially in deep-sea sediment. As an alternative, short-lived RNA molecules constitute a good proxy for the detection of active species. Here, we used a metatranscriptomic approach based on RNA-derived (cDNA) sequences to study the diversity of the deep-sea benthic foraminifera and compared it to the metagenetic approach. We analyzed 257 ribosomal DNA and cDNA sequences obtained from seven sediments samples collected in the Sea of Japan at depths ranging from 486 to 3665 m. The DNA and RNA-based approaches gave a similar view of the taxonomic composition of foraminiferal assemblage, but differed in some important points. First, the cDNA dataset was dominated by sequences of rotaliids and robertiniids, suggesting that these calcareous species, some of which have been observed in Rose Bengal stained samples, are the most active component of foraminiferal community. Second, the richness of monothalamous (single-chambered) foraminifera was particularly high in DNA extracts from the deepest samples, confirming that this group of foraminifera is abundant but not necessarily very active in the deep-sea sediments. Finally, the high divergence of undetermined sequences in cDNA dataset indicate the limits of our database and lack of knowledge about some active but possibly rare species. Our study demonstrates the capability of the metatranscriptomic approach to detect active foraminiferal species and prompt its use in future high-throughput sequencing-based environmental surveys.
High Mitochondrial DNA Stability in B-Cell Chronic Lymphocytic Leukemia

PubMed Central

Cerezo, María; Bandelt, Hans-Jürgen; Martín-Guerrero, Idoia; Ardanaz, Maite; Vega, Ana; Carracedo, Ángel; García-Orad, África; Salas, Antonio

2009-01-01

Background Chronic Lymphocytic Leukemia (CLL) leads to progressive accumulation of lymphocytes in the blood, bone marrow, and lymphatic tissues. Previous findings have suggested that the mtDNA could play an important role in CLL. Methodology/Principal Findings The mitochondrial DNA (mtDNA) control-region was analyzed in lymphocyte cell DNA extracts and compared with their granulocyte counterpart extract of 146 patients suffering from B-Cell CLL; B-CLL (all recruited from the Basque country). Major efforts were undertaken to rule out methodological artefacts that would render a high false positive rate for mtDNA instabilities and thus lead to erroneous interpretation of sequence instabilities. Only twenty instabilities were finally confirmed, most of them affecting the homopolymeric stretch located in the second hypervariable segment (HVS-II) around position 310, which is well known to constitute an extreme mutational hotspot of length polymorphism, as these mutations are frequently observed in the general human population. A critical revision of the findings in previous studies indicates a lack of proper methodological standards, which eventually led to an overinterpretation of the role of the mtDNA in CLL tumorigenesis. Conclusions/Significance Our results suggest that mtDNA instability is not the primary causal factor in B-CLL. A secondary role of mtDNA mutations cannot be fully ruled out under the hypothesis that the progressive accumulation of mtDNA instabilities could finally contribute to the tumoral process. Recommendations are given that would help to minimize erroneous interpretation of sequencing results in mtDNA studies in tumorigenesis. PMID:19924307
Helix Unwinding and Base Flipping Enable Human MTERF1 to Terminate Mitochondrial Transcription

DOE Office of Scientific and Technical Information (OSTI.GOV)

Yakubovskaya, E.; Mejia, E; Byrnes, J

2010-01-01

Defects in mitochondrial gene expression are associated with aging and disease. Mterf proteins have been implicated in modulating transcription, replication and protein synthesis. We have solved the structure of a member of this family, the human mitochondrial transcriptional terminator MTERF1, bound to dsDNA containing the termination sequence. The structure indicates that upon sequence recognition MTERF1 unwinds the DNA molecule, promoting eversion of three nucleotides. Base flipping is critical for stable binding and transcriptional termination. Additional structural and biochemical results provide insight into the DNA binding mechanism and explain how MTERF1 recognizes its target sequence. Finally, we have demonstrated that themore » mitochondrial pathogenic G3249A and G3244A mutations interfere with key interactions for sequence recognition, eliminating termination. Our results provide insight into the role of mterf proteins and suggest a link between mitochondrial disease and the regulation of mitochondrial transcription.« less
Preparation of Low-Input and Ligation-Free ChIP-seq Libraries Using Template-Switching Technology.

PubMed

Bolduc, Nathalie; Lehman, Alisa P; Farmer, Andrew

2016-10-10

Chromatin immunoprecipitation (ChIP) followed by high-throughput sequencing (ChIP-seq) has become the gold standard for mapping of transcription factors and histone modifications throughout the genome. However, for ChIP experiments involving few cells or targeting low-abundance transcription factors, the small amount of DNA recovered makes ligation of adapters very challenging. In this unit, we describe a ChIP-seq workflow that can be applied to small cell numbers, including a robust single-tube and ligation-free method for preparation of sequencing libraries from sub-nanogram amounts of ChIP DNA. An example ChIP protocol is first presented, resulting in selective enrichment of DNA-binding proteins and cross-linked DNA fragments immobilized on beads via an antibody bridge. This is followed by a protocol for fast and easy cross-linking reversal and DNA recovery. Finally, we describe a fast, ligation-free library preparation protocol, featuring DNA SMART technology, resulting in samples ready for Illumina sequencing. © 2016 by John Wiley & Sons, Inc. Copyright © 2016 John Wiley & Sons, Inc.
Large-Scale Biomonitoring of Remote and Threatened Ecosystems via High-Throughput Sequencing

PubMed Central

Gibson, Joel F.; Shokralla, Shadi; Curry, Colin; Baird, Donald J.; Monk, Wendy A.; King, Ian; Hajibabaei, Mehrdad

2015-01-01

Biodiversity metrics are critical for assessment and monitoring of ecosystems threatened by anthropogenic stressors. Existing sorting and identification methods are too expensive and labour-intensive to be scaled up to meet management needs. Alternately, a high-throughput DNA sequencing approach could be used to determine biodiversity metrics from bulk environmental samples collected as part of a large-scale biomonitoring program. Here we show that both morphological and DNA sequence-based analyses are suitable for recovery of individual taxonomic richness, estimation of proportional abundance, and calculation of biodiversity metrics using a set of 24 benthic samples collected in the Peace-Athabasca Delta region of Canada. The high-throughput sequencing approach was able to recover all metrics with a higher degree of taxonomic resolution than morphological analysis. The reduced cost and increased capacity of DNA sequence-based approaches will finally allow environmental monitoring programs to operate at the geographical and temporal scale required by industrial and regulatory end-users. PMID:26488407
Statistical and linguistic features of DNA sequences

NASA Technical Reports Server (NTRS)

Havlin, S.; Buldyrev, S. V.; Goldberger, A. L.; Mantegna, R. N.; Peng, C. K.; Simons, M.; Stanley, H. E.

1995-01-01

We present evidence supporting the idea that the DNA sequence in genes containing noncoding regions is correlated, and that the correlation is remarkably long range--indeed, base pairs thousands of base pairs distant are correlated. We do not find such a long-range correlation in the coding regions of the gene. We resolve the problem of the "non-stationary" feature of the sequence of base pairs by applying a new algorithm called Detrended Fluctuation Analysis (DFA). We address the claim of Voss that there is no difference in the statistical properties of coding and noncoding regions of DNA by systematically applying the DFA algorithm, as well as standard FFT analysis, to all eukaryotic DNA sequences (33 301 coding and 29 453 noncoding) in the entire GenBank database. We describe a simple model to account for the presence of long-range power-law correlations which is based upon a generalization of the classic Levy walk. Finally, we describe briefly some recent work showing that the noncoding sequences have certain statistical features in common with natural languages. Specifically, we adapt to DNA the Zipf approach to analyzing linguistic texts, and the Shannon approach to quantifying the "redundancy" of a linguistic text in terms of a measurable entropy function. We suggest that noncoding regions in plants and invertebrates may display a smaller entropy and larger redundancy than coding regions, further supporting the possibility that noncoding regions of DNA may carry biological information.
The primary structures of two yeast enolase genes. Homology between the 5' noncoding flanking regions of yeast enolase and glyceraldehyde-3-phosphate dehydrogenase genes.

PubMed

Holland, M J; Holland, J P; Thill, G P; Jackson, K A

1981-02-10

Segments of yeast genomic DNA containing two enolase structural genes have been isolated by subculture cloning procedures using a cDNA hybridization probe synthesized from purified yeast enolase mRNA. Based on restriction endonuclease and transcriptional maps of these two segments of yeast DNA, each hybrid plasmid contains a region of extensive nucleotide sequence homology which forms hybrids with the cDNA probe. The DNA sequences which flank this homologous region in the two hybrid plasmids are nonhomologous indicating that these sequences are nontandemly repeated in the yeast genome. The complete nucleotide sequence of the coding as well as the flanking noncoding regions of these genes has been determined. The amino acid sequence predicted from one reading frame of both structural genes is extremely similar to that determined for yeast enolase (Chin, C. C. Q., Brewer, J. M., Eckard, E., and Wold, F. (1981) J. Biol. Chem. 256, 1370-1376), confirming that these isolated structural genes encode yeast enolase. The nucleotide sequences of the coding regions of the genes are approximately 95% homologous, and neither gene contains an intervening sequence. Codon utilization in the enolase genes follows the same biased pattern previously described for two yeast glyceraldehyde-3-phosphate dehydrogenase structural genes (Holland, J. P., and Holland, M. J. (1980) J. Biol. Chem. 255, 2596-2605). DNA blotting analysis confirmed that the isolated segments of yeast DNA are colinear with yeast genomic DNA and that there are two nontandemly repeated enolase genes per haploid yeast genome. The noncoding portions of the two enolase genes adjacent to the initiation and termination codons are approximately 70% homologous and contain sequences thought to be involved in the synthesis and processing messenger RNA. Finally there are regions of extensive homology between the two enolase structural genes and two yeast glyceraldehyde-3-phosphate dehydrogenase structural genes within the 5- noncoding portions of these glycolytic genes.
Chromosome evolution in the Thermotogales: large-scale inversions and strain diversification of CRISPR sequences.

PubMed

DeBoy, Robert T; Mongodin, Emmanuel F; Emerson, Joanne B; Nelson, Karen E

2006-04-01

In the present study, the chromosomes of two members of the Thermotogales were compared. A whole-genome alignment of Thermotoga maritima MSB8 and Thermotoga neapolitana NS-E has revealed numerous large-scale DNA rearrangements, most of which are associated with CRISPR DNA repeats and/or tRNA genes. These DNA rearrangements do not include the putative origin of DNA replication but move within the same replichore, i.e., the same replicating half of the chromosome (delimited by the replication origin and terminus). Based on cumulative GC skew analysis, both the T. maritima and T. neapolitana lineages contain one or two major inverted DNA segments. Also, based on PCR amplification and sequence analysis of the DNA joints that are associated with the major rearrangements, the overall chromosome architecture was found to be conserved at most DNA joints for other strains of T. neapolitana. Taken together, the results from this analysis suggest that the observed chromosomal rearrangements in the Thermotogales likely occurred by successive inversions after their divergence from a common ancestor and before strain diversification. Finally, sequence analysis shows that size polymorphisms in the DNA joints associated with CRISPRs can be explained by expansion and possibly contraction of the DNA repeat and spacer unit, providing a tool for discerning the relatedness of strains from different geographic locations.
A novel chaos-based image encryption algorithm using DNA sequence operations

NASA Astrophysics Data System (ADS)

Chai, Xiuli; Chen, Yiran; Broyde, Lucie

2017-01-01

An image encryption algorithm based on chaotic system and deoxyribonucleic acid (DNA) sequence operations is proposed in this paper. First, the plain image is encoded into a DNA matrix, and then a new wave-based permutation scheme is performed on it. The chaotic sequences produced by 2D Logistic chaotic map are employed for row circular permutation (RCP) and column circular permutation (CCP). Initial values and parameters of the chaotic system are calculated by the SHA 256 hash of the plain image and the given values. Then, a row-by-row image diffusion method at DNA level is applied. A key matrix generated from the chaotic map is used to fuse the confused DNA matrix; also the initial values and system parameters of the chaotic system are renewed by the hamming distance of the plain image. Finally, after decoding the diffused DNA matrix, we obtain the cipher image. The DNA encoding/decoding rules of the plain image and the key matrix are determined by the plain image. Experimental results and security analyses both confirm that the proposed algorithm has not only an excellent encryption result but also resists various typical attacks.
Cloud-based adaptive exon prediction for DNA analysis.

PubMed

Putluri, Srinivasareddy; Zia Ur Rahman, Md; Fathima, Shaik Yasmeen

2018-02-01

Cloud computing offers significant research and economic benefits to healthcare organisations. Cloud services provide a safe place for storing and managing large amounts of such sensitive data. Under conventional flow of gene information, gene sequence laboratories send out raw and inferred information via Internet to several sequence libraries. DNA sequencing storage costs will be minimised by use of cloud service. In this study, the authors put forward a novel genomic informatics system using Amazon Cloud Services, where genomic sequence information is stored and accessed for processing. True identification of exon regions in a DNA sequence is a key task in bioinformatics, which helps in disease identification and design drugs. Three base periodicity property of exons forms the basis of all exon identification techniques. Adaptive signal processing techniques found to be promising in comparison with several other methods. Several adaptive exon predictors (AEPs) are developed using variable normalised least mean square and its maximum normalised variants to reduce computational complexity. Finally, performance evaluation of various AEPs is done based on measures such as sensitivity, specificity and precision using various standard genomic datasets taken from National Center for Biotechnology Information genomic sequence database.
Strong spurious transcription likely contributes to DNA insert bias in typical metagenomic clone libraries.

PubMed

Lam, Kathy N; Charles, Trevor C

2015-01-01

Clone libraries provide researchers with a powerful resource to study nucleic acid from diverse sources. Metagenomic clone libraries in particular have aided in studies of microbial biodiversity and function, and allowed the mining of novel enzymes. Libraries are often constructed by cloning large inserts into cosmid or fosmid vectors. Recently, there have been reports of GC bias in fosmid metagenomic libraries, and it was speculated to be a result of fragmentation and loss of AT-rich sequences during cloning. However, evidence in the literature suggests that transcriptional activity or gene product toxicity may play a role. To explore possible mechanisms responsible for sequence bias in clone libraries, we constructed a cosmid library from a human microbiome sample and sequenced DNA from different steps during library construction: crude extract DNA, size-selected DNA, and cosmid library DNA. We confirmed a GC bias in the final cosmid library, and we provide evidence that the bias is not due to fragmentation and loss of AT-rich sequences but is likely occurring after DNA is introduced into Escherichia coli. To investigate the influence of strong constitutive transcription, we searched the sequence data for promoters and found that rpoD/σ(70) promoter sequences were underrepresented in the cosmid library. Furthermore, when we examined the genomes of taxa that were differentially abundant in the cosmid library relative to the original sample, we found the bias to be more correlated with the number of rpoD/σ(70) consensus sequences in the genome than with simple GC content. The GC bias of metagenomic libraries does not appear to be due to DNA fragmentation. Rather, analysis of promoter sequences provides support for the hypothesis that strong constitutive transcription from sequences recognized as rpoD/σ(70) consensus-like in E. coli may lead to instability, causing loss of the plasmid or loss of the insert DNA that gives rise to the transcription. Despite widespread use of E. coli to propagate foreign DNA in metagenomic libraries, the effects of in vivo transcriptional activity on clone stability are not well understood. Further work is required to tease apart the effects of transcription from those of gene product toxicity.
Plant centromeres: structure and control.

PubMed

Richards, E J; Dawe, R K

1998-04-01

Recent work has led to a better understanding of the molecular components of plant centromeres. Conservation of at least some centromere protein constituents between plant and non-plant systems has been demonstrated. The identity and organization of plant centromeric DNA sequences are also beginning to yield to analysis. While there is little primary DNA sequence conservation among the characterized plant centromeres and their non-plant counterparts, some parallels in centromere genomic organisation can be seen across species. Finally, the emerging idea that centromere activity is controlled epigenetically finds support in an examination of the plant centromere literature.

DNA gel electrophoresis: the reptation model(s).

PubMed

Slater, Gary W

2009-06-01

DNA gel electrophoresis has been the most important experimental tool to separate DNA fragments for several decades. The introduction of PFGE in the 1980s and capillary gel electrophoresis in the 1990s made it possible to study, map and sequence entire genomes. Explaining how very large DNA molecules move in a gel and why PFGE is needed to separate them has been an active field of research ever since the launch of the journal Electrophoresis. This article presents a personal and historical overview of the development of the theory of gel electrophoresis, focusing on the reptation model, the band broadening mechanisms, and finally the factors that limit the read length and the resolution of electrophoresis-based sequencing systems. I conclude with a short discussion of some of the questions that remain unanswered.
Predicting DNA hybridization kinetics from sequence

NASA Astrophysics Data System (ADS)

Zhang, Jinny X.; Fang, John Z.; Duan, Wei; Wu, Lucia R.; Zhang, Angela W.; Dalchau, Neil; Yordanov, Boyan; Petersen, Rasmus; Phillips, Andrew; Zhang, David Yu

2018-01-01

Hybridization is a key molecular process in biology and biotechnology, but so far there is no predictive model for accurately determining hybridization rate constants based on sequence information. Here, we report a weighted neighbour voting (WNV) prediction algorithm, in which the hybridization rate constant of an unknown sequence is predicted based on similarity reactions with known rate constants. To construct this algorithm we first performed 210 fluorescence kinetics experiments to observe the hybridization kinetics of 100 different DNA target and probe pairs (36 nt sub-sequences of the CYCS and VEGF genes) at temperatures ranging from 28 to 55 °C. Automated feature selection and weighting optimization resulted in a final six-feature WNV model, which can predict hybridization rate constants of new sequences to within a factor of 3 with ∼91% accuracy, based on leave-one-out cross-validation. Accurate prediction of hybridization kinetics allows the design of efficient probe sequences for genomics research.
Ultra-low background DNA cloning system.

PubMed

Goto, Kenta; Nagano, Yukio

2013-01-01

Yeast-based in vivo cloning is useful for cloning DNA fragments into plasmid vectors and is based on the ability of yeast to recombine the DNA fragments by homologous recombination. Although this method is efficient, it produces some by-products. We have developed an "ultra-low background DNA cloning system" on the basis of yeast-based in vivo cloning, by almost completely eliminating the generation of by-products and applying the method to commonly used Escherichia coli vectors, particularly those lacking yeast replication origins and carrying an ampicillin resistance gene (Amp(r)). First, we constructed a conversion cassette containing the DNA sequences in the following order: an Amp(r) 5' UTR (untranslated region) and coding region, an autonomous replication sequence and a centromere sequence from yeast, a TRP1 yeast selectable marker, and an Amp(r) 3' UTR. This cassette allowed conversion of the Amp(r)-containing vector into the yeast/E. coli shuttle vector through use of the Amp(r) sequence by homologous recombination. Furthermore, simultaneous transformation of the desired DNA fragment into yeast allowed cloning of this DNA fragment into the same vector. We rescued the plasmid vectors from all yeast transformants, and by-products containing the E. coli replication origin disappeared. Next, the rescued vectors were transformed into E. coli and the by-products containing the yeast replication origin disappeared. Thus, our method used yeast- and E. coli-specific "origins of replication" to eliminate the generation of by-products. Finally, we successfully cloned the DNA fragment into the vector with almost 100% efficiency.
Protection of CpG islands from DNA methylation is DNA-encoded and evolutionarily conserved

PubMed Central

Long, Hannah K.; King, Hamish W.; Patient, Roger K.; Odom, Duncan T.; Klose, Robert J.

2016-01-01

DNA methylation is a repressive epigenetic modification that covers vertebrate genomes. Regions known as CpG islands (CGIs), which are refractory to DNA methylation, are often associated with gene promoters and play central roles in gene regulation. Yet how CGIs in their normal genomic context evade the DNA methylation machinery and whether these mechanisms are evolutionarily conserved remains enigmatic. To address these fundamental questions we exploited a transchromosomic animal model and genomic approaches to understand how the hypomethylated state is formed in vivo and to discover whether mechanisms governing CGI formation are evolutionarily conserved. Strikingly, insertion of a human chromosome into mouse revealed that promoter-associated CGIs are refractory to DNA methylation regardless of host species, demonstrating that DNA sequence plays a central role in specifying the hypomethylated state through evolutionarily conserved mechanisms. In contrast, elements distal to gene promoters exhibited more variable methylation between host species, uncovering a widespread dependence on nucleotide frequency and occupancy of DNA-binding transcription factors in shaping the DNA methylation landscape away from gene promoters. This was exemplified by young CpG rich lineage-restricted repeat sequences that evaded DNA methylation in the absence of co-evolved mechanisms targeting methylation to these sequences, and species specific DNA binding events that protected against DNA methylation in CpG poor regions. Finally, transplantation of mouse chromosomal fragments into the evolutionarily distant zebrafish uncovered the existence of a mechanistically conserved and DNA-encoded logic which shapes CGI formation across vertebrate species. PMID:27084945
Simple diazonium chemistry to develop specific gene sensing platforms.

PubMed

Revenga-Parra, M; García-Mendiola, T; González-Costas, J; González-Romero, E; Marín, A García; Pau, J L; Pariente, F; Lorenzo, E

2014-02-27

A simple strategy for covalent immobilizing DNA sequences, based on the formation of stable diazonized conducting platforms, is described. The electrochemical reduction of 4-nitrobenzenediazonium salt onto screen-printed carbon electrodes (SPCE) in aqueous media gives rise to terminal grafted amino groups. The presence of primary aromatic amines allows the formation of diazonium cations capable to react with the amines present at the DNA capture probe. As a comparison a second strategy based on the binding of aminated DNA capture probes to the developed diazonized conducting platforms through a crosslinking agent was also employed. The resulting DNA sensing platforms were characterized by cyclic voltammetry, electrochemical impedance spectroscopy and spectroscopic ellipsometry. The hybridization event with the complementary sequence was detected using hexaamineruthenium (III) chloride as electrochemical indicator. Finally, they were applied to the analysis of a 145-bp sequence from the human gene MRP3, reaching a detection limit of 210 pg μL(-1). Copyright © 2014 Elsevier B.V. All rights reserved.
mtDNA-Server: next-generation sequencing data analysis of human mitochondrial DNA in the cloud.

PubMed

Weissensteiner, Hansi; Forer, Lukas; Fuchsberger, Christian; Schöpf, Bernd; Kloss-Brandstätter, Anita; Specht, Günther; Kronenberg, Florian; Schönherr, Sebastian

2016-07-08

Next generation sequencing (NGS) allows investigating mitochondrial DNA (mtDNA) characteristics such as heteroplasmy (i.e. intra-individual sequence variation) to a higher level of detail. While several pipelines for analyzing heteroplasmies exist, issues in usability, accuracy of results and interpreting final data limit their usage. Here we present mtDNA-Server, a scalable web server for the analysis of mtDNA studies of any size with a special focus on usability as well as reliable identification and quantification of heteroplasmic variants. The mtDNA-Server workflow includes parallel read alignment, heteroplasmy detection, artefact or contamination identification, variant annotation as well as several quality control metrics, often neglected in current mtDNA NGS studies. All computational steps are parallelized with Hadoop MapReduce and executed graphically with Cloudgene. We validated the underlying heteroplasmy and contamination detection model by generating four artificial sample mix-ups on two different NGS devices. Our evaluation data shows that mtDNA-Server detects heteroplasmies and artificial recombinations down to the 1% level with perfect specificity and outperforms existing approaches regarding sensitivity. mtDNA-Server is currently able to analyze the 1000G Phase 3 data (n = 2,504) in less than 5 h and is freely accessible at https://mtdna-server.uibk.ac.at. © The Author(s) 2016. Published by Oxford University Press on behalf of Nucleic Acids Research.
Novel division level bacterial diversity in a Yellowstone hot spring.

PubMed

Hugenholtz, P; Pitulle, C; Hershberger, K L; Pace, N R

1998-01-01

A culture-independent molecular phylogenetic survey was carried out for the bacterial community in Obsidian Pool (OP), a Yellowstone National Park hot spring previously shown to contain remarkable archaeal diversity (S. M. Barns, R. E. Fundyga, M. W. Jeffries, and N. R. Page, Proc. Natl. Acad. Sci. USA 91:1609-1613, 1994). Small-subunit rRNA genes (rDNA) were amplified directly from OP sediment DNA by PCR with universally conserved or Bacteria-specific rDNA primers and cloned. Unique rDNA types among > 300 clones were identified by restriction fragment length polymorphism, and 122 representative rDNA sequences were determined. These were found to represent 54 distinct bacterial sequence types or clusters (> or = 98% identity) of sequences. A majority (70%) of the sequence types were affiliated with 14 previously recognized bacterial divisions (main phyla; kingdoms); 30% were unaffiliated with recognized bacterial divisions. The unaffiliated sequence types (represented by 38 sequences) nominally comprise 12 novel, division level lineages termed candidate divisions. Several OP sequences were nearly identical to those of cultivated chemolithotrophic thermophiles, including the hydrogen-oxidizing Calderobacterium and the sulfate reducers Thermodesulfovibrio and Thermodesulfobacterium, or belonged to monophyletic assemblages recognized for a particular type of metabolism, such as the hydrogen-oxidizing Aquificales and the sulfate-reducing delta-Proteobacteria. The occurrence of such organisms is consistent with the chemical composition of OP (high in reduced iron and sulfur) and suggests a lithotrophic base for primary productivity in this hot spring, through hydrogen oxidation and sulfate reduction. Unexpectedly, no archaeal sequences were encountered in OP clone libraries made with universal primers. Hybridization analysis of amplified OP DNA with domain-specific probes confirmed that the analyzed community rDNA from OP sediment was predominantly bacterial. These results expand substantially our knowledge of the extent of bacterial diversity and call into question the commonly held notion that Archaea dominate hydrothermal environments. Finally, the currently known extent of division level bacterial phylogenetic diversity is collated and summarized.
CDSbank: taxonomy-aware extraction, selection, renaming and formatting of protein-coding DNA or amino acid sequences.

PubMed

Hazes, Bart

2014-02-28

Protein-coding DNA sequences and their corresponding amino acid sequences are routinely used to study relationships between sequence, structure, function, and evolution. The rapidly growing size of sequence databases increases the power of such comparative analyses but it makes it more challenging to prepare high quality sequence data sets with control over redundancy, quality, completeness, formatting, and labeling. Software tools for some individual steps in this process exist but manual intervention remains a common and time consuming necessity. CDSbank is a database that stores both the protein-coding DNA sequence (CDS) and amino acid sequence for each protein annotated in Genbank. CDSbank also stores Genbank feature annotation, a flag to indicate incomplete 5' and 3' ends, full taxonomic data, and a heuristic to rank the scientific interest of each species. This rich information allows fully automated data set preparation with a level of sophistication that aims to meet or exceed manual processing. Defaults ensure ease of use for typical scenarios while allowing great flexibility when needed. Access is via a free web server at http://hazeslab.med.ualberta.ca/CDSbank/. CDSbank presents a user-friendly web server to download, filter, format, and name large sequence data sets. Common usage scenarios can be accessed via pre-programmed default choices, while optional sections give full control over the processing pipeline. Particular strengths are: extract protein-coding DNA sequences just as easily as amino acid sequences, full access to taxonomy for labeling and filtering, awareness of incomplete sequences, and the ability to take one protein sequence and extract all synonymous CDS or identical protein sequences in other species. Finally, CDSbank can also create labeled property files to, for instance, annotate or re-label phylogenetic trees.
Simian virus 40 major late promoter: an upstream DNA sequence required for efficient in vitro transcription.

PubMed Central

Brady, J; Radonovich, M; Thoren, M; Das, G; Salzman, N P

1984-01-01

We have previously identified an 11-base DNA sequence, 5'-G-G-T-A-C-C-T-A-A-C-C-3' (simian virus 40 [SV40] map position 294 to 304), which is important in the control of SV40 late RNA expression in vitro and in vivo (Brady et al., Cell 31:625-633, 1982). We report here the identification of another domain of the SV40 late promoter. A series of mutants with deletions extending from SV40 map position 0 to 300 was prepared by nuclease BAL 31 treatment. The cloned templates were then analyzed for efficiency and accuracy of late SV40 RNA expression in the Manley in vitro transcription system. Our studies showed that, in addition to the promoter domain near map position 300, there are essential DNA sequences between nucleotide positions 74 and 95 that are required for efficient expression of late SV40 RNA. Included in this SV40 DNA sequence were two of the six GGGCGG SV40 repeat sequences and an 11-nucleotide segment which showed strong homology with the upstream sequences required for the efficient in vitro and in vivo expression of the histone H2A gene. This upstream promoter sequence supported transcription with the same efficiency even when it was moved 72 nucleotides closer to the major late cap site. In vitro promoter competition analysis demonstrated that the upstream promoter sequence, independent of the 294 to 304 promoter element, is capable of binding polymerase-transcription factors required for SV40 late gene transcription. Finally, we show that DNA sequences which control the specificity of RNA initiation at nucleotide 325 lie downstream of map position 294. Images PMID:6321950
Mitochondrial Mutations in Subjects with Psychiatric Disorders

PubMed Central

Magnan, Christophe; van Oven, Mannis; Baldi, Pierre; Myers, Richard M.; Barchas, Jack D.; Schatzberg, Alan F.; Watson, Stanley J.; Akil, Huda; Bunney, William E.; Vawter, Marquis P.

2015-01-01

A considerable body of evidence supports the role of mitochondrial dysfunction in psychiatric disorders and mitochondrial DNA (mtDNA) mutations are known to alter brain energy metabolism, neurotransmission, and cause neurodegenerative disorders. Genetic studies focusing on common nuclear genome variants associated with these disorders have produced genome wide significant results but those studies have not directly studied mtDNA variants. The purpose of this study is to investigate, using next generation sequencing, the involvement of mtDNA variation in bipolar disorder, schizophrenia, major depressive disorder, and methamphetamine use. MtDNA extracted from multiple brain regions and blood were sequenced (121 mtDNA samples with an average of 8,800x coverage) and compared to an electronic database containing 26,850 mtDNA genomes. We confirmed novel and rare variants, and confirmed next generation sequencing error hotspots by traditional sequencing and genotyping methods. We observed a significant increase of non-synonymous mutations found in individuals with schizophrenia. Novel and rare non-synonymous mutations were found in psychiatric cases in mtDNA genes: ND6, ATP6, CYTB, and ND2. We also observed mtDNA heteroplasmy in brain at a locus previously associated with schizophrenia (T16519C). Large differences in heteroplasmy levels across brain regions within subjects suggest that somatic mutations accumulate differentially in brain regions. Finally, multiplasmy, a heteroplasmic measure of repeat length, was observed in brain from selective cases at a higher frequency than controls. These results offer support for increased rates of mtDNA substitutions in schizophrenia shown in our prior results. The variable levels of heteroplasmic/multiplasmic somatic mutations that occur in brain may be indicators of genetic instability in mtDNA. PMID:26011537
Nullomers and High Order Nullomers in Genomic Sequences

PubMed Central

Vergni, Davide; Santoni, Daniele

2016-01-01

A nullomer is an oligomer that does not occur as a subsequence in a given DNA sequence, i.e. it is an absent word of that sequence. The importance of nullomers in several applications, from drug discovery to forensic practice, is now debated in the literature. Here, we investigated the nature of nullomers, whether their absence in genomes has just a statistical explanation or it is a peculiar feature of genomic sequences. We introduced an extension of the notion of nullomer, namely high order nullomers, which are nullomers whose mutated sequences are still nullomers. We studied different aspects of them: comparison with nullomers of random sequences, CpG distribution and mean helical rise. In agreement with previous results we found that the number of nullomers in the human genome is much larger than expected by chance. Nevertheless antithetical results were found when considering a random DNA sequence preserving dinucleotide frequencies. The analysis of CpG frequencies in nullomers and high order nullomers revealed, as expected, a high CpG content but it also highlighted a strong dependence of CpG frequencies on the dinucleotide position, suggesting that nullomers have their own peculiar structure and are not simply sequences whose CpG frequency is biased. Furthermore, phylogenetic trees were built on eleven species based on both the similarities between the dinucleotide frequencies and the number of nullomers two species share, showing that nullomers are fairly conserved among close species. Finally the study of mean helical rise of nullomers sequences revealed significantly high mean rise values, reinforcing the hypothesis that those sequences have some peculiar structural features. The obtained results show that nullomers are the consequence of the peculiar structure of DNA (also including biased CpG frequency and CpGs islands), so that the hypermutability model, also taking into account CpG islands, seems to be not sufficient to explain nullomer phenomenon. Finally, high order nullomers could emphasize those features that already make simple nullomers useful in several applications. PMID:27906971
Nanopore sequencing in microgravity

PubMed Central

McIntyre, Alexa B R; Rizzardi, Lindsay; Yu, Angela M; Alexander, Noah; Rosen, Gail L; Botkin, Douglas J; Stahl, Sarah E; John, Kristen K; Castro-Wallace, Sarah L; McGrath, Ken; Burton, Aaron S; Feinberg, Andrew P; Mason, Christopher E

2016-01-01

Rapid DNA sequencing and analysis has been a long-sought goal in remote research and point-of-care medicine. In microgravity, DNA sequencing can facilitate novel astrobiological research and close monitoring of crew health, but spaceflight places stringent restrictions on the mass and volume of instruments, crew operation time, and instrument functionality. The recent emergence of portable, nanopore-based tools with streamlined sample preparation protocols finally enables DNA sequencing on missions in microgravity. As a first step toward sequencing in space and aboard the International Space Station (ISS), we tested the Oxford Nanopore Technologies MinION during a parabolic flight to understand the effects of variable gravity on the instrument and data. In a successful proof-of-principle experiment, we found that the instrument generated DNA reads over the course of the flight, including the first ever sequenced in microgravity, and additional reads measured after the flight concluded its parabolas. Here we detail modifications to the sample-loading procedures to facilitate nanopore sequencing aboard the ISS and in other microgravity environments. We also evaluate existing analysis methods and outline two new approaches, the first based on a wave-fingerprint method and the second on entropy signal mapping. Computationally light analysis methods offer the potential for in situ species identification, but are limited by the error profiles (stays, skips, and mismatches) of older nanopore data. Higher accuracies attainable with modified sample processing methods and the latest version of flow cells will further enable the use of nanopore sequencers for diagnostics and research in space. PMID:28725742
Decision Tree Algorithm-Generated Single-Nucleotide Polymorphism Barcodes of rbcL Genes for 38 Brassicaceae Species Tagging.

PubMed

Yang, Cheng-Hong; Wu, Kuo-Chuan; Chuang, Li-Yeh; Chang, Hsueh-Wei

2018-01-01

DNA barcode sequences are accumulating in large data sets. A barcode is generally a sequence larger than 1000 base pairs and generates a computational burden. Although the DNA barcode was originally envisioned as straightforward species tags, the identification usage of barcode sequences is rarely emphasized currently. Single-nucleotide polymorphism (SNP) association studies provide us an idea that the SNPs may be the ideal target of feature selection to discriminate between different species. We hypothesize that SNP-based barcodes may be more effective than the full length of DNA barcode sequences for species discrimination. To address this issue, we tested a r ibulose diphosphate carboxylase ( rbcL ) S NP b arcoding (RSB) strategy using a decision tree algorithm. After alignment and trimming, 31 SNPs were discovered in the rbcL sequences from 38 Brassicaceae plant species. In the decision tree construction, these SNPs were computed to set up the decision rule to assign the sequences into 2 groups level by level. After algorithm processing, 37 nodes and 31 loci were required for discriminating 38 species. Finally, the sequence tags consisting of 31 rbcL SNP barcodes were identified for discriminating 38 Brassicaceae species based on the decision tree-selected SNP pattern using RSB method. Taken together, this study provides the rational that the SNP aspect of DNA barcode for rbcL gene is a useful and effective sequence for tagging 38 Brassicaceae species.
VaDiR: an integrated approach to Variant Detection in RNA.

PubMed

Neums, Lisa; Suenaga, Seiji; Beyerlein, Peter; Anders, Sara; Koestler, Devin; Mariani, Andrea; Chien, Jeremy

2018-02-01

Advances in next-generation DNA sequencing technologies are now enabling detailed characterization of sequence variations in cancer genomes. With whole-genome sequencing, variations in coding and non-coding sequences can be discovered. But the cost associated with it is currently limiting its general use in research. Whole-exome sequencing is used to characterize sequence variations in coding regions, but the cost associated with capture reagents and biases in capture rate limit its full use in research. Additional limitations include uncertainty in assigning the functional significance of the mutations when these mutations are observed in the non-coding region or in genes that are not expressed in cancer tissue. We investigated the feasibility of uncovering mutations from expressed genes using RNA sequencing datasets with a method called Variant Detection in RNA(VaDiR) that integrates 3 variant callers, namely: SNPiR, RVBoost, and MuTect2. The combination of all 3 methods, which we called Tier 1 variants, produced the highest precision with true positive mutations from RNA-seq that could be validated at the DNA level. We also found that the integration of Tier 1 variants with those called by MuTect2 and SNPiR produced the highest recall with acceptable precision. Finally, we observed a higher rate of mutation discovery in genes that are expressed at higher levels. Our method, VaDiR, provides a possibility of uncovering mutations from RNA sequencing datasets that could be useful in further functional analysis. In addition, our approach allows orthogonal validation of DNA-based mutation discovery by providing complementary sequence variation analysis from paired RNA/DNA sequencing datasets.
A Personal Journey of Discovery: Developing Technology and Changing Biology

NASA Astrophysics Data System (ADS)

Hood, Lee

2008-07-01

This autobiographical article describes my experiences in developing chemically based, biological technologies for deciphering biological information: DNA, RNA, proteins, interactions, and networks. The instruments developed include protein and DNA sequencers and synthesizers, as well as ink-jet technology for synthesizing DNA chips. Diverse new strategies for doing biology also arose from novel applications of these instruments. The functioning of these instruments can be integrated to generate powerful new approaches to cloning and characterizing genes from a small amount of protein sequence or to using gene sequences to synthesize peptide fragments so as to characterize various properties of the proteins. I also discuss the five paradigm changes in which I have participated: the development and integration of biological instrumentation; the human genome project; cross-disciplinary biology; systems biology; and predictive, personalized, preventive, and participatory (P4) medicine. Finally, I discuss the origins, the philosophy, some accomplishments, and the future trajectories of the Institute for Systems Biology.
Analysis of methylated patterns and quality-related genes in tobacco (Nicotiana tabacum) cultivars.

PubMed

Jiao, Junna; Jia, Yanlong; Lv, Zhuangwei; Sun, Chuanfei; Gao, Lijie; Yan, Xiaoxiao; Cui, Liusu; Tang, Zongxiang; Yan, Benju

2014-08-01

Methylation-sensitive amplified polymorphism was used in this study to investigate epigenetic information of four tobacco cultivars: Yunyan 85, NC89, K326, and Yunyan 87. The DNA fragments with methylated information were cloned by reamplified PCR and sequenced. The results of Blast alignments showed that the genes with methylation information included chitinase, nitrate reductase, chloroplast DNA, mitochondrial DNA, ornithine decarboxylase, ribulose carboxylase, and promoter sequences. Homologous comparison in three cloned gene sequences (nitrate reductase, ornithine decarboxylase, and ribulose decarboxylase) indicated that geographic factors had significant influence on the whole genome methylation. Introns also contained different information in different tobacco cultivars. These findings suggest that synthetic mechanisms for tobacco aromatic components could be affected by different environmental factors leading to variation of noncoding regions in the genome, which finally results in different fragrance and taste in different tobacco cultivars.
Diversity of halophilic archaea from six hypersaline environments in Turkey.

PubMed

Ozcan, Birgul; Ozcengiz, Gulay; Coleri, Arzu; Cokmus, Cumhur

2007-06-01

The diversity of archaeal strains from six hypersaline environments in Turkey was analyzed by comparing their phenotypic characteristics and 16S rDNA sequences. Thirty-three isolates were characterized in terms of their phenotypic properties including morphological and biochemical characteristics, susceptibility to different antibiotics, and total lipid and plasmid contents, and finally compared by 16S rDNA gene sequences. The results showed that all isolates belong to the family Halobacteriaceae. Phylogenetic analyses using approximately 1,388 bp comparisions of 16S rDNA sequences demonstrated that all isolates clustered closely to species belonging to 9 genera, namely Halorubrum (8 isolates), Natrinema (5 isolates), Haloarcula (4 isolates), Natronococcus (4 isolates), Natrialba (4 isolates), Haloferax (3 isolates), Haloterrigena (3 isolates), Halalkalicoccus (1 isolate), and Halomicrobium (1 isolate). The results revealed a high diversity among the isolated halophilic strains and indicated that some of these strains constitute new taxa of extremely halophilic archaea.
A programmable Cas9-serine recombinase fusion protein that operates on DNA sequences in mammalian cells

PubMed Central

Chaikind, Brian; Bessen, Jeffrey L.; Thompson, David B.; Hu, Johnny H.; Liu, David R.

2016-01-01

We describe the development of ‘recCas9’, an RNA-programmed small serine recombinase that functions in mammalian cells. We fused a catalytically inactive dCas9 to the catalytic domain of Gin recombinase using an optimized fusion architecture. The resulting recCas9 system recombines DNA sites containing a minimal recombinase core site flanked by guide RNA-specified sequences. We show that these recombinases can operate on DNA sites in mammalian cells identical to genomic loci naturally found in the human genome in a manner that is dependent on the guide RNA sequences. DNA sequencing reveals that recCas9 catalyzes guide RNA-dependent recombination in human cells with an efficiency as high as 32% on plasmid substrates. Finally, we demonstrate that recCas9 expressed in human cells can catalyze in situ deletion between two genomic sites. Because recCas9 directly catalyzes recombination, it generates virtually no detectable indels or other stochastic DNA modification products. This work represents a step toward programmable, scarless genome editing in unmodified cells that is independent of endogenous cellular machinery or cell state. Current and future generations of recCas9 may facilitate targeted agricultural breeding, or the study and treatment of human genetic diseases. PMID:27515511
Quantum mechanical calculations related to ionization and charge transfer in DNA

NASA Astrophysics Data System (ADS)

Cauët, E.; Valiev, M.; Weare, J. H.; Liévin, J.

2012-07-01

Ionization and charge migration in DNA play crucial roles in mechanisms of DNA damage caused by ionizing radiation, oxidizing agents and photo-irradiation. Therefore, an evaluation of the ionization properties of the DNA bases is central to the full interpretation and understanding of the elementary reactive processes that occur at the molecular level during the initial exposure and afterwards. Ab initio quantum mechanical (QM) methods have been successful in providing highly accurate evaluations of key parameters, such as ionization energies (IE) of DNA bases. Hence, in this study, we performed high-level QM calculations to characterize the molecular energy levels and potential energy surfaces, which shed light on ionization and charge migration between DNA bases. In particular, we examined the IEs of guanine, the most easily oxidized base, isolated and embedded in base clusters, and investigated the mechanism of charge migration over two and three stacked guanines. The IE of guanine in the human telomere sequence has also been evaluated. We report a simple molecular orbital analysis to explain how modifications in the base sequence are expected to change the efficiency of the sequence as a hole trap. Finally, the application of a hybrid approach combining quantum mechanics with molecular mechanics brings an interesting discussion as to how the native aqueous DNA environment affects the IE threshold of nucleobases.
Genotype Specification Language.

PubMed

Wilson, Erin H; Sagawa, Shiori; Weis, James W; Schubert, Max G; Bissell, Michael; Hawthorne, Brian; Reeves, Christopher D; Dean, Jed; Platt, Darren

2016-06-17

We describe here the Genotype Specification Language (GSL), a language that facilitates the rapid design of large and complex DNA constructs used to engineer genomes. The GSL compiler implements a high-level language based on traditional genetic notation, as well as a set of low-level DNA manipulation primitives. The language allows facile incorporation of parts from a library of cloned DNA constructs and from the "natural" library of parts in fully sequenced and annotated genomes. GSL was designed to engage genetic engineers in their native language while providing a framework for higher level abstract tooling. To this end we define four language levels, Level 0 (literal DNA sequence) through Level 3, with increasing abstraction of part selection and construction paths. GSL targets an intermediate language based on DNA slices that translates efficiently into a wide range of final output formats, such as FASTA and GenBank, and includes formats that specify instructions and materials such as oligonucleotide primers to allow the physical construction of the GSL designs by individual strain engineers or an automated DNA assembly core facility.

Cloud-based adaptive exon prediction for DNA analysis

PubMed Central

Putluri, Srinivasareddy; Fathima, Shaik Yasmeen

2018-01-01

Cloud computing offers significant research and economic benefits to healthcare organisations. Cloud services provide a safe place for storing and managing large amounts of such sensitive data. Under conventional flow of gene information, gene sequence laboratories send out raw and inferred information via Internet to several sequence libraries. DNA sequencing storage costs will be minimised by use of cloud service. In this study, the authors put forward a novel genomic informatics system using Amazon Cloud Services, where genomic sequence information is stored and accessed for processing. True identification of exon regions in a DNA sequence is a key task in bioinformatics, which helps in disease identification and design drugs. Three base periodicity property of exons forms the basis of all exon identification techniques. Adaptive signal processing techniques found to be promising in comparison with several other methods. Several adaptive exon predictors (AEPs) are developed using variable normalised least mean square and its maximum normalised variants to reduce computational complexity. Finally, performance evaluation of various AEPs is done based on measures such as sensitivity, specificity and precision using various standard genomic datasets taken from National Center for Biotechnology Information genomic sequence database. PMID:29515813
Second generation noninvasive fetal genome analysis reveals de novo mutations, single-base parental inheritance, and preferred DNA ends

PubMed Central

Chan, K. C. Allen; Jiang, Peiyong; Sun, Kun; Cheng, Yvonne K. Y.; Tong, Yu K.; Cheng, Suk Hang; Wong, Ada I. C.; Hudecova, Irena; Leung, Tak Y.; Chiu, Rossa W. K.; Lo, Yuk Ming Dennis

2016-01-01

Plasma DNA obtained from a pregnant woman was sequenced to a depth of 270× haploid genome coverage. Comparing the maternal plasma DNA sequencing data with the parental genomic DNA data and using a series of bioinformatics filters, fetal de novo mutations were detected at a sensitivity of 85% and a positive predictive value of 74%. These results represent a 169-fold improvement in the positive predictive value over previous attempts. Improvements in the interpretation of the sequence information of every base position in the genome allowed us to interrogate the maternal inheritance of the fetus for 618,271 of 656,676 (94.2%) heterozygous SNPs within the maternal genome. The fetal genotype at each of these sites was deduced individually, unlike previously, where the inheritance was determined for a collection of sites within a haplotype. These results represent a 90-fold enhancement in the resolution in determining the fetus’s maternal inheritance. Selected genomic locations were more likely to be found at the ends of plasma DNA molecules. We found that a subset of such preferred ends exhibited selectivity for fetal- or maternal-derived DNA in maternal plasma. The ratio of the number of maternal plasma DNA molecules with fetal preferred ends to those with maternal preferred ends showed a correlation with the fetal DNA fraction. Finally, this second generation approach for noninvasive fetal whole-genome analysis was validated in a pregnancy diagnosed with cardiofaciocutaneous syndrome with maternal plasma DNA sequenced to 195× coverage. The causative de novo BRAF mutation was successfully detected through the maternal plasma DNA analysis. PMID:27799561
Automated selection of synthetic biology parts for genetic regulatory networks.

PubMed

Yaman, Fusun; Bhatia, Swapnil; Adler, Aaron; Densmore, Douglas; Beal, Jacob

2012-08-17

Raising the level of abstraction for synthetic biology design requires solving several challenging problems, including mapping abstract designs to DNA sequences. In this paper we present the first formalism and algorithms to address this problem. The key steps of this transformation are feature matching, signal matching, and part matching. Feature matching ensures that the mapping satisfies the regulatory relationships in the abstract design. Signal matching ensures that the expression levels of functional units are compatible. Finally, part matching finds a DNA part sequence that can implement the design. Our software tool MatchMaker implements these three steps.
Scaling features of noncoding DNA

NASA Technical Reports Server (NTRS)

Stanley, H. E.; Buldyrev, S. V.; Goldberger, A. L.; Havlin, S.; Peng, C. K.; Simons, M.

1999-01-01

We review evidence supporting the idea that the DNA sequence in genes containing noncoding regions is correlated, and that the correlation is remarkably long range--indeed, base pairs thousands of base pairs distant are correlated. We do not find such a long-range correlation in the coding regions of the gene, and utilize this fact to build a Coding Sequence Finder Algorithm, which uses statistical ideas to locate the coding regions of an unknown DNA sequence. Finally, we describe briefly some recent work adapting to DNA the Zipf approach to analyzing linguistic texts, and the Shannon approach to quantifying the "redundancy" of a linguistic text in terms of a measurable entropy function, and reporting that noncoding regions in eukaryotes display a larger redundancy than coding regions. Specifically, we consider the possibility that this result is solely a consequence of nucleotide concentration differences as first noted by Bonhoeffer and his collaborators. We find that cytosine-guanine (CG) concentration does have a strong "background" effect on redundancy. However, we find that for the purine-pyrimidine binary mapping rule, which is not affected by the difference in CG concentration, the Shannon redundancy for the set of analyzed sequences is larger for noncoding regions compared to coding regions.
In and out of the rRNA genes: characterization of Pokey elements in the sequenced Daphnia genome

PubMed Central

2013-01-01

Background Only a few transposable elements are known to exhibit site-specific insertion patterns, including the well-studied R-element retrotransposons that insert into specific sites within the multigene rDNA. The only known rDNA-specific DNA transposon, Pokey (superfamily: piggyBac) is found in the freshwater microcrustacean, Daphnia pulex. Here, we present a genome-wide analysis of Pokey based on the recently completed whole genome sequencing project for D. pulex. Results Phylogenetic analysis of Pokey elements recovered from the genome sequence revealed the presence of four lineages corresponding to two divergent autonomous families and two related lineages of non-autonomous miniature inverted repeat transposable elements (MITEs). The MITEs are also found at the same 28S rRNA gene insertion site as the Pokey elements, and appear to have arisen as deletion derivatives of autonomous elements. Several copies of the full-length Pokey elements may be capable of producing an active transposase. Surprisingly, both families of Pokey possess a series of 200 bp repeats upstream of the transposase that is derived from the rDNA intergenic spacer (IGS). The IGS sequences within the Pokey elements appear to be evolving in concert with the rDNA units. Finally, analysis of the insertion sites of Pokey elements outside of rDNA showed a target preference for sites similar to the specific sequence that is targeted within rDNA. Conclusions Based on the target site preference of Pokey elements and the concerted evolution of a segment of the element with the rDNA unit, we propose an evolutionary path by which the ancestors of Pokey elements have invaded the rDNA niche. We discuss how specificity for the rDNA unit may have evolved and how this specificity has played a role in the long-term survival of these elements in the subgenus Daphnia. PMID:24059783
Protection of CpG islands from DNA methylation is DNA-encoded and evolutionarily conserved.

PubMed

Long, Hannah K; King, Hamish W; Patient, Roger K; Odom, Duncan T; Klose, Robert J

2016-08-19

DNA methylation is a repressive epigenetic modification that covers vertebrate genomes. Regions known as CpG islands (CGIs), which are refractory to DNA methylation, are often associated with gene promoters and play central roles in gene regulation. Yet how CGIs in their normal genomic context evade the DNA methylation machinery and whether these mechanisms are evolutionarily conserved remains enigmatic. To address these fundamental questions we exploited a transchromosomic animal model and genomic approaches to understand how the hypomethylated state is formed in vivo and to discover whether mechanisms governing CGI formation are evolutionarily conserved. Strikingly, insertion of a human chromosome into mouse revealed that promoter-associated CGIs are refractory to DNA methylation regardless of host species, demonstrating that DNA sequence plays a central role in specifying the hypomethylated state through evolutionarily conserved mechanisms. In contrast, elements distal to gene promoters exhibited more variable methylation between host species, uncovering a widespread dependence on nucleotide frequency and occupancy of DNA-binding transcription factors in shaping the DNA methylation landscape away from gene promoters. This was exemplified by young CpG rich lineage-restricted repeat sequences that evaded DNA methylation in the absence of co-evolved mechanisms targeting methylation to these sequences, and species specific DNA binding events that protected against DNA methylation in CpG poor regions. Finally, transplantation of mouse chromosomal fragments into the evolutionarily distant zebrafish uncovered the existence of a mechanistically conserved and DNA-encoded logic which shapes CGI formation across vertebrate species. © The Author(s) 2016. Published by Oxford University Press on behalf of Nucleic Acids Research.
Biorecognition by DNA oligonucleotides after Exposure to Photoresists and Resist Removers

PubMed Central

Dean, Stacey L.; Morrow, Thomas J.; Patrick, Sue; Li, Mingwei; Clawson, Gary; Mayer, Theresa S.; Keating, Christine D.

2013-01-01

Combining biological molecules with integrated circuit technology is of considerable interest for next generation sensors and biomedical devices. Current lithographic microfabrication methods, however, were developed for compatibility with silicon technology rather than bioorganic molecules and consequently it cannot be assumed that biomolecules will remain attached and intact during on-chip processing. Here, we evaluate the effects of three common photoresists (Microposit S1800 series, PMGI SF6, and Megaposit SPR 3012) and two photoresist removers (acetone and 1165 remover) on the ability of surface-immobilized DNA oligonucleotides to selectively recognize their reverse-complementary sequence. Two common DNA immobilization methods were compared: adsorption of 5′-thiolated sequences directly to gold nanowires and covalent attachment of 5′-thiolated sequences to surface amines on silica coated nanowires. We found that acetone had deleterious effects on selective hybridization as compared to 1165 remover, presumably due to incomplete resist removal. Use of the PMGI photoresist, which involves a high temperature bake step, was detrimental to the later performance of nanowire-bound DNA in hybridization assays, especially for DNA attached via thiol adsorption. The other three photoresists did not substantially degrade DNA binding capacity or selectivity for complementary DNA sequences. To determine if the lithographic steps caused more subtle damage, we also tested oligonucleotides containing a single base mismatch. Finally, a two-step photolithographic process was developed and used in combination with dielectrophoretic nanowire assembly to produce an array of doubly-contacted, electrically isolated individual nanowire components on a chip. Post-fabrication fluorescence imaging indicated that nanowire-bound DNA was present and able to selectively bind complementary strands. PMID:23952639
Superstatistical model of bacterial DNA architecture

NASA Astrophysics Data System (ADS)

Bogachev, Mikhail I.; Markelov, Oleg A.; Kayumov, Airat R.; Bunde, Armin

2017-02-01

Understanding the physical principles that govern the complex DNA structural organization as well as its mechanical and thermodynamical properties is essential for the advancement in both life sciences and genetic engineering. Recently we have discovered that the complex DNA organization is explicitly reflected in the arrangement of nucleotides depicted by the universal power law tailed internucleotide interval distribution that is valid for complete genomes of various prokaryotic and eukaryotic organisms. Here we suggest a superstatistical model that represents a long DNA molecule by a series of consecutive ~150 bp DNA segments with the alternation of the local nucleotide composition between segments exhibiting long-range correlations. We show that the superstatistical model and the corresponding DNA generation algorithm explicitly reproduce the laws governing the empirical nucleotide arrangement properties of the DNA sequences for various global GC contents and optimal living temperatures. Finally, we discuss the relevance of our model in terms of the DNA mechanical properties. As an outlook, we focus on finding the DNA sequences that encode a given protein while simultaneously reproducing the nucleotide arrangement laws observed from empirical genomes, that may be of interest in the optimization of genetic engineering of long DNA molecules.
Guidelines for whole genome bisulphite sequencing of intact and FFPET DNA on the Illumina HiSeq X Ten.

PubMed

Nair, Shalima S; Luu, Phuc-Loi; Qu, Wenjia; Maddugoda, Madhavi; Huschtscha, Lily; Reddel, Roger; Chenevix-Trench, Georgia; Toso, Martina; Kench, James G; Horvath, Lisa G; Hayes, Vanessa M; Stricker, Phillip D; Hughes, Timothy P; White, Deborah L; Rasko, John E J; Wong, Justin J-L; Clark, Susan J

2018-05-28

Comprehensive genome-wide DNA methylation profiling is critical to gain insights into epigenetic reprogramming during development and disease processes. Among the different genome-wide DNA methylation technologies, whole genome bisulphite sequencing (WGBS) is considered the gold standard for assaying genome-wide DNA methylation at single base resolution. However, the high sequencing cost to achieve the optimal depth of coverage limits its application in both basic and clinical research. To achieve 15× coverage of the human methylome, using WGBS, requires approximately three lanes of 100-bp-paired-end Illumina HiSeq 2500 sequencing. It is important, therefore, for advances in sequencing technologies to be developed to enable cost-effective high-coverage sequencing. In this study, we provide an optimised WGBS methodology, from library preparation to sequencing and data processing, to enable 16-20× genome-wide coverage per single lane of HiSeq X Ten, HCS 3.3.76. To process and analyse the data, we developed a WGBS pipeline (METH10X) that is fast and can call SNPs. We performed WGBS on both high-quality intact DNA and degraded DNA from formalin-fixed paraffin-embedded tissue. First, we compared different library preparation methods on the HiSeq 2500 platform to identify the best method for sequencing on the HiSeq X Ten. Second, we optimised the PhiX and genome spike-ins to achieve higher quality and coverage of WGBS data on the HiSeq X Ten. Third, we performed integrated whole genome sequencing (WGS) and WGBS of the same DNA sample in a single lane of HiSeq X Ten to improve data output. Finally, we compared methylation data from the HiSeq 2500 and HiSeq X Ten and found high concordance (Pearson r > 0.9×). Together we provide a systematic, efficient and complete approach to perform and analyse WGBS on the HiSeq X Ten. Our protocol allows for large-scale WGBS studies at reasonable processing time and cost on the HiSeq X Ten platform.
Sequence verification of synthetic DNA by assembly of sequencing reads

PubMed Central

Wilson, Mandy L.; Cai, Yizhi; Hanlon, Regina; Taylor, Samantha; Chevreux, Bastien; Setubal, João C.; Tyler, Brett M.; Peccoud, Jean

2013-01-01

Gene synthesis attempts to assemble user-defined DNA sequences with base-level precision. Verifying the sequences of construction intermediates and the final product of a gene synthesis project is a critical part of the workflow, yet one that has received the least attention. Sequence validation is equally important for other kinds of curated clone collections. Ensuring that the physical sequence of a clone matches its published sequence is a common quality control step performed at least once over the course of a research project. GenoREAD is a web-based application that breaks the sequence verification process into two steps: the assembly of sequencing reads and the alignment of the resulting contig with a reference sequence. GenoREAD can determine if a clone matches its reference sequence. Its sophisticated reporting features help identify and troubleshoot problems that arise during the sequence verification process. GenoREAD has been experimentally validated on thousands of gene-sized constructs from an ORFeome project, and on longer sequences including whole plasmids and synthetic chromosomes. Comparing GenoREAD results with those from manual analysis of the sequencing data demonstrates that GenoREAD tends to be conservative in its diagnostic. GenoREAD is available at www.genoread.org. PMID:23042248
Gene and genon concept: coding versus regulation

PubMed Central

2007-01-01

We analyse here the definition of the gene in order to distinguish, on the basis of modern insight in molecular biology, what the gene is coding for, namely a specific polypeptide, and how its expression is realized and controlled. Before the coding role of the DNA was discovered, a gene was identified with a specific phenotypic trait, from Mendel through Morgan up to Benzer. Subsequently, however, molecular biologists ventured to define a gene at the level of the DNA sequence in terms of coding. As is becoming ever more evident, the relations between information stored at DNA level and functional products are very intricate, and the regulatory aspects are as important and essential as the information coding for products. This approach led, thus, to a conceptual hybrid that confused coding, regulation and functional aspects. In this essay, we develop a definition of the gene that once again starts from the functional aspect. A cellular function can be represented by a polypeptide or an RNA. In the case of the polypeptide, its biochemical identity is determined by the mRNA prior to translation, and that is where we locate the gene. The steps from specific, but possibly separated sequence fragments at DNA level to that final mRNA then can be analysed in terms of regulation. For that purpose, we coin the new term “genon”. In that manner, we can clearly separate product and regulative information while keeping the fundamental relation between coding and function without the need to introduce a conceptual hybrid. In mRNA, the program regulating the expression of a gene is superimposed onto and added to the coding sequence in cis - we call it the genon. The complementary external control of a given mRNA by trans-acting factors is incorporated in its transgenon. A consequence of this definition is that, in eukaryotes, the gene is, in most cases, not yet present at DNA level. Rather, it is assembled by RNA processing, including differential splicing, from various pieces, as steered by the genon. It emerges finally as an uninterrupted nucleic acid sequence at mRNA level just prior to translation, in faithful correspondence with the amino acid sequence to be produced as a polypeptide. After translation, the genon has fulfilled its role and expires. The distinction between the protein coding information as materialised in the final polypeptide and the processing information represented by the genon allows us to set up a new information theoretic scheme. The standard sequence information determined by the genetic code expresses the relation between coding sequence and product. Backward analysis asks from which coding region in the DNA a given polypeptide originates. The (more interesting) forward analysis asks in how many polypeptides of how many different types a given DNA segment is expressed. This concerns the control of the expression process for which we have introduced the genon concept. Thus, the information theoretic analysis can capture the complementary aspects of coding and regulation, of gene and genon. PMID:18087760
Colorimetric and dynamic light scattering detection of DNA sequences by using positively charged gold nanospheres: a comparative study with gold nanorods

NASA Astrophysics Data System (ADS)

Pylaev, T. E.; Khanadeev, V. A.; Khlebtsov, B. N.; Dykman, L. A.; Bogatyrev, V. A.; Khlebtsov, N. G.

2011-07-01

We introduce a new genosensing approach employing CTAB (cetyltrimethylammonium bromide)-coated positively charged colloidal gold nanoparticles (GNPs) to detect target DNA sequences by using absorption spectroscopy and dynamic light scattering. The approach is compared with a previously reported method employing unmodified CTAB-coated gold nanorods (GNRs). Both approaches are based on the observation that whereas the addition of probe and target ssDNA to CTAB-coated particles results in particle aggregation, no aggregation is observed after addition of probe and nontarget DNA sequences. Our goal was to compare the feasibility and sensitivity of both methods. A 21-mer ssDNA from the human immunodeficiency virus type 1 HIV-1 U5 long terminal repeat (LTR) sequence and a 23-mer ssDNA from the Bacillus anthracis cryptic protein and protective antigen precursor (pagA) genes were used as ssDNA models. In the case of GNRs, unexpectedly, the colorimetric test failed with perfect cigar-like particles but could be performed with dumbbell and dog-bone rods. By contrast, our approach with cationic CTAB-coated GNPs is easy to implement and possesses excellent feasibility with retention of comparable sensitivity—a 0.1 nM concentration of target cDNA can be detected with the naked eye and 10 pM by dynamic light scattering (DLS) measurements. The specificity of our method is illustrated by successful DLS detection of one-three base mismatches in cDNA sequences for both DNA models. These results suggest that the cationic GNPs and DLS can be used for genosensing under optimal DNA hybridization conditions without any chemical modifications of the particle surface with ssDNA molecules and signal amplification. Finally, we discuss a more than two-three-order difference in the reported estimations of the detection sensitivity of colorimetric methods (0.1 to 10-100 pM) to show that the existing aggregation models are inconsistent with the detection limits of about 0.1-1 pM DNA and that other explanations should be developed.
Mechanism of duplex DNA destabilization by RNA-guided Cas9 nuclease during target interrogation

PubMed Central

Mekler, Vladimir; Minakhin, Leonid; Severinov, Konstantin

2017-01-01

The prokaryotic clustered regularly interspaced short palindromic repeats (CRISPR)-associated 9 (Cas9) endonuclease cleaves double-stranded DNA sequences specified by guide RNA molecules and flanked by a protospacer adjacent motif (PAM) and is widely used for genome editing in various organisms. The RNA-programmed Cas9 locates the target site by scanning genomic DNA. We sought to elucidate the mechanism of initial DNA interrogation steps that precede the pairing of target DNA with guide RNA. Using fluorometric and biochemical assays, we studied Cas9/guide RNA complexes with model DNA substrates that mimicked early intermediates on the pathway to the final Cas9/guide RNA–DNA complex. The results show that Cas9/guide RNA binding to PAM favors separation of a few PAM-proximal protospacer base pairs allowing initial target interrogation by guide RNA. The duplex destabilization is mediated, in part, by Cas9/guide RNA affinity for unpaired segments of nontarget strand DNA close to PAM. Furthermore, our data indicate that the entry of double-stranded DNA beyond a short threshold distance from PAM into the Cas9/single-guide RNA (sgRNA) interior is hindered. We suggest that the interactions unfavorable for duplex DNA binding promote DNA bending in the PAM-proximal region during early steps of Cas9/guide RNA–DNA complex formation, thus additionally destabilizing the protospacer duplex. The mechanism that emerges from our analysis explains how the Cas9/sgRNA complex is able to locate the correct target sequence efficiently while interrogating numerous nontarget sequences associated with correct PAMs. PMID:28484024
Mechanism of duplex DNA destabilization by RNA-guided Cas9 nuclease during target interrogation.

PubMed

Mekler, Vladimir; Minakhin, Leonid; Severinov, Konstantin

2017-05-23

The prokaryotic clustered regularly interspaced short palindromic repeats (CRISPR)-associated 9 (Cas9) endonuclease cleaves double-stranded DNA sequences specified by guide RNA molecules and flanked by a protospacer adjacent motif (PAM) and is widely used for genome editing in various organisms. The RNA-programmed Cas9 locates the target site by scanning genomic DNA. We sought to elucidate the mechanism of initial DNA interrogation steps that precede the pairing of target DNA with guide RNA. Using fluorometric and biochemical assays, we studied Cas9/guide RNA complexes with model DNA substrates that mimicked early intermediates on the pathway to the final Cas9/guide RNA-DNA complex. The results show that Cas9/guide RNA binding to PAM favors separation of a few PAM-proximal protospacer base pairs allowing initial target interrogation by guide RNA. The duplex destabilization is mediated, in part, by Cas9/guide RNA affinity for unpaired segments of nontarget strand DNA close to PAM. Furthermore, our data indicate that the entry of double-stranded DNA beyond a short threshold distance from PAM into the Cas9/single-guide RNA (sgRNA) interior is hindered. We suggest that the interactions unfavorable for duplex DNA binding promote DNA bending in the PAM-proximal region during early steps of Cas9/guide RNA-DNA complex formation, thus additionally destabilizing the protospacer duplex. The mechanism that emerges from our analysis explains how the Cas9/sgRNA complex is able to locate the correct target sequence efficiently while interrogating numerous nontarget sequences associated with correct PAMs.
Studying long 16S rDNA sequences with ultrafast-metagenomic sequence classification using exact alignments (Kraken).

PubMed

Valenzuela-González, Fabiola; Martínez-Porchas, Marcel; Villalpando-Canchola, Enrique; Vargas-Albores, Francisco

2016-03-01

Ultrafast-metagenomic sequence classification using exact alignments (Kraken) is a novel approach to classify 16S rDNA sequences. The classifier is based on mapping short sequences to the lowest ancestor and performing alignments to form subtrees with specific weights in each taxon node. This study aimed to evaluate the classification performance of Kraken with long 16S rDNA random environmental sequences produced by cloning and then Sanger sequenced. A total of 480 clones were isolated and expanded, and 264 of these clones formed contigs (1352 ± 153 bp). The same sequences were analyzed using the Ribosomal Database Project (RDP) classifier. Deeper classification performance was achieved by Kraken than by the RDP: 73% of the contigs were classified up to the species or variety levels, whereas 67% of these contigs were classified no further than the genus level by the RDP. The results also demonstrated that unassembled sequences analyzed by Kraken provide similar or inclusively deeper information. Moreover, sequences that did not form contigs, which are usually discarded by other programs, provided meaningful information when analyzed by Kraken. Finally, it appears that the assembly step for Sanger sequences can be eliminated when using Kraken. Kraken cumulates the information of both sequence senses, providing additional elements for the classification. In conclusion, the results demonstrate that Kraken is an excellent choice for use in the taxonomic assignment of sequences obtained by Sanger sequencing or based on third generation sequencing, of which the main goal is to generate larger sequences. Copyright © 2016 Elsevier B.V. All rights reserved.
Systematic analysis of coding and noncoding DNA sequences using methods of statistical linguistics

NASA Technical Reports Server (NTRS)

Mantegna, R. N.; Buldyrev, S. V.; Goldberger, A. L.; Havlin, S.; Peng, C. K.; Simons, M.; Stanley, H. E.

1995-01-01

We compare the statistical properties of coding and noncoding regions in eukaryotic and viral DNA sequences by adapting two tests developed for the analysis of natural languages and symbolic sequences. The data set comprises all 30 sequences of length above 50 000 base pairs in GenBank Release No. 81.0, as well as the recently published sequences of C. elegans chromosome III (2.2 Mbp) and yeast chromosome XI (661 Kbp). We find that for the three chromosomes we studied the statistical properties of noncoding regions appear to be closer to those observed in natural languages than those of coding regions. In particular, (i) a n-tuple Zipf analysis of noncoding regions reveals a regime close to power-law behavior while the coding regions show logarithmic behavior over a wide interval, while (ii) an n-gram entropy measurement shows that the noncoding regions have a lower n-gram entropy (and hence a larger "n-gram redundancy") than the coding regions. In contrast to the three chromosomes, we find that for vertebrates such as primates and rodents and for viral DNA, the difference between the statistical properties of coding and noncoding regions is not pronounced and therefore the results of the analyses of the investigated sequences are less conclusive. After noting the intrinsic limitations of the n-gram redundancy analysis, we also briefly discuss the failure of the zeroth- and first-order Markovian models or simple nucleotide repeats to account fully for these "linguistic" features of DNA. Finally, we emphasize that our results by no means prove the existence of a "language" in noncoding DNA.
An atypical topoisomerase II sequence from the slime mold Physarum polycephalum.

PubMed

Hugodot, Yannick; Dutertre, Murielle; Duguet, Michel

2004-01-21

We have determined the complete nucleotide sequence of the cDNA encoding DNA topoisomerase II from Physarum polycephalum. Using degenerate primers, based on the conserved amino acid sequences of other eukaryotic enzymes, a 250-bp fragment was polymerase chain reaction (PCR) amplified. This fragment was used as a probe to screen a Physarum cDNA library. A partial cDNA clone was isolated that was truncated at the 3' end. Rapid amplification of cDNA ends (RACE)-PCR was employed to isolate the remaining portion of the gene. The complete sequence of 4613 bp contains an open reading frame of 4494 bp that codes for 1498 amino acid residues with a theoretical molecular weight of 167 kDa. The predicted amino acid sequence shares similarity with those of other eukaryotes and shows the highest degree of identity with the enzyme of Dictyostelium discoideum. However, the enzyme of P. polycephalum contains an atypical amino-terminal domain very rich in serine and proline, whose function is unknown. Remarkably, both a mitochondrial targeting sequence and a nuclear localization signal were predicted respectively in the amino and carboxy-terminus of the protein, as in the case of human topoisomerase III alpha. At the Physarum genomic level, the topoisomerase II gene encompasses a region of about 16 kbp suggesting a large proportion of intronic sequences, an unusual situation for a gene of a lower eukaryote, often free of introns. Finally, expression of topoisomerase II mRNA does not appear significantly dependent on the plasmodium cycle stage, possibly due to the lack of G1 phase or (and) to a mitochondrial localization of the enzyme.
Sequence requirements of oligonucleotide chiral selectors for the capillary electrophoresis resolution of low-affinity DNA binders.

PubMed

Tohala, Luma; Oukacine, Farid; Ravelet, Corinne; Peyrin, Eric

2017-05-01

We recently reported that a great variety of DNA oligonucleotides (ONs) used as chiral selectors in partial-filling capillary electrophoresis (CE) exhibited interesting enantioresolution properties toward low-affinity DNA binders. Herein, the sequence prerequisites of ONs for the CE enantioseparation process were studied. First, the chiral resolution properties of a series of homopolymeric sequences (Poly-dT) of different lengths (from 5 to 60-mer) were investigated. It was shown that the size increase-dependent random coil-like conformation of Poly-dT favorably acted on the apparent selectivity and resolution. The base-unpairing state constituted also an important factor in the chiral resolution ability of ONs as the switch from the single-stranded to double-stranded structure was responsible for a significant decrease in the analyte selectivity range. Finally, the chemical diversity enhanced the enantioresolution ability of single-stranded ONs. The present work could lay the foundation for the design of performant ON chiral selectors for the CE separation of weak DNA binder enantiomers. © 2017 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
Electrostatic study of Alanine mutational effects on transcription: application to GATA-3:DNA interaction complex.

PubMed

El-Assaad, Atlal; Dawy, Zaher; Nemer, Georges

2015-01-01

Protein-DNA interaction is of fundamental importance in molecular biology, playing roles in functions as diverse as DNA transcription, DNA structure formation, and DNA repair. Protein-DNA association is also important in medicine; understanding Protein-DNA binding kinetics can assist in identifying disease root causes which can contribute to drug development. In this perspective, this work focuses on the transcription process by the GATA Transcription Factor (TF). GATA TF binds to DNA promoter region represented by `G,A,T,A' nucleotides sequence, and initiates transcription of target genes. When proper regulation fails due to some mutations on the GATA TF protein sequence or on the DNA promoter sequence (weak promoter), deregulation of the target genes might lead to various disorders. In this study, we aim to understand the electrostatic mechanism behind GATA TF and DNA promoter interactions, in order to predict Protein-DNA binding in the presence of mutations, while elaborating on non-covalent binding kinetics. To generate a family of mutants for the GATA:DNA complex, we replaced every charged amino acid, one at a time, with a neutral amino acid like Alanine (Ala). We then applied Poisson-Boltzmann electrostatic calculations feeding into free energy calculations, for each mutation. These calculations delineate the contribution to binding from each Ala-replaced amino acid in the GATA:DNA interaction. After analyzing the obtained data in view of a two-step model, we are able to identify potential key amino acids in binding. Finally, we applied the model to GATA-3:DNA (crystal structure with PDB-ID: 3DFV) binding complex and validated it against experimental results from the literature.
Homopolymer tail-mediated ligation PCR: a streamlined and highly efficient method for DNA cloning and library construction.

PubMed

Lazinski, David W; Camilli, Andrew

2013-01-01

The amplification of DNA fragments, cloned between user-defined 5' and 3' end sequences, is a prerequisite step in the use of many current applications including massively parallel sequencing (MPS). Here we describe an improved method, called homopolymer tail-mediated ligation PCR (HTML-PCR), that requires very little starting template, minimal hands-on effort, is cost-effective, and is suited for use in high-throughput and robotic methodologies. HTML-PCR starts with the addition of homopolymer tails of controlled lengths to the 3' termini of a double-stranded genomic template. The homopolymer tails enable the annealing-assisted ligation of a hybrid oligonucleotide to the template's recessed 5' ends. The hybrid oligonucleotide has a user-defined sequence at its 5' end. This primer, together with a second primer composed of a longer region complementary to the homopolymer tail and fused to a second 5' user-defined sequence, are used in a PCR reaction to generate the final product. The user-defined sequences can be varied to enable compatibility with a wide variety of downstream applications. We demonstrate our new method by constructing MPS libraries starting from nanogram and sub-nanogram quantities of Vibrio cholerae and Streptococcus pneumoniae genomic DNA.

CRISPR-Cas systems exploit viral DNA injection to establish and maintain adaptive immunity.

PubMed

Modell, Joshua W; Jiang, Wenyan; Marraffini, Luciano A

2017-04-06

Clustered regularly interspaced short palindromic repeats (CRISPR)-Cas systems provide protection against viral and plasmid infection by capturing short DNA sequences from these invaders and integrating them into the CRISPR locus of the prokaryotic host. These sequences, known as spacers, are transcribed into short CRISPR RNA guides that specify the cleavage site of Cas nucleases in the genome of the invader. It is not known when spacer sequences are acquired during viral infection. Here, to investigate this, we tracked spacer acquisition in Staphylococcus aureus cells harbouring a type II CRISPR-Cas9 system after infection with the staphylococcal bacteriophage ϕ12. We found that new spacers were acquired immediately after infection preferentially from the cos site, the viral free DNA end that is first injected into the cell. Analysis of spacer acquisition after infection with mutant phages demonstrated that most spacers are acquired during DNA injection, but not during other stages of the viral cycle that produce free DNA ends, such as DNA replication or packaging. Finally, we showed that spacers acquired from early-injected genomic regions, which direct Cas9 cleavage of the viral DNA immediately after infection, provide better immunity than spacers acquired from late-injected regions. Our results reveal that CRISPR-Cas systems exploit the phage life cycle to generate a pattern of spacer acquisition that ensures a successful CRISPR immune response.
A feasibility study of colorectal cancer diagnosis via circulating tumor DNA derived CNV detection.

PubMed

Molparia, Bhuvan; Oliveira, Glenn; Wagner, Jennifer L; Spencer, Emily G; Torkamani, Ali

2018-01-01

Circulating tumor DNA (ctDNA) has shown great promise as a biomarker for early detection of cancer. However, due to the low abundance of ctDNA, especially at early stages, it is hard to detect at high accuracies while keeping sequencing costs low. Here we present a pilot stage study to detect large scale somatic copy numbers variations (CNVs), which contribute more molecules to ctDNA signal compared to point mutations, via cell free DNA sequencing. We show that it is possible to detect somatic CNVs in early stage colorectal cancer (CRC) patients and subsequently discriminate them from normal patients. With 25 normal and 24 CRC samples, we achieve 100% specificity (lower bound confidence interval: 86%) and ~79% sensitivity (95% confidence interval: 63% - 95%,), though the performance should be considered with caution given the limited sample size. We report a lack of concordance between the CNVs detected via cfDNA sequencing and CNVs identified in parent tissue samples. However, recent findings suggest that a lack of concordance is expected for CNVs in CRC because of their sub-clonal nature. Finally, the CNVs we detect very likely contribute to cancer progression as they lie in functionally important regions, and have been shown to be associated with CRC specifically. This study paves the path for a larger scale exploration of the potential of CNV detection for both diagnoses and prognoses of cancer.
Immobilized-free miniaturized electrochemical sensing system for Pb2+ detection based on dual Pb2+-DNAzyme assistant feedback amplification strategy.

PubMed

Cai, Wei; Xie, Shunbi; Zhang, Jin; Tang, Dianyong; Tang, Ying

2018-06-08

We presented a novel dual-DNAzyme feedback amplification (DDFA) strategy for Pb 2+ detection based on a micropipette tip-based miniaturized homogeneous electrochemical device. The DDFA system involves two rolling circle amplification (RCA) processes in which two circular DNA templates (C1 and C2) have been designed with a Pb 2+ -DNAzyme sequence (8-17 DNAzyme, anti-GR-5 DNAzyme) and an antisense sequence of G-quadruplex. And a linear DNA (L-DNA), which consists of a primer sequence and a Pb 2+ -DNAzyme substrate sequence, could hybridize with C1 and C2 to form two DNA complexes. In presence of Pb 2+ , the Pb 2+ -DNAzyme exhibited excellent cleavage specificity toward the substrate sequence in L-DNA, leaving primer sequence to trigger two paths of RCA process and finally resulting in massive long nanosolo DNA strands with reduplicated G-quadruplex sequences. And then, methylene blue (MB) could selectively intercalate into G-quadruplex to reduce the free MB concentration in the solution. Thereafter, a carbon fiber microelectrode-based miniaturized electrochemical device was constructed to record the decrease of electrochemical signal due to the much lower diffusion rate of MB/G-quadruplex complex than that of free MB. Therefore, the concentration of Pb 2+ could be correctively and sensitively determined in a homogeneous solution by combining DDFA with miniaturized electrochemical device. This protocol not only exhibited high selectivity and sensitivity toward Pb 2+ with a detection limit of 0.048 pM, but also reduced sample volume to 10 µL. In addition, this sensing system has been successfully applied to Pb 2+ detection in Yangtze River with desirable quantitative manners, which matched well with the atomic absorption spectrometry (AAS). Copyright © 2018 Elsevier B.V. All rights reserved.
PDNAsite: Identification of DNA-binding Site from Protein Sequence by Incorporating Spatial and Sequence Context

PubMed Central

Zhou, Jiyun; Xu, Ruifeng; He, Yulan; Lu, Qin; Wang, Hongpeng; Kong, Bing

2016-01-01

Protein-DNA interactions are involved in many fundamental biological processes essential for cellular function. Most of the existing computational approaches employed only the sequence context of the target residue for its prediction. In the present study, for each target residue, we applied both the spatial context and the sequence context to construct the feature space. Subsequently, Latent Semantic Analysis (LSA) was applied to remove the redundancies in the feature space. Finally, a predictor (PDNAsite) was developed through the integration of the support vector machines (SVM) classifier and ensemble learning. Results on the PDNA-62 and the PDNA-224 datasets demonstrate that features extracted from spatial context provide more information than those from sequence context and the combination of them gives more performance gain. An analysis of the number of binding sites in the spatial context of the target site indicates that the interactions between binding sites next to each other are important for protein-DNA recognition and their binding ability. The comparison between our proposed PDNAsite method and the existing methods indicate that PDNAsite outperforms most of the existing methods and is a useful tool for DNA-binding site identification. A web-server of our predictor (http://hlt.hitsz.edu.cn:8080/PDNAsite/) is made available for free public accessible to the biological research community. PMID:27282833
Profiling Nematode Communities in Unmanaged Flowerbed and Agricultural Field Soils in Japan by DNA Barcode Sequencing

PubMed Central

Morise, Hisashi; Miyazaki, Erika; Yoshimitsu, Shoko; Eki, Toshihiko

2012-01-01

Soil nematodes play crucial roles in the soil food web and are a suitable indicator for assessing soil environments and ecosystems. Previous nematode community analyses based on nematode morphology classification have been shown to be useful for assessing various soil environments. Here we have conducted DNA barcode analysis for soil nematode community analyses in Japanese soils. We isolated nematodes from two different environmental soils of an unmanaged flowerbed and an agricultural field using the improved flotation-sieving method. Small subunit (SSU) rDNA fragments were directly amplified from each of 68 (flowerbed samples) and 48 (field samples) isolated nematodes to determine the nucleotide sequence. Sixteen and thirteen operational taxonomic units (OTUs) were obtained by multiple sequence alignment from the flowerbed and agricultural field nematodes, respectively. All 29 SSU rDNA-derived OTUs (rOTUs) were further mapped onto a phylogenetic tree with 107 known nematode species. Interestingly, the two nematode communities examined were clearly distinct from each other in terms of trophic groups: Animal predators and plant feeders were markedly abundant in the flowerbed soils, in contrast, bacterial feeders were dominantly observed in the agricultural field soils. The data from the flowerbed nematodes suggests a possible food web among two different trophic nematode groups and plants (weeds) in the closed soil environment. Finally, DNA sequences derived from the mitochondrial cytochrome oxidase c subunit 1 (COI) gene were determined as a DNA barcode from 43 agricultural field soil nematodes. These nematodes were assigned to 13 rDNA-derived OTUs, but in the COI gene analysis were assigned to 23 COI gene-derived OTUs (cOTUs), indicating that COI gene-based barcoding may provide higher taxonomic resolution than conventional SSU rDNA-barcoding in soil nematode community analysis. PMID:23284767
Characterization of mtDNA variation in a cohort of South African paediatric patients with mitochondrial disease.

PubMed

van der Walt, Elizna M; Smuts, Izelle; Taylor, Robert W; Elson, Joanna L; Turnbull, Douglass M; Louw, Roan; van der Westhuizen, Francois H

2012-06-01

Mitochondrial disease can be attributed to both mitochondrial and nuclear gene mutations. It has a heterogeneous clinical and biochemical profile, which is compounded by the diversity of the genetic background. Disease-based epidemiological information has expanded significantly in recent decades, but little information is known that clarifies the aetiology in African patients. The aim of this study was to investigate mitochondrial DNA variation and pathogenic mutations in the muscle of diagnosed paediatric patients from South Africa. A cohort of 71 South African paediatric patients was included and a high-throughput nucleotide sequencing approach was used to sequence full-length muscle mtDNA. The average coverage of the mtDNA genome was 81±26 per position. After assigning haplogroups, it was determined that although the nature of non-haplogroup-defining variants was similar in African and non-African haplogroup patients, the number of substitutions were significantly higher in African patients. We describe previously reported disease-associated and novel variants in this cohort. We observed a general lack of commonly reported syndrome-associated mutations, which supports clinical observations and confirms general observations in African patients when using single mutation screening strategies based on (predominantly non-African) mtDNA disease-based information. It is finally concluded that this first extensive report on muscle mtDNA sequences in African paediatric patients highlights the need for a full-length mtDNA sequencing strategy, which applies to all populations where specific mutations is not present. This, in addition to nuclear DNA gene mutation and pathogenicity evaluations, will be required to better unravel the aetiology of these disorders in African patients.
Solid-phase proximity ligation assays for individual or parallel protein analyses with readout via real-time PCR or sequencing.

PubMed

Nong, Rachel Yuan; Wu, Di; Yan, Junhong; Hammond, Maria; Gu, Gucci Jijuan; Kamali-Moghaddam, Masood; Landegren, Ulf; Darmanis, Spyros

2013-06-01

Solid-phase proximity ligation assays share properties with the classical sandwich immunoassays for protein detection. The proteins captured via antibodies on solid supports are, however, detected not by single antibodies with detectable functions, but by pairs of antibodies with attached DNA strands. Upon recognition by these sets of three antibodies, pairs of DNA strands brought in proximity are joined by ligation. The ligated reporter DNA strands are then detected via methods such as real-time PCR or next-generation sequencing (NGS). We describe how to construct assays that can offer improved detection specificity by virtue of recognition by three antibodies, as well as enhanced sensitivity owing to reduced background and amplified detection. Finally, we also illustrate how the assays can be applied for parallel detection of proteins, taking advantage of the oligonucleotide ligation step to avoid background problems that might arise with multiplexing. The protocol for the singleplex solid-phase proximity ligation assay takes ~5 h. The multiplex version of the assay takes 7-8 h depending on whether quantitative PCR (qPCR) or sequencing is used as the readout. The time for the sequencing-based protocol includes the library preparation but not the actual sequencing, as times may vary based on the choice of sequencing platform.
RNase H-assisted RNA-primed rolling circle amplification for targeted RNA sequence detection.

PubMed

Takahashi, Hirokazu; Ohkawachi, Masahiko; Horio, Kyohei; Kobori, Toshiro; Aki, Tsunehiro; Matsumura, Yukihiko; Nakashimada, Yutaka; Okamura, Yoshiko

2018-05-17

RNA-primed rolling circle amplification (RPRCA) is a useful laboratory method for RNA detection; however, the detection of RNA is limited by the lack of information on 3'-terminal sequences. We uncovered that conventional RPRCA using pre-circularized probes could potentially detect the internal sequence of target RNA molecules in combination with RNase H. However, the specificity for mRNA detection was low, presumably due to non-specific hybridization of non-target RNA with the circular probe. To overcome this technical problem, we developed a method for detecting a sequence of interest in target RNA molecules via RNase H-assisted RPRCA using padlocked probes. When padlock probes are hybridized to the target RNA molecule, they are converted to the circular form by SplintR ligase. Subsequently, RNase H creates nick sites only in the hybridized RNA sequence, and single-stranded DNA is finally synthesized from the nick site by phi29 DNA polymerase. This method could specifically detect at least 10 fmol of the target RNA molecule without reverse transcription. Moreover, this method detected GFP mRNA present in 10 ng of total RNA isolated from Escherichia coli without background DNA amplification. Therefore, this method can potentially detect almost all types of RNA molecules without reverse transcription and reveal full-length sequence information.
Calibrating genomic and allelic coverage bias in single-cell sequencing.

PubMed

Zhang, Cheng-Zhong; Adalsteinsson, Viktor A; Francis, Joshua; Cornils, Hauke; Jung, Joonil; Maire, Cecile; Ligon, Keith L; Meyerson, Matthew; Love, J Christopher

2015-04-16

Artifacts introduced in whole-genome amplification (WGA) make it difficult to derive accurate genomic information from single-cell genomes and require different analytical strategies from bulk genome analysis. Here, we describe statistical methods to quantitatively assess the amplification bias resulting from whole-genome amplification of single-cell genomic DNA. Analysis of single-cell DNA libraries generated by different technologies revealed universal features of the genome coverage bias predominantly generated at the amplicon level (1-10 kb). The magnitude of coverage bias can be accurately calibrated from low-pass sequencing (∼0.1 × ) to predict the depth-of-coverage yield of single-cell DNA libraries sequenced at arbitrary depths. We further provide a benchmark comparison of single-cell libraries generated by multi-strand displacement amplification (MDA) and multiple annealing and looping-based amplification cycles (MALBAC). Finally, we develop statistical models to calibrate allelic bias in single-cell whole-genome amplification and demonstrate a census-based strategy for efficient and accurate variant detection from low-input biopsy samples.
Calibrating genomic and allelic coverage bias in single-cell sequencing

PubMed Central

Francis, Joshua; Cornils, Hauke; Jung, Joonil; Maire, Cecile; Ligon, Keith L.; Meyerson, Matthew; Love, J. Christopher

2016-01-01

Artifacts introduced in whole-genome amplification (WGA) make it difficult to derive accurate genomic information from single-cell genomes and require different analytical strategies from bulk genome analysis. Here, we describe statistical methods to quantitatively assess the amplification bias resulting from whole-genome amplification of single-cell genomic DNA. Analysis of single-cell DNA libraries generated by different technologies revealed universal features of the genome coverage bias predominantly generated at the amplicon level (1–10 kb). The magnitude of coverage bias can be accurately calibrated from low-pass sequencing (~0.1 ×) to predict the depth-of-coverage yield of single-cell DNA libraries sequenced at arbitrary depths. We further provide a benchmark comparison of single-cell libraries generated by multi-strand displacement amplification (MDA) and multiple annealing and looping-based amplification cycles (MALBAC). Finally, we develop statistical models to calibrate allelic bias in single-cell whole-genome amplification and demonstrate a census-based strategy for efficient and accurate variant detection from low-input biopsy samples. PMID:25879913
SynTrack: DNA Assembly Workflow Management (SynTrack) v2.0.1

DOE Office of Scientific and Technical Information (OSTI.GOV)

MENG, XIANWEI; SIMIRENKO, LISA

2016-12-01

SynTrack is a dynamic, workflow-driven data management system that tracks the DNA build process: Management of the hierarchical relationships of the DNA fragments; Monitoring of process tasks for the assembly of multiple DNA fragments into final constructs; Creations of vendor order forms with selectable building blocks. Organizing plate layouts barcodes for vendor/pcr/fusion/chewback/bioassay/glycerol/master plate maps (default/condensed); Creating or updating Pre-Assembly/Assembly process workflows with selected building blocks; Generating Echo pooling instructions based on plate maps; Tracking of building block orders, received and final assembled for delivering; Bulk updating of colony or PCR amplification information, fusion PCR and chewback results; Updating with QA/QCmore » outcome with .csv & .xlsx template files; Re-work assembly workflow enabled before and after sequencing validation; and Tracking of plate/well data changes and status updates and reporting of master plate status with QC outcomes.« less
Development of biometric DNA ink for authentication security.

PubMed

Hashiyada, Masaki

2004-10-01

Among the various types of biometric personal identification systems, DNA provides the most reliable personal identification. It is intrinsically digital and unchangeable while the person is alive, and even after his/her death. Increasing the number of DNA loci examined can enhance the power of discrimination. This report describes the development of DNA ink, which contains synthetic DNA mixed with printing inks. Single-stranded DNA fragments encoding a personalized set of short tandem repeats (STR) were synthesized. The sequence was defined as follows. First, a decimal DNA personal identification (DNA-ID) was established based on the number of STRs in the locus. Next, this DNA-ID was encrypted using a binary, 160-bit algorithm, using a hashing function to protect privacy. Since this function is irreversible, no one can recover the original information from the encrypted code. Finally, the bit series generated above is transformed into base sequences, and double-stranded DNA fragments are amplified by the polymerase chain reaction (PCR) to protect against physical attacks. Synthesized DNA was detected successfully after samples printed in DNA ink were subjected to several resistance tests used to assess the stability of printing inks. Endurance test results showed that this DNA ink would be suitable for practical use as a printing ink and was resistant to 40 hours of ultraviolet exposure, performance commensurate with that of photogravure ink. Copyright 2004 Tohoku University Medical Press
[Multiplexing mapping of human cDNAs]. Final report, September 1, 1991--February 28, 1994

DOE Office of Scientific and Technical Information (OSTI.GOV)

Not Available

Using PCR with automated product analysis, 329 human brain cDNA sequences have been assigned to individual human chromosomes. Primers were designed from single-pass cDNA sequences expressed sequence tags (ESTs). Primers were used in PCR reactions with DNA from somatic cell hybrid mapping panels as templates, often with multiplexing. Many ESTs mapped match sequence database records. To evaluate of these matches, the position of the primers relative to the matching region (In), the BLAST scores and the Poisson probability values of the EST/sequence record match were determined. In cases where the gene product was stringently identified by the sequence match hadmore » already been mapped, the gene locus determined by EST was consistent with the previous position which strongly supports the validity of assigning unknown genes to human chromosomes based on the EST sequence matches. In the present cases mapping the ESTs to a chromosome can also be considered to have mapped the known gene product: rolipram-sensitive cAMP phosphodiesterase, chromosome 1; protein phosphatase 2A{beta}, chromosome 4; alpha-catenin, chromosome 5; the ELE1 oncogene, chromosome 10q11.2 or q2.1-q23; MXII protein, chromosome l0q24-qter; ribosomal protein L18a homologue, chromosome 14; ribosomal protein L3, chromosome 17; and moesin, Xp11-cen. There were also ESTs mapped that were closely related to non-human sequence records. These matches therefore can be considered to identify human counterparts of known gene products, or members of known gene families. Examples of these include membrane proteins, translation-associated proteins, structural proteins, and enzymes. These data then demonstrate that single pass sequence information is sufficient to design PCR primers useful for assigning cDNA sequences to human chromosomes. When the EST sequence matches previous sequence database records, the chromosome assignments of the EST can be used to make preliminary assignments of the human gene to a chromosome.« less
Converting Panax ginseng DNA and chemical fingerprints into two-dimensional barcode.

PubMed

Cai, Yong; Li, Peng; Li, Xi-Wen; Zhao, Jing; Chen, Hai; Yang, Qing; Hu, Hao

2017-07-01

In this study, we investigated how to convert the Panax ginseng DNA sequence code and chemical fingerprints into a two-dimensional code. In order to improve the compression efficiency, GATC2Bytes and digital merger compression algorithms are proposed. HPLC chemical fingerprint data of 10 groups of P. ginseng from Northeast China and the internal transcribed spacer 2 (ITS2) sequence code as the DNA sequence code were ready for conversion. In order to convert such data into a two-dimensional code, the following six steps were performed: First, the chemical fingerprint characteristic data sets were obtained through the inflection filtering algorithm. Second, precompression processing of such data sets is undertaken. Third, precompression processing was undertaken with the P. ginseng DNA (ITS2) sequence codes. Fourth, the precompressed chemical fingerprint data and the DNA (ITS2) sequence code were combined in accordance with the set data format. Such combined data can be compressed by Zlib, an open source data compression algorithm. Finally, the compressed data generated a two-dimensional code called a quick response code (QR code). Through the abovementioned converting process, it can be found that the number of bytes needed for storing P. ginseng chemical fingerprints and its DNA (ITS2) sequence code can be greatly reduced. After GTCA2Bytes algorithm processing, the ITS2 compression rate reaches 75% and the chemical fingerprint compression rate exceeds 99.65% via filtration and digital merger compression algorithm processing. Therefore, the overall compression ratio even exceeds 99.36%. The capacity of the formed QR code is around 0.5k, which can easily and successfully be read and identified by any smartphone. P. ginseng chemical fingerprints and its DNA (ITS2) sequence code can form a QR code after data processing, and therefore the QR code can be a perfect carrier of the authenticity and quality of P. ginseng information. This study provides a theoretical basis for the development of a quality traceability system of traditional Chinese medicine based on a two-dimensional code.
Targeted DNA demethylation in human cells by fusion of a plant 5-methylcytosine DNA glycosylase to a sequence-specific DNA binding domain

PubMed Central

Parrilla-Doblas, Jara Teresa; Ariza, Rafael R.; Roldán-Arjona, Teresa

2017-01-01

ABSTRACT DNA methylation is a crucial epigenetic mark associated to gene silencing, and its targeted removal is a major goal of epigenetic editing. In animal cells, DNA demethylation involves iterative 5mC oxidation by TET enzymes followed by replication-dependent dilution and/or replication-independent DNA repair of its oxidized derivatives. In contrast, plants use specific DNA glycosylases that directly excise 5mC and initiate its substitution for unmethylated C in a base excision repair process. In this work, we have fused the catalytic domain of Arabidopsis ROS1 5mC DNA glycosylase (ROS1_CD) to the DNA binding domain of yeast GAL4 (GBD). We show that the resultant GBD-ROS1_CD fusion protein binds specifically a GBD-targeted DNA sequence in vitro. We also found that transient in vivo expression of GBD-ROS1_CD in human cells specifically reactivates transcription of a methylation-silenced reporter gene, and that such reactivation requires both ROS1_CD catalytic activity and GBD binding capacity. Finally, we show that reactivation induced by GBD-ROS1_CD is accompanied by decreased methylation levels at several CpG sites of the targeted promoter. All together, these results show that plant 5mC DNA glycosylases can be used for targeted active DNA demethylation in human cells. PMID:28277978
Genome sequence determination and metagenomic characterization of a Dehalococcoides mixed culture grown on cis-1,2-dichloroethene.

PubMed

Yohda, Masafumi; Yagi, Osami; Takechi, Ayane; Kitajima, Mizuki; Matsuda, Hisashi; Miyamura, Naoaki; Aizawa, Tomoko; Nakajima, Mutsuyasu; Sunairi, Michio; Daiba, Akito; Miyajima, Takashi; Teruya, Morimi; Teruya, Kuniko; Shiroma, Akino; Shimoji, Makiko; Tamotsu, Hinako; Juan, Ayaka; Nakano, Kazuma; Aoyama, Misako; Terabayashi, Yasunobu; Satou, Kazuhito; Hirano, Takashi

2015-07-01

A Dehalococcoides-containing bacterial consortium that performed dechlorination of 0.20 mM cis-1,2-dichloroethene to ethene in 14 days was obtained from the sediment mud of the lotus field. To obtain detailed information of the consortium, the metagenome was analyzed using the short-read next-generation sequencer SOLiD 3. Matching the obtained sequence tags with the reference genome sequences indicated that the Dehalococcoides sp. in the consortium was highly homologous to Dehalococcoides mccartyi CBDB1 and BAV1. Sequence comparison with the reference sequence constructed from 16S rRNA gene sequences in a public database showed the presence of Sedimentibacter, Sulfurospirillum, Clostridium, Desulfovibrio, Parabacteroides, Alistipes, Eubacterium, Peptostreptococcus and Proteocatella in addition to Dehalococcoides sp. After further enrichment, the members of the consortium were narrowed down to almost three species. Finally, the full-length circular genome sequence of the Dehalococcoides sp. in the consortium, D. mccartyi IBARAKI, was determined by analyzing the metagenome with the single-molecule DNA sequencer PacBio RS. The accuracy of the sequence was confirmed by matching it to the tag sequences obtained by SOLiD 3. The genome is 1,451,062 nt and the number of CDS is 1566, which includes 3 rRNA genes and 47 tRNA genes. There exist twenty-eight RDase genes that are accompanied by the genes for anchor proteins. The genome exhibits significant sequence identity with other Dehalococcoides spp. throughout the genome, but there exists significant difference in the distribution RDase genes. The combination of a short-read next-generation DNA sequencer and a long-read single-molecule DNA sequencer gives detailed information of a bacterial consortium. Copyright © 2014 The Society for Biotechnology, Japan. Published by Elsevier B.V. All rights reserved.
DNA methylation dynamics during early plant life.

PubMed

Bouyer, Daniel; Kramdi, Amira; Kassam, Mohamed; Heese, Maren; Schnittger, Arp; Roudier, François; Colot, Vincent

2017-09-25

Cytosine methylation is crucial for gene regulation and silencing of transposable elements in mammals and plants. While this epigenetic mark is extensively reprogrammed in the germline and early embryos of mammals, the extent to which DNA methylation is reset between generations in plants remains largely unknown. Using Arabidopsis as a model, we uncovered distinct DNA methylation dynamics over transposable element sequences during the early stages of plant development. Specifically, transposable elements and their relics show invariably high methylation at CG sites but increasing methylation at CHG and CHH sites. This non-CG methylation culminates in mature embryos, where it reaches saturation for a large fraction of methylated CHH sites, compared to the typical 10-20% methylation level observed in seedlings or adult plants. Moreover, the increase in CHH methylation during embryogenesis matches the hypomethylated state in the early endosperm. Finally, we show that interfering with the embryo-to-seedling transition results in the persistence of high CHH methylation levels after germination, specifically over sequences that are targeted by the RNA-directed DNA methylation (RdDM) machinery. Our findings indicate the absence of extensive resetting of DNA methylation patterns during early plant life and point instead to an important role of RdDM in reinforcing DNA methylation of transposable element sequences in every cell of the mature embryo. Furthermore, we provide evidence that this elevated RdDM activity is a specific property of embryogenesis.
Precise Sequential DNA Ligation on A Solid Substrate: Solid-Based Rapid Sequential Ligation of Multiple DNA Molecules

PubMed Central

Takita, Eiji; Kohda, Katsunori; Tomatsu, Hajime; Hanano, Shigeru; Moriya, Kanami; Hosouchi, Tsutomu; Sakurai, Nozomu; Suzuki, Hideyuki; Shinmyo, Atsuhiko; Shibata, Daisuke

2013-01-01

Ligation, the joining of DNA fragments, is a fundamental procedure in molecular cloning and is indispensable to the production of genetically modified organisms that can be used for basic research, the applied biosciences, or both. Given that many genes cooperate in various pathways, incorporating multiple gene cassettes in tandem in a transgenic DNA construct for the purpose of genetic modification is often necessary when generating organisms that produce multiple foreign gene products. Here, we describe a novel method, designated PRESSO (precise sequential DNA ligation on a solid substrate), for the tandem ligation of multiple DNA fragments. We amplified donor DNA fragments with non-palindromic ends, and ligated the fragment to acceptor DNA fragments on solid beads. After the final donor DNA fragments, which included vector sequences, were joined to the construct that contained the array of fragments, the ligation product (the construct) was thereby released from the beads via digestion with a rare-cut meganuclease; the freed linear construct was circularized via an intra-molecular ligation. PRESSO allowed us to rapidly and efficiently join multiple genes in an optimized order and orientation. This method can overcome many technical challenges in functional genomics during the post-sequencing generation. PMID:23897972
On the roles of repetitive DNA elements in the context of a unified genomic-epigenetic system.

PubMed

von Sternberg, Richard

2002-12-01

Repetitive DNA sequences comprise a substantial portion of most eukaryotic and some prokaryotic chromosomes. Despite nearly forty years of research, the functions of various sequence families as a whole and their monomer units remain largely unknown. The inability to map specific functional roles onto many repetitive DNA elements (REs), coupled with the taxon-specificity of sequence families, have led many to speculate that these genomic components are "selfish" replicators generating genomic "junk." The purpose of this paper is to critically examine the selfishness, evolutionary effects, and functionality of REs. First, a brief overview of the range of ideas pertaining to RE function is presented. Second, the argument is presented that the selfish DNA "hypothesis" is actually a narrative scheme, that it serves to protect neo-Darwinian assumptions from criticism, and that this story is untestable and therefore not a hypothesis. Third, attempts to synthesize the selfish DNA concept with complex systems models of the genome and RE functionality are critiqued. Fourth, the supposed connection between RE-induced mutations and macroevolutionary events are stated to be at variance with empirical evidence and theoretical considerations. Hypotheses that base phylogenetic transitions in repetitive sequence changes thus remain speculative. Fifth and finally, the case is made for viewing REs as integrally functional components of chromosomes, genomes, and cells. It is argued throughout that a new conceptual framework is needed for understanding the roles of repetitive DNA in genomic/epigenetic systems, and that neo-Darwinian "narratives" have been the primary obstacle to elucidating the effects of these enigmatic components of chromosomes.
Fine Dissection of Human Mitochondrial DNA Haplogroup HV Lineages Reveals Paleolithic Signatures from European Glacial Refugia

PubMed Central

Sarno, Stefania; Sevini, Federica; Vianello, Dario; Tamm, Erika; Metspalu, Ene; van Oven, Mannis; Hübner, Alexander; Sazzini, Marco; Franceschi, Claudio; Pettener, Davide; Luiselli, Donata

2015-01-01

Genetic signatures from the Paleolithic inhabitants of Eurasia can be traced from the early divergent mitochondrial DNA lineages still present in contemporary human populations. Previous studies already suggested a pre-Neolithic diffusion of mitochondrial haplogroup HV*(xH,V) lineages, a relatively rare class of mtDNA types that includes parallel branches mainly distributed across Europe and West Asia with a certain degree of structure. Up till now, variation within haplogroup HV was addressed mainly by analyzing sequence data from the mtDNA control region, except for specific sub-branches, such as HV4 or the widely distributed haplogroups H and V. In this study, we present a revised HV topology based on full mtDNA genome data, and we include a comprehensive dataset consisting of 316 complete mtDNA sequences including 60 new samples from the Italian peninsula, a previously underrepresented geographic area. We highlight points of instability in the particular topology of this haplogroup, reconstructed with BEAST-generated trees and networks. We also confirm a major lineage expansion that probably followed the Late Glacial Maximum and preceded Neolithic population movements. We finally observe that Italy harbors a reservoir of mtDNA diversity, with deep-rooting HV lineages often related to sequences present in the Caucasus and the Middle East. The resulting hypothesis of a glacial refugium in Southern Italy has implications for the understanding of late Paleolithic population movements and is discussed within the archaeological cultural shifts occurred over the entire continent. PMID:26640946

DNA SEQUENCE SIMILARITY REQUIREMENTS FOR INTERSPECIFIC RECOMBINATION IN BACILLUS. (R825348)

EPA Science Inventory

The perspectives, information and conclusions conveyed in research project abstracts, progress reports, final reports, journal abstracts and journal publications convey the viewpoints of the principal investigator and may not represent the views and policies of ORD and EPA. Concl...
Homopolymer tail-mediated ligation PCR: a streamlined and highly efficient method for DNA cloning and library construction

PubMed Central

Lazinski, David W.; Camilli, Andrew

2013-01-01

The amplification of DNA fragments, cloned between user-defined 5′ and 3′ end sequences, is a prerequisite step in the use of many current applications including massively parallel sequencing (MPS). Here we describe an improved method, called homopolymer tail-mediated ligation PCR (HTML-PCR), that requires very little starting template, minimal hands-on effort, is cost-effective, and is suited for use in high-throughput and robotic methodologies. HTML-PCR starts with the addition of homopolymer tails of controlled lengths to the 3′ termini of a double-stranded genomic template. The homopolymer tails enable the annealing-assisted ligation of a hybrid oligonucleotide to the template's recessed 5′ ends. The hybrid oligonucleotide has a user-defined sequence at its 5′ end. This primer, together with a second primer composed of a longer region complementary to the homopolymer tail and fused to a second 5′ user-defined sequence, are used in a PCR reaction to generate the final product. The user-defined sequences can be varied to enable compatibility with a wide variety of downstream applications. We demonstrate our new method by constructing MPS libraries starting from nanogram and sub-nanogram quantities of Vibrio cholerae and Streptococcus pneumoniae genomic DNA. PMID:23311318
Systematic analysis and evolution of 5S ribosomal DNA in metazoans.

PubMed

Vierna, J; Wehner, S; Höner zu Siederdissen, C; Martínez-Lage, A; Marz, M

2013-11-01

Several studies on 5S ribosomal DNA (5S rDNA) have been focused on a subset of the following features in mostly one organism: number of copies, pseudogenes, secondary structure, promoter and terminator characteristics, genomic arrangements, types of non-transcribed spacers and evolution. In this work, we systematically analyzed 5S rDNA sequence diversity in available metazoan genomes, and showed organism-specific and evolutionary-conserved features. Putatively functional sequences (12,766) from 97 organisms allowed us to identify general features of this multigene family in animals. Interestingly, we show that each mammal species has a highly conserved (housekeeping) 5S rRNA type and many variable ones. The genomic organization of 5S rDNA is still under debate. Here, we report the occurrence of several paralog 5S rRNA sequences in 58 of the examined species, and a flexible genome organization of 5S rDNA in animals. We found heterogeneous 5S rDNA clusters in several species, supporting the hypothesis of an exchange of 5S rDNA from one locus to another. A rather high degree of variation of upstream, internal and downstream putative regulatory regions appears to characterize metazoan 5S rDNA. We systematically studied the internal promoters and described three different types of termination signals, as well as variable distances between the coding region and the typical termination signal. Finally, we present a statistical method for detection of linkage among noncoding RNA (ncRNA) gene families. This method showed no evolutionary-conserved linkage among 5S rDNAs and any other ncRNA genes within Metazoa, even though we found 5S rDNA to be linked to various ncRNAs in several clades.
Systematic analysis and evolution of 5S ribosomal DNA in metazoans

PubMed Central

Vierna, J; Wehner, S; Höner zu Siederdissen, C; Martínez-Lage, A; Marz, M

2013-01-01

Several studies on 5S ribosomal DNA (5S rDNA) have been focused on a subset of the following features in mostly one organism: number of copies, pseudogenes, secondary structure, promoter and terminator characteristics, genomic arrangements, types of non-transcribed spacers and evolution. In this work, we systematically analyzed 5S rDNA sequence diversity in available metazoan genomes, and showed organism-specific and evolutionary-conserved features. Putatively functional sequences (12 766) from 97 organisms allowed us to identify general features of this multigene family in animals. Interestingly, we show that each mammal species has a highly conserved (housekeeping) 5S rRNA type and many variable ones. The genomic organization of 5S rDNA is still under debate. Here, we report the occurrence of several paralog 5S rRNA sequences in 58 of the examined species, and a flexible genome organization of 5S rDNA in animals. We found heterogeneous 5S rDNA clusters in several species, supporting the hypothesis of an exchange of 5S rDNA from one locus to another. A rather high degree of variation of upstream, internal and downstream putative regulatory regions appears to characterize metazoan 5S rDNA. We systematically studied the internal promoters and described three different types of termination signals, as well as variable distances between the coding region and the typical termination signal. Finally, we present a statistical method for detection of linkage among noncoding RNA (ncRNA) gene families. This method showed no evolutionary-conserved linkage among 5S rDNAs and any other ncRNA genes within Metazoa, even though we found 5S rDNA to be linked to various ncRNAs in several clades. PMID:23838690
Vaginal microbial flora analysis by next generation sequencing and microarrays; can microbes indicate vaginal origin in a forensic context?

PubMed

Benschop, Corina C G; Quaak, Frederike C A; Boon, Mathilde E; Sijen, Titia; Kuiper, Irene

2012-03-01

Forensic analysis of biological traces generally encompasses the investigation of both the person who contributed to the trace and the body site(s) from which the trace originates. For instance, for sexual assault cases, it can be beneficial to distinguish vaginal samples from skin or saliva samples. In this study, we explored the use of microbial flora to indicate vaginal origin. First, we explored the vaginal microbiome for a large set of clinical vaginal samples (n = 240) by next generation sequencing (n = 338,184 sequence reads) and found 1,619 different sequences. Next, we selected 389 candidate probes targeting genera or species and designed a microarray, with which we analysed a diverse set of samples; 43 DNA extracts from vaginal samples and 25 DNA extracts from samples from other body sites, including sites in close proximity of or in contact with the vagina. Finally, we used the microarray results and next generation sequencing dataset to assess the potential for a future approach that uses microbial markers to indicate vaginal origin. Since no candidate genera/species were found to positively identify all vaginal DNA extracts on their own, while excluding all non-vaginal DNA extracts, we deduce that a reliable statement about the cellular origin of a biological trace should be based on the detection of multiple species within various genera. Microarray analysis of a sample will then render a microbial flora pattern that is probably best analysed in a probabilistic approach.
Development and validation of a D-loop mtDNA SNP assay for the screening of specimens in forensic casework.

PubMed

Chemale, Gustavo; Paneto, Greiciane Gaburro; Menezes, Meiga Aurea Mendes; de Freitas, Jorge Marcelo; Jacques, Guilherme Silveira; Cicarelli, Regina Maria Barretto; Fagundes, Paulo Roberto

2013-05-01

Mitochondrial DNA (mtDNA) analysis is usually a last resort in routine forensic DNA casework. However, it has become a powerful tool for the analysis of highly degraded samples or samples containing too little or no nuclear DNA, such as old bones and hair shafts. The gold standard methodology still constitutes the direct sequencing of polymerase chain reaction (PCR) products or cloned amplicons from the HVS-1 and HVS-2 (hypervariable segment) control region segments. Identifications using mtDNA are time consuming, expensive and can be very complex, depending on the amount and nature of the material being tested. The main goal of this work is to develop a less labour-intensive and less expensive screening method for mtDNA analysis, in order to aid in the exclusion of non-matching samples and as a presumptive test prior to final confirmatory DNA sequencing. We have selected 14 highly discriminatory single nucleotide polymorphisms (SNPs) based on simulations performed by Salas and Amigo (2010) to be typed using SNaPShot(TM) (Applied Biosystems, Foster City, CA, USA). The assay was validated by typing more than 100 HVS-1/HVS-2 sequenced samples. No differences were observed between the SNP typing and DNA sequencing when results were compared, with the exception of allelic dropouts observed in a few haplotypes. Haplotype diversity simulations were performed using 172 mtDNA sequences representative of the Brazilian population and a score of 0.9794 was obtained when the 14 SNPs were used, showing that the theoretical prediction approach for the selection of highly discriminatory SNPs suggested by Salas and Amigo (2010) was confirmed in the population studied. As the main goal of the work is to develop a screening assay to skip the sequencing of all samples in a particular case, a pair-wise comparison of the sequences was done using the selected SNPs. When both HVS-1/HVS-2 SNPs were used for simulations, at least two differences were observed in 93.2% of the comparisons performed. The assay was validated with casework samples. Results show that the method is straightforward and can be used for exclusionary purposes, saving time and laboratory resources. The assay confirms the theoretic prediction suggested by Salas and Amigo (2010). All forensic advantages, such as high sensitivity and power of discrimination, as also the disadvantages, such as the occurrence of allele dropouts, are discussed throughout the article. Copyright © 2013 Elsevier Ireland Ltd. All rights reserved.
Whole-comparative genomic hybridization in domestic sheep (Ovis aries) breeds.

PubMed

Dávila-Rodríguez, M I; Cortés-Gutiérrez, E I; López-Fernández, C; Pita, M; Mezzanotte, R; Gosálvez, J

2009-01-01

Whole-comparative genomic hybridization (W-CGH) allows identification of chromosomal polymorphisms related to highly repetitive DNA sequences localized in constitutive heterochromatin. Such polymorphisms are detected establishing competition between genomic DNAs in an in situ hybridization environment without subtraction of highly repetitive DNA sequences, when comparing two species from closely related taxa (same species, sub-species, or breeds) or somewhat related taxa. This experimental approach was applied to investigating differences in highly repetitive sequences of three sheep breeds (Castellana, Ojalada, and Assaf). To this end, W-CGH was carried out using mouflon (sheep ancestor) chromosomes as a common target to co-hybridize equimolar quantities of two genomic DNAs obtained from either Castellana, Ojalada or Assaf sheep breeds. The results showed that the amount of constitutive heterochromatin is greater in all pericentromeric heterochromatin regions of acrocentric chromosomes than in metacentric or sex chromosomes. Additionally, when W-CGH was performed using DNAs from the Iberian breeds Castellana and Ojalada, chromosomal pericentromeric regions revealed quantitatively and qualitatively a presence of DNA families similar to that obtained from any of the above-cited breeds. On the contrary, when the DNA used in W-CGH experiments was obtained from Assaf, as compared to either Castellana or Ojalada, two different pericentromeric DNA families of highly repetitive sequences could be detected. Lastly, sex chromosomes were shown to be homogeneous among all breeds and thus revealed no detectable constitutive heterochromatin. W-CGH results were confirmed using DNA breakage detection-FISH experiments (DBD-FISH) carried out on lymphocytes. As a whole, the results showed that two different repetitive DNA families are present in the pericentromeric heterochromatin of the sheep breeds studied here. Additionally, they suggest a differential presence of these distinct repetitive DNA families in Castellana and Ojalada breeds as compared to the Assaf breed. Finally, the results of W-CGH after using mouflon as the targeted chromosomes also show that the two DNA families are present in the ancestor. Copyright 2009 S. Karger AG, Basel.
New insights into replication origin characteristics in metazoans

PubMed Central

Puy, Aurore; Rialle, Stéphanie; Kaplan, Noam; Segal, Eran

2012-01-01

We recently reported the identification and characterization of DNA replication origins (Oris) in metazoan cell lines. Here, we describe additional bioinformatic analyses showing that the previously identified GC-rich sequence elements form origin G-rich repeated elements (OGREs) that are present in 67% to 90% of the DNA replication origins from Drosophila to human cells, respectively. Our analyses also show that initiation of DNA synthesis takes place precisely at 160 bp (Drosophila) and 280 bp (mouse) from the OGRE. We also found that in most CpG islands, an OGRE is positioned in opposite orientation on each of the two DNA strands and detected two sites of initiation of DNA synthesis upstream or downstream of each OGRE. Conversely, Oris not associated with CpG islands have a single initiation site. OGRE density along chromosomes correlated with previously published replication timing data. Ori sequences centered on the OGRE are also predicted to have high intrinsic nucleosome occupancy. Finally, OGREs predict G-quadruplex structures at Oris that might be structural elements controlling the choice or activation of replication origins. PMID:22373526
Epigenetically-inherited centromere and neocentromere DNA replicates earliest in S-phase.

PubMed

Koren, Amnon; Tsai, Hung-Ji; Tirosh, Itay; Burrack, Laura S; Barkai, Naama; Berman, Judith

2010-08-19

Eukaryotic centromeres are maintained at specific chromosomal sites over many generations. In the budding yeast Saccharomyces cerevisiae, centromeres are genetic elements defined by a DNA sequence that is both necessary and sufficient for function; whereas, in most other eukaryotes, centromeres are maintained by poorly characterized epigenetic mechanisms in which DNA has a less definitive role. Here we use the pathogenic yeast Candida albicans as a model organism to study the DNA replication properties of centromeric DNA. By determining the genome-wide replication timing program of the C. albicans genome, we discovered that each centromere is associated with a replication origin that is the first to fire on its respective chromosome. Importantly, epigenetic formation of new ectopic centromeres (neocentromeres) was accompanied by shifts in replication timing, such that a neocentromere became the first to replicate and became associated with origin recognition complex (ORC) components. Furthermore, changing the level of the centromere-specific histone H3 isoform led to a concomitant change in levels of ORC association with centromere regions, further supporting the idea that centromere proteins determine origin activity. Finally, analysis of centromere-associated DNA revealed a replication-dependent sequence pattern characteristic of constitutively active replication origins. This strand-biased pattern is conserved, together with centromere position, among related strains and species, in a manner independent of primary DNA sequence. Thus, inheritance of centromere position is correlated with a constitutively active origin of replication that fires at a distinct early time. We suggest a model in which the distinct timing of DNA replication serves as an epigenetic mechanism for the inheritance of centromere position.
Characterization of full-length sequenced cDNA inserts (FLIcs) from Atlantic salmon (Salmo salar)

PubMed Central

Andreassen, Rune; Lunner, Sigbjørn; Høyheim, Bjørn

2009-01-01

Background Sequencing of the Atlantic salmon genome is now being planned by an international research consortium. Full-length sequenced inserts from cDNAs (FLIcs) are an important tool for correct annotation and clustering of the genomic sequence in any species. The large amount of highly similar duplicate sequences caused by the relatively recent genome duplication in the salmonid ancestor represents a particular challenge for the genome project. FLIcs will therefore be an extremely useful resource for the Atlantic salmon sequencing project. In addition to be helpful in order to distinguish between duplicate genome regions and in determining correct gene structures, FLIcs are an important resource for functional genomic studies and for investigation of regulatory elements controlling gene expression. In contrast to the large number of ESTs available, including the ESTs from 23 developmental and tissue specific cDNA libraries contributed by the Salmon Genome Project (SGP), the number of sequences where the full-length of the cDNA insert has been determined has been small. Results High quality full-length insert sequences from 560 pre-smolt white muscle tissue specific cDNAs were generated, accession numbers [GenBank: BT043497 - BT044056]. Five hundred and ten (91%) of the transcripts were annotated using Gene Ontology (GO) terms and 440 of the FLIcs are likely to contain a complete coding sequence (cCDS). The sequence information was used to identify putative paralogs, characterize salmon Kozak motifs, polyadenylation signal variation and to identify motifs likely to be involved in the regulation of particular genes. Finally, conserved 7-mers in the 3'UTRs were identified, of which some were identical to miRNA target sequences. Conclusion This paper describes the first Atlantic salmon FLIcs from a tissue and developmental stage specific cDNA library. We have demonstrated that many FLIcs contained a complete coding sequence (cCDS). This suggests that the remaining cDNA libraries generated by SGP represent a valuable cCDS FLIc source. The conservation of 7-mers in 3'UTRs indicates that these motifs are functionally important. Identity between some of these 7-mers and miRNA target sequences suggests that they are miRNA targets in Salmo salar transcripts as well. PMID:19878547
DNA typing of ancient parasite eggs from environmental samples identifies human and animal worm infections in Viking-age settlement.

PubMed

Søe, Martin Jensen; Nejsum, Peter; Fredensborg, Brian Lund; Kapel, Christian Moliin Outzen

2015-02-01

Ancient parasite eggs were recovered from environmental samples collected at a Viking-age settlement in Viborg, Denmark, dated 1018-1030 A.D. Morphological examination identified Ascaris sp., Trichuris sp., and Fasciola sp. eggs, but size and shape did not allow species identification. By carefully selecting genetic markers, PCR amplification and sequencing of ancient DNA (aDNA) isolates resulted in identification of: the human whipworm, Trichuris trichiura , using SSUrRNA sequence homology; Ascaris sp. with 100% homology to cox1 haplotype 07; and Fasciola hepatica using ITS1 sequence homology. The identification of T. trichiura eggs indicates that human fecal material is present and, hence, that the Ascaris sp. haplotype 07 was most likely a human variant in Viking-age Denmark. The location of the F. hepatica finding suggests that sheep or cattle are the most likely hosts. Further, we sequenced the Ascaris sp. 18S rRNA gene in recent isolates from humans and pigs of global distribution and show that this is not a suited marker for species-specific identification. Finally, we discuss ancient parasitism in Denmark and the implementation of aDNA analysis methods in paleoparasitological studies. We argue that when employing species-specific identification, soil samples offer excellent opportunities for studies of human parasite infections and of human and animal interactions of the past.
BplI, a new BcgI-like restriction endonuclease, which recognizes a symmetric sequence.

PubMed Central

Vitkute, J; Maneliene, Z; Petrusyte, M; Janulaitis, A

1997-01-01

Bcg I and Bcg I-like restriction endonucleases cleave double stranded DNA specifically on both sides of their asymmetric recognition sequences which are interrupted by several ambiguous base pairs. Their heterosubunit structure, bifunctionality and stimulation by AdoMet make them different from other classified restriction enzymes. Here we report on a new Bcg I-like restriction endonuclease, Bpl I from Bacillus pumilus , which in contrast to all other Bcg I-like enzymes, recognizes a symmetric interrupted sequence, and which, like Bcg I, cleaves double stranded DNA upstream and downstream of its recognition sequence (8/13)GAGN5CTC(13/8). Like Bcg I, Bpl I is a bifunctional enzyme revealing both DNA cleavage and methyltransferase activities. There are two polypeptides in the homogeneous preparation of Bpl I with molecular masses of approximately 74 and 37 kDa. The sizes of the Bpl I subunits are close to those of Bcg I, but the proportion 1:1 in the final preparation is different from that of 2:1 in Bcg I. Low activity observed with Mg2+increases >100-fold in the presence of AdoMet. Even with AdoMet though, specific cleavage is incomplete. S -adenosylhomocysteine (AdoHcy) or sinefungin can replace AdoMet in the cleavage reaction. AdoHcy activated Bpl I yields complete cleavage of DNA. PMID:9358150
Spectrophotometric and ultrasensitive DNA bioassay by circular-strand displacement polymerization reaction.

PubMed

Yu, Luxin; Wu, Wei; Chen, Junhua; Xiao, Zhuo; Ge, Chenchen; Lie, Puchang; Fang, Zhiyuan; Chen, Lingbo; Zhang, Ya; Zeng, Lingwen

2013-12-07

We demonstrated a new spectrophotometric DNA detection approach based on a circular strand-displacement polymerization reaction for the quantitative detection of sequence specific DNA. In this assay, the hybridization of an immobilized hairpin probe on the microtiter plate, to target DNA, results in a conformational change and leads to a stem separation. A short primer thus anneals with the open stem and triggers a polymerization reaction, allowing a cyclic reaction comprising the release of target DNA and hybridization of the target with the remaining immobilized hairpin probe. Through this cyclical process, a large number of duplex DNA complexes are produced. Finally, the biotin modified duplex DNA products can be detected via the HRP catalyzed substrate 3,3',5,5'-tetramethylbenzidine using a spectrophotometer. As a proof of concept, a short DNA sequence (20-nt) related to the South East Asia (SEA) type deletion of α-thalassemia was chosen as the model target. This proposed assay has a very high sensitivity and selectivity with a dynamic response ranging from 0.1 fM to 10 nM and the detection limit was 8 aM. It can be performed within 2 hours, and it can differentiate target SEA DNA from wild-type DNA. By substituting the hairpin probes used in the present work, this assay can be used to detect other subtypes of genetic disorders.
Whole genome DNA methylation: beyond genes silencing.

PubMed

Tirado-Magallanes, Roberto; Rebbani, Khadija; Lim, Ricky; Pradhan, Sriharsa; Benoukraf, Touati

2017-01-17

The combination of DNA bisulfite treatment with high-throughput sequencing technologies has enabled investigation of genome-wide DNA methylation at near base pair level resolution, far beyond that of the kilobase-long canonical CpG islands that initially revealed the biological relevance of this covalent DNA modification. The latest high-resolution studies have revealed a role for very punctual DNA methylation in chromatin plasticity, gene regulation and splicing. Here, we aim to outline the major biological consequences of DNA methylation recently discovered. We also discuss the necessity of tuning DNA methylation resolution into an adequate scale to ease the integration of the methylome information with other chromatin features and transcription events such as gene expression, nucleosome positioning, transcription factors binding dynamic, gene splicing and genomic imprinting. Finally, our review sheds light on DNA methylation heterogeneity in cell population and the different approaches used for its assessment, including the contribution of single cell DNA analysis technology.
Whole genome DNA methylation: beyond genes silencing

PubMed Central

Tirado-Magallanes, Roberto; Rebbani, Khadija; Lim, Ricky; Pradhan, Sriharsa; Benoukraf, Touati

2017-01-01

The combination of DNA bisulfite treatment with high-throughput sequencing technologies has enabled investigation of genome-wide DNA methylation at near base pair level resolution, far beyond that of the kilobase-long canonical CpG islands that initially revealed the biological relevance of this covalent DNA modification. The latest high-resolution studies have revealed a role for very punctual DNA methylation in chromatin plasticity, gene regulation and splicing. Here, we aim to outline the major biological consequences of DNA methylation recently discovered. We also discuss the necessity of tuning DNA methylation resolution into an adequate scale to ease the integration of the methylome information with other chromatin features and transcription events such as gene expression, nucleosome positioning, transcription factors binding dynamic, gene splicing and genomic imprinting. Finally, our review sheds light on DNA methylation heterogeneity in cell population and the different approaches used for its assessment, including the contribution of single cell DNA analysis technology. PMID:27895318
BioVLAB-mCpG-SNP-EXPRESS: A system for multi-level and multi-perspective analysis and exploration of DNA methylation, sequence variation (SNPs), and gene expression from multi-omics data.

PubMed

Chae, Heejoon; Lee, Sangseon; Seo, Seokjun; Jung, Daekyoung; Chang, Hyeonsook; Nephew, Kenneth P; Kim, Sun

2016-12-01

Measuring gene expression, DNA sequence variation, and DNA methylation status is routinely done using high throughput sequencing technologies. To analyze such multi-omics data and explore relationships, reliable bioinformatics systems are much needed. Existing systems are either for exploring curated data or for processing omics data in the form of a library such as R. Thus scientists have much difficulty in investigating relationships among gene expression, DNA sequence variation, and DNA methylation using multi-omics data. In this study, we report a system called BioVLAB-mCpG-SNP-EXPRESS for the integrated analysis of DNA methylation, sequence variation (SNPs), and gene expression for distinguishing cellular phenotypes at the pairwise and multiple phenotype levels. The system can be deployed on either the Amazon cloud or a publicly available high-performance computing node, and the data analysis and exploration of the analysis result can be conveniently done using a web-based interface. In order to alleviate analysis complexity, all the process are fully automated, and graphical workflow system is integrated to represent real-time analysis progression. The BioVLAB-mCpG-SNP-EXPRESS system works in three stages. First, it processes and analyzes multi-omics data as input in the form of the raw data, i.e., FastQ files. Second, various integrated analyses such as methylation vs. gene expression and mutation vs. methylation are performed. Finally, the analysis result can be explored in a number of ways through a web interface for the multi-level, multi-perspective exploration. Multi-level interpretation can be done by either gene, gene set, pathway or network level and multi-perspective exploration can be explored from either gene expression, DNA methylation, sequence variation, or their relationship perspective. The utility of the system is demonstrated by performing analysis of phenotypically distinct 30 breast cancer cell line data set. BioVLAB-mCpG-SNP-EXPRESS is available at http://biohealth.snu.ac.kr/software/biovlab_mcpg_snp_express/. Copyright Â© 2016 Elsevier Inc. All rights reserved.
Characterization of monomeric DNA-binding protein Histone H1 in Leishmania braziliensis.

PubMed

Carmelo, Emma; González, Gloria; Cruz, Teresa; Osuna, Antonio; Hernández, Mariano; Valladares, Basilio

2011-08-01

Histone H1 in Leishmania presents relevant differences compared to higher eukaryote counterparts, such as the lack of a DNA-binding central globular domain. Despite that, it is apparently fully functional since its differential expression levels have been related to changes in chromatin condensation and infectivity, among other features. The localization and the aggregation state of L. braziliensis H1 has been determined by immunolocalization, mass spectrometry, cross-linking and electrophoretic mobility shift assays. Analysis of H1 sequences from the Leishmania Genome Database revealed that our protein is included in a very divergent group of histones H1 that is present only in L. braziliensis. An antibody raised against recombinant L. braziliensis H1 recognized specifically that protein by immunoblot in L. braziliensis extracts, but not in other Leishmania species, a consequence of the sequence divergences observed among Leishmania species. Mass spectrometry analysis and in vitro DNA-binding experiments have also proven that L. braziliensis H1 is monomeric in solution, but oligomerizes upon binding to DNA. Finally, despite the lack of a globular domain, L. braziliensis H1 is able to form complexes with DNA in vitro, with higher affinity for supercoiled compared to linear DNA.
Interaction of Zn(II)bleomycin-A2 and Zn(II)peplomycin with a DNA hairpin containing the 5'-GT-3' binding site in comparison with the 5'-GC-3' binding site studied by NMR spectroscopy.

PubMed

Follett, Shelby E; Ingersoll, Azure D; Murray, Sally A; Reilly, Teresa M; Lehmann, Teresa E

2017-10-01

Bleomycins are a group of glycopeptide antibiotics synthesized by Streptomyces verticillus that are widely used for the treatment of various neoplastic diseases. These antibiotics have the ability to chelate a metal center, mainly Fe(II), and cause site-specific DNA cleavage. Bleomycins are differentiated by their C-terminal regions. Although this antibiotic family is a successful course of treatment for some types of cancers, it is known to cause pulmonary fibrosis. Previous studies have identified that bleomycin-related pulmonary toxicity is linked to the C-terminal region of these drugs. This region has been shown to closely interact with DNA. We examined the binding of Zn(II)peplomycin and Zn(II)bleomycin-A 2 to a DNA hairpin of sequence 5'-CCAGTATTTTTACTGG-3', containing the binding site 5'-GT-3', and compared the results with those obtained from our studies of the same MBLMs bound to a DNA hairpin containing the binding site 5'-GC-3'. We provide evidence that the DNA base sequence has a strong impact in the final structure of the drug-target complex.
DNA Sequencing Using capillary Electrophoresis

DOE Office of Scientific and Technical Information (OSTI.GOV)

Dr. Barry Karger

2011-05-09

The overall goal of this program was to develop capillary electrophoresis as the tool to be used to sequence for the first time the Human Genome. Our program was part of the Human Genome Project. In this work, we were highly successful and the replaceable polymer we developed, linear polyacrylamide, was used by the DOE sequencing lab in California to sequence a significant portion of the human genome using the MegaBase multiple capillary array electrophoresis instrument. In this final report, we summarize our efforts and success. We began our work by separating by capillary electrophoresis double strand oligonucleotides using cross-linkedmore » polyacrylamide gels in fused silica capillaries. This work showed the potential of the methodology. However, preparation of such cross-linked gel capillaries was difficult with poor reproducibility, and even more important, the columns were not very stable. We improved stability by using non-cross linked linear polyacrylamide. Here, the entangled linear chains could move when osmotic pressure (e.g. sample injection) was imposed on the polymer matrix. This relaxation of the polymer dissipated the stress in the column. Our next advance was to use significantly lower concentrations of the linear polyacrylamide that the polymer could be automatically blown out after each run and replaced with fresh linear polymer solution. In this way, a new column was available for each analytical run. Finally, while testing many linear polymers, we selected linear polyacrylamide as the best matrix as it was the most hydrophilic polymer available. Under our DOE program, we demonstrated initially the success of the linear polyacrylamide to separate double strand DNA. We note that the method is used even today to assay purity of double stranded DNA fragments. Our focus, of course, was on the separation of single stranded DNA for sequencing purposes. In one paper, we demonstrated the success of our approach in sequencing up to 500 bases. Other application papers of sequencing up to this level were also published in the mid 1990's. A major interest of the sequencing community has always been read length. The longer the sequence read per run the more efficient the process as well as the ability to read repeat sequences. We therefore devoted a great deal of time to studying the factors influencing read length in capillary electrophoresis, including polymer type and molecule weight, capillary column temperature, applied electric field, etc. In our initial optimization, we were able to demonstrate, for the first time, the sequencing of over 1000 bases with 90% accuracy. The run required 80 minutes for separation. Sequencing of 1000 bases per column was next demonstrated on a multiple capillary instrument. Our studies revealed that linear polyacrylamide produced the longest read lengths because the hydrophilic single strand DNA had minimal interaction with the very hydrophilic linear polyacrylamide. Any interaction of the DNA with the polymer would lead to broader peaks and lower read length. Another important parameter was the molecular weight of the linear chains. High molecular weight (> 1 MDA) was important to allow the long single strand DNA to reptate through the entangled polymer matrix. In an important paper, we showed an inverse emulsion method to prepare reproducibility linear polyacrylamide polymer with an average MWT of 9MDa. This approach was used in the polymer for sequencing the human genome. Another critical factor in the successful use of capillary electrophoresis for sequencing was the sample preparation method. In the Sanger sequencing reaction, high concentration of salts and dideoxynucleotide remained. Since the sample was introduced to the capillary column by electrokinetic injection, these salt ions would be favorably injected into the column over the sequencing fragments, thus reducing the signal for longer fragments and hence reading read length. In two papers, we examined the role of individual components from the sequencing reaction and then developed a protocol to reduce the deleterious salts. We demonstrated a robust method for achieving long read length DNA sequencing. Continuing our advances, we next demonstrated the achievement of over 1000 bases in less than one hour with a base calling accuracy of between 98 and 99%. In this work, we implemented energy transfer dyes which allowed for cleaner differentiation of the 4 dye labeled terminal nucleotides. In addition, we developed improved base calling software to help read sequencing when the separation was only minimal as occurs at long read lengths. Another critical parameter we studied was column temperature. We demonstrated that read lengths improved as the column temperature was increased from room temperature to 60 C or 70 C. The higher temperature relaxed the DNA chains under the influence of the high electric field.« less
The yeast Pif1 helicase prevents genomic instability caused by G-quadruplex-forming CEB1 sequences in vivo.

PubMed

Ribeyre, Cyril; Lopes, Judith; Boulé, Jean-Baptiste; Piazza, Aurèle; Guédin, Aurore; Zakian, Virginia A; Mergny, Jean-Louis; Nicolas, Alain

2009-05-01

In budding yeast, the Pif1 DNA helicase is involved in the maintenance of both nuclear and mitochondrial genomes, but its role in these processes is still poorly understood. Here, we provide evidence for a new Pif1 function by demonstrating that its absence promotes genetic instability of alleles of the G-rich human minisatellite CEB1 inserted in the Saccharomyces cerevisiae genome, but not of other tandem repeats. Inactivation of other DNA helicases, including Sgs1, had no effect on CEB1 stability. In vitro, we show that CEB1 repeats formed stable G-quadruplex (G4) secondary structures and the Pif1 protein unwinds these structures more efficiently than regular B-DNA. Finally, synthetic CEB1 arrays in which we mutated the potential G4-forming sequences were no longer destabilized in pif1Delta cells. Hence, we conclude that CEB1 instability in pif1Delta cells depends on the potential to form G-quadruplex structures, suggesting that Pif1 could play a role in the metabolism of G4-forming sequences.

Reproducibility and Consistency of In Vitro Nucleosome Reconstitutions Demonstrated by Invitrosome Isolation and Sequencing

PubMed Central

Kempton, Colton E.; Heninger, Justin R.; Johnson, Steven M.

2014-01-01

Nucleosomes and their positions in the eukaryotic genome play an important role in regulating gene expression by influencing accessibility to DNA. Many factors influence a nucleosome's final position in the chromatin landscape including the underlying genomic sequence. One of the primary reasons for performing in vitro nucleosome reconstitution experiments is to identify how the underlying DNA sequence will influence a nucleosome's position in the absence of other compounding cellular factors. However, concerns have been raised about the reproducibility of data generated from these kinds of experiments. Here we present data for in vitro nucleosome reconstitution experiments performed on linear plasmid DNA that demonstrate that, when coverage is deep enough, these reconstitution experiments are exquisitely reproducible and highly consistent. Our data also suggests that a coverage depth of 35X be maintained for maximal confidence when assaying nucleosome positions, but lower coverage levels may be generally sufficient. These coverage depth recommendations are sufficient in the experimental system and conditions used in this study, but may vary depending on the exact parameters used in other systems. PMID:25093869
A novel low energy electron microscope for DNA sequencing and surface analysis

DOE PAGES

Mankos, M.; Shadman, K.; Persson, H. H. J.; ...

2014-01-31

Monochromatic, aberration-corrected, dual-beam low energy electron microscopy (MAD-LEEM) is a novel technique that is directed towards imaging nanostructures and surfaces with sub-nanometer resolution. The technique combines a monochromator, a mirror aberration corrector, an energy filter, and dual beam illumination in a single instrument. The monochromator reduces the energy spread of the illuminating electron beam, which significantly improves spectroscopic and spatial resolution. Simulation results predict that the novel aberration corrector design will eliminate the second rank chromatic and third and fifth order spherical aberrations, thereby improving the resolution into the sub-nanometer regime at landing energies as low as one hundred electron-Volts.more » The energy filter produces a beam that can extract detailed information about the chemical composition and local electronic states of non-periodic objects such as nanoparticles, interfaces, defects, and macromolecules. The dual flood illumination eliminates charging effects that are generated when a conventional LEEM is used to image insulating specimens. A potential application for MAD-LEEM is in DNA sequencing, which requires high resolution to distinguish the individual bases and high speed to reduce the cost. The MAD-LEEM approach images the DNA with low electron impact energies, which provides nucleobase contrast mechanisms without organometallic labels. Furthermore, the micron-size field of view when combined with imaging on the fly provides long read lengths, thereby reducing the demand on assembling the sequence. Finally, experimental results from bulk specimens with immobilized single-base oligonucleotides demonstrate that base specific contrast is available with reflected, photo-emitted, and Auger electrons. Image contrast simulations of model rectangular features mimicking the individual nucleotides in a DNA strand have been developed to translate measurements of contrast on bulk DNA to the detectability of individual DNA bases in a sequence.« less
THE PHYLOGENETIC RELATIONSHIPS OF WHALE-FALL VESICOMYID CLAMS BASED ON MITOCHONDRIAL COI DNA SEQUENCES. (U915626)

EPA Science Inventory

The perspectives, information and conclusions conveyed in research project abstracts, progress reports, final reports, journal abstracts and journal publications convey the viewpoints of the principal investigator and may not represent the views and policies of ORD and EPA. Concl...
IDENTICAL RIBOSOMAL DNA SEQUENCE DATA FROM PFIESTERIA PISCICIDA (DINOPHYCEAE) ISOLATES WITH DIFFERENT TOXICITY PHENOTYPES. (R827084)

EPA Science Inventory

The perspectives, information and conclusions conveyed in research project abstracts, progress reports, final reports, journal abstracts and journal publications convey the viewpoints of the principal investigator and may not represent the views and policies of ORD and EPA. Concl...
Bisulfite Conversion of DNA: Performance Comparison of Different Kits and Methylation Quantitation of Epigenetic Biomarkers that Have the Potential to Be Used in Non-Invasive Prenatal Testing

PubMed Central

Leontiou, Chrysanthia A.; Hadjidaniel, Michael D.; Mina, Petros; Antoniou, Pavlos; Ioannides, Marios; Patsalis, Philippos C.

2015-01-01

Introduction Epigenetic alterations, including DNA methylation, play an important role in the regulation of gene expression. Several methods exist for evaluating DNA methylation, but bisulfite sequencing remains the gold standard by which base-pair resolution of CpG methylation is achieved. The challenge of the method is that the desired outcome (conversion of unmethylated cytosines) positively correlates with the undesired side effects (DNA degradation and inappropriate conversion), thus several commercial kits try to adjust a balance between the two. The aim of this study was to compare the performance of four bisulfite conversion kits [Premium Bisulfite kit (Diagenode), EpiTect Bisulfite kit (Qiagen), MethylEdge Bisulfite Conversion System (Promega) and BisulFlash DNA Modification kit (Epigentek)] regarding conversion efficiency, DNA degradation and conversion specificity. Methods Performance was tested by combining fully methylated and fully unmethylated λ-DNA controls in a series of spikes by means of Sanger sequencing (0%, 25%, 50% and 100% methylated spikes) and Next-Generation Sequencing (0%, 3%, 5%, 7%, 10%, 25%, 50% and 100% methylated spikes). We also studied the methylation status of two of our previously published differentially methylated regions (DMRs) at base resolution by using spikes of chorionic villus sample in whole blood. Results The kits studied showed different but comparable results regarding DNA degradation, conversion efficiency and conversion specificity. However, the best performance was observed with the MethylEdge Bisulfite Conversion System (Promega) followed by the Premium Bisulfite kit (Diagenode). The DMRs, EP6 and EP10, were confirmed to be hypermethylated in the CVS and hypomethylated in whole blood. Conclusion Our findings indicate that the MethylEdge Bisulfite Conversion System (Promega) was shown to have the best performance among the kits. In addition, the methylation level of two of our DMRs, EP6 and EP10, was confirmed. Finally, we showed that bisulfite amplicon sequencing is a suitable approach for methylation analysis of targeted regions. PMID:26247357
Bisulfite Conversion of DNA: Performance Comparison of Different Kits and Methylation Quantitation of Epigenetic Biomarkers that Have the Potential to Be Used in Non-Invasive Prenatal Testing.

PubMed

Leontiou, Chrysanthia A; Hadjidaniel, Michael D; Mina, Petros; Antoniou, Pavlos; Ioannides, Marios; Patsalis, Philippos C

2015-01-01

Epigenetic alterations, including DNA methylation, play an important role in the regulation of gene expression. Several methods exist for evaluating DNA methylation, but bisulfite sequencing remains the gold standard by which base-pair resolution of CpG methylation is achieved. The challenge of the method is that the desired outcome (conversion of unmethylated cytosines) positively correlates with the undesired side effects (DNA degradation and inappropriate conversion), thus several commercial kits try to adjust a balance between the two. The aim of this study was to compare the performance of four bisulfite conversion kits [Premium Bisulfite kit (Diagenode), EpiTect Bisulfite kit (Qiagen), MethylEdge Bisulfite Conversion System (Promega) and BisulFlash DNA Modification kit (Epigentek)] regarding conversion efficiency, DNA degradation and conversion specificity. Performance was tested by combining fully methylated and fully unmethylated λ-DNA controls in a series of spikes by means of Sanger sequencing (0%, 25%, 50% and 100% methylated spikes) and Next-Generation Sequencing (0%, 3%, 5%, 7%, 10%, 25%, 50% and 100% methylated spikes). We also studied the methylation status of two of our previously published differentially methylated regions (DMRs) at base resolution by using spikes of chorionic villus sample in whole blood. The kits studied showed different but comparable results regarding DNA degradation, conversion efficiency and conversion specificity. However, the best performance was observed with the MethylEdge Bisulfite Conversion System (Promega) followed by the Premium Bisulfite kit (Diagenode). The DMRs, EP6 and EP10, were confirmed to be hypermethylated in the CVS and hypomethylated in whole blood. Our findings indicate that the MethylEdge Bisulfite Conversion System (Promega) was shown to have the best performance among the kits. In addition, the methylation level of two of our DMRs, EP6 and EP10, was confirmed. Finally, we showed that bisulfite amplicon sequencing is a suitable approach for methylation analysis of targeted regions.
Zinc finger nuclease technology: advances and obstacles in modelling and treating genetic disorders.

PubMed

Jabalameli, Hamid Reza; Zahednasab, Hamid; Karimi-Moghaddam, Amin; Jabalameli, Mohammad Reza

2015-03-01

Zinc finger nucleases (ZFNs) are engineered restriction enzymes designed to target specific DNA sequences within the genome. Assembly of zinc finger DNA-binding domain to a DNA-cleavage domain enables the enzyme machinery to target unique locus in the genome and invoke endogenous DNA repair mechanisms. This machinery offers a versatile approach in allele editing and gene therapy. Here we discuss the architecture of ZFNs and strategies for generating targeted modifications within the genome. We review advances in gene therapy and modelling of the disease using these enzymes and finally, discuss the practical obstacles in using this technology. Copyright © 2014 Elsevier B.V. All rights reserved.
Structure and Dynamics of DNA and RNA Double Helices Obtained from the CCG and GGC Trinucleotide Repeats.

PubMed

Pan, Feng; Man, Viet Hoang; Roland, Christopher; Sagui, Celeste

2018-04-26

Expansions of both GGC and CCG sequences lead to a number of expandable, trinucleotide repeat (TR) neurodegenerative diseases. Understanding of these diseases involves, among other things, the structural characterization of the atypical DNA and RNA secondary structures. We have performed molecular dynamics simulations of (GCC) n and (GGC) n homoduplexes in order to characterize their conformations, stability, and dynamics. Each TR has two reading frames, which results in eight nonequivalent RNA/DNA homoduplexes, characterized by CpG or GpC steps between the Watson-Crick base pairs. Free energy maps for the eight homoduplexes indicate that the C-mismatches prefer anti-anti conformations, while G-mismatches prefer anti-syn conformations. Comparison between three modifications of the DNA AMBER force field shows good agreement for the mismatch free energy maps. The mismatches in DNA-GCC (but not CCG) are extrahelical, forming an extended e-motif. The mismatched duplexes exhibit characteristic sequence-dependent step twist, with strong variations in the G-rich sequences and the e-motif. The distribution of Na + is highly localized around the mismatches, especially G-mismatches. In the e-motif, there is strong Na + binding by two G(N7) atoms belonging to the pseudo GpC step created when cytosines are extruded and by extrahelical cytosines. Finally, we used a novel technique based on fast melting by means of an infrared laser pulse to classify the relative stability of the different DNA-CCG and -GGC homoduplexes.
Contribution of the first K-homology domain of poly(C)-binding protein 1 to its affinity and specificity for C-rich oligonucleotides

PubMed Central

Yoga, Yano M. K.; Traore, Daouda A. K.; Sidiqi, Mahjooba; Szeto, Chris; Pendini, Nicole R.; Barker, Andrew; Leedman, Peter J.; Wilce, Jacqueline A.; Wilce, Matthew C. J.

2012-01-01

Poly-C-binding proteins are triple KH (hnRNP K homology) domain proteins with specificity for single stranded C-rich RNA and DNA. They play diverse roles in the regulation of protein expression at both transcriptional and translational levels. Here, we analyse the contributions of individual αCP1 KH domains to binding C-rich oligonucleotides using biophysical and structural methods. Using surface plasmon resonance (SPR), we demonstrate that KH1 makes the most stable interactions with both RNA and DNA, KH3 binds with intermediate affinity and KH2 only interacts detectibly with DNA. The crystal structure of KH1 bound to a 5′-CCCTCCCT-3′ DNA sequence shows a 2:1 protein:DNA stoichiometry and demonstrates a molecular arrangement of KH domains bound to immediately adjacent oligonucleotide target sites. SPR experiments, with a series of poly-C-sequences reveals that cytosine is preferred at all four positions in the oligonucleotide binding cleft and that a C-tetrad binds KH1 with 10 times higher affinity than a C-triplet. The basis for this high affinity interaction is finally detailed with the structure determination of a KH1.W.C54S mutant bound to 5′-ACCCCA-3′ DNA sequence. Together, these data establish the lead role of KH1 in oligonucleotide binding by αCP1 and reveal the molecular basis of its specificity for a C-rich tetrad. PMID:22344691
Contribution of the first K-homology domain of poly(C)-binding protein 1 to its affinity and specificity for C-rich oligonucleotides.

PubMed

Yoga, Yano M K; Traore, Daouda A K; Sidiqi, Mahjooba; Szeto, Chris; Pendini, Nicole R; Barker, Andrew; Leedman, Peter J; Wilce, Jacqueline A; Wilce, Matthew C J

2012-06-01

Poly-C-binding proteins are triple KH (hnRNP K homology) domain proteins with specificity for single stranded C-rich RNA and DNA. They play diverse roles in the regulation of protein expression at both transcriptional and translational levels. Here, we analyse the contributions of individual αCP1 KH domains to binding C-rich oligonucleotides using biophysical and structural methods. Using surface plasmon resonance (SPR), we demonstrate that KH1 makes the most stable interactions with both RNA and DNA, KH3 binds with intermediate affinity and KH2 only interacts detectibly with DNA. The crystal structure of KH1 bound to a 5'-CCCTCCCT-3' DNA sequence shows a 2:1 protein:DNA stoichiometry and demonstrates a molecular arrangement of KH domains bound to immediately adjacent oligonucleotide target sites. SPR experiments, with a series of poly-C-sequences reveals that cytosine is preferred at all four positions in the oligonucleotide binding cleft and that a C-tetrad binds KH1 with 10 times higher affinity than a C-triplet. The basis for this high affinity interaction is finally detailed with the structure determination of a KH1.W.C54S mutant bound to 5'-ACCCCA-3' DNA sequence. Together, these data establish the lead role of KH1 in oligonucleotide binding by αCP1 and reveal the molecular basis of its specificity for a C-rich tetrad.
Lactobacillus hayakitensis sp. nov., isolated from intestines of healthy thoroughbreds.

PubMed

Morita, Hidetoshi; Shiratori, Chiharu; Murakami, Masaru; Takami, Hideto; Kato, Yukio; Endo, Akihito; Nakajima, Fumihiko; Takagi, Misako; Akita, Hiroaki; Okada, Sanae; Masaoka, Toshio

2007-12-01

Two strains, KBL13(T) and GBL13, were isolated as one of intestinal lactobacilli from the faecal specimens from different thoroughbreds of the same farm where they were born in Hokkaido, Japan. They were Gram-positive, facultatively anaerobic, catalase-negative, non-spore-forming and non-motile rods. KBL13(T) and GBL13 homofermentatively metabolize glucose, and produce lactate as the sole final product from glucose. The 16S rRNA gene sequence, DNA-DNA hybridization, DNA G+C content and biochemical characterization indicated that these two strains, KBL13(T) and GBL13, belong to the same species. In the representative strain, KBL13(T), the DNA G+C content was 34.3 mol%. Lactobacillus salivarius JCM 1231(T) (=ATCC 11741(T); AF089108) is the type strain most closely related to the strain KBL13(T) as shown in the phylogenetic tree, and the 16S rRNA gene sequence identity showed 96.0 % (1425/1484 bp). Comparative 16S rRNA gene sequence analysis of this strain indicated that the two isolated strains belong to the genus Lactobacillus and that they formed a branch distinct from their closest relatives, L. salivarius, Lactobacillus aviarius, Lactobacillus saerimneri and Lactobacillus acidipiscis. DNA-DNA reassociation experiments with L. salivarius and L. aviarius confirmed that KBL13(T) represents a novel species, for which the name Lactobacillus hayakitensis sp. nov. is proposed. The type strain is KBL13(T) (=JCM 14209(T)=DSM 18933(T)).
Genome-wide characterization of centromeric satellites from multiple mammalian genomes.

PubMed

Alkan, Can; Cardone, Maria Francesca; Catacchio, Claudia Rita; Antonacci, Francesca; O'Brien, Stephen J; Ryder, Oliver A; Purgato, Stefania; Zoli, Monica; Della Valle, Giuliano; Eichler, Evan E; Ventura, Mario

2011-01-01

Despite its importance in cell biology and evolution, the centromere has remained the final frontier in genome assembly and annotation due to its complex repeat structure. However, isolation and characterization of the centromeric repeats from newly sequenced species are necessary for a complete understanding of genome evolution and function. In recent years, various genomes have been sequenced, but the characterization of the corresponding centromeric DNA has lagged behind. Here, we present a computational method (RepeatNet) to systematically identify higher-order repeat structures from unassembled whole-genome shotgun sequence and test whether these sequence elements correspond to functional centromeric sequences. We analyzed genome datasets from six species of mammals representing the diversity of the mammalian lineage, namely, horse, dog, elephant, armadillo, opossum, and platypus. We define candidate monomer satellite repeats and demonstrate centromeric localization for five of the six genomes. Our analysis revealed the greatest diversity of centromeric sequences in horse and dog in contrast to elephant and armadillo, which showed high-centromeric sequence homogeneity. We could not isolate centromeric sequences within the platypus genome, suggesting that centromeres in platypus are not enriched in satellite DNA. Our method can be applied to the characterization of thousands of other vertebrate genomes anticipated for sequencing in the near future, providing an important tool for annotation of centromeres.
The dynamics of genome replication using deep sequencing

PubMed Central

Müller, Carolin A.; Hawkins, Michelle; Retkute, Renata; Malla, Sunir; Wilson, Ray; Blythe, Martin J.; Nakato, Ryuichiro; Komata, Makiko; Shirahige, Katsuhiko; de Moura, Alessandro P.S.; Nieduszynski, Conrad A.

2014-01-01

Eukaryotic genomes are replicated from multiple DNA replication origins. We present complementary deep sequencing approaches to measure origin location and activity in Saccharomyces cerevisiae. Measuring the increase in DNA copy number during a synchronous S-phase allowed the precise determination of genome replication. To map origin locations, replication forks were stalled close to their initiation sites; therefore, copy number enrichment was limited to origins. Replication timing profiles were generated from asynchronous cultures using fluorescence-activated cell sorting. Applying this technique we show that the replication profiles of haploid and diploid cells are indistinguishable, indicating that both cell types use the same cohort of origins with the same activities. Finally, increasing sequencing depth allowed the direct measure of replication dynamics from an exponentially growing culture. This is the first time this approach, called marker frequency analysis, has been successfully applied to a eukaryote. These data provide a high-resolution resource and methodological framework for studying genome biology. PMID:24089142
Scar-less multi-part DNA assembly design automation

DOE Office of Scientific and Technical Information (OSTI.GOV)

Hillson, Nathan J.

The present invention provides a method of a method of designing an implementation of a DNA assembly. In an exemplary embodiment, the method includes (1) receiving a list of DNA sequence fragments to be assembled together and an order in which to assemble the DNA sequence fragments, (2) designing DNA oligonucleotides (oligos) for each of the DNA sequence fragments, and (3) creating a plan for adding flanking homology sequences to each of the DNA oligos. In an exemplary embodiment, the method includes (1) receiving a list of DNA sequence fragments to be assembled together and an order in which tomore » assemble the DNA sequence fragments, (2) designing DNA oligonucleotides (oligos) for each of the DNA sequence fragments, and (3) creating a plan for adding optimized overhang sequences to each of the DNA oligos.« less
The multi-zinc finger protein ZNF217 contacts DNA through a two-finger domain.

PubMed

Nunez, Noelia; Clifton, Molly M K; Funnell, Alister P W; Artuz, Crisbel; Hallal, Samantha; Quinlan, Kate G R; Font, Josep; Vandevenne, Marylène; Setiyaputra, Surya; Pearson, Richard C M; Mackay, Joel P; Crossley, Merlin

2011-11-04

Classical C2H2 zinc finger proteins are among the most abundant transcription factors found in eukaryotes, and the mechanisms through which they recognize their target genes have been extensively investigated. In general, a tandem array of three fingers separated by characteristic TGERP links is required for sequence-specific DNA recognition. Nevertheless, a significant number of zinc finger proteins do not contain a hallmark three-finger array of this type, raising the question of whether and how they contact DNA. We have examined the multi-finger protein ZNF217, which contains eight classical zinc fingers. ZNF217 is implicated as an oncogene and in repressing the E-cadherin gene. We show that two of its zinc fingers, 6 and 7, can mediate contacts with DNA. We examine its putative recognition site in the E-cadherin promoter and demonstrate that this is a suboptimal site. NMR analysis and mutagenesis is used to define the DNA binding surface of ZNF217, and we examine the specificity of the DNA binding activity using fluorescence anisotropy titrations. Finally, sequence analysis reveals that a variety of multi-finger proteins also contain two-finger units, and our data support the idea that these may constitute a distinct subclass of DNA recognition motif.
The Multi-zinc Finger Protein ZNF217 Contacts DNA through a Two-finger Domain*

PubMed Central

Nunez, Noelia; Clifton, Molly M. K.; Funnell, Alister P. W.; Artuz, Crisbel; Hallal, Samantha; Quinlan, Kate G. R.; Font, Josep; Vandevenne, Marylène; Setiyaputra, Surya; Pearson, Richard C. M.; Mackay, Joel P.; Crossley, Merlin

2011-01-01

Classical C2H2 zinc finger proteins are among the most abundant transcription factors found in eukaryotes, and the mechanisms through which they recognize their target genes have been extensively investigated. In general, a tandem array of three fingers separated by characteristic TGERP links is required for sequence-specific DNA recognition. Nevertheless, a significant number of zinc finger proteins do not contain a hallmark three-finger array of this type, raising the question of whether and how they contact DNA. We have examined the multi-finger protein ZNF217, which contains eight classical zinc fingers. ZNF217 is implicated as an oncogene and in repressing the E-cadherin gene. We show that two of its zinc fingers, 6 and 7, can mediate contacts with DNA. We examine its putative recognition site in the E-cadherin promoter and demonstrate that this is a suboptimal site. NMR analysis and mutagenesis is used to define the DNA binding surface of ZNF217, and we examine the specificity of the DNA binding activity using fluorescence anisotropy titrations. Finally, sequence analysis reveals that a variety of multi-finger proteins also contain two-finger units, and our data support the idea that these may constitute a distinct subclass of DNA recognition motif. PMID:21908891
Toehold-mediated strand displacement reaction triggered isothermal DNA amplification for highly sensitive and selective fluorescent detection of single-base mutation.

PubMed

Zhu, Jing; Ding, Yongshun; Liu, Xingti; Wang, Lei; Jiang, Wei

2014-09-15

Highly sensitive and selective detection strategy for single-base mutations is essential for risk assessment of malignancy and disease prognosis. In this work, a fluorescent detection method for single-base mutation was proposed based on high selectivity of toehold-mediated strand displacement reaction (TSDR) and powerful signal amplification capability of isothermal DNA amplification. A discrimination probe was specially designed with a stem-loop structure and an overhanging toehold domain. Hybridization between the toehold domain and the perfect matched target initiated the TSDR along with the unfolding of the discrimination probe. Subsequently, the target sequence acted as a primer to initiate the polymerization and nicking reactions, which released a great abundant of short sequences. Finally, the released strands were annealed with the reporter probe, launching another polymerization and nicking reaction to produce lots of G-quadruplex DNA, which could bind the N-methyl mesoporphyrin IX to yield an enhanced fluorescence response. However, when there was even a single base mismatch in the target DNA, the TSDR was suppressed and so subsequent isothermal DNA amplification and fluorescence response process could not occur. The proposed approach has been successfully implemented for the identification of the single-base mutant sequences in the human KRAS gene with a detection limit of 1.8 pM. Furthermore, a recovery of 90% was obtained when detecting the target sequence in spiked HeLa cells lysate, demonstrating the feasibility of this detection strategy for single-base mutations in biological samples. Copyright © 2014 Elsevier B.V. All rights reserved.
Cardiovascular genetics: technological advancements and applicability for dilated cardiomyopathy.

PubMed

Kummeling, G J M; Baas, A F; Harakalova, M; van der Smagt, J J; Asselbergs, F W

2015-07-01

Genetics plays an important role in the pathophysiology of cardiovascular diseases, and is increasingly being integrated into clinical practice. Since 2008, both capacity and cost-efficiency of mutation screening of DNA have been increased magnificently due to the technological advancement obtained by next-generation sequencing. Hence, the discovery rate of genetic defects in cardiovascular genetics has grown rapidly and the financial threshold for gene diagnostics has been lowered, making large-scale DNA sequencing broadly accessible. In this review, the genetic variants, mutations and inheritance models are briefly introduced, after which an overview is provided of current clinical and technological applications in gene diagnostics and research for cardiovascular disease and in particular, dilated cardiomyopathy. Finally, a reflection on the future perspectives in cardiogenetics is given.
Genomics: The Science and Technology Behind the Human Genome Project (by Charles R. Cantor and Cassandra L. Smith)

NASA Astrophysics Data System (ADS)

Serra, Reviewed By Martin J.

2000-01-01

Genomics is one of the most rapidly expanding areas of science. This book is an outgrowth of a series of lectures given by one of the former heads (CRC) of the Human Genome Initiative. The book is designed to reach a wide audience, from biologists with little chemical or physical science background through engineers, computer scientists, and physicists with little current exposure to the chemical or biological principles of genetics. The text starts with a basic review of the chemical and biological properties of DNA. However, without either a biochemistry background or a supplemental biochemistry text, this chapter and much of the rest of the text would be difficult to digest. The second chapter is designed to put DNA into the context of the larger chromosomal unit. Specialized chromosomal structures and sequences (centromeres, telomeres) are introduced, leading to a section on chromosome organization and purification. The next 4 chapters cover the physical (hybridization, electrophoresis), chemical (polymerase chain reaction), and biological (genetic) techniques that provide the backbone of genomic analysis. These chapters cover in significant detail the fundamental principles underlying each technique and provide a firm background for the remainder of the text. Chapters 79 consider the need and methods for the development of physical maps. Chapter 7 primarily discusses chromosomal localization techniques, including in situ hybridization, FISH, and chromosome paintings. The next two chapters focus on the development of libraries and clones. In particular, Chapter 9 considers the limitations of current mapping and clone production. The current state and future of DNA sequencing is covered in the next three chapters. The first considers the current methods of DNA sequencing - especially gel-based methods of analysis, although other possible approaches (mass spectrometry) are introduced. Much of the chapter addresses the limitations of current methods, including analysis of error in sequencing and current bottlenecks in the sequencing effort. The next chapter describes the steps necessary to scale current technologies for the sequencing of entire genomes. Chapter 12 examines alternate methods for DNA sequencing. Initially, methods of single-molecule sequencing and sequencing by microscopy are introduced; the majority of the chapter is devoted to the development of DNA sequencing methods using chip microarrays and hybridization. The remaining chapters (13-15) consider the uses and analysis of DNA sequence information. The initial focus is on the identification of genes. Several examples are given of the use of DNA sequence information for diagnosis of inherited or infectious diseases. The sequence-specific manipulation of DNA is discussed in Chapter 14. The final chapter deals with the implications of large-scale sequencing, including methods for identifying genes and finding errors in DNA sequences, to the development of computer algorithms for the interpretation of DNA sequence information. The text figures are black and white line drawings that, although clearly done, seem a bit primitive for 1999. While I appreciated the simplicity of the drawings, many students accustomed to more colorful presentations will find them wanting. The four color figures in the center of the text seem an afterthought and add little to the text's clarity. Each chapter has a set of additional reading sources, mostly primary sources. Often, specialized topics are offset into boxes that provide clarification and amplification without cluttering the text. An appendix includes a list of the Web-based database resources. As an undergraduate instructor who has previously taught biochemistry, molecular biology, and a course on the human genome, I found many interesting tidbits and amplifications throughout the text. I would recommend this book as a text for an advanced undergraduate or beginning graduate course in genomics. Although the text works though several examples of genetic and genome analysis, additional problem/homework sets would need to be developed to ensure student comprehension. The text steers clear of the ethical implications of the Human Genome Initiative and remains true to its subtitle The Science and Technology .
Cloning of human prourokinase cDNA without the signal peptide and expression in Escherichia coli.

PubMed

Hu, B; Li, J; Yu, W; Fang, J

1993-01-01

Human prourokinase (pro-UK) cDNA without the signal peptide was obtained using synthetic oligonucleotide and DNA recombination techniques and was successfully expressed in E. coli. The plasmid pMMUK which contained pro-UK cDNA (including both the entire coding sequence and the sequence for signal peptide) was digested with Hind III and PstI, so that the N-terminal 371-bp fragment could be recovered. A 304-bp fragment was collected from the 371-bp fragment after partial digestion with Fnu4HI in order to remove the signal peptide sequence. An intermediate plasmid was formed after this 304-bp fragment and the synthetic oligonucleotide was ligated with pUC18. Correctness of the ligation was confirmed by enzyme digestion and sequencing. By joining the PstI-PstI fragment of pro-UK to the plasmid we obtained the final plasmid which contained the entire coding sequence of pro-UK without the signal peptide. The coding sequence with correct orientation was inserted into pBV220 under the control of the temperature-induced promoter PRPL, and mature pro-UK was expressed in E. coli at 42 degrees C. Both sonicated supernatant and inclusion bodies of the bacterial host JM101 showed positive results by ELISA and FAPA assays. After renaturation, the biological activity of the expressed product was increased from 500-1000IU/L to about 60,000IU/L. The bacterial pro-UK showed a molecular weight of about 47,000 daltons by Western blot analysis. It can be completely inhibited by UK antiserum but not by t-PA antiserum nor by normal rabbit serum.

A computational method for estimating the PCR duplication rate in DNA and RNA-seq experiments.

PubMed

Bansal, Vikas

2017-03-14

PCR amplification is an important step in the preparation of DNA sequencing libraries prior to high-throughput sequencing. PCR amplification introduces redundant reads in the sequence data and estimating the PCR duplication rate is important to assess the frequency of such reads. Existing computational methods do not distinguish PCR duplicates from "natural" read duplicates that represent independent DNA fragments and therefore, over-estimate the PCR duplication rate for DNA-seq and RNA-seq experiments. In this paper, we present a computational method to estimate the average PCR duplication rate of high-throughput sequence datasets that accounts for natural read duplicates by leveraging heterozygous variants in an individual genome. Analysis of simulated data and exome sequence data from the 1000 Genomes project demonstrated that our method can accurately estimate the PCR duplication rate on paired-end as well as single-end read datasets which contain a high proportion of natural read duplicates. Further, analysis of exome datasets prepared using the Nextera library preparation method indicated that 45-50% of read duplicates correspond to natural read duplicates likely due to fragmentation bias. Finally, analysis of RNA-seq datasets from individuals in the 1000 Genomes project demonstrated that 70-95% of read duplicates observed in such datasets correspond to natural duplicates sampled from genes with high expression and identified outlier samples with a 2-fold greater PCR duplication rate than other samples. The method described here is a useful tool for estimating the PCR duplication rate of high-throughput sequence datasets and for assessing the fraction of read duplicates that correspond to natural read duplicates. An implementation of the method is available at https://github.com/vibansal/PCRduplicates .
Recovery of infectious virus from full-length cowpox virus (CPXV) DNA cloned as a bacterial artificial chromosome (BAC)

PubMed Central

2011-01-01

Transmission from pet rats and cats to humans as well as severe infection in felids and other animal species have recently drawn increasing attention to cowpox virus (CPXV). We report the cloning of the entire genome of cowpox virus strain Brighton Red (BR) as a bacterial artificial chromosome (BAC) in Escherichia coli and the recovery of infectious virus from cloned DNA. Generation of a full-length CPXV DNA clone was achieved by first introducing a mini-F vector, which allows maintenance of large circular DNA in E. coli, into the thymidine kinase locus of CPXV by homologous recombination. Circular replication intermediates were then electroporated into E. coli DH10B cells. Upon successful establishment of the infectious BR clone, we modified the full-length clone such that recombination-mediated excision of bacterial sequences can occur upon transfection in eukaryotic cells. This self-excision of the bacterial replicon is made possible by a sequence duplication within mini-F sequences and allows recovery of recombinant virus progeny without remaining marker or vector sequences. The in vitro growth properties of viruses derived from both BAC clones were determined and found to be virtually indistinguishable from those of parental, wild-type BR. Finally, the complete genomic sequence of the infectious clone was determined and the cloned viral genome was shown to be identical to that of the parental virus. In summary, the generated infectious clone will greatly facilitate studies on individual genes and pathogenesis of CPXV. Moreover, the vector potential of CPXV can now be more systematically explored using this newly generated tool. PMID:21314965
Structural Transformation of Wireframe DNA Origami via DNA Polymerase Assisted Gap-Filling.

PubMed

Agarwal, Nayan P; Matthies, Michael; Joffroy, Bastian; Schmidt, Thorsten L

2018-03-27

The programmability of DNA enables constructing nanostructures with almost any arbitrary shape, which can be decorated with many functional materials. Moreover, dynamic structures can be realized such as molecular motors and walkers. In this work, we have explored the possibility to synthesize the complementary sequences to single-stranded gap regions in the DNA origami scaffold cost effectively by a DNA polymerase rather than by a DNA synthesizer. For this purpose, four different wireframe DNA origami structures were designed to have single-stranded gap regions. This reduced the number of staple strands needed to determine the shape and size of the final structure after gap filling. For this, several DNA polymerases and single-stranded binding (SSB) proteins were tested, with T4 DNA polymerase being the best fit. The structures could be folded in as little as 6 min, and the subsequent optimized gap-filling reaction was completed in less than 3 min. The introduction of flexible gap regions results in fully collapsed or partially bent structures due to entropic spring effects. Finally, we demonstrated structural transformations of such deformed wireframe DNA origami structures with DNA polymerases including the expansion of collapsed structures and the straightening of curved tubes. We anticipate that this approach will become a powerful tool to build DNA wireframe structures more material-efficiently, and to quickly prototype and test new wireframe designs that can be expanded, rigidified, or mechanically switched. Mechanical force generation and structural transitions will enable applications in structural DNA nanotechnology, plasmonics, or single-molecule biophysics.
COMBINING DNA SEQUENCES AND MORPHOLOGY IN SYSTEMATICS: TESTING THE VALIDITY OF THE DRAGONFLY SPECIES CORDULEGASTER BILINEATA. (R826599)

EPA Science Inventory

The perspectives, information and conclusions conveyed in research project abstracts, progress reports, final reports, journal abstracts and journal publications convey the viewpoints of the principal investigator and may not represent the views and policies of ORD and EPA. Concl...
Sequential addition of short DNA oligos in DNA-polymerase-based synthesis reactions

DOEpatents

Gardner, Shea N; Mariella, Jr., Raymond P; Christian, Allen T; Young, Jennifer A; Clague, David S

2013-06-25

A method of preselecting a multiplicity of DNA sequence segments that will comprise the DNA molecule of user-defined sequence, separating the DNA sequence segments temporally, and combining the multiplicity of DNA sequence segments with at least one polymerase enzyme wherein the multiplicity of DNA sequence segments join to produce the DNA molecule of user-defined sequence. Sequence segments may be of length n, where n is an odd integer. In one embodiment the length of desired hybridizing overlap is specified by the user and the sequences and the protocol for combining them are guided by computational (bioinformatics) predictions. In one embodiment sequence segments are combined from multiple reading frames to span the same region of a sequence, so that multiple desired hybridizations may occur with different overlap lengths.
Sequential addition of short DNA oligos in DNA-polymerase-based synthesis reactions

DOEpatents

Gardner, Shea N [San Leandro, CA; Mariella, Jr., Raymond P.; Christian, Allen T [Tracy, CA; Young, Jennifer A [Berkeley, CA; Clague, David S [Livermore, CA

2011-01-18

A method of fabricating a DNA molecule of user-defined sequence. The method comprises the steps of preselecting a multiplicity of DNA sequence segments that will comprise the DNA molecule of user-defined sequence, separating the DNA sequence segments temporally, and combining the multiplicity of DNA sequence segments with at least one polymerase enzyme wherein the multiplicity of DNA sequence segments join to produce the DNA molecule of user-defined sequence. Sequence segments may be of length n, where n is an even or odd integer. In one embodiment the length of desired hybridizing overlap is specified by the user and the sequences and the protocol for combining them are guided by computational (bioinformatics) predictions. In one embodiment sequence segments are combined from multiple reading frames to span the same region of a sequence, so that multiple desired hybridizations may occur with different overlap lengths. In one embodiment starting sequence fragments are of different lengths, n, n+1, n+2, etc.
VarDict: a novel and versatile variant caller for next-generation sequencing in cancer research

PubMed Central

Lai, Zhongwu; Markovets, Aleksandra; Ahdesmaki, Miika; Chapman, Brad; Hofmann, Oliver; McEwen, Robert; Johnson, Justin; Dougherty, Brian; Barrett, J. Carl; Dry, Jonathan R.

2016-01-01

Abstract Accurate variant calling in next generation sequencing (NGS) is critical to understand cancer genomes better. Here we present VarDict, a novel and versatile variant caller for both DNA- and RNA-sequencing data. VarDict simultaneously calls SNV, MNV, InDels, complex and structural variants, expanding the detected genetic driver landscape of tumors. It performs local realignments on the fly for more accurate allele frequency estimation. VarDict performance scales linearly to sequencing depth, enabling ultra-deep sequencing used to explore tumor evolution or detect tumor DNA circulating in blood. In addition, VarDict performs amplicon aware variant calling for polymerase chain reaction (PCR)-based targeted sequencing often used in diagnostic settings, and is able to detect PCR artifacts. Finally, VarDict also detects differences in somatic and loss of heterozygosity variants between paired samples. VarDict reprocessing of The Cancer Genome Atlas (TCGA) Lung Adenocarcinoma dataset called known driver mutations in KRAS, EGFR, BRAF, PIK3CA and MET in 16% more patients than previously published variant calls. We believe VarDict will greatly facilitate application of NGS in clinical cancer research. PMID:27060149
DNA Commission of the International Society for Forensic Genetics: revised and extended guidelines for mitochondrial DNA typing.

PubMed

Parson, W; Gusmão, L; Hares, D R; Irwin, J A; Mayr, W R; Morling, N; Pokorak, E; Prinz, M; Salas, A; Schneider, P M; Parsons, T J

2014-11-01

The DNA Commission of the International Society of Forensic Genetics (ISFG) regularly publishes guidelines and recommendations concerning the application of DNA polymorphisms to the question of human identification. Previous recommendations published in 2000 addressed the analysis and interpretation of mitochondrial DNA (mtDNA) in forensic casework. While the foundations set forth in the earlier recommendations still apply, new approaches to the quality control, alignment and nomenclature of mitochondrial sequences, as well as the establishment of mtDNA reference population databases, have been developed. Here, we describe these developments and discuss their application to both mtDNA casework and mtDNA reference population databasing applications. While the generation of mtDNA for forensic casework has always been guided by specific standards, it is now well-established that data of the same quality are required for the mtDNA reference population data used to assess the statistical weight of the evidence. As a result, we introduce guidelines regarding sequence generation, as well as quality control measures based on the known worldwide mtDNA phylogeny, that can be applied to ensure the highest quality population data possible. For both casework and reference population databasing applications, the alignment and nomenclature of haplotypes is revised here and the phylogenetic alignment proffered as acceptable standard. In addition, the interpretation of heteroplasmy in the forensic context is updated, and the utility of alignment-free database searches for unbiased probability estimates is highlighted. Finally, we discuss statistical issues and define minimal standards for mtDNA database searches. Copyright © 2014 Elsevier Ireland Ltd. All rights reserved.
Curated collection of yeast transcription factor DNA binding specificity data reveals novel structural and gene regulatory insights

PubMed Central

2011-01-01

Background Transcription factors (TFs) play a central role in regulating gene expression by interacting with cis-regulatory DNA elements associated with their target genes. Recent surveys have examined the DNA binding specificities of most Saccharomyces cerevisiae TFs, but a comprehensive evaluation of their data has been lacking. Results We analyzed in vitro and in vivo TF-DNA binding data reported in previous large-scale studies to generate a comprehensive, curated resource of DNA binding specificity data for all characterized S. cerevisiae TFs. Our collection comprises DNA binding site motifs and comprehensive in vitro DNA binding specificity data for all possible 8-bp sequences. Investigation of the DNA binding specificities within the basic leucine zipper (bZIP) and VHT1 regulator (VHR) TF families revealed unexpected plasticity in TF-DNA recognition: intriguingly, the VHR TFs, newly characterized by protein binding microarrays in this study, recognize bZIP-like DNA motifs, while the bZIP TF Hac1 recognizes a motif highly similar to the canonical E-box motif of basic helix-loop-helix (bHLH) TFs. We identified several TFs with distinct primary and secondary motifs, which might be associated with different regulatory functions. Finally, integrated analysis of in vivo TF binding data with protein binding microarray data lends further support for indirect DNA binding in vivo by sequence-specific TFs. Conclusions The comprehensive data in this curated collection allow for more accurate analyses of regulatory TF-DNA interactions, in-depth structural studies of TF-DNA specificity determinants, and future experimental investigations of the TFs' predicted target genes and regulatory roles. PMID:22189060
Shuffle Optimizer: A Program to Optimize DNA Shuffling for Protein Engineering.

PubMed

Milligan, John N; Garry, Daniel J

2017-01-01

DNA shuffling is a powerful tool to develop libraries of variants for protein engineering. Here, we present a protocol to use our freely available and easy-to-use computer program, Shuffle Optimizer. Shuffle Optimizer is written in the Python computer language and increases the nucleotide homology between two pieces of DNA desired to be shuffled together without changing the amino acid sequence. In addition we also include sections on optimal primer design for DNA shuffling and library construction, a small-volume ultrasonicator method to create sheared DNA, and finally a method to reassemble the sheared fragments and recover and clone the library. The Shuffle Optimizer program and these protocols will be useful to anyone desiring to perform any of the nucleotide homology-dependent shuffling methods.
Contrasting morphological and DNA barcode-suggested species boundaries among shallow-water amphipod fauna from the southern European Atlantic coast.

PubMed

Lobo, Jorge; Ferreira, Maria S; Antunes, Ilisa C; Teixeira, Marcos A L; Borges, Luisa M S; Sousa, Ronaldo; Gomes, Pedro A; Costa, Maria Helena; Cunha, Marina R; Costa, Filipe O

2017-02-01

In this study we compared DNA barcode-suggested species boundaries with morphology-based species identifications in the amphipod fauna of the southern European Atlantic coast. DNA sequences of the cytochrome c oxidase subunit I barcode region (COI-5P) were generated for 43 morphospecies (178 specimens) collected along the Portuguese coast which, together with publicly available COI-5P sequences, produced a final dataset comprising 68 morphospecies and 295 sequences. Seventy-five BINs (Barcode Index Numbers) were assigned to these morphospecies, of which 48 were concordant (i.e., 1 BIN = 1 species), 8 were taxonomically discordant, and 19 were singletons. Twelve species had matching sequences (<2% distance) with conspecifics from distant locations (e.g., North Sea). Seven morphospecies were assigned to multiple, and highly divergent, BINs, including specimens of Corophium multisetosum (18% divergence) and Dexamine spiniventris (16% divergence), which originated from sampling locations on the west coast of Portugal (only about 36 and 250 km apart, respectively). We also found deep divergence (4%-22%) among specimens of seven species from Portugal compared to those from the North Sea and Italy. The detection of evolutionarily meaningful divergence among populations of several amphipod species from southern Europe reinforces the need for a comprehensive re-assessment of the diversity of this faunal group.
One recognition sequence, seven restriction enzymes, five reaction mechanisms

PubMed Central

Gowers, Darren M.; Bellamy, Stuart R.W.; Halford, Stephen E.

2004-01-01

The diversity of reaction mechanisms employed by Type II restriction enzymes was investigated by analysing the reactions of seven endonucleases at the same DNA sequence. NarI, KasI, Mly113I, SfoI, EgeI, EheI and BbeI cleave DNA at several different positions in the sequence 5′-GGCGCC-3′. Their reactions on plasmids with one or two copies of this sequence revealed five distinct mechanisms. These differ in terms of the number of sites the enzyme binds, and the number of phosphodiester bonds cleaved per turnover. NarI binds two sites, but cleaves only one bond per DNA-binding event. KasI also cuts only one bond per turnover but acts at individual sites, preferring intact to nicked sites. Mly113I cuts both strands of its recognition sites, but shows full activity only when bound to two sites, which are then cleaved concertedly. SfoI, EgeI and EheI cut both strands at individual sites, in the manner historically considered as normal for Type II enzymes. Finally, BbeI displays an absolute requirement for two sites in close physical proximity, which are cleaved concertedly. The range of reaction mechanisms for restriction enzymes is thus larger than commonly imagined, as is the number of enzymes needing two recognition sites. PMID:15226412
Graphene oxide-DNA based sensors.

PubMed

Gao, Li; Lian, Chaoqun; Zhou, Yang; Yan, Lirong; Li, Qin; Zhang, Chunxia; Chen, Liang; Chen, Keping

2014-10-15

Since graphene oxide (GO) is readily available and exhibits exceptional optical, electrical, mechanical and chemical properties, it has attracted increasing interests for use in GO-DNA based sensors. This paper reviews the advances in GO-DNA based sensors using DNA as recognition elements. In solution, GO is as an excellent acceptor of fluorescence resonance energy transfer (FRET) to quench the fluorescence in dye labeled DNA sequences. This review discusses the emerging GO-DNA based sensors related to FRET for use in the detection of DNA, proteins, metal ions, cysteine (Cys), and others. The application of the electrochemical GO-DNA based sensors is also summarized because GO possesses exceptional electrochemical properties. The detection mechanisms and the advantages of GO are also revealed and discussed. GO-DNA based sensors perform well at low cost, and high sensitivity, and provide low detection limits. Additionally, GO-DNA based sensors should appear in the near future as scientists explore their usefulness and properties. Finally, future perspectives and possible challenges in this area are outlined. Copyright © 2014 Elsevier B.V. All rights reserved.
Interactions of the EcoRV restriction endonuclease with fluorescent oligodeoxynucleotides.

PubMed

Erskine, S G; Halford, S E

1995-05-19

A self-complementary dodecadeoxyribonucleotide that contains the recognition sequence for the R.EcoRV ENase was synthesised with a primary amino group at its 5' terminus. The 5' amino function was labeled with the fluorescent dye 5-[dimethylamino] napthalene-1-sulfonyl chloride. The labeled oligodeoxyribonucleotide in its duplex form was shown to be a suitable substrate for kinetic studies on the ENase and that no significant dye-DNA or dye-protein interactions occurred. Finally, the binding of R.EcoRV to the labeled DNA was followed by detecting the fluorescence resonance energy transfer between the tryptophans of the protein and the fluorescent labels of the DNA.
Oligo-DNA Custom Macroarray for Monitoring Major Pathogenic and Non-Pathogenic Fungi and Bacteria in the Phyllosphere of Apple Trees

PubMed Central

He, Ying-Hong; Isono, Sayaka; Shibuya, Makoto; Tsuji, Masaharu; Adkar Purushothama, Charith-Raj; Tanaka, Kazuaki; Sano, Teruo

2012-01-01

Background To monitor the richness in microbial inhabitants in the phyllosphere of apple trees cultivated under various cultural and environmental conditions, we developed an oligo-DNA macroarray for major pathogenic and non-pathogenic fungi and bacteria inhabiting the phyllosphere of apple trees. Methods and Findings First, we isolated culturable fungi and bacteria from apple orchards by an agar-plate culture method, and detected 32 fungal and 34 bacterial species. Alternaria, Aureobasidium, Cladosporium, Rhodotorula, Cystofilobasidium, and Epicoccum genera were predominant among the fungi, and Bacillus, Pseudomonas, Sphingomonas, Methylobacterium, and Pantoea genera were predominant among the bacteria. Based on the data, we selected 29 major non-pathogenic and 12 phytopathogenic fungi and bacteria as the targets of macroarray. Forty-one species-specific 40-base pair long oligo-DNA sequences were selected from the nucleotide sequences of rDNA-internal transcribed spacer region for fungi and 16S rDNA for bacteria. The oligo-DNAs were fixed on nylon membrane and hybridized with digoxigenin-labeled cRNA probes prepared for each species. All arrays except those for Alternaria, Bacillus, and their related species, were specifically hybridized. The array was sensitive enough to detect 103 CFU for Aureobasidium pullulans and Bacillus cereus. Nucleotide sequencing of 100 each of independent fungal rDNA-ITS and bacterial 16S-rDNA sequences from apple tree was in agreement with the macroarray data obtained using the same sample. Finally, we analyzed the richness in the microbial inhabitants in the samples collected from apple trees in four orchards. Major apple pathogens that cause scab, Alternaria blotch, and Marssonina blotch were detected along with several non-phytopathogenic fungal and bacterial inhabitants. Conclusions The macroarray technique presented here is a strong tool to monitor the major microbial species and the community structures in the phyllosphere of apple trees and identify key species antagonistic, supportive or co-operative to specific pathogens in the orchard managed under different environmental conditions. PMID:22479577
Oligo-DNA custom macroarray for monitoring major pathogenic and non-pathogenic fungi and bacteria in the phyllosphere of apple trees.

PubMed

He, Ying-Hong; Isono, Sayaka; Shibuya, Makoto; Tsuji, Masaharu; Adkar Purushothama, Charith-Raj; Tanaka, Kazuaki; Sano, Teruo

2012-01-01

To monitor the richness in microbial inhabitants in the phyllosphere of apple trees cultivated under various cultural and environmental conditions, we developed an oligo-DNA macroarray for major pathogenic and non-pathogenic fungi and bacteria inhabiting the phyllosphere of apple trees. First, we isolated culturable fungi and bacteria from apple orchards by an agar-plate culture method, and detected 32 fungal and 34 bacterial species. Alternaria, Aureobasidium, Cladosporium, Rhodotorula, Cystofilobasidium, and Epicoccum genera were predominant among the fungi, and Bacillus, Pseudomonas, Sphingomonas, Methylobacterium, and Pantoea genera were predominant among the bacteria. Based on the data, we selected 29 major non-pathogenic and 12 phytopathogenic fungi and bacteria as the targets of macroarray. Forty-one species-specific 40-base pair long oligo-DNA sequences were selected from the nucleotide sequences of rDNA-internal transcribed spacer region for fungi and 16S rDNA for bacteria. The oligo-DNAs were fixed on nylon membrane and hybridized with digoxigenin-labeled cRNA probes prepared for each species. All arrays except those for Alternaria, Bacillus, and their related species, were specifically hybridized. The array was sensitive enough to detect 10(3) CFU for Aureobasidium pullulans and Bacillus cereus. Nucleotide sequencing of 100 each of independent fungal rDNA-ITS and bacterial 16S-rDNA sequences from apple tree was in agreement with the macroarray data obtained using the same sample. Finally, we analyzed the richness in the microbial inhabitants in the samples collected from apple trees in four orchards. Major apple pathogens that cause scab, Alternaria blotch, and Marssonina blotch were detected along with several non-phytopathogenic fungal and bacterial inhabitants. The macroarray technique presented here is a strong tool to monitor the major microbial species and the community structures in the phyllosphere of apple trees and identify key species antagonistic, supportive or co-operative to specific pathogens in the orchard managed under different environmental conditions.
Application of Quaternion in improving the quality of global sequence alignment scores for an ambiguous sequence target in Streptococcus pneumoniae DNA

NASA Astrophysics Data System (ADS)

Lestari, D.; Bustamam, A.; Novianti, T.; Ardaneswari, G.

2017-07-01

DNA sequence can be defined as a succession of letters, representing the order of nucleotides within DNA, using a permutation of four DNA base codes including adenine (A), guanine (G), cytosine (C), and thymine (T). The precise code of the sequences is determined using DNA sequencing methods and technologies, which have been developed since the 1970s and currently become highly developed, advanced and highly throughput sequencing technologies. So far, DNA sequencing has greatly accelerated biological and medical research and discovery. However, in some cases DNA sequencing could produce any ambiguous and not clear enough sequencing results that make them quite difficult to be determined whether these codes are A, T, G, or C. To solve these problems, in this study we can introduce other representation of DNA codes namely Quaternion Q = (PA, PT, PG, PC), where PA, PT, PG, PC are the probability of A, T, G, C bases that could appear in Q and PA + PT + PG + PC = 1. Furthermore, using Quaternion representations we are able to construct the improved scoring matrix for global sequence alignment processes, by applying a dot product method. Moreover, this scoring matrix produces better and higher quality of the match and mismatch score between two DNA base codes. In implementation, we applied the Needleman-Wunsch global sequence alignment algorithm using Octave, to analyze our target sequence which contains some ambiguous sequence data. The subject sequences are the DNA sequences of Streptococcus pneumoniae families obtained from the Genebank, meanwhile the target DNA sequence are received from our collaborator database. As the results we found the Quaternion representations improve the quality of the sequence alignment score and we can conclude that DNA sequence target has maximum similarity with Streptococcus pneumoniae.
Large-Scale Concatenation cDNA Sequencing

PubMed Central

Yu, Wei; Andersson, Björn; Worley, Kim C.; Muzny, Donna M.; Ding, Yan; Liu, Wen; Ricafrente, Jennifer Y.; Wentland, Meredith A.; Lennon, Greg; Gibbs, Richard A.

1997-01-01

A total of 100 kb of DNA derived from 69 individual human brain cDNA clones of 0.7–2.0 kb were sequenced by concatenated cDNA sequencing (CCS), whereby multiple individual DNA fragments are sequenced simultaneously in a single shotgun library. The method yielded accurate sequences and a similar efficiency compared with other shotgun libraries constructed from single DNA fragments (>20 kb). Computer analyses were carried out on 65 cDNA clone sequences and their corresponding end sequences to examine both nucleic acid and amino acid sequence similarities in the databases. Thirty-seven clones revealed no DNA database matches, 12 clones generated exact matches (≥98% identity), and 16 clones generated nonexact matches (57%–97% identity) to either known human or other species genes. Of those 28 matched clones, 8 had corresponding end sequences that failed to identify similarities. In a protein similarity search, 27 clone sequences displayed significant matches, whereas only 20 of the end sequences had matches to known protein sequences. Our data indicate that full-length cDNA insert sequences provide significantly more nucleic acid and protein sequence similarity matches than expressed sequence tags (ESTs) for database searching. [All 65 cDNA clone sequences described in this paper have been submitted to the GenBank data library under accession nos. U79240–U79304.] PMID:9110174
Synthesis of DNA

DOEpatents

Mariella, Jr., Raymond P.

2008-11-18

A method of synthesizing a desired double-stranded DNA of a predetermined length and of a predetermined sequence. Preselected sequence segments that will complete the desired double-stranded DNA are determined. Preselected segment sequences of DNA that will be used to complete the desired double-stranded DNA are provided. The preselected segment sequences of DNA are assembled to produce the desired double-stranded DNA.
Nanopore Technology: A Simple, Inexpensive, Futuristic Technology for DNA Sequencing.

PubMed

Gupta, P D

2016-10-01

In health care, importance of DNA sequencing has been fully established. Sanger's Capillary Electrophoresis DNA sequencing methodology is time consuming, cumbersome, hence become more expensive. Lately, because of its versatility DNA sequencing became house hold name, and therefore, there is an urgent need of simple, fast, inexpensive, DNA sequencing technology. In the beginning of this century efforts were made, and Nanopore DNA sequencing technology was developed; still it is infancy, nevertheless, it is the futuristic technology.

The genome-wide DNA sequence specificity of the anti-tumour drug bleomycin in human cells.

PubMed

Murray, Vincent; Chen, Jon K; Tanaka, Mark M

2016-07-01

The cancer chemotherapeutic agent, bleomycin, cleaves DNA at specific sites. For the first time, the genome-wide DNA sequence specificity of bleomycin breakage was determined in human cells. Utilising Illumina next-generation DNA sequencing techniques, over 200 million bleomycin cleavage sites were examined to elucidate the bleomycin genome-wide DNA selectivity. The genome-wide bleomycin cleavage data were analysed by four different methods to determine the cellular DNA sequence specificity of bleomycin strand breakage. For the most highly cleaved DNA sequences, the preferred site of bleomycin breakage was at 5'-GT* dinucleotide sequences (where the asterisk indicates the bleomycin cleavage site), with lesser cleavage at 5'-GC* dinucleotides. This investigation also determined longer bleomycin cleavage sequences, with preferred cleavage at 5'-GT*A and 5'- TGT* trinucleotide sequences, and 5'-TGT*A tetranucleotides. For cellular DNA, the hexanucleotide DNA sequence 5'-RTGT*AY (where R is a purine and Y is a pyrimidine) was the most highly cleaved DNA sequence. It was striking that alternating purine-pyrimidine sequences were highly cleaved by bleomycin. The highest intensity cleavage sites in cellular and purified DNA were very similar although there were some minor differences. Statistical nucleotide frequency analysis indicated a G nucleotide was present at the -3 position (relative to the cleavage site) in cellular DNA but was absent in purified DNA.
Characterizing and controlling intrinsic biases of lambda exonuclease in nascent strand sequencing reveals phasing between nucleosomes and G-quadruplex motifs around a subset of human replication origins

PubMed Central

Foulk, Michael S.; Urban, John M.; Casella, Cinzia; Gerbi, Susan A.

2015-01-01

Nascent strand sequencing (NS-seq) is used to discover DNA replication origins genome-wide, allowing identification of features for their specification. NS-seq depends on the ability of lambda exonuclease (λ-exo) to efficiently digest parental DNA while leaving RNA-primer protected nascent strands intact. We used genomics and biochemical approaches to determine if λ-exo digests all parental DNA sequences equally. We report that λ-exo does not efficiently digest G-quadruplex (G4) structures in a plasmid. Moreover, λ-exo digestion of nonreplicating genomic DNA (LexoG0) enriches GC-rich DNA and G4 motifs genome-wide. We used LexoG0 data to control for nascent strand–independent λ-exo biases in NS-seq and validated this approach at the rDNA locus. The λ-exo–controlled NS-seq peaks are not GC-rich, and only 35.5% overlap with 6.8% of all G4s, suggesting that G4s are not general determinants for origin specification but may play a role for a subset. Interestingly, we observed a periodic spacing of G4 motifs and nucleosomes around the peak summits, suggesting that G4s may position nucleosomes at this subset of origins. Finally, we demonstrate that use of Na+ instead of K+ in the λ-exo digestion buffer reduced the effect of G4s on λ-exo digestion and discuss ways to increase both the sensitivity and specificity of NS-seq. PMID:25695952
Characterizing and controlling intrinsic biases of lambda exonuclease in nascent strand sequencing reveals phasing between nucleosomes and G-quadruplex motifs around a subset of human replication origins.

PubMed

Foulk, Michael S; Urban, John M; Casella, Cinzia; Gerbi, Susan A

2015-05-01

Nascent strand sequencing (NS-seq) is used to discover DNA replication origins genome-wide, allowing identification of features for their specification. NS-seq depends on the ability of lambda exonuclease (λ-exo) to efficiently digest parental DNA while leaving RNA-primer protected nascent strands intact. We used genomics and biochemical approaches to determine if λ-exo digests all parental DNA sequences equally. We report that λ-exo does not efficiently digest G-quadruplex (G4) structures in a plasmid. Moreover, λ-exo digestion of nonreplicating genomic DNA (LexoG0) enriches GC-rich DNA and G4 motifs genome-wide. We used LexoG0 data to control for nascent strand-independent λ-exo biases in NS-seq and validated this approach at the rDNA locus. The λ-exo-controlled NS-seq peaks are not GC-rich, and only 35.5% overlap with 6.8% of all G4s, suggesting that G4s are not general determinants for origin specification but may play a role for a subset. Interestingly, we observed a periodic spacing of G4 motifs and nucleosomes around the peak summits, suggesting that G4s may position nucleosomes at this subset of origins. Finally, we demonstrate that use of Na(+) instead of K(+) in the λ-exo digestion buffer reduced the effect of G4s on λ-exo digestion and discuss ways to increase both the sensitivity and specificity of NS-seq. © 2015 Foulk et al.; Published by Cold Spring Harbor Laboratory Press.
Foreign Plastid Sequences in Plant Mitochondria are Frequently Acquired Via Mitochondrion-to-Mitochondrion Horizontal Transfer

PubMed Central

Gandini, C. L.; Sanchez-Puerta, M. V.

2017-01-01

Angiosperm mitochondrial genomes (mtDNA) exhibit variable quantities of alien sequences. Many of these sequences are acquired by intracellular gene transfer (IGT) from the plastid. In addition, frequent events of horizontal gene transfer (HGT) between mitochondria of different species also contribute to their expanded genomes. In contrast, alien sequences are rarely found in plastid genomes. Most of the plant-to-plant HGT events involve mitochondrion-to-mitochondrion transfers. Occasionally, foreign sequences in mtDNAs are plastid-derived (MTPT), raising questions about their origin, frequency, and mechanism of transfer. The rising number of complete mtDNAs allowed us to address these questions. We identified 15 new foreign MTPTs, increasing significantly the number of those previously reported. One out of five of the angiosperm species analyzed contained at least one foreign MTPT, suggesting a remarkable frequency of HGT among plants. By analyzing the flanking regions of the foreign MTPTs, we found strong evidence for mt-to-mt transfers in 65% of the cases. We hypothesize that plastid sequences were initially acquired by the native mtDNA via IGT and then transferred to a distantly-related plant via mitochondrial HGT, rather than directly from a foreign plastid to the mitochondrial genome. Finally, we describe three novel putative cases of mitochondrial-derived sequences among angiosperm plastomes. PMID:28262720
Ancient DNA from marine mammals: studying long-lived species over ecological and evolutionary timescales.

PubMed

Foote, Andrew D; Hofreiter, Michael; Morin, Phillip A

2012-01-20

Marine mammals have long generation times and broad, difficult to sample distributions, which makes inferring evolutionary and demographic changes using field studies of extant populations challenging. However, molecular analyses from sub-fossil or historical materials of marine mammals such as bone, tooth, baleen, skin, fur, whiskers and scrimshaw using ancient DNA (aDNA) approaches provide an opportunity for investigating such changes over evolutionary and ecological timescales. Here, we review the application of aDNA techniques to the study of marine mammals. Most of the studies have focused on detecting changes in genetic diversity following periods of exploitation and environmental change. To date, these studies have shown that even small sample sizes can provide useful information on historical genetic diversity. Ancient DNA has also been used in investigations of changes in distribution and range of marine mammal species; we review these studies and discuss the limitations of such 'presence only' studies. Combining aDNA data with stable isotopes can provide further insights into changes in ecology and we review past studies and suggest future potential applications. We also discuss studies reconstructing inter- and intra-specific phylogenies from aDNA sequences and discuss how aDNA sequences could be used to estimate mutation rates. Finally, we highlight some of the problems of aDNA studies on marine mammals, such as obtaining sufficient sample sizes and calibrating for the marine reservoir effect when radiocarbon-dating such wide-ranging species. Copyright © 2011 Elsevier GmbH. All rights reserved.
Use of amplicon sequencing to improve sensitivity in PCR-based detection of microbial pathogen in environmental samples.

PubMed

Saingam, Prakit; Li, Bo; Yan, Tao

2018-06-01

DNA-based molecular detection of microbial pathogens in complex environments is still plagued by sensitivity, specificity and robustness issues. We propose to address these issues by viewing them as inadvertent consequences of requiring specific and adequate amplification (SAA) of target DNA molecules by current PCR methods. Using the invA gene of Salmonella as the model system, we investigated if next generation sequencing (NGS) can be used to directly detect target sequences in false-negative PCR reaction (PCR-NGS) in order to remove the SAA requirement from PCR. False-negative PCR and qPCR reactions were first created using serial dilutions of laboratory-prepared Salmonella genomic DNA and then analyzed directly by NGS. Target invA sequences were detected in all false-negative PCR and qPCR reactions, which lowered the method detection limits near the theoretical minimum of single gene copy detection. The capability of the PCR-NGS approach in correcting false negativity was further tested and confirmed under more environmentally relevant conditions using Salmonella-spiked stream water and sediment samples. Finally, the PCR-NGS approach was applied to ten urban stream water samples and detected invA sequences in eight samples that would be otherwise deemed Salmonella negative. Analysis of the non-target sequences in the false-negative reactions helped to identify primer dime-like short sequences as the main cause of the false negativity. Together, the results demonstrated that the PCR-NGS approach can significantly improve method sensitivity, correct false-negative detections, and enable sequence-based analysis for failure diagnostics in complex environmental samples. Copyright © 2018 Elsevier B.V. All rights reserved.
Molecular simulations of assembly of functionalized spherical nanoparticles

NASA Astrophysics Data System (ADS)

Seifpour, Arezou

Precise assembly of nanoparticles is crucial for creating spatially engineered materials that can be used for photonics, photovoltaic, and metamaterials applications. One way to control nanoparticle assembly is by functionalizing the nanoparticle with ligands, such as polymers, DNA, and proteins, that can manipulate the interactions between the nanoparticles in the medium the particles are placed in. This thesis research aims to design ligands to provide a new route to the programmable assembly of nanoparticles. We first investigate using Monte Carlo simulation the effect of copolymer ligands on nanoparticle assembly. We first study a single nanoparticle grafted with many copolymer chains to understand how monomer sequence (e.g. alternating ABAB, or diblock AxBx) and chemistry of the copolymers affect the grafted chain conformation at various particle diameters, grafting densities, copolymer chain lengths, and monomer-monomer interactions in an implicit small molecule solvent. We find that the size of the grafted chain varies non-monotonically with increasing blockiness of the monomer sequence for a small particle diameter. From this first study, we selected the two sequences with the most different chain conformations---alternating and diblock---and studied the effect of the sequence and a range of monomer chemistries of the copolymer on the characteristics of assembly of multiple copolymer-functionalized nanoparticles. We find that the alternating sequence produces nanoclusters that are relatively isotropic, whereas diblock sequence tends to form anisotropic structures that are smaller and more compact when the block closer to the surface is attractive and larger loosely held together clusters when the outer block is attractive. Next, we conduct molecular dynamics simulations to study the effect of DNA ligands on nanoparticle assembly. Specifically we investigate the effect of grafted DNA strand composition (e.g. G/C content, placement and sequence) and bidispersity in DNA strand lengths on the thermodynamics and structure of assembly of functionalized nanoparticles. We find that higher G/C content increases cluster dissociation temperature for smaller particles. Placement of G/C block inward along the strand decreases number of neighbors within the assembled cluster. Finally, increased bidispersity in DNA strand lengths leads a distribution of inter-particle distances in the assembled cluster.
LINE-1 Elements in Structural Variation and Disease

PubMed Central

Beck, Christine R.; Garcia-Perez, José Luis; Badge, Richard M.; Moran, John V.

2014-01-01

The completion of the human genome reference sequence ushered in a new era for the study and discovery of human transposable elements. It now is undeniable that transposable elements, historically dismissed as junk DNA, have had an instrumental role in sculpting the structure and function of our genomes. In particular, long interspersed element-1 (LINE-1 or L1) and short interspersed elements (SINEs) continue to affect our genome, and their movement can lead to sporadic cases of disease. Here, we briefly review the types of transposable elements present in the human genome and their mechanisms of mobility. We next highlight how advances in DNA sequencing and genomic technologies have enabled the discovery of novel retrotransposons in individual genomes. Finally, we discuss how L1-mediated retrotransposition events impact human genomes. PMID:21801021
Problems with DNA

ERIC Educational Resources Information Center

Erickson, Keith A.; Franciszkowicz, Marc J.

2010-01-01

A modified version of this project was used during the final seven days of a year-long calculus sequence at the United States Military Academy to introduce students to the nature of integrative learning. Students from different majors were brought together in groups and spent the first few days going over the mathematics material presented here.…
Sequence and Structure Dependent DNA-DNA Interactions

NASA Astrophysics Data System (ADS)

Kopchick, Benjamin; Qiu, Xiangyun

Molecular forces between dsDNA strands are largely dominated by electrostatics and have been extensively studied. Quantitative knowledge has been accumulated on how DNA-DNA interactions are modulated by varied biological constituents such as ions, cationic ligands, and proteins. Despite its central role in biology, the sequence of DNA has not received substantial attention and ``random'' DNA sequences are typically used in biophysical studies. However, ~50% of human genome is composed of non-random-sequence DNAs, particularly repetitive sequences. Furthermore, covalent modifications of DNA such as methylation play key roles in gene functions. Such DNAs with specific sequences or modifications often take on structures other than the canonical B-form. Here we present series of quantitative measurements of the DNA-DNA forces with the osmotic stress method on different DNA sequences, from short repeats to the most frequent sequences in genome, and to modifications such as bromination and methylation. We observe peculiar behaviors that appear to be strongly correlated with the incurred structural changes. We speculate the causalities in terms of the differences in hydration shell and DNA surface structures.
First case of Plasmodium knowlesi infection in a Japanese traveller returning from Malaysia.

PubMed

Tanizaki, Ryutaro; Ujiie, Mugen; Kato, Yasuyuki; Iwagami, Moritoshi; Hashimoto, Aki; Kutsuna, Satoshi; Takeshita, Nozomi; Hayakawa, Kyoko; Kanagawa, Shuzo; Kano, Shigeyuki; Ohmagari, Norio

2013-04-15

This is the first case of Plasmodium knowlesi infection in a Japanese traveller returning from Malaysia. In September 2012, a previously healthy 35-year-old Japanese man presented to National Center for Global Health and Medicine in Tokyo with a two-day history of daily fever, mild headaches and mild arthralgia. Malaria parasites were found in the Giemsa-stained thin blood smear, which showed band forms similar to Plasmodium malariae. Although a nested PCR showed the amplification of the primer of Plasmodium vivax and Plasmodium knowlesi, he was finally diagnosed with P. knowlesi mono-infection by DNA sequencing. He was treated with mefloquine, and recovered without any complications. DNA sequencing of the PCR products is indispensable to confirm P. knowlesi infection, however there is limited access to DNA sequencing procedures in endemic areas. The extent of P. knowlesi transmission in Asia has not been clearly defined. There is limited availability of diagnostic tests and routine surveillance system for reporting an accurate diagnosis in the Asian endemic regions. Thus, reporting accurately diagnosed cases of P. knowlesi infection in travellers would be important for assessing the true nature of this emerging human infection.
Conformational and mechanical changes of DNA upon transcription factor binding detected by a QCM and transmission line model.

PubMed

de-Carvalho, Jorge; Rodrigues, Rogério M M; Tomé, Brigitte; Henriques, Sílvia F; Mira, Nuno P; Sá-Correia, Isabel; Ferreira, Guilherme N M

2014-04-21

A novel quartz crystal microbalance (QCM) analytical method is developed based on the transmission line model (TLM) algorithm to analyze the binding of transcription factors (TFs) to immobilized DNA oligoduplexes. The method is used to characterize the mechanical properties of biological films through the estimation of the film dynamic shear moduli, G and G, and the film thickness. Using the Saccharomyces cerevisiae transcription factor Haa1 (Haa1DBD) as a biological model two sensors were prepared by immobilizing DNA oligoduplexes, one containing the Haa1 recognition element (HRE(wt)) and another with a random sequence (HRE(neg)) used as a negative control. The immobilization of DNA oligoduplexes was followed in real time and we show that DNA strands initially adsorb with low or non-tilting, laying flat close to the surface, which then lift-off the surface leading to final film tilting angles of 62.9° and 46.7° for HRE(wt) and HRE(neg), respectively. Furthermore we show that the binding of Haa1DBD to HRE(wt) leads to a more ordered and compact film, and forces a 31.7° bending of the immobilized HRE(wt) oligoduplex. This work demonstrates the suitability of the QCM to monitor the specific binding of TFs to immobilized DNA sequences and provides an analytical methodology to study protein-DNA biophysics and kinetics.
Theory and applications of the polymerase chain reaction.

PubMed

Remick, D G; Kunkel, S L; Holbrook, E A; Hanson, C A

1990-04-01

The polymerase chain reaction (PCR) is a newly developed molecular biology technique that can significantly amplify DNA or RNA. The process consists of repetitive cycles of specific DNA synthesis, defined by short stretches of preselected DNA. With each cycle, there is a doubling of the final, desired DNA product such that a million-fold amplification is possible. This powerful method has numerous applications in diagnostic pathology, especially in the fields of microbiology, forensic science, and hematology. The PCR may be used to directly detect viral DNA, which may facilitate the diagnosis of acquired immune deficiency syndrome (AIDS) or other viral diseases. PCR amplification of DNA allows detection of specific sequences from extremely small samples, such as with forensic material. In hematology, PCR may help in the diagnosis of hemoglobinopathies or of neoplastic disorders by documenting chromosomal translocations. The new PCR opens exciting new avenues for diagnostic pathology using this new technology.
Multiplex electrochemical DNA platform for femtomolar-level quantification of genetically modified soybean.

PubMed

Manzanares-Palenzuela, C Lorena; de-Los-Santos-Álvarez, Noemí; Lobo-Castañón, María Jesús; López-Ruiz, Beatriz

2015-06-15

Current EU regulations on the mandatory labeling of genetically modified organisms (GMOs) with a minimum content of 0.9% would benefit from the availability of reliable and rapid methods to detect and quantify DNA sequences specific for GMOs. Different genosensors have been developed to this aim, mainly intended for GMO screening. A remaining challenge, however, is the development of genosensing platforms for GMO quantification, which should be expressed as the number of event-specific DNA sequences per taxon-specific sequences. Here we report a simple and sensitive multiplexed electrochemical approach for the quantification of Roundup-Ready Soybean (RRS). Two DNA sequences, taxon (lectin) and event-specific (RR), are targeted via hybridization onto magnetic beads. Both sequences are simultaneously detected by performing the immobilization, hybridization and labeling steps in a single tube and parallel electrochemical readout. Hybridization is performed in a sandwich format using signaling probes labeled with fluorescein isothiocyanate (FITC) or digoxigenin (Dig), followed by dual enzymatic labeling using Fab fragments of anti-Dig and anti-FITC conjugated to peroxidase or alkaline phosphatase, respectively. Electrochemical measurement of the enzyme activity is finally performed on screen-printed carbon electrodes. The assay gave a linear range of 2-250 pM for both targets, with LOD values of 650 fM (160 amol) and 190 fM (50 amol) for the event-specific and the taxon-specific targets, respectively. Results indicate that the method could be applied for GMO quantification below the European labeling threshold level (0.9%), offering a general approach for the rapid quantification of specific GMO events in foods. Copyright © 2015 Elsevier B.V. All rights reserved.
Functionalization of optical nanotip arrays with an electrochemical microcantilever for multiplexed DNA detection.

PubMed

Descamps, Emeline; Duroure, Nathalie; Deiss, Frédérique; Leichlé, Thierry; Adam, Catherine; Mailley, Pascal; Aït-Ikhlef, Ali; Livache, Thierry; Nicu, Liviu; Sojic, Neso

2013-08-07

Optical nanotip arrays fabricated on etched fiber bundles were functionalized with DNA spots. Such unconventional substrates (3D and non-planar) are difficult to pattern with standard microfabrication techniques but, using an electrochemical cantilever, up to 400 spots were electrodeposited on the nanostructured optical surface in 5 min. This approach allows each spot to be addressed individually and multiplexed fluorescence detection is demonstrated. Finally, remote fluorescence detection was performed by imaging through the optical fiber bundle itself after hybridisation with the complementary sequence.
A High-Throughput Process for the Solid-Phase Purification of Synthetic DNA Sequences

PubMed Central

Grajkowski, Andrzej; Cieślak, Jacek; Beaucage, Serge L.

2017-01-01

An efficient process for the purification of synthetic phosphorothioate and native DNA sequences is presented. The process is based on the use of an aminopropylated silica gel support functionalized with aminooxyalkyl functions to enable capture of DNA sequences through an oximation reaction with the keto function of a linker conjugated to the 5′-terminus of DNA sequences. Deoxyribonucleoside phosphoramidites carrying this linker, as a 5′-hydroxyl protecting group, have been synthesized for incorporation into DNA sequences during the last coupling step of a standard solid-phase synthesis protocol executed on a controlled pore glass (CPG) support. Solid-phase capture of the nucleobase- and phosphate-deprotected DNA sequences released from the CPG support is demonstrated to proceed near quantitatively. Shorter than full-length DNA sequences are first washed away from the capture support; the solid-phase purified DNA sequences are then released from this support upon reaction with tetra-n-butylammonium fluoride in dry dimethylsulfoxide (DMSO) and precipitated in tetrahydrofuran (THF). The purity of solid-phase-purified DNA sequences exceeds 98%. The simulated high-throughput and scalability features of the solid-phase purification process are demonstrated without sacrificing purity of the DNA sequences. PMID:28628204
Analytical Framework for Identifying and Differentiating Recent Hitchhiking and Severe Bottleneck Effects from Multi-Locus DNA Sequence Data

DOE PAGES

Sargsyan, Ori

2012-05-25

Hitchhiking and severe bottleneck effects have impact on the dynamics of genetic diversity of a population by inducing homogenization at a single locus and at the genome-wide scale, respectively. As a result, identification and differentiation of the signatures of such events from DNA sequence data at a single locus is challenging. This study develops an analytical framework for identifying and differentiating recent homogenization events at multiple neutral loci in low recombination regions. The dynamics of genetic diversity at a locus after a recent homogenization event is modeled according to the infinite-sites mutation model and the Wright-Fisher model of reproduction withmore » constant population size. In this setting, I derive analytical expressions for the distribution, mean, and variance of the number of polymorphic sites in a random sample of DNA sequences from a locus affected by a recent homogenization event. Based on this framework, three likelihood-ratio based tests are presented for identifying and differentiating recent homogenization events at multiple loci. Lastly, I apply the framework to two data sets. First, I consider human DNA sequences from four non-coding loci on different chromosomes for inferring evolutionary history of modern human populations. The results suggest, in particular, that recent homogenization events at the loci are identifiable when the effective human population size is 50000 or greater in contrast to 10000, and the estimates of the recent homogenization events are agree with the “Out of Africa” hypothesis. Second, I use HIV DNA sequences from HIV-1-infected patients to infer the times of HIV seroconversions. The estimates are contrasted with other estimates derived as the mid-time point between the last HIV-negative and first HIV-positive screening tests. Finally, the results show that significant discrepancies can exist between the estimates.« less
An improved model for whole genome phylogenetic analysis by Fourier transform.

PubMed

Yin, Changchuan; Yau, Stephen S-T

2015-10-07

DNA sequence similarity comparison is one of the major steps in computational phylogenetic studies. The sequence comparison of closely related DNA sequences and genomes is usually performed by multiple sequence alignments (MSA). While the MSA method is accurate for some types of sequences, it may produce incorrect results when DNA sequences undergone rearrangements as in many bacterial and viral genomes. It is also limited by its computational complexity for comparing large volumes of data. Previously, we proposed an alignment-free method that exploits the full information contents of DNA sequences by Discrete Fourier Transform (DFT), but still with some limitations. Here, we present a significantly improved method for the similarity comparison of DNA sequences by DFT. In this method, we map DNA sequences into 2-dimensional (2D) numerical sequences and then apply DFT to transform the 2D numerical sequences into frequency domain. In the 2D mapping, the nucleotide composition of a DNA sequence is a determinant factor and the 2D mapping reduces the nucleotide composition bias in distance measure, and thus improving the similarity measure of DNA sequences. To compare the DFT power spectra of DNA sequences with different lengths, we propose an improved even scaling algorithm to extend shorter DFT power spectra to the longest length of the underlying sequences. After the DFT power spectra are evenly scaled, the spectra are in the same dimensionality of the Fourier frequency space, then the Euclidean distances of full Fourier power spectra of the DNA sequences are used as the dissimilarity metrics. The improved DFT method, with increased computational performance by 2D numerical representation, can be applicable to any DNA sequences of different length ranges. We assess the accuracy of the improved DFT similarity measure in hierarchical clustering of different DNA sequences including simulated and real datasets. The method yields accurate and reliable phylogenetic trees and demonstrates that the improved DFT dissimilarity measure is an efficient and effective similarity measure of DNA sequences. Due to its high efficiency and accuracy, the proposed DFT similarity measure is successfully applied on phylogenetic analysis for individual genes and large whole bacterial genomes. Copyright © 2015 Elsevier Ltd. All rights reserved.
Authentication of Herbal Supplements Using Next-Generation Sequencing

PubMed Central

Braukmann, Thomas W. A.; Borisenko, Alex V.; Zakharov, Evgeny V.

2016-01-01

Background DNA-based testing has been gaining acceptance as a tool for authentication of a wide range of food products; however, its applicability for testing of herbal supplements remains contentious. Methods We utilized Sanger and Next-Generation Sequencing (NGS) for taxonomic authentication of fifteen herbal supplements representing three different producers from five medicinal plants: Echinacea purpurea, Valeriana officinalis, Ginkgo biloba, Hypericum perforatum and Trigonella foenum-graecum. Experimental design included three modifications of DNA extraction, two lysate dilutions, Internal Amplification Control, and multiple negative controls to exclude background contamination. Ginkgo supplements were also analyzed using HPLC-MS for the presence of active medicinal components. Results All supplements yielded DNA from multiple species, rendering Sanger sequencing results for rbcL and ITS2 regions either uninterpretable or non-reproducible between the experimental replicates. Overall, DNA from the manufacturer-listed medicinal plants was successfully detected in seven out of eight dry herb form supplements; however, low or poor DNA recovery due to degradation was observed in most plant extracts (none detected by Sanger; three out of seven–by NGS). NGS also revealed a diverse community of fungi, known to be associated with live plant material and/or the fermentation process used in the production of plant extracts. HPLC-MS testing demonstrated that Ginkgo supplements with degraded DNA contained ten key medicinal components. Conclusion Quality control of herbal supplements should utilize a synergetic approach targeting both DNA and bioactive components, especially for standardized extracts with degraded DNA. The NGS workflow developed in this study enables reliable detection of plant and fungal DNA and can be utilized by manufacturers for quality assurance of raw plant materials, contamination control during the production process, and the final product. Interpretation of results should involve an interdisciplinary approach taking into account the processes involved in production of herbal supplements, as well as biocomplexity of plant-plant and plant-fungal biological interactions. PMID:27227830
Authentication of Herbal Supplements Using Next-Generation Sequencing.

PubMed

Ivanova, Natalia V; Kuzmina, Maria L; Braukmann, Thomas W A; Borisenko, Alex V; Zakharov, Evgeny V

2016-01-01

DNA-based testing has been gaining acceptance as a tool for authentication of a wide range of food products; however, its applicability for testing of herbal supplements remains contentious. We utilized Sanger and Next-Generation Sequencing (NGS) for taxonomic authentication of fifteen herbal supplements representing three different producers from five medicinal plants: Echinacea purpurea, Valeriana officinalis, Ginkgo biloba, Hypericum perforatum and Trigonella foenum-graecum. Experimental design included three modifications of DNA extraction, two lysate dilutions, Internal Amplification Control, and multiple negative controls to exclude background contamination. Ginkgo supplements were also analyzed using HPLC-MS for the presence of active medicinal components. All supplements yielded DNA from multiple species, rendering Sanger sequencing results for rbcL and ITS2 regions either uninterpretable or non-reproducible between the experimental replicates. Overall, DNA from the manufacturer-listed medicinal plants was successfully detected in seven out of eight dry herb form supplements; however, low or poor DNA recovery due to degradation was observed in most plant extracts (none detected by Sanger; three out of seven-by NGS). NGS also revealed a diverse community of fungi, known to be associated with live plant material and/or the fermentation process used in the production of plant extracts. HPLC-MS testing demonstrated that Ginkgo supplements with degraded DNA contained ten key medicinal components. Quality control of herbal supplements should utilize a synergetic approach targeting both DNA and bioactive components, especially for standardized extracts with degraded DNA. The NGS workflow developed in this study enables reliable detection of plant and fungal DNA and can be utilized by manufacturers for quality assurance of raw plant materials, contamination control during the production process, and the final product. Interpretation of results should involve an interdisciplinary approach taking into account the processes involved in production of herbal supplements, as well as biocomplexity of plant-plant and plant-fungal biological interactions.

Ribosomal RNA Genes Contribute to the Formation of Pseudogenes and Junk DNA in the Human Genome.

PubMed

Robicheau, Brent M; Susko, Edward; Harrigan, Amye M; Snyder, Marlene

2017-02-01

Approximately 35% of the human genome can be identified as sequence devoid of a selected-effect function, and not derived from transposable elements or repeated sequences. We provide evidence supporting a known origin for a fraction of this sequence. We show that: 1) highly degraded, but near full length, ribosomal DNA (rDNA) units, including both 45S and Intergenic Spacer (IGS), can be found at multiple sites in the human genome on chromosomes without rDNA arrays, 2) that these rDNA sequences have a propensity for being centromere proximal, and 3) that sequence at all human functional rDNA array ends is divergent from canonical rDNA to the point that it is pseudogenic. We also show that small sequence strings of rDNA (from 45S + IGS) can be found distributed throughout the genome and are identifiable as an "rDNA-like signal", representing 0.26% of the q-arm of HSA21 and ∼2% of the total sequence of other regions tested. The size of sequence strings found in the rDNA-like signal intergrade into the size of sequence strings that make up the full-length degrading rDNA units found scattered throughout the genome. We conclude that the displaced and degrading rDNA sequences are likely of a similar origin but represent different stages in their evolution towards random sequence. Collectively, our data suggests that over vast evolutionary time, rDNA arrays contribute to the production of junk DNA. The concept that the production of rDNA pseudogenes is a by-product of concerted evolution represents a previously under-appreciated process; we demonstrate here its importance. © The Author(s) 2017. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.
Sequencing analysis of 20,000 full-length cDNA clones from cassava reveals lineage specific expansions in gene families related to stress response

PubMed Central

Sakurai, Tetsuya; Plata, Germán; Rodríguez-Zapata, Fausto; Seki, Motoaki; Salcedo, Andrés; Toyoda, Atsushi; Ishiwata, Atsushi; Tohme, Joe; Sakaki, Yoshiyuki; Shinozaki, Kazuo; Ishitani, Manabu

2007-01-01

Background Cassava, an allotetraploid known for its remarkable tolerance to abiotic stresses is an important source of energy for humans and animals and a raw material for many industrial processes. A full-length cDNA library of cassava plants under normal, heat, drought, aluminum and post harvest physiological deterioration conditions was built; 19968 clones were sequence-characterized using expressed sequence tags (ESTs). Results The ESTs were assembled into 6355 contigs and 9026 singletons that were further grouped into 10577 scaffolds; we found 4621 new cassava sequences and 1521 sequences with no significant similarity to plant protein databases. Transcripts of 7796 distinct genes were captured and we were able to assign a functional classification to 78% of them while finding more than half of the enzymes annotated in metabolic pathways in Arabidopsis. The annotation of sequences that were not paired to transcripts of other species included many stress-related functional categories showing that our library is enriched with stress-induced genes. Finally, we detected 230 putative gene duplications that include key enzymes in reactive oxygen species signaling pathways and could play a role in cassava stress response features. Conclusion The cassava full-length cDNA library here presented contains transcripts of genes involved in stress response as well as genes important for different areas of cassava research. This library will be an important resource for gene discovery, characterization and cloning; in the near future it will aid the annotation of the cassava genome. PMID:18096061
Representation of DNA sequences in genetic codon context with applications in exon and intron prediction.

PubMed

Yin, Changchuan

2015-04-01

To apply digital signal processing (DSP) methods to analyze DNA sequences, the sequences first must be specially mapped into numerical sequences. Thus, effective numerical mappings of DNA sequences play key roles in the effectiveness of DSP-based methods such as exon prediction. Despite numerous mappings of symbolic DNA sequences to numerical series, the existing mapping methods do not include the genetic coding features of DNA sequences. We present a novel numerical representation of DNA sequences using genetic codon context (GCC) in which the numerical values are optimized by simulation annealing to maximize the 3-periodicity signal to noise ratio (SNR). The optimized GCC representation is then applied in exon and intron prediction by Short-Time Fourier Transform (STFT) approach. The results show the GCC method enhances the SNR values of exon sequences and thus increases the accuracy of predicting protein coding regions in genomes compared with the commonly used 4D binary representation. In addition, this study offers a novel way to reveal specific features of DNA sequences by optimizing numerical mappings of symbolic DNA sequences.
Single-cell genomic sequencing using Multiple Displacement Amplification.

PubMed

Lasken, Roger S

2007-10-01

Single microbial cells can now be sequenced using DNA amplified by the Multiple Displacement Amplification (MDA) reaction. The few femtograms of DNA in a bacterium are amplified into micrograms of high molecular weight DNA suitable for DNA library construction and Sanger sequencing. The MDA-generated DNA also performs well when used directly as template for pyrosequencing by the 454 Life Sciences method. While MDA from single cells loses some of the genomic sequence, this approach will greatly accelerate the pace of sequencing from uncultured microbes. The genetically linked sequences from single cells are also a powerful tool to be used in guiding genomic assembly of shotgun sequences of multiple organisms from environmental DNA extracts (metagenomic sequences).
Laser mass spectrometry for DNA fingerprinting for forensic applications

DOE Office of Scientific and Technical Information (OSTI.GOV)

Chen, C.H.; Tang, K.; Taranenko, N.I.

The application of DNA fingerprinting has become very broad in forensic analysis, patient identification, diagnostic medicine, and wildlife poaching, since every individual`s DNA structure is identical within all tissues of their body. DNA fingerprinting was initiated by the use of restriction fragment length polymorphisms (RFLP). In 1987, Nakamura et al. found that a variable number of tandem repeats (VNTR) often occurred in the alleles. The probability of different individuals having the same number of tandem repeats in several different alleles is very low. Thus, the identification of VNTR from genomic DNA became a very reliable method for identification of individuals.more » DNA fingerprinting is a reliable tool for forensic analysis. In DNA fingerprinting, knowledge of the sequence of tandem repeats and restriction endonuclease sites can provide the basis for identification. The major steps for conventional DNA fingerprinting include (1) specimen processing (2) amplification of selected DNA segments by PCR, and (3) gel electrophoresis to do the final DNA analysis. In this work we propose to use laser desorption mass spectrometry for fast DNA fingerprinting. The process and advantages are discussed.« less
Freshly excavated fossil bones are best for amplification of ancient DNA.

PubMed

Pruvost, Mélanie; Schwarz, Reinhard; Correia, Virginia Bessa; Champlot, Sophie; Braguier, Séverine; Morel, Nicolas; Fernandez-Jalvo, Yolanda; Grange, Thierry; Geigl, Eva-Maria

2007-01-16

Despite the enormous potential of analyses of ancient DNA for phylogeographic studies of past populations, the impact these analyses, most of which are performed with fossil samples from natural history museum collections, has been limited to some extent by the inefficient recovery of ancient genetic material. Here we show that the standard storage conditions and/or treatments of fossil bones in these collections can be detrimental to DNA survival. Using a quantitative paleogenetic analysis of 247 herbivore fossil bones up to 50,000 years old and originating from 60 different archeological and paleontological contexts, we demonstrate that freshly excavated and nontreated unwashed bones contain six times more DNA and yield twice as many authentic DNA sequences as bones treated with standard procedures. This effect was even more pronounced with bones from one Neolithic site, where only freshly excavated bones yielded results. Finally, we compared the DNA content in the fossil bones of one animal, a approximately 3,200-year-old aurochs, excavated in two separate seasons 57 years apart. Whereas the washed museum-stored fossil bones did not permit any DNA amplification, all recently excavated bones yielded authentic aurochs sequences. We established that during the 57 years when the aurochs bones were stored in a collection, at least as much amplifiable DNA was lost as during the previous 3,200 years of burial. This result calls for a revision of the postexcavation treatment of fossil bones to better preserve the genetic heritage of past life forms.
Freshly excavated fossil bones are best for amplification of ancient DNA

PubMed Central

Pruvost, Mélanie; Schwarz, Reinhard; Correia, Virginia Bessa; Champlot, Sophie; Braguier, Séverine; Morel, Nicolas; Fernandez-Jalvo, Yolanda; Grange, Thierry; Geigl, Eva-Maria

2007-01-01

Despite the enormous potential of analyses of ancient DNA for phylogeographic studies of past populations, the impact these analyses, most of which are performed with fossil samples from natural history museum collections, has been limited to some extent by the inefficient recovery of ancient genetic material. Here we show that the standard storage conditions and/or treatments of fossil bones in these collections can be detrimental to DNA survival. Using a quantitative paleogenetic analysis of 247 herbivore fossil bones up to 50,000 years old and originating from 60 different archeological and paleontological contexts, we demonstrate that freshly excavated and nontreated unwashed bones contain six times more DNA and yield twice as many authentic DNA sequences as bones treated with standard procedures. This effect was even more pronounced with bones from one Neolithic site, where only freshly excavated bones yielded results. Finally, we compared the DNA content in the fossil bones of one animal, a ≈3,200-year-old aurochs, excavated in two separate seasons 57 years apart. Whereas the washed museum-stored fossil bones did not permit any DNA amplification, all recently excavated bones yielded authentic aurochs sequences. We established that during the 57 years when the aurochs bones were stored in a collection, at least as much amplifiable DNA was lost as during the previous 3,200 years of burial. This result calls for a revision of the postexcavation treatment of fossil bones to better preserve the genetic heritage of past life forms. PMID:17210911
Assessment of DNA degradation induced by thermal and UV radiation processing: implications for quantification of genetically modified organisms.

PubMed

Ballari, Rajashekhar V; Martin, Asha

2013-12-01

DNA quality is an important parameter for the detection and quantification of genetically modified organisms (GMO's) using the polymerase chain reaction (PCR). Food processing leads to degradation of DNA, which may impair GMO detection and quantification. This study evaluated the effect of various processing treatments such as heating, baking, microwaving, autoclaving and ultraviolet (UV) irradiation on the relative transgenic content of MON 810 maize using pRSETMON-02, a dual target plasmid as a model system. Amongst all the processing treatments examined, autoclaving and UV irradiation resulted in the least recovery of the transgenic (CaMV 35S promoter) and taxon-specific (zein) target DNA sequences. Although a profound impact on DNA degradation was seen during the processing, DNA could still be reliably quantified by Real-time PCR. The measured mean DNA copy number ratios of the processed samples were in agreement with the expected values. Our study confirms the premise that the final analytical value assigned to a particular sample is independent of the degree of DNA degradation since the transgenic and the taxon-specific target sequences possessing approximately similar lengths degrade in parallel. The results of our study demonstrate that food processing does not alter the relative quantification of the transgenic content provided the quantitative assays target shorter amplicons and the difference in the amplicon size between the transgenic and taxon-specific genes is minimal. Copyright © 2013 Elsevier Ltd. All rights reserved.
XX/XY System of Sex Determination in the Geophilomorph Centipede Strigamia maritima

PubMed Central

Green, Jack E.; Dalíková, Martina; Sahara, Ken; Marec, František; Akam, Michael

2016-01-01

We show that the geophilomorph centipede Strigamia maritima possesses an XX/XY system of sex chromosomes, with males being the heterogametic sex. This is, to our knowledge, the first report of sex chromosomes in any geophilomorph centipede. Using the recently assembled Strigamia genome sequence, we identified a set of scaffolds differentially represented in male and female DNA sequence. Using quantitative real-time PCR, we confirmed that three candidate X chromosome-derived scaffolds are present at approximately twice the copy number in females as in males. Furthermore, we confirmed that six candidate Y chromosome-derived scaffolds contain male-specific sequences. Finally, using this molecular information, we designed an X chromosome-specific DNA probe and performed fluorescent in situ hybridization against mitotic and meiotic chromosome spreads to identify the Strigamia XY sex-chromosome pair cytologically. We found that the X and Y chromosomes are recognizably different in size during the early pachytene stage of meiosis, and exhibit incomplete and delayed pairing. PMID:26919730
Analysis of protein-coding genetic variation in 60,706 humans.

PubMed

Lek, Monkol; Karczewski, Konrad J; Minikel, Eric V; Samocha, Kaitlin E; Banks, Eric; Fennell, Timothy; O'Donnell-Luria, Anne H; Ware, James S; Hill, Andrew J; Cummings, Beryl B; Tukiainen, Taru; Birnbaum, Daniel P; Kosmicki, Jack A; Duncan, Laramie E; Estrada, Karol; Zhao, Fengmei; Zou, James; Pierce-Hoffman, Emma; Berghout, Joanne; Cooper, David N; Deflaux, Nicole; DePristo, Mark; Do, Ron; Flannick, Jason; Fromer, Menachem; Gauthier, Laura; Goldstein, Jackie; Gupta, Namrata; Howrigan, Daniel; Kiezun, Adam; Kurki, Mitja I; Moonshine, Ami Levy; Natarajan, Pradeep; Orozco, Lorena; Peloso, Gina M; Poplin, Ryan; Rivas, Manuel A; Ruano-Rubio, Valentin; Rose, Samuel A; Ruderfer, Douglas M; Shakir, Khalid; Stenson, Peter D; Stevens, Christine; Thomas, Brett P; Tiao, Grace; Tusie-Luna, Maria T; Weisburd, Ben; Won, Hong-Hee; Yu, Dongmei; Altshuler, David M; Ardissino, Diego; Boehnke, Michael; Danesh, John; Donnelly, Stacey; Elosua, Roberto; Florez, Jose C; Gabriel, Stacey B; Getz, Gad; Glatt, Stephen J; Hultman, Christina M; Kathiresan, Sekar; Laakso, Markku; McCarroll, Steven; McCarthy, Mark I; McGovern, Dermot; McPherson, Ruth; Neale, Benjamin M; Palotie, Aarno; Purcell, Shaun M; Saleheen, Danish; Scharf, Jeremiah M; Sklar, Pamela; Sullivan, Patrick F; Tuomilehto, Jaakko; Tsuang, Ming T; Watkins, Hugh C; Wilson, James G; Daly, Mark J; MacArthur, Daniel G

2016-08-18

Large-scale reference data sets of human genetic variation are critical for the medical and functional interpretation of DNA sequence changes. Here we describe the aggregation and analysis of high-quality exome (protein-coding region) DNA sequence data for 60,706 individuals of diverse ancestries generated as part of the Exome Aggregation Consortium (ExAC). This catalogue of human genetic diversity contains an average of one variant every eight bases of the exome, and provides direct evidence for the presence of widespread mutational recurrence. We have used this catalogue to calculate objective metrics of pathogenicity for sequence variants, and to identify genes subject to strong selection against various classes of mutation; identifying 3,230 genes with near-complete depletion of predicted protein-truncating variants, with 72% of these genes having no currently established human disease phenotype. Finally, we demonstrate that these data can be used for the efficient filtering of candidate disease-causing variants, and for the discovery of human 'knockout' variants in protein-coding genes.
Pyrrole-Imidazole Polyamides: Manual Solid-Phase Synthesis.

PubMed

Pauff, Steven M; Fallows, Andrew J; Mackay, Simon P; Su, Wu; Cullis, Paul M; Burley, Glenn A

2015-12-01

Pyrrole-imidazole polyamides (PAs) are a family of DNA-binding peptides that bind in the minor groove of double-stranded DNA (dsDNA) in a sequence-selective, programmable fashion. This protocol describes a detailed manual procedure for the solid-phase synthesis of this family of compounds. The protocol entails solution-phase synthesis of the Boc-protected pyrrole (Py) and imidazole (Im) carboxylic acid building blocks. This unit also describes the importance of choosing the appropriate condensing agent to form the amide linkages between each building block. Finally, a monomeric coupling protocol and a fragment-based approach are described that delivers PAs in 13% to 30% yield in 8 days. Copyright © 2015 John Wiley & Sons, Inc.
Product analysis illuminates the final steps of IES deletion in Tetrahymena thermophila

PubMed Central

Saveliev, Sergei V.; Cox, Michael M.

2001-01-01

DNA sequences (IES elements) eliminated from the developing macronucleus in the ciliate Tetrahymena thermophila are released as linear fragments, which have now been detected and isolated. A PCR-mediated examination of fragment end structures reveals three types of strand scission events, reflecting three steps in the deletion process. New evidence is provided for two steps proposed previously: an initiating double-stranded cleavage, and strand transfer to create a branched deletion intermediate. The fragment ends provide evidence for a previously uncharacterized third step: the branched DNA strand is cleaved at one of several defined sites located within 15–16 nucleotides of the IES boundary, liberating the deleted DNA in a linear form. PMID:11406601
Product analysis illuminates the final steps of IES deletion in Tetrahymena thermophila.

PubMed

Saveliev, S V; Cox, M M

2001-06-15

DNA sequences (IES elements) eliminated from the developing macronucleus in the ciliate Tetrahymena thermophila are released as linear fragments, which have now been detected and isolated. A PCR-mediated examination of fragment end structures reveals three types of strand scission events, reflecting three steps in the deletion process. New evidence is provided for two steps proposed previously: an initiating double-stranded cleavage, and strand transfer to create a branched deletion intermediate. The fragment ends provide evidence for a previously uncharacterized third step: the branched DNA strand is cleaved at one of several defined sites located within 15-16 nucleotides of the IES boundary, liberating the deleted DNA in a linear form.
Rapid identification of fungal pathogens in BacT/ALERT, BACTEC, and BBL MGIT media using polymerase chain reaction and DNA sequencing of the internal transcribed spacer regions.

PubMed

Pryce, Todd M; Palladino, Silvano; Price, Diane M; Gardam, Dianne J; Campbell, Peter B; Christiansen, Keryn J; Murray, Ronan J

2006-04-01

We report a direct polymerase chain reaction/sequence (d-PCRS)-based method for the rapid identification of clinically significant fungi from 5 different types of commercial broth enrichment media inoculated with clinical specimens. Media including BacT/ALERT FA (BioMérieux, Marcy l'Etoile, France) (n = 87), BACTEC Plus Aerobic/F (Becton Dickinson, Microbiology Systems, Sparks, MD) (n = 16), BACTEC Peds Plus/F (Becton Dickinson) (n = 15), BACTEC Lytic/10 Anaerobic/F (Becton Dickinson) (n = 11) bottles, and BBL MGIT (Becton Dickinson) (n = 11) were inoculated with specimens from 138 patients. A universal DNA extraction method was used combining a novel pretreatment step to remove PCR inhibitors with a column-based DNA extraction kit. Target sequences in the noncoding internal transcribed spacer regions of the rRNA gene were amplified by PCR and sequenced using a rapid (24 h) automated capillary electrophoresis system. Using sequence alignment software, fungi were identified by sequence similarity with sequences derived from isolates identified by upper-level reference laboratories or isolates defined as ex-type strains. We identified Candida albicans (n = 14), Candida parapsilosis (n = 8), Candida glabrata (n = 7), Candida krusei (n = 2), Scedosporium prolificans (n = 4), and 1 each of Candida orthopsilosis, Candida dubliniensis, Candida kefyr, Candida tropicalis, Candida guilliermondii, Saccharomyces cerevisiae, Cryptococcus neoformans, Aspergillus fumigatus, Histoplasma capsulatum, and Malassezia pachydermatis by d-PCRS analysis. All d-PCRS identifications from positive broths were in agreement with the final species identification of the isolates grown from subculture. Earlier identification of fungi using d-PCRS may facilitate prompt and more appropriate antifungal therapy.
Sequence variations in RepMP2/3 and RepMP4 elements reveal intragenomic homologous DNA recombination events in Mycoplasma pneumoniae.

PubMed

Spuesens, Emiel B M; Oduber, Minoushka; Hoogenboezem, Theo; Sluijter, Marcel; Hartwig, Nico G; van Rossum, Annemarie M C; Vink, Cornelis

2009-07-01

The gene encoding major adhesin protein P1 of Mycoplasma pneumoniae, MPN141, contains two DNA sequence stretches, designated RepMP2/3 and RepMP4, which display variation among strains. This variation allows strains to be differentiated into two major P1 genotypes (1 and 2) and several variants. Interestingly, multiple versions of the RepMP2/3 and RepMP4 elements exist at other sites within the bacterial genome. Because these versions are closely related in sequence, but not identical, it has been hypothesized that they have the capacity to recombine with their counterparts within MPN141, and thereby serve as a source of sequence variation of the P1 protein. In order to determine the variation within the RepMP2/3 and RepMP4 elements, both within the bacterial genome and among strains, we analysed the DNA sequences of all RepMP2/3 and RepMP4 elements within the genomes of 23 M. pneumoniae strains. Our data demonstrate that: (i) recombination is likely to have occurred between two RepMP2/3 elements in four of the strains, and (ii) all previously described P1 genotypes can be explained by inter-RepMP recombination events. Moreover, the difference between the two major P1 genotypes was reflected in all RepMP elements, such that subtype 1 and 2 strains can be differentiated on the basis of sequence variation in each RepMP element. This implies that subtype 1 and subtype 2 strains represent evolutionarily diverged strain lineages. Finally, a classification scheme is proposed in which the P1 genotype of M. pneumoniae isolates can be described in a sequence-based, universal fashion.
Acquisition of New DNA Sequences After Infection of Chicken Cells with Avian Myeloblastosis Virus

PubMed Central

Shoyab, M.; Baluda, M. A.; Evans, R.

1974-01-01

DNA-RNA hybridization studies between 70S RNA from avian myeloblastosis virus (AMV) and an excess of DNA from (i) AMV-induced leukemic chicken myeloblasts or (ii) a mixture of normal and of congenitally infected K-137 chicken embryos producing avian leukosis viruses revealed the presence of fast- and slow-hybridizing virus-specific DNA sequences. However, the leukemic cells contained twice the level of AMV-specific DNA sequences observed in normal chicken embryonic cells. The fast-reacting sequences were two to three times more numerous in leukemic DNA than in DNA from the mixed embryos. The slow-reacting sequences had a reiteration frequency of approximately 9 and 6, in the two respective systems. Both the fast- and the slow-reacting DNA sequences in leukemic cells exhibited a higher Tm (2 C) than the respective DNA sequences in normal cells. In normal and leukemic cells the slow hybrid sequences appeared to have a Tm which was 2 C higher than that of the fast hybrid sequences. Individual non-virus-producing chicken embryos, either group-specific antigen positive or negative, contained 40 to 100 copies of the fast sequences and 2 to 6 copies of the slowly hybridizing sequences per cell genome. Normal rat cells did not contain DNA that hybridized with AMV RNA, whereas non-virus-producing rat cells transformed by B-77 avian sarcoma virus contained only the slowly reacting sequences. The results demonstrate that leukemic cells transformed by AMV contain new AMV-specific DNA sequences which were not present before infection. PMID:16789139
Accuracy of the high-throughput amplicon sequencing to identify species within the genus Aspergillus.

PubMed

Lee, Seungeun; Yamamoto, Naomichi

2015-12-01

This study characterized the accuracy of high-throughput amplicon sequencing to identify species within the genus Aspergillus. To this end, we sequenced the internal transcribed spacer 1 (ITS1), β-tubulin (BenA), and calmodulin (CaM) gene encoding sequences as DNA markers from eight reference Aspergillus strains with known identities using 300-bp sequencing on the Illumina MiSeq platform, and compared them with the BLASTn outputs. The identifications with the sequences longer than 250 bp were accurate at the section rank, with some ambiguities observed at the species rank due to mostly cross detection of sibling species. Additionally, in silico analysis was performed to predict the identification accuracy for all species in the genus Aspergillus, where 107, 210, and 187 species were predicted to be identifiable down to the species rank based on ITS1, BenA, and CaM, respectively. Finally, air filter samples were analysed to quantify the relative abundances of Aspergillus species in outdoor air. The results were reproducible across biological duplicates both at the species and section ranks, but not strongly correlated between ITS1 and BenA, suggesting the Aspergillus detection can be taxonomically biased depending on the selection of the DNA markers and/or primers. Copyright © 2015 The British Mycological Society. Published by Elsevier Ltd. All rights reserved.
cuRRBS: simple and robust evaluation of enzyme combinations for reduced representation approaches.

PubMed

Martin-Herranz, Daniel E; Ribeiro, António J M; Krueger, Felix; Thornton, Janet M; Reik, Wolf; Stubbs, Thomas M

2017-11-16

DNA methylation is an important epigenetic modification in many species that is critical for development, and implicated in ageing and many complex diseases, such as cancer. Many cost-effective genome-wide analyses of DNA modifications rely on restriction enzymes capable of digesting genomic DNA at defined sequence motifs. There are hundreds of restriction enzyme families but few are used to date, because no tool is available for the systematic evaluation of restriction enzyme combinations that can enrich for certain sites of interest in a genome. Herein, we present customised Reduced Representation Bisulfite Sequencing (cuRRBS), a novel and easy-to-use computational method that solves this problem. By computing the optimal enzymatic digestions and size selection steps required, cuRRBS generalises the traditional MspI-based Reduced Representation Bisulfite Sequencing (RRBS) protocol to all restriction enzyme combinations. In addition, cuRRBS estimates the fold-reduction in sequencing costs and provides a robustness value for the personalised RRBS protocol, allowing users to tailor the protocol to their experimental needs. Moreover, we show in silico that cuRRBS-defined restriction enzymes consistently out-perform MspI digestion in many biological systems, considering both CpG and CHG contexts. Finally, we have validated the accuracy of cuRRBS predictions for single and double enzyme digestions using two independent experimental datasets. © The Author(s) 2017. Published by Oxford University Press on behalf of Nucleic Acids Research.
Effects of a Transposable Element Insertion on Alcohol Dehydrogenase Expression in Drosophila Melanogaster

PubMed Central

Dunn, R. C.; Laurie, C. C.

1995-01-01

Variation in the DNA sequence and level of alcohol dehydrogenase (Adh) gene expression in Drosophila melanogaster have been studied to determine what types of DNA polymorphisms contribute to phenotypic variation in natural populations. The Adh gene, like many others, shows a high level of variability in both DNA sequence and quantitative level of expression. A number of transposable element insertions occur in the Adh region and one of these, a copia insertion in the 5' flanking region, is associated with unusually low Adh expression. To determine whether this insertion (called RI42) causes the low expression level, the insertion was excised from the cloned RI42 Adh gene and the effect was assessed by P-element transformation. Removal of this insertion causes a threefold increase in the level of ADH, clearly showing that it contributes to the naturally occurring variation in expression at this locus. Removal of all but one LTR also causes a threefold increase, indicating that the mechanism is not a simple sequence disruption. Furthermore, this copia insertion, which is located between the two Adh promoters and their upstream enhancer sequences, has differential effects on the levels of proximal and distal transcripts. Finally, a test for the possible modifying effects of two suppressor loci, su(w(a)) and su(f), on this insertional mutation was negative, in contrast to a previous report in the literature. PMID:7498745
The consequences of sequence erosion in the evolution of recombination hotspots.

PubMed

Tiemann-Boege, Irene; Schwarz, Theresa; Striedner, Yasmin; Heissl, Angelika

2017-12-19

Meiosis is initiated by a double-strand break (DSB) introduced in the DNA by a highly controlled process that is repaired by recombination. In many organisms, recombination occurs at specific and narrow regions of the genome, known as recombination hotspots, which overlap with regions enriched for DSBs. In recent years, it has been demonstrated that conversions and mutations resulting from the repair of DSBs lead to a rapid sequence evolution at recombination hotspots eroding target sites for DSBs. We still do not fully understand the effect of this erosion in the recombination activity, but evidence has shown that the binding of trans -acting factors like PRDM9 is affected. PRDM9 is a meiosis-specific, multi-domain protein that recognizes DNA target motifs by its zinc finger domain and directs DSBs to these target sites. Here we discuss the changes in affinity of PRDM9 to eroded recognition sequences, and explain how these changes in affinity of PRDM9 can affect recombination, leading sometimes to sterility in the context of hybrid crosses. We also present experimental data showing that DNA methylation reduces PRDM9 binding in vitro Finally, we discuss PRDM9-independent hotspots, posing the question how these hotspots evolve and change with sequence erosion.This article is part of the themed issue 'Evolutionary causes and consequences of recombination rate variation in sexual organisms'. © 2017 The Authors.

The consequences of sequence erosion in the evolution of recombination hotspots

PubMed Central

Schwarz, Theresa; Heissl, Angelika

2017-01-01

Meiosis is initiated by a double-strand break (DSB) introduced in the DNA by a highly controlled process that is repaired by recombination. In many organisms, recombination occurs at specific and narrow regions of the genome, known as recombination hotspots, which overlap with regions enriched for DSBs. In recent years, it has been demonstrated that conversions and mutations resulting from the repair of DSBs lead to a rapid sequence evolution at recombination hotspots eroding target sites for DSBs. We still do not fully understand the effect of this erosion in the recombination activity, but evidence has shown that the binding of trans-acting factors like PRDM9 is affected. PRDM9 is a meiosis-specific, multi-domain protein that recognizes DNA target motifs by its zinc finger domain and directs DSBs to these target sites. Here we discuss the changes in affinity of PRDM9 to eroded recognition sequences, and explain how these changes in affinity of PRDM9 can affect recombination, leading sometimes to sterility in the context of hybrid crosses. We also present experimental data showing that DNA methylation reduces PRDM9 binding in vitro. Finally, we discuss PRDM9-independent hotspots, posing the question how these hotspots evolve and change with sequence erosion. This article is part of the themed issue ‘Evolutionary causes and consequences of recombination rate variation in sexual organisms’. PMID:29109225
Lindnera (Pichia) fabianii blood infection after mesenteric ischemia.

PubMed

Gabriel, Frederic; Noel, Thierry; Accoceberry, Isabelle

2012-04-01

Lindnera (Pichia) fabianii (teleomorph of Candida fabianii) is a yeast species rarely involved in human infections. This report describes the first known human case of a Lindnera fabianii blood infection after mesenteric ischemia. The 53-year-old patient was hospitalized in the intensive care unit after a suicide attempt and was suffering from a mesenteric ischemia and acute renal failure. Lindnera fabianii was recovered from an oropharyngeal swab, then isolated from stool and urine samples before the diagnosis of the blood infection. Caspofungin intravenous treatment was associated with a successful outcome. Final unequivocal identification of the strain was done by sequencing the internal transcribed spacer (ITS) region, and regions of 18S rDNA gene and of the translation elongation factor-1α gene. Until our work, the genomic databases did not contain the complete ITS region of L. fabianii as a single nucleotide sequence (encompassing ITS1, the 5.8S rDNA and ITS2), and misidentification with other yeast species, e.g., Lindnera (Pichia) mississippiensis, could have occurred. Our work demonstrates that the usual DNA barcoding method based on sequencing of the ITS region may fail to provide the correct identification of some taxa, and that partial sequencing of the EF1α gene may be much more effective for the accurate delineation and molecular identification of new emerging opportunistic yeast pathogens.
Homogeneity of the 16S rDNA sequence among geographically disparate isolates of Taylorella equigenitalis

PubMed Central

Matsuda, M; Tazumi, A; Kagawa, S; Sekizuka, T; Murayama, O; Moore, JE; Millar, BC

2006-01-01

Background At present, six accessible sequences of 16S rDNA from Taylorella equigenitalis (T. equigenitalis) are available, whose sequence differences occur at a few nucleotide positions. Thus it is important to determine these sequences from additional strains in other countries, if possible, in order to clarify any anomalies regarding 16S rDNA sequence heterogeneity. Here, we clone and sequence the approximate full-length 16S rDNA from additional strains of T. equigenitalis isolated in Japan, Australia and France and compare these sequences to the existing published sequences. Results Clarification of any anomalies regarding 16S rDNA sequence heterogeneity of T. equigenitalis was carried out. When cloning, sequencing and comparison of the approximate full-length 16S rDNA from 17 strains of T. equigenitalis isolated in Japan, Australia and France, nucleotide sequence differences were demonstrated at the six loci in the 1,469 nucleotide sequence. Moreover, 12 polymorphic sites occurred among 23 sequences of the 16S rDNA, including the six reference sequences. Conclusion High sequence similarity (99.5% or more) was observed throughout, except from nucleotide positions 138 to 501 where substitutions and deletions were noted. PMID:16398935
Detection and quantitation of single nucleotide polymorphisms, DNA sequence variations, DNA mutations, DNA damage and DNA mismatches

DOEpatents

McCutchen-Maloney, Sandra L.

2002-01-01

DNA mutation binding proteins alone and as chimeric proteins with nucleases are used with solid supports to detect DNA sequence variations, DNA mutations and single nucleotide polymorphisms. The solid supports may be flow cytometry beads, DNA chips, glass slides or DNA dips sticks. DNA molecules are coupled to solid supports to form DNA-support complexes. Labeled DNA is used with unlabeled DNA mutation binding proteins such at TthMutS to detect DNA sequence variations, DNA mutations and single nucleotide length polymorphisms by binding which gives an increase in signal. Unlabeled DNA is utilized with labeled chimeras to detect DNA sequence variations, DNA mutations and single nucleotide length polymorphisms by nuclease activity of the chimera which gives a decrease in signal.
Methylation patterns of repetitive DNA sequences in germ cells of Mus musculus.

PubMed

Sanford, J; Forrester, L; Chapman, V; Chandley, A; Hastie, N

1984-03-26

The major and the minor satellite sequences of Mus musculus were undermethylated in both sperm and oocyte DNAs relative to the amount of undermethylation observed in adult somatic tissue DNA. This hypomethylation was specific for satellite sequences in sperm DNA. Dispersed repetitive and low copy sequences show a high degree of methylation in sperm DNA; however, a dispersed repetitive sequence was undermethylated in oocyte DNA. This finding suggests a difference in the amount of total genomic DNA methylation between sperm and oocyte DNA. The methylation levels of the minor satellite sequences did not change during spermiogenesis, and were not associated with the onset of meiosis or a specific stage in sperm development.
An alternative method for cDNA cloning from surrogate eukaryotic cells transfected with the corresponding genomic DNA.

PubMed

Hu, Lin-Yong; Cui, Chen-Chen; Song, Yu-Jie; Wang, Xiang-Guo; Jin, Ya-Ping; Wang, Ai-Hua; Zhang, Yong

2012-07-01

cDNA is widely used in gene function elucidation and/or transgenics research but often suitable tissues or cells from which to isolate mRNA for reverse transcription are unavailable. Here, an alternative method for cDNA cloning is described and tested by cloning the cDNA of human LALBA (human alpha-lactalbumin) from genomic DNA. First, genomic DNA containing all of the coding exons was cloned from human peripheral blood and inserted into a eukaryotic expression vector. Next, by delivering the plasmids into either 293T or fibroblast cells, surrogate cells were constructed. Finally, the total RNA was extracted from the surrogate cells and cDNA was obtained by RT-PCR. The human LALBA cDNA that was obtained was compared with the corresponding mRNA published in GenBank. The comparison showed that the two sequences were identical. The novel method for cDNA cloning from surrogate eukaryotic cells described here uses well-established techniques that are feasible and simple to use. We anticipate that this alternative method will have widespread applications.
Process of labeling specific chromosomes using recombinant repetitive DNA

DOEpatents

Moyzis, R.K.; Meyne, J.

1988-02-12

Chromosome preferential nucleotide sequences are first determined from a library of recombinant DNA clones having families of repetitive sequences. Library clones are identified with a low homology with a sequence of repetitive DNA families to which the first clones respectively belong and variant sequences are then identified by selecting clones having a pattern of hybridization with genomic DNA dissimilar to the hybridization pattern shown by the respective families. In another embodiment, variant sequences are selected from a sequence of a known repetitive DNA family. The selected variant sequence is classified as chromosome specific, chromosome preferential, or chromosome nonspecific. Sequences which are classified as chromosome preferential are further sequenced and regions are identified having a low homology with other regions of the chromosome preferential sequence or with known sequences of other family members and consensus sequences of the repetitive DNA families for the chromosome preferential sequences. The selected low homology regions are then hybridized with chromosomes to determine those low homology regions hybridized with a specific chromosome under normal stringency conditions.
Sequence-based prediction of protein-binding sites in DNA: comparative study of two SVM models.

PubMed

Park, Byungkyu; Im, Jinyong; Tuvshinjargal, Narankhuu; Lee, Wook; Han, Kyungsook

2014-11-01

As many structures of protein-DNA complexes have been known in the past years, several computational methods have been developed to predict DNA-binding sites in proteins. However, its inverse problem (i.e., predicting protein-binding sites in DNA) has received much less attention. One of the reasons is that the differences between the interaction propensities of nucleotides are much smaller than those between amino acids. Another reason is that DNA exhibits less diverse sequence patterns than protein. Therefore, predicting protein-binding DNA nucleotides is much harder than predicting DNA-binding amino acids. We computed the interaction propensity (IP) of nucleotide triplets with amino acids using an extensive dataset of protein-DNA complexes, and developed two support vector machine (SVM) models that predict protein-binding nucleotides from sequence data alone. One SVM model predicts protein-binding nucleotides using DNA sequence data alone, and the other SVM model predicts protein-binding nucleotides using both DNA and protein sequences. In a 10-fold cross-validation with 1519 DNA sequences, the SVM model that uses DNA sequence data only predicted protein-binding nucleotides with an accuracy of 67.0%, an F-measure of 67.1%, and a Matthews correlation coefficient (MCC) of 0.340. With an independent dataset of 181 DNAs that were not used in training, it achieved an accuracy of 66.2%, an F-measure 66.3% and a MCC of 0.324. Another SVM model that uses both DNA and protein sequences achieved an accuracy of 69.6%, an F-measure of 69.6%, and a MCC of 0.383 in a 10-fold cross-validation with 1519 DNA sequences and 859 protein sequences. With an independent dataset of 181 DNAs and 143 proteins, it showed an accuracy of 67.3%, an F-measure of 66.5% and a MCC of 0.329. Both in cross-validation and independent testing, the second SVM model that used both DNA and protein sequence data showed better performance than the first model that used DNA sequence data. To the best of our knowledge, this is the first attempt to predict protein-binding nucleotides in a given DNA sequence from the sequence data alone. Copyright © 2014 Elsevier Ireland Ltd. All rights reserved.
A phylogenetic hypothesis for passerine birds: taxonomic and biogeographic implications of an analysis of nuclear DNA sequence data.

PubMed Central

Barker, F Keith; Barrowclough, George F; Groth, Jeff G

2002-01-01

Passerine birds comprise over half of avian diversity, but have proved difficult to classify. Despite a long history of work on this group, no comprehensive hypothesis of passerine family-level relationships was available until recent analyses of DNA-DNA hybridization data. Unfortunately, given the value of such a hypothesis in comparative studies of passerine ecology and behaviour, the DNA-hybridization results have not been well tested using independent data and analytical approaches. Therefore, we analysed nucleotide sequence variation at the nuclear RAG-1 and c-mos genes from 69 passerine taxa, including representatives of most currently recognized families. In contradiction to previous DNA-hybridization studies, our analyses suggest paraphyly of suboscine passerines because the suboscine New Zealand wren Acanthisitta was found to be sister to all other passerines. Additionally, we reconstructed the parvorder Corvida as a basal paraphyletic grade within the oscine passerines. Finally, we found strong evidence that several family-level taxa are misplaced in the hybridization results, including the Alaudidae, Irenidae, and Melanocharitidae. The hypothesis of relationships we present here suggests that the oscine passerines arose on the Australian continental plate while it was isolated by oceanic barriers and that a major northern radiation of oscines (i.e. the parvorder Passerida) originated subsequent to dispersal from the south. PMID:11839199
A phylogenetic hypothesis for passerine birds: taxonomic and biogeographic implications of an analysis of nuclear DNA sequence data.

PubMed

Barker, F Keith; Barrowclough, George F; Groth, Jeff G

2002-02-07

Passerine birds comprise over half of avian diversity, but have proved difficult to classify. Despite a long history of work on this group, no comprehensive hypothesis of passerine family-level relationships was available until recent analyses of DNA-DNA hybridization data. Unfortunately, given the value of such a hypothesis in comparative studies of passerine ecology and behaviour, the DNA-hybridization results have not been well tested using independent data and analytical approaches. Therefore, we analysed nucleotide sequence variation at the nuclear RAG-1 and c-mos genes from 69 passerine taxa, including representatives of most currently recognized families. In contradiction to previous DNA-hybridization studies, our analyses suggest paraphyly of suboscine passerines because the suboscine New Zealand wren Acanthisitta was found to be sister to all other passerines. Additionally, we reconstructed the parvorder Corvida as a basal paraphyletic grade within the oscine passerines. Finally, we found strong evidence that several family-level taxa are misplaced in the hybridization results, including the Alaudidae, Irenidae, and Melanocharitidae. The hypothesis of relationships we present here suggests that the oscine passerines arose on the Australian continental plate while it was isolated by oceanic barriers and that a major northern radiation of oscines (i.e. the parvorder Passerida) originated subsequent to dispersal from the south.
A Robust Framework for Microbial Archaeology

PubMed Central

Warinner, Christina; Herbig, Alexander; Mann, Allison; Yates, James A. Fellows; Weiβ, Clemens L.; Burbano, Hernán A.; Orlando, Ludovic; Krause, Johannes

2017-01-01

Microbial archaeology is flourishing in the era of high-throughput sequencing, revealing the agents behind devastating historical plagues, identifying the cryptic movements of pathogens in prehistory, and reconstructing the ancestral microbiota of humans. Here, we introduce the fundamental concepts and theoretical framework of the discipline, then discuss applied methodologies for pathogen identification and microbiome characterization from archaeological samples. We give special attention to the process of identifying, validating, and authenticating ancient microbes using high-throughput DNA sequencing data. Finally, we outline standards and precautions to guide future research in the field. PMID:28460196
Enlightenment of Yeast Mitochondrial Homoplasmy: Diversified Roles of Gene Conversion

PubMed Central

Ling, Feng; Mikawa, Tsutomu; Shibata, Takehiko

2011-01-01

Mitochondria have their own genomic DNA. Unlike the nuclear genome, each cell contains hundreds to thousands of copies of mitochondrial DNA (mtDNA). The copies of mtDNA tend to have heterogeneous sequences, due to the high frequency of mutagenesis, but are quickly homogenized within a cell (“homoplasmy”) during vegetative cell growth or through a few sexual generations. Heteroplasmy is strongly associated with mitochondrial diseases, diabetes and aging. Recent studies revealed that the yeast cell has the machinery to homogenize mtDNA, using a common DNA processing pathway with gene conversion; i.e., both genetic events are initiated by a double-stranded break, which is processed into 3′ single-stranded tails. One of the tails is base-paired with the complementary sequence of the recipient double-stranded DNA to form a D-loop (homologous pairing), in which repair DNA synthesis is initiated to restore the sequence lost by the breakage. Gene conversion generates sequence diversity, depending on the divergence between the donor and recipient sequences, especially when it occurs among a number of copies of a DNA sequence family with some sequence variations, such as in immunoglobulin diversification in chicken. MtDNA can be regarded as a sequence family, in which the members tend to be diversified by a high frequency of spontaneous mutagenesis. Thus, it would be interesting to determine why and how double-stranded breakage and D-loop formation induce sequence homogenization in mitochondria and sequence diversification in nuclear DNA. We will review the mechanisms and roles of mtDNA homoplasmy, in contrast to nuclear gene conversion, which diversifies gene and genome sequences, to provide clues toward understanding how the common DNA processing pathway results in such divergent outcomes. PMID:24710143
Novel electrochemiluminescence of silver nanoclusters fabricated on triplex DNA scaffolds for label-free detection of biothiols.

PubMed

Feng, Lingyan; Wu, Li; Xing, Feifei; Hu, Lianzhe; Ren, Jinsong; Qu, Xiaogang

2017-12-15

Electrochemiluminescence (ECL) of metal nanoclusters and their application have been widely reported due to the good biocompatibility, fascinating electrocatalytic activity and so on. Using DNA as synthesis template opens new opportunities to modulate the physical properties of AgNCs. Triplex DNA has been reported for the site-specific, homogeneous and highly stable silver nanoclusters (AgNCs) fabrication from our recent research. Here we further explore their extraordinary ECL properties and applications in biosensor utilization. By reasonable design of DNA sequence, AgNCs were obtained in the predefined position of CG.C + sites of triplex DNA, and the ECL emission at a low potential was observed with this novel DNA template. Finally, a simple and label-free method was developed for biothiols detection based on the enhanced catalytic reaction and a robust interaction between the triplex-AgNCs and cysteine, by influencing the microenvironment provided by DNA template. Copyright © 2017 Elsevier B.V. All rights reserved.
Qualitative and quantitative assessment of Illumina's forensic STR and SNP kits on MiSeq FGx™.

PubMed

Sharma, Vishakha; Chow, Hoi Yan; Siegel, Donald; Wurmbach, Elisa

2017-01-01

Massively parallel sequencing (MPS) is a powerful tool transforming DNA analysis in multiple fields ranging from medicine, to environmental science, to evolutionary biology. In forensic applications, MPS offers the ability to significantly increase the discriminatory power of human identification as well as aid in mixture deconvolution. However, before the benefits of any new technology can be employed, a thorough evaluation of its quality, consistency, sensitivity, and specificity must be rigorously evaluated in order to gain a detailed understanding of the technique including sources of error, error rates, and other restrictions/limitations. This extensive study assessed the performance of Illumina's MiSeq FGx MPS system and ForenSeq™ kit in nine experimental runs including 314 reaction samples. In-depth data analysis evaluated the consequences of different assay conditions on test results. Variables included: sample numbers per run, targets per run, DNA input per sample, and replications. Results are presented as heat maps revealing patterns for each locus. Data analysis focused on read numbers (allele coverage), drop-outs, drop-ins, and sequence analysis. The study revealed that loci with high read numbers performed better and resulted in fewer drop-outs and well balanced heterozygous alleles. Several loci were prone to drop-outs which led to falsely typed homozygotes and therefore to genotype errors. Sequence analysis of allele drop-in typically revealed a single nucleotide change (deletion, insertion, or substitution). Analyses of sequences, no template controls, and spurious alleles suggest no contamination during library preparation, pooling, and sequencing, but indicate that sequencing or PCR errors may have occurred due to DNA polymerase infidelities. Finally, we found utilizing Illumina's FGx System at recommended conditions does not guarantee 100% outcomes for all samples tested, including the positive control, and required manual editing due to low read numbers and/or allele drop-in. These findings are important for progressing towards implementation of MPS in forensic DNA testing.
"First generation" automated DNA sequencing technology.

PubMed

Slatko, Barton E; Kieleczawa, Jan; Ju, Jingyue; Gardner, Andrew F; Hendrickson, Cynthia L; Ausubel, Frederick M

2011-10-01

Beginning in the 1980s, automation of DNA sequencing has greatly increased throughput, reduced costs, and enabled large projects to be completed more easily. The development of automation technology paralleled the development of other aspects of DNA sequencing: better enzymes and chemistry, separation and imaging technology, sequencing protocols, robotics, and computational advancements (including base-calling algorithms with quality scores, database developments, and sequence analysis programs). Despite the emergence of high-throughput sequencing platforms, automated Sanger sequencing technology remains useful for many applications. This unit provides background and a description of the "First-Generation" automated DNA sequencing technology. It also includes protocols for using the current Applied Biosystems (ABI) automated DNA sequencing machines. © 2011 by John Wiley & Sons, Inc.
DNA methylation analysis of phenotype specific stratified Indian population.

PubMed

Rotti, Harish; Mallya, Sandeep; Kabekkodu, Shama Prasada; Chakrabarty, Sanjiban; Bhale, Sameer; Bharadwaj, Ramachandra; Bhat, Balakrishna K; Dedge, Amrish P; Dhumal, Vikram Ram; Gangadharan, G G; Gopinath, Puthiya M; Govindaraj, Periyasamy; Joshi, Kalpana S; Kondaiah, Paturu; Nair, Sreekumaran; Nair, S N Venugopalan; Nayak, Jayakrishna; Prasanna, B V; Shintre, Pooja; Sule, Mayura; Thangaraj, Kumarasamy; Patwardhan, Bhushan; Valiathan, Marthanda Varma Sankaran; Satyamoorthy, Kapaettu

2015-05-08

DNA methylation and its perturbations are an established attribute to a wide spectrum of phenotypic variations and disease conditions. Indian traditional system practices personalized medicine through indigenous concept of distinctly descriptive physiological, psychological and anatomical features known as prakriti. Here we attempted to establish DNA methylation differences in these three prakriti phenotypes. Following structured and objective measurement of 3416 subjects, whole blood DNA of 147 healthy male individuals belonging to defined prakriti (Vata, Pitta and Kapha) between the age group of 20-30years were subjected to methylated DNA immunoprecipitation (MeDIP) and microarray analysis. After data analysis, prakriti specific signatures were validated through bisulfite DNA sequencing. Differentially methylated regions in CpG islands and shores were significantly enriched in promoters/UTRs and gene body regions. Phenotypes characterized by higher metabolism (Pitta prakriti) in individuals showed distinct promoter (34) and gene body methylation (204), followed by Vata prakriti which correlates to motion showed DNA methylation in 52 promoters and 139 CpG islands and finally individuals with structural attributes (Kapha prakriti) with 23 and 19 promoters and CpG islands respectively. Bisulfite DNA sequencing of prakriti specific multiple CpG sites in promoters and 5'-UTR such as; LHX1 (Vata prakriti), SOX11 (Pitta prakriti) and CDH22 (Kapha prakriti) were validated. Kapha prakriti specific CDH22 5'-UTR CpG methylation was also found to be associated with higher body mass index (BMI). Differential DNA methylation signatures in three distinct prakriti phenotypes demonstrate the epigenetic basis of Indian traditional human classification which may have relevance to personalized medicine.
Yeast Sub1 and human PC4 are G-quadruplex binding proteins that suppress genome instability at co-transcriptionally formed G4 DNA.

PubMed

Lopez, Christopher R; Singh, Shivani; Hambarde, Shashank; Griffin, Wezley C; Gao, Jun; Chib, Shubeena; Yu, Yang; Ira, Grzegorz; Raney, Kevin D; Kim, Nayun

2017-06-02

G-quadruplex or G4 DNA is a non-B secondary DNA structure consisting of a stacked array of guanine-quartets that can disrupt critical cellular functions such as replication and transcription. When sequences that can adopt Non-B structures including G4 DNA are located within actively transcribed genes, the reshaping of DNA topology necessary for transcription process stimulates secondary structure-formation thereby amplifying the potential for genome instability. Using a reporter assay designed to study G4-induced recombination in the context of an actively transcribed locus in Saccharomyces cerevisiae, we tested whether co-transcriptional activator Sub1, recently identified as a G4-binding factor, contributes to genome maintenance at G4-forming sequences. Our data indicate that, upon Sub1-disruption, genome instability linked to co-transcriptionally formed G4 DNA in Top1-deficient cells is significantly augmented and that its highly conserved DNA binding domain or the human homolog PC4 is sufficient to suppress G4-associated genome instability. We also show that Sub1 interacts specifically with co-transcriptionally formed G4 DNA in vivo and that yeast cells become highly sensitivity to G4-stabilizing chemical ligands by the loss of Sub1. Finally, we demonstrate the physical and genetic interaction of Sub1 with the G4-resolving helicase Pif1, suggesting a possible mechanism by which Sub1 suppresses instability at G4 DNA. © The Author(s) 2017. Published by Oxford University Press on behalf of Nucleic Acids Research.
Ancient DNA analysis identifies marine mollusc shells as new metagenomic archives of the past.

PubMed

Der Sarkissian, Clio; Pichereau, Vianney; Dupont, Catherine; Ilsøe, Peter C; Perrigault, Mickael; Butler, Paul; Chauvaud, Laurent; Eiríksson, Jón; Scourse, James; Paillard, Christine; Orlando, Ludovic

2017-09-01

Marine mollusc shells enclose a wealth of information on coastal organisms and their environment. Their life history traits as well as (palaeo-) environmental conditions, including temperature, food availability, salinity and pollution, can be traced through the analysis of their shell (micro-) structure and biogeochemical composition. Adding to this list, the DNA entrapped in shell carbonate biominerals potentially offers a novel and complementary proxy both for reconstructing palaeoenvironments and tracking mollusc evolutionary trajectories. Here, we assess this potential by applying DNA extraction, high-throughput shotgun DNA sequencing and metagenomic analyses to marine mollusc shells spanning the last ~7,000 years. We report successful DNA extraction from shells, including a variety of ancient specimens, and find that DNA recovery is highly dependent on their biomineral structure, carbonate layer preservation and disease state. We demonstrate positive taxonomic identification of mollusc species using a combination of mitochondrial DNA genomes, barcodes, genome-scale data and metagenomic approaches. We also find shell biominerals to contain a diversity of microbial DNA from the marine environment. Finally, we reconstruct genomic sequences of organisms closely related to the Vibrio tapetis bacteria from Manila clam shells previously diagnosed with Brown Ring Disease. Our results reveal marine mollusc shells as novel genetic archives of the past, which opens new perspectives in ancient DNA research, with the potential to reconstruct the evolutionary history of molluscs, microbial communities and pathogens in the face of environmental changes. Other future applications include conservation of endangered mollusc species and aquaculture management. © 2017 John Wiley & Sons Ltd.
Minimum information about a marker gene sequence (MIMARKS) and minimum information about any (x) sequence (MIxS) specifications

PubMed Central

Yilmaz, Pelin; Kottmann, Renzo; Field, Dawn; Knight, Rob; Cole, James R; Amaral-Zettler, Linda; Gilbert, Jack A; Karsch-Mizrachi, Ilene; Johnston, Anjanette; Cochrane, Guy; Vaughan, Robert; Hunter, Christopher; Park, Joonhong; Morrison, Norman; Rocca-Serra, Philippe; Sterk, Peter; Arumugam, Manimozhiyan; Bailey, Mark; Baumgartner, Laura; Birren, Bruce W; Blaser, Martin J; Bonazzi, Vivien; Booth, Tim; Bork, Peer; Bushman, Frederic D; Buttigieg, Pier Luigi; Chain, Patrick S G; Charlson, Emily; Costello, Elizabeth K; Huot-Creasy, Heather; Dawyndt, Peter; DeSantis, Todd; Fierer, Noah; Fuhrman, Jed A; Gallery, Rachel E; Gevers, Dirk; Gibbs, Richard A; Gil, Inigo San; Gonzalez, Antonio; Gordon, Jeffrey I; Guralnick, Robert; Hankeln, Wolfgang; Highlander, Sarah; Hugenholtz, Philip; Jansson, Janet; Kau, Andrew L; Kelley, Scott T; Kennedy, Jerry; Knights, Dan; Koren, Omry; Kuczynski, Justin; Kyrpides, Nikos; Larsen, Robert; Lauber, Christian L; Legg, Teresa; Ley, Ruth E; Lozupone, Catherine A; Ludwig, Wolfgang; Lyons, Donna; Maguire, Eamonn; Methé, Barbara A; Meyer, Folker; Muegge, Brian; Nakielny, Sara; Nelson, Karen E; Nemergut, Diana; Neufeld, Josh D; Newbold, Lindsay K; Oliver, Anna E; Pace, Norman R; Palanisamy, Giriprakash; Peplies, Jörg; Petrosino, Joseph; Proctor, Lita; Pruesse, Elmar; Quast, Christian; Raes, Jeroen; Ratnasingham, Sujeevan; Ravel, Jacques; Relman, David A; Assunta-Sansone, Susanna; Schloss, Patrick D; Schriml, Lynn; Sinha, Rohini; Smith, Michelle I; Sodergren, Erica; Spor, Aymé; Stombaugh, Jesse; Tiedje, James M; Ward, Doyle V; Weinstock, George M; Wendel, Doug; White, Owen; Whiteley, Andrew; Wilke, Andreas; Wortman, Jennifer R; Yatsunenko, Tanya; Glöckner, Frank Oliver

2012-01-01

Here we present a standard developed by the Genomic Standards Consortium (GSC) for reporting marker gene sequences—the minimum information about a marker gene sequence (MIMARKS). We also introduce a system for describing the environment from which a biological sample originates. The ‘environmental packages’ apply to any genome sequence of known origin and can be used in combination with MIMARKS and other GSC checklists. Finally, to establish a unified standard for describing sequence data and to provide a single point of entry for the scientific community to access and learn about GSC checklists, we present the minimum information about any (x) sequence (MIxS). Adoption of MIxS will enhance our ability to analyze natural genetic diversity documented by massive DNA sequencing efforts from myriad ecosystems in our ever-changing biosphere. PMID:21552244
Influence of DNA sequence on the structure of minicircles under torsional stress

PubMed Central

Wang, Qian; Irobalieva, Rossitza N.; Chiu, Wah; Schmid, Michael F.; Fogg, Jonathan M.; Zechiedrich, Lynn

2017-01-01

Abstract The sequence dependence of the conformational distribution of DNA under various levels of torsional stress is an important unsolved problem. Combining theory and coarse-grained simulations shows that the DNA sequence and a structural correlation due to topology constraints of a circle are the main factors that dictate the 3D structure of a 336 bp DNA minicircle under torsional stress. We found that DNA minicircle topoisomers can have multiple bend locations under high torsional stress and that the positions of these sharp bends are determined by the sequence, and by a positive mechanical correlation along the sequence. We showed that simulations and theory are able to provide sequence-specific information about individual DNA minicircles observed by cryo-electron tomography (cryo-ET). We provided a sequence-specific cryo-ET tomogram fitting of DNA minicircles, registering the sequence within the geometric features. Our results indicate that the conformational distribution of minicircles under torsional stress can be designed, which has important implications for using minicircle DNA for gene therapy. PMID:28609782

Analysis of DNA Sequences by an Optical Time-Integrating Correlator: Proof-of-Concept Experiments.

DTIC Science & Technology

1992-05-01

DNA ANALYSIS STRATEGY 4 2.1 Representation of DNA Bases 4 2.2 DNA Analysis Strategy 6 3.0 CUSTOM GENERATORS FOR DNA SEQUENCES 10 3.1 Hardware Design 10...of the DNA bases where each base is represented by a 7-bits long pseudorandom sequence. 5 Figure 4: Coarse analysis of a DNA sequence. 7 Figure 5: Fine...a 20-bases long database. 32 xiii LIST OF TABLES PAGE Table 1: Short representations of the DNA bases where each base is represented by 7-bits long
Random Mutagenesis, Clonal Events, and Embryonic or Somatic Origin Determine the mtDNA Variant Type and Load in Human Pluripotent Stem Cells.

PubMed

Zambelli, Filippo; Mertens, Joke; Dziedzicka, Dominika; Sterckx, Johan; Markouli, Christina; Keller, Alexander; Tropel, Philippe; Jung, Laura; Viville, Stephane; Van de Velde, Hilde; Geens, Mieke; Seneca, Sara; Sermon, Karen; Spits, Claudia

2018-06-07

In this study, we deep-sequenced the mtDNA of human embryonic and induced pluripotent stem cells (hESCs and hiPSCs) and their source cells and found that the majority of variants pre-existed in the cells used to establish the lines. Early-passage hESCs carried few and low-load heteroplasmic variants, similar to those identified in oocytes and inner cell masses. The number and heteroplasmic loads of these variants increased with prolonged cell culture. The study of 120 individual cells of early- and late-passage hESCs revealed a significant diversity in mtDNA heteroplasmic variants at the single-cell level and that the variants that increase during time in culture are always passenger to the appearance of chromosomal abnormalities. We found that early-passage hiPSCs carry much higher loads of mtDNA variants than hESCs, which single-fibroblast sequencing proved pre-existed in the source cells. Finally, we show that these variants are stably transmitted during short-term differentiation. Copyright © 2018 The Author(s). Published by Elsevier Inc. All rights reserved.
NEIL3 Repairs Telomere Damage during S Phase to Secure Chromosome Segregation at Mitosis.

PubMed

Zhou, Jia; Chan, Jany; Lambelé, Marie; Yusufzai, Timur; Stumpff, Jason; Opresko, Patricia L; Thali, Markus; Wallace, Susan S

2017-08-29

Oxidative damage to telomere DNA compromises telomere integrity. We recently reported that the DNA glycosylase NEIL3 preferentially repairs oxidative lesions in telomere sequences in vitro. Here, we show that loss of NEIL3 causes anaphase DNA bridging because of telomere dysfunction. NEIL3 expression increases during S phase and reaches maximal levels in late S/G2. NEIL3 co-localizes with TRF2 and associates with telomeres during S phase, and this association increases upon oxidative stress. Mechanistic studies reveal that NEIL3 binds to single-stranded DNA via its intrinsically disordered C terminus in a telomere-sequence-independent manner. Moreover, NEIL3 is recruited to telomeres through its interaction with TRF1, and this interaction enhances the enzymatic activity of purified NEIL3. Finally, we show that NEIL3 interacts with AP Endonuclease 1 (APE1) and the long-patch base excision repair proteins PCNA and FEN1. Taken together, we propose that NEIL3 protects genome stability through targeted repair of oxidative damage in telomeres during S/G2 phase. Copyright © 2017 The Author(s). Published by Elsevier Inc. All rights reserved.
Laser mass spectrometry for DNA sequencing, disease diagnosis, and fingerprinting

NASA Astrophysics Data System (ADS)

Chen, C. H. Winston; Taranenko, N. I.; Zhu, Y. F.; Chung, C. N.; Allman, S. L.

1997-05-01

Since laser mass spectrometry has the potential for achieving very fast DNA analysis, we recently applied it to DNA sequencing, DNA typing for fingerprinting, and DNA screening for disease diagnosis. Two different approaches for sequencing DNA have been successfully demonstrated. One is to sequence DNA with DNA ladders produced from Sanger's enzymatic method. The other is to do direct sequencing without DNA ladders. The need for quick DNA typing for identification purposes is critical for forensic application. Our preliminary results indicate laser mass spectrometry can possible be used for rapid DNA fingerprinting applications at a much lower cost than gel electrophoresis. Population screening for certain genetic disease can be a very efficient step to reducing medical costs through prevention. Since laser mass spectrometry can provide very fast DNA analysis, we applied laser mass spectrometry to disease diagnosis. Clinical samples with both base deletion and point mutation have been tested with complete success.
Effect of sustained elevated temperature prior to amplification on template copy number estimation using digital polymerase chain reaction.

PubMed

Bhat, Somanath; McLaughlin, Jacob L H; Emslie, Kerry R

2011-02-21

Digital polymerase chain reaction (dPCR) has the potential to enable accurate quantification of target DNA copy number provided that all target DNA molecules are successfully amplified. Following duplex dPCR analysis from a linear DNA target sequence that contains single copies of two independent template sequences, we have observed that amplification of both templates in a single partition does not always occur. To investigate this finding, we heated the target DNA solution to 95 °C for increasing time intervals and then immediately chilled on ice prior to preparing the dPCR mix. We observed an exponential decline in estimated copy number (R(2)≥ 0.98) of the two template sequences when amplified from either a linearized plasmid or a 388 base pair (bp) amplicon containing the same two template sequences. The distribution of amplifiable templates and the final concentration (copies per µL) were both affected by heat treatment of the samples at 95 °C from 0 s to 30 min. The proportion of target sequences from which only one of the two templates was amplified in a single partition (either 1507 or hmg only) increased over time, while the proportion of target sequences where both templates were amplified (1507 and hmg) in each individual partition declined rapidly from 94% to 52% (plasmid) and 88% to 31% (388 bp amplicon) suggesting an increase in number of targets from which both templates no longer amplify. A 10 min incubation at 95 °C reduced the initial amplifiable template concentration of the plasmid and the 388 bp amplicon by 59% and 91%, respectively. To determine if a similar decrease in amplifiable target occurs during the default pre-activation step of typical PCR amplification protocol, we used mastermixes with a 20 s or 10 min hot-start. The choice of mastermix and consequent pre-activation time did not affect the estimated plasmid concentration. Therefore, we conclude that prolonged exposure of this DNA template to elevated temperatures could lead to significant bias in dPCR measurements. However, care must be taken when designing PCR and non-PCR based experiments by reducing exposure of the DNA template to sustained elevated temperatures in order to improve accuracy in copy number estimation and concentration determination.
Micronuclear DNA of Oxytricha nova contains sequences with autonomously replicating activity in Saccharomyces cerevisiae.

PubMed Central

Colombo, M M; Swanton, M T; Donini, P; Prescott, D M

1984-01-01

Oxytricha nova is a hypotrichous ciliate with micronuclei and macronuclei. Micronuclei, which contain large, chromosomal-sized DNA, are genetically inert but undergo meiosis and exchange during cell mating. Macronuclei, which contain only small, gene-sized DNA molecules, provide all of the nuclear RNA needed to run the cell. After cell mating the macronucleus is derived from a micronucleus, a derivation that includes excision of the genes from chromosomes and elimination of the remaining DNA. The eliminated DNA includes all of the repetitious sequences and approximately 95% of the unique sequences. We cloned large restriction fragments from the micronucleus that confer replication ability on a replication-deficient plasmid in Saccharomyces cerevisiae. Sequences that confer replication ability are called autonomously replicating sequences. The frequency and effectiveness of autonomously replicating sequences in micronuclear DNA are similar to those reported for DNAs of other organisms introduced into yeast cells. Of the 12 micronuclear fragments with autonomously replicating sequence activity, 9 also showed homology to macronuclear DNA, indicating that they contain a macronuclear gene sequence. We conclude from this that autonomously replicating sequence activity is nonrandomly distributed throughout micronuclear DNA and is preferentially associated with those regions of micronuclear DNA that contain genes. Images PMID:6092934
DNA sequence-dependent mechanics and protein-assisted bending in repressor-mediated loop formation

PubMed Central

Boedicker, James Q.; Garcia, Hernan G.; Johnson, Stephanie; Phillips, Rob

2014-01-01

As the chief informational molecule of life, DNA is subject to extensive physical manipulations. The energy required to deform double-helical DNA depends on sequence, and this mechanical code of DNA influences gene regulation, such as through nucleosome positioning. Here we examine the sequence-dependent flexibility of DNA in bacterial transcription factor-mediated looping, a context for which the role of sequence remains poorly understood. Using a suite of synthetic constructs repressed by the Lac repressor and two well-known sequences that show large flexibility differences in vitro, we make precise statistical mechanical predictions as to how DNA sequence influences loop formation and test these predictions using in vivo transcription and in vitro single-molecule assays. Surprisingly, sequence-dependent flexibility does not affect in vivo gene regulation. By theoretically and experimentally quantifying the relative contributions of sequence and the DNA-bending protein HU to DNA mechanical properties, we reveal that bending by HU dominates DNA mechanics and masks intrinsic sequence-dependent flexibility. Such a quantitative understanding of how mechanical regulatory information is encoded in the genome will be a key step towards a predictive understanding of gene regulation at single-base pair resolution. PMID:24231252
DNA Scrunching in the Packaging of Viral Genomes.

PubMed

Waters, James T; Kim, Harold D; Gumbart, James C; Lu, Xiang-Jun; Harvey, Stephen C

2016-07-07

The motors that drive double-stranded DNA (dsDNA) genomes into viral capsids are among the strongest of all biological motors for which forces have been measured, but it is not known how they generate force. We previously proposed that the DNA is not a passive substrate but that it plays an active role in force generation. This "scrunchworm hypothesis" holds that the motor proteins repeatedly dehydrate and rehydrate the DNA, which then undergoes cyclic shortening and lengthening motions. These are captured by a coupled protein-DNA grip-and-release cycle to rectify the motion and translocate the DNA into the capsid. In this study, we examined the interactions of dsDNA with the dodecameric connector protein of bacteriophage ϕ29, using molecular dynamics simulations on four different DNA sequences, starting from two different conformations (A-DNA and B-DNA). In all four simulations starting with the protein equilibrated with A-DNA in the channel, we observed transitions to a common, metastable, highly scrunched conformation, designated A*. This conformation is very similar to one recently reported by Kumar and Grubmüller in much longer MD simulations on B-DNA docked into the ϕ29 connector. These results are significant for four reasons. First, the scrunched conformations occur spontaneously, without requiring lever-like protein motions often believed to be necessary for DNA translocation. Second, the transition takes place within the connector, providing the location of the putative "dehydrator". Third, the protein has more contacts with one strand of the DNA than with the other; the former was identified in single-molecule laser tweezer experiments as the "load-bearing strand". Finally, the spontaneity of the DNA-protein interaction suggests that it may play a role in the initial docking of DNA in motors like that of T4 that can load and package any sequence.
Transcriptionally active PCR for antigen identification and vaccine development: in vitro genome-wide screening and in vivo immunogenicity

PubMed Central

Regis, David P.; Dobaño, Carlota; Quiñones-Olson, Paola; Liang, Xiaowu; Graber, Norma L.; Stefaniak, Maureen E.; Campo, Joseph J.; Carucci, Daniel J.; Roth, David A.; He, Huaping; Felgner, Philip L.; Doolan, Denise L.

2009-01-01

We have evaluated a technology called Transcriptionally Active PCR (TAP) for high throughput identification and prioritization of novel target antigens from genomic sequence data using the Plasmodium parasite, the causative agent of malaria, as a model. First, we adapted the TAP technology for the highly AT-rich Plasmodium genome, using well-characterized P. falciparum and P. yoelii antigens and a small panel of uncharacterized open reading frames from the P. falciparum genome sequence database. We demonstrated that TAP fragments encoding six well-characterized P. falciparum antigens and five well-characterized P. yoelii antigens could be amplified in an equivalent manner from both plasmid DNA and genomic DNA templates, and that uncharacterized open reading frames could also be amplified from genomic DNA template. Second, we showed that the in vitro expression of the TAP fragments was equivalent or superior to that of supercoiled plasmid DNA encoding the same antigen. Third, we evaluated the in vivo immunogenicity of TAP fragments encoding a subset of the model P. falciparum and P. yoelii antigens. We found that antigen-specific antibody and cellular immune responses induced by the TAP fragments in mice were equivalent or superior to those induced by the corresponding plasmid DNA vaccines. Finally, we developed and demonstrated proof-of-principle for an in vitro humoral immunoscreening assay for down-selection of novel target antigens. These data support the potential of a TAP approach for rapid high throughput functional screening and identification of potential candidate vaccine antigens from genomic sequence data. PMID:18164079
Development of a Stable Cell Line, Overexpressing Human T-cell Immunoglobulin Mucin 1

PubMed Central

Ebrahimi, Mina; Kazemi, Tohid; Ganjalikhani-hakemi, Mazdak; Majidi, Jafar; khanahmad, Hossein; Rahimmanesh, Ilnaz; Homayouni, Vida; Kohpayeh, Shirin

2015-01-01

Background Recent researches have demonstrated that human T-cell immunoglobulin mucin 1 (TIM-1) glycoprotein plays important roles in regulation of autoimmune and allergic diseases, as well as in tumor immunity and response to viral infections. Therefore, targeting TIM-1 could be a potential therapeutic approach against such diseases. Objectives In this study, we aimed to express TIM-1 protein on Human Embryonic kidney (HEK) 293T cell line in order to have an available source of the TIM-1 antigen. Materials and Methods The cDNA was synthesized after RNA extraction from peripheral blood mononuclear cells (PBMC) and TIM-1 cDNA was amplified by PCR with specific primers. The PCR product was cloned in pcDNA™3.1/Hygro (+) and transformed in Escherichia coli TOP 10 F’. After cloning, authenticity of DNA sequence was checked and expressed in HEK 293T cells. Finally, expression of TIM-1 was analyzed by flow cytometry and real-time PCR. Results The result of DNA sequencing demonstrated correctness of TIM-1 DNA sequence. The flow cytometry results indicated that TIM-1 was expressed in about 90% of transfected HEK 293T cells. The real-time PCR analysis showed TIM-1 mRNA expression increased 195-fold in transfected cells compared with un-transfected cells. Conclusions Findings of present study demonstrated the successful cloning and expression of TIM-1 on HEK 293T cells. These cells could be used as an immunogenic source for production of specific monoclonal antibodies, nanobodies and aptamers against human TIM-1. PMID:28959306
Transcriptionally active PCR for antigen identification and vaccine development: in vitro genome-wide screening and in vivo immunogenicity.

PubMed

Regis, David P; Dobaño, Carlota; Quiñones-Olson, Paola; Liang, Xiaowu; Graber, Norma L; Stefaniak, Maureen E; Campo, Joseph J; Carucci, Daniel J; Roth, David A; He, Huaping; Felgner, Philip L; Doolan, Denise L

2008-03-01

We have evaluated a technology called transcriptionally active PCR (TAP) for high throughput identification and prioritization of novel target antigens from genomic sequence data using the Plasmodium parasite, the causative agent of malaria, as a model. First, we adapted the TAP technology for the highly AT-rich Plasmodium genome, using well-characterized P. falciparum and P. yoelii antigens and a small panel of uncharacterized open reading frames from the P. falciparum genome sequence database. We demonstrated that TAP fragments encoding six well-characterized P. falciparum antigens and five well-characterized P. yoelii antigens could be amplified in an equivalent manner from both plasmid DNA and genomic DNA templates, and that uncharacterized open reading frames could also be amplified from genomic DNA template. Second, we showed that the in vitro expression of the TAP fragments was equivalent or superior to that of supercoiled plasmid DNA encoding the same antigen. Third, we evaluated the in vivo immunogenicity of TAP fragments encoding a subset of the model P. falciparum and P. yoelii antigens. We found that antigen-specific antibody and cellular immune responses induced by the TAP fragments in mice were equivalent or superior to those induced by the corresponding plasmid DNA vaccines. Finally, we developed and demonstrated proof-of-principle for an in vitro humoral immunoscreening assay for down-selection of novel target antigens. These data support the potential of a TAP approach for rapid high throughput functional screening and identification of potential candidate vaccine antigens from genomic sequence data.
Divergent nuclear 18S rDNA paralogs in a turkey coccidium, Eimeria meleagrimitis, complicate molecular systematics and identification.

PubMed

El-Sherry, Shiem; Ogedengbe, Mosun E; Hafeez, Mian A; Barta, John R

2013-07-01

Multiple 18S rDNA sequences were obtained from two single-oocyst-derived lines of each of Eimeria meleagrimitis and Eimeria adenoeides. After analysing the 15 new 18S rDNA sequences from two lines of E. meleagrimitis and 17 new sequences from two lines of E. adenoeides, there were clear indications that divergent, paralogous 18S rDNA copies existed within the nuclear genome of E. meleagrimitis. In contrast, mitochondrial cytochrome c oxidase subunit I (COI) partial sequences from all lines of a particular Eimeria sp. were identical and, in phylogenetic analyses, COI sequences clustered unambiguously in monophyletic and highly-supported clades specific to individual Eimeria sp. Phylogenetic analysis of the new 18S rDNA sequences from E. meleagrimitis showed that they formed two distinct clades: Type A with four new sequences; and Type B with nine new sequences; both Types A and B sequences were obtained from each of the single-oocyst-derived lines of E. meleagrimitis. Together these rDNA types formed a well-supported E. meleagrimitis clade. Types A and B 18S rDNA sequences from E. meleagrimitis had a mean sequence identity of only 97.4% whereas mean sequence identity within types was 99.1-99.3%. The observed intraspecific sequence divergence among E. meleagrimitis 18S rDNA sequence types was even higher (approximately 2.6%) than the interspecific sequence divergence present between some well-recognized species such as Eimeria tenella and Eimeria necatrix (1.1%). Our observations suggest that, unlike COI sequences, 18S rDNA sequences are not reliable molecular markers to be used alone for species identification with coccidia, although 18S rDNA sequences have clear utility for phylogenetic reconstruction of apicomplexan parasites at the genus and higher taxonomic ranks. Copyright © 2013. Published by Elsevier Ltd.
Affordable hands-on DNA sequencing and genotyping: an exercise for teaching DNA analysis to undergraduates.

PubMed

Shah, Kushani; Thomas, Shelby; Stein, Arnold

2013-01-01

In this report, we describe a 5-week laboratory exercise for undergraduate biology and biochemistry students in which students learn to sequence DNA and to genotype their DNA for selected single nucleotide polymorphisms (SNPs). Students use miniaturized DNA sequencing gels that require approximately 8 min to run. The students perform G, A, T, C Sanger sequencing reactions. They prepare and run the gels, perform Southern blots (which require only 10 min), and detect sequencing ladders using a colorimetric detection system. Students enlarge their sequencing ladders from digital images of their small nylon membranes, and read the sequence manually. They compare their reads with the actual DNA sequence using BLAST2. After mastering the DNA sequencing system, students prepare their own DNA from a cheek swab, polymerase chain reaction-amplify a region of their DNA that encompasses a SNP of interest, and perform sequencing to determine their genotype at the SNP position. A family pedigree can also be constructed. The SNP chosen by the instructor was rs17822931, which is in the ABCC11 gene and is the determinant of human earwax type. Genotypes at the rs178229931 site vary in different ethnic populations. © 2013 by The International Union of Biochemistry and Molecular Biology.
The mitochondrial genome of the pathogenic yeast Candida subhashii: GC-rich linear DNA with a protein covalently attached to the 5′ termini

PubMed Central

Fricova, Dominika; Valach, Matus; Farkas, Zoltan; Pfeiffer, Ilona; Kucsera, Judit; Tomaska, Lubomir; Nosek, Jozef

2010-01-01

As a part of our initiative aimed at a large-scale comparative analysis of fungal mitochondrial genomes, we determined the complete DNA sequence of the mitochondrial genome of the yeast Candida subhashii and found that it exhibits a number of peculiar features. First, the mitochondrial genome is represented by linear dsDNA molecules of uniform length (29 795 bp), with an unusually high content of guanine and cytosine residues (52.7 %). Second, the coding sequences lack introns; thus, the genome has a relatively compact organization. Third, the termini of the linear molecules consist of long inverted repeats and seem to contain a protein covalently bound to terminal nucleotides at the 5′ ends. This architecture resembles the telomeres in a number of linear viral and plasmid DNA genomes classified as invertrons, in which the terminal proteins serve as specific primers for the initiation of DNA synthesis. Finally, although the mitochondrial genome of C. subhashii contains essentially the same set of genes as other closely related pathogenic Candida species, we identified additional ORFs encoding two homologues of the family B protein-priming DNA polymerases and an unknown protein. The terminal structures and the genes for DNA polymerases are reminiscent of linear mitochondrial plasmids, indicating that this genome architecture might have emerged from fortuitous recombination between an ancestral, presumably circular, mitochondrial genome and an invertron-like element. PMID:20395267
Virgibacillus carmonensis sp. nov., Virgibacillus necropolis sp. nov. and Virgibacillus picturae sp. nov., three novel species isolated from deteriorated mural paintings, transfer of the species of the genus salibacillus to Virgibacillus, as Virgibacillus marismortui comb. nov. and Virgibacillus salexigens comb. nov., and emended description of the genus Virgibacillus.

PubMed

Heyrman, Jeroen; Logan, Niall A; Busse, Hans-Jürgen; Balcaen, An; Lebbe, Liesbeth; Rodriguez-Diaz, Marina; Swings, Jean; De Vos, Paul

2003-03-01

A group of 13 strains was isolated from samples of biofilm formation on the mural paintings of the Servilia tomb (necropolis of Carmona, Spain) and the Saint-Catherine chapel (castle at Herberstein, Austria). The strains were subjected to a polyphasic taxonomic study, including (GTG)5-PCR, 16S rDNA sequence analysis, DNA-DNA hybridizations, DNA base ratio determination, analysis of fatty acids, polar lipids and menaquinones and morphological and biochemical characterization. In a phylogenetic tree based on neighbour-joining of 16S rDNA sequences, the strains are divided in two major groups, representing three novel species according to DNA-DNA relatedness, that are positioned at approximately equal distances from Virgibacillus and Salibacillus. After comparison of the novel results with existing data, the transfer of the species of Salibacillus to Virgibacillus is proposed, with the resulting new combinations Virgibacillus marismortui comb. nov. and Virgibacillus salexigens comb. nov. Additionally, three novel species are described, for which the names Virgibacillus carmonensis sp. nov., Virgibacillus necropolis sp. nov. and Virgibacillus picturae sp. nov. are proposed. The respective type strains are LMG 20964T (=DSM 14868T), LMG 19488T (=DSM 14866T) and LMG 19492T (= DSM 14867T). Finally, an emended description of the genus Virgibacillus is given.
DNA capture and next-generation sequencing can recover whole mitochondrial genomes from highly degraded samples for human identification

PubMed Central

2013-01-01

Background Mitochondrial DNA (mtDNA) typing can be a useful aid for identifying people from compromised samples when nuclear DNA is too damaged, degraded or below detection thresholds for routine short tandem repeat (STR)-based analysis. Standard mtDNA typing, focused on PCR amplicon sequencing of the control region (HVS I and HVS II), is limited by the resolving power of this short sequence, which misses up to 70% of the variation present in the mtDNA genome. Methods We used in-solution hybridisation-based DNA capture (using DNA capture probes prepared from modern human mtDNA) to recover mtDNA from post-mortem human remains in which the majority of DNA is both highly fragmented (<100 base pairs in length) and chemically damaged. The method ‘immortalises’ the finite quantities of DNA in valuable extracts as DNA libraries, which is followed by the targeted enrichment of endogenous mtDNA sequences and characterisation by next-generation sequencing (NGS). Results We sequenced whole mitochondrial genomes for human identification from samples where standard nuclear STR typing produced only partial profiles or demonstrably failed and/or where standard mtDNA hypervariable region sequences lacked resolving power. Multiple rounds of enrichment can substantially improve coverage and sequencing depth of mtDNA genomes from highly degraded samples. The application of this method has led to the reliable mitochondrial sequencing of human skeletal remains from unidentified World War Two (WWII) casualties approximately 70 years old and from archaeological remains (up to 2,500 years old). Conclusions This approach has potential applications in forensic science, historical human identification cases, archived medical samples, kinship analysis and population studies. In particular the methodology can be applied to any case, involving human or non-human species, where whole mitochondrial genome sequences are required to provide the highest level of maternal lineage discrimination. Multiple rounds of in-solution hybridisation-based DNA capture can retrieve whole mitochondrial genome sequences from even the most challenging samples. PMID:24289217
RDNAnalyzer: A tool for DNA secondary structure prediction and sequence analysis.

PubMed

Afzal, Muhammad; Shahid, Ahmad Ali; Shehzadi, Abida; Nadeem, Shahid; Husnain, Tayyab

2012-01-01

RDNAnalyzer is an innovative computer based tool designed for DNA secondary structure prediction and sequence analysis. It can randomly generate the DNA sequence or user can upload the sequences of their own interest in RAW format. It uses and extends the Nussinov dynamic programming algorithm and has various application for the sequence analysis. It predicts the DNA secondary structure and base pairings. It also provides the tools for routinely performed sequence analysis by the biological scientists such as DNA replication, reverse compliment generation, transcription, translation, sequence specific information as total number of nucleotide bases, ATGC base contents along with their respective percentages and sequence cleaner. RDNAnalyzer is a unique tool developed in Microsoft Visual Studio 2008 using Microsoft Visual C# and Windows Presentation Foundation and provides user friendly environment for sequence analysis. It is freely available. http://www.cemb.edu.pk/sw.html RDNAnalyzer - Random DNA Analyser, GUI - Graphical user interface, XAML - Extensible Application Markup Language.
Direct Detection and Sequencing of Damaged DNA Bases

PubMed Central

2011-01-01

Products of various forms of DNA damage have been implicated in a variety of important biological processes, such as aging, neurodegenerative diseases, and cancer. Therefore, there exists great interest to develop methods for interrogating damaged DNA in the context of sequencing. Here, we demonstrate that single-molecule, real-time (SMRT®) DNA sequencing can directly detect damaged DNA bases in the DNA template - as a by-product of the sequencing method - through an analysis of the DNA polymerase kinetics that are altered by the presence of a modified base. We demonstrate the sequencing of several DNA templates containing products of DNA damage, including 8-oxoguanine, 8-oxoadenine, O6-methylguanine, 1-methyladenine, O4-methylthymine, 5-hydroxycytosine, 5-hydroxyuracil, 5-hydroxymethyluracil, or thymine dimers, and show that these base modifications can be readily detected with single-modification resolution and DNA strand specificity. We characterize the distinct kinetic signatures generated by these DNA base modifications. PMID:22185597
Direct detection and sequencing of damaged DNA bases.

PubMed

Clark, Tyson A; Spittle, Kristi E; Turner, Stephen W; Korlach, Jonas

2011-12-20

Products of various forms of DNA damage have been implicated in a variety of important biological processes, such as aging, neurodegenerative diseases, and cancer. Therefore, there exists great interest to develop methods for interrogating damaged DNA in the context of sequencing. Here, we demonstrate that single-molecule, real-time (SMRT®) DNA sequencing can directly detect damaged DNA bases in the DNA template - as a by-product of the sequencing method - through an analysis of the DNA polymerase kinetics that are altered by the presence of a modified base. We demonstrate the sequencing of several DNA templates containing products of DNA damage, including 8-oxoguanine, 8-oxoadenine, O6-methylguanine, 1-methyladenine, O4-methylthymine, 5-hydroxycytosine, 5-hydroxyuracil, 5-hydroxymethyluracil, or thymine dimers, and show that these base modifications can be readily detected with single-modification resolution and DNA strand specificity. We characterize the distinct kinetic signatures generated by these DNA base modifications.
A comprehensive list of cloned human DNA sequences

PubMed Central

Schmidtke, Jörg; Cooper, David N.

1987-01-01

A list of DNA sequences cloned from the human genome is presented. Intended as a guide to clone availability, this list includes published reports of cDNA, genomic and synthetic clones comprising gene and pseudogene sequences, uncharacterised DNA segments and repetitive DNA elements. PMID:3575113

A comprehensive list of cloned human DNA sequences

PubMed Central

Schmidtke, Jörg; Cooper, David N.

1990-01-01

A list of DNA sequences cloned from the human genome is presented. Intended as a guide to clone availability, this list includes published reports of cDNA, genomic and synthetic clones comprising gene and pseudogene sequences, uncharacterised DNA segments and repetitive DNA elements. PMID:2333227
A comprehensive list of cloned human DNA sequences

PubMed Central

Schmidtke, Jörg; Cooper, David N.

1988-01-01

A list of DNA sequences cloned from the human genome is presented. Intended as a guide to clone availability, this list includes published reports of cDNA, genomic and synthetic clones comprising gene and pseudogene sequences, uncharacterised DNA segments and repetitive DNA elements. PMID:3368330
A comprehensive list of cloned human DNA sequences

PubMed Central

Schmidtke, Jörg; Cooper, David N.

1989-01-01

A list of DNA sequences cloned from the human genome is presented. Intended as a guide to clone availability, this list includes published reports of cDNA, genomic and synthetic clones comprising gene and pseudogene sequences, uncharacterised DNA segments and repetitive DNA elements. PMID:2654889
Kilo-sequencing: an ordered strategy for rapid DNA sequence data acquisition.

PubMed Central

Barnes, W M; Bevan, M

1983-01-01

A strategy for rapid DNA sequence acquisition in an ordered, nonrandom manner, while retaining all of the conveniences of the dideoxy method with M13 transducing phage DNA template, is described. Target DNA 3 to 14 kb in size can be stably carried by our M13 vectors. Suitable targets are stretches of DNA which lack an enzyme recognition site which is unique on our cloning vectors and adjacent to the sequencing primer; current sites that are so useful when lacking are Pst, Xba, HindIII, BglII, EcoRI. By an in vitro procedure, we cut RF DNA once randomly and once specifically, to create thousands of deletions which start at the unique restriction site adjacent to the dideoxy sequencing primer and extend various distances across the target DNA. Phage carrying a desired size of deletions, whose DNA as template will give rise to DNA sequence data in a desired location along the target DNA, may be purified by electrophoresis alive on agarose gels. Phage running in the same location on the agarose gel thus conveniently give rise to nucleotide sequence data from the same kilobase of target DNA. Images PMID:6298723
Silicene nanoribbon as a new DNA sequencing device

NASA Astrophysics Data System (ADS)

Alesheikh, Sara; Shahtahmassebi, Nasser; Roknabadi, Mahmood Rezaee; Pilevar Shahri, Raheleh

2018-02-01

The importance of applying DNA sequencing in different fields, results in looking for fast and cheap methods. Nanotechnology helps this development by introducing nanostructures used for DNA sequencing. In this work we study the interaction between zigzag silicene nanoribbon and DNA nucleobases using DFT and non equilibrium Green's function approach, to investigate the possibility of using zigzag silicene nanoribbons as a biosensor for DNA sequencing.
Isolation and characterization of target sequences of the chicken CdxA homeobox gene.

PubMed Central

Margalit, Y; Yarus, S; Shapira, E; Gruenbaum, Y; Fainsod, A

1993-01-01

The DNA binding specificity of the chicken homeodomain protein CDXA was studied. Using a CDXA-glutathione-S-transferase fusion protein, DNA fragments containing the binding site for this protein were isolated. The sources of DNA were oligonucleotides with random sequence and chicken genomic DNA. The DNA fragments isolated were sequenced and tested in DNA binding assays. Sequencing revealed that most DNA fragments are AT rich which is a common feature of homeodomain binding sites. By electrophoretic mobility shift assays it was shown that the different target sequences isolated bind to the CDXA protein with different affinities. The specific sequences bound by the CDXA protein in the genomic fragments isolated, were determined by DNase I footprinting. From the footprinted sequences, the CDXA consensus binding site was determined. The CDXA protein binds the consensus sequence A, A/T, T, A/T, A, T, A/G. The CAUDAL binding site in the ftz promoter is also included in this consensus sequence. When tested, some of the genomic target sequences were capable of enhancing the transcriptional activity of reporter plasmids when introduced into CDXA expressing cells. This study determined the DNA sequence specificity of the CDXA protein and it also shows that this protein can further activate transcription in cells in culture. Images PMID:7909943
Sequence periodicity in nucleosomal DNA and intrinsic curvature.

PubMed

Nair, T Murlidharan

2010-05-17

Most eukaryotic DNA contained in the nucleus is packaged by wrapping DNA around histone octamers. Histones are ubiquitous and bind most regions of chromosomal DNA. In order to achieve smooth wrapping of the DNA around the histone octamer, the DNA duplex should be able to deform and should possess intrinsic curvature. The deformability of DNA is a result of the non-parallelness of base pair stacks. The stacking interaction between base pairs is sequence dependent. The higher the stacking energy the more rigid the DNA helix, thus it is natural to expect that sequences that are involved in wrapping around the histone octamer should be unstacked and possess intrinsic curvature. Intrinsic curvature has been shown to be dictated by the periodic recurrence of certain dinucleotides. Several genome-wide studies directed towards mapping of nucleosome positions have revealed periodicity associated with certain stretches of sequences. In the current study, these sequences have been analyzed with a view to understand their sequence-dependent structures. Higher order DNA structures and the distribution of molecular bend loci associated with 146 base nucleosome core DNA sequence from C. elegans and chicken have been analyzed using the theoretical model for DNA curvature. The curvature dispersion calculated by cyclically permuting the sequences revealed that the molecular bend loci were delocalized throughout the nucleosome core region and had varying degrees of intrinsic curvature. The higher order structures associated with nucleosomes of C.elegans and chicken calculated from the sequences revealed heterogeneity with respect to the deviation of the DNA axis. The results points to the possibility of context dependent curvature of varying degrees to be associated with nucleosomal DNA.
Assessing the Fidelity of Ancient DNA Sequences Amplified From Nuclear Genes

PubMed Central

Binladen, Jonas; Wiuf, Carsten; Gilbert, M. Thomas P.; Bunce, Michael; Barnett, Ross; Larson, Greger; Greenwood, Alex D.; Haile, James; Ho, Simon Y. W.; Hansen, Anders J.; Willerslev, Eske

2006-01-01

To date, the field of ancient DNA has relied almost exclusively on mitochondrial DNA (mtDNA) sequences. However, a number of recent studies have reported the successful recovery of ancient nuclear DNA (nuDNA) sequences, thereby allowing the characterization of genetic loci directly involved in phenotypic traits of extinct taxa. It is well documented that postmortem damage in ancient mtDNA can lead to the generation of artifactual sequences. However, as yet no one has thoroughly investigated the damage spectrum in ancient nuDNA. By comparing clone sequences from 23 fossil specimens, recovered from environments ranging from permafrost to desert, we demonstrate the presence of miscoding lesion damage in both the mtDNA and nuDNA, resulting in insertion of erroneous bases during amplification. Interestingly, no significant differences in the frequency of miscoding lesion damage are recorded between mtDNA and nuDNA despite great differences in cellular copy numbers. For both mtDNA and nuDNA, we find significant positive correlations between total sequence heterogeneity and the rates of type 1 transitions (adenine → guanine and thymine → cytosine) and type 2 transitions (cytosine → thymine and guanine → adenine), respectively. Type 2 transitions are by far the most dominant and increase relative to those of type 1 with damage load. The results suggest that the deamination of cytosine (and 5-methyl cytosine) to uracil (and thymine) is the main cause of miscoding lesions in both ancient mtDNA and nuDNA sequences. We argue that the problems presented by postmortem damage, as well as problems with contamination from exogenous sources of conserved nuclear genes, allelic variation, and the reliance on single nucleotide polymorphisms, call for great caution in studies relying on ancient nuDNA sequences. PMID:16299392
[Current applications of high-throughput DNA sequencing technology in antibody drug research].

PubMed

Yu, Xin; Liu, Qi-Gang; Wang, Ming-Rong

2012-03-01

Since the publication of a high-throughput DNA sequencing technology based on PCR reaction was carried out in oil emulsions in 2005, high-throughput DNA sequencing platforms have been evolved to a robust technology in sequencing genomes and diverse DNA libraries. Antibody libraries with vast numbers of members currently serve as a foundation of discovering novel antibody drugs, and high-throughput DNA sequencing technology makes it possible to rapidly identify functional antibody variants with desired properties. Herein we present a review of current applications of high-throughput DNA sequencing technology in the analysis of antibody library diversity, sequencing of CDR3 regions, identification of potent antibodies based on sequence frequency, discovery of functional genes, and combination with various display technologies, so as to provide an alternative approach of discovery and development of antibody drugs.
Micromonospora halotolerans sp. nov., isolated from the rhizosphere of a Pisum sativum plant.

PubMed

Carro, Lorena; Pukall, Rüdiger; Spröer, Cathrin; Kroppenstedt, Reiner M; Trujillo, Martha E

2013-06-01

A filamentous actinomycete strain designated CR18(T) was isolated on humic acid agar from the rhizosphere of a Pisum sativum plant collected in Spain. This isolate was observed to grow optimally at 28 °C, pH 7.0 and in the presence of 5 % NaCl. Phylogenetic analyses based on the 16S rRNA gene sequence indicated a close relationship with the type strains of Micromonospora chersina and Micromonospora endolithica. A further analysis based on a concatenated DNA sequence stretch of 4,523 bp that included partial sequences of the atpD, gyrB, recA, rpoB and 16S rRNA genes clearly differentiated the new strain from recognized Micromonospora species compared. DNA-DNA hybridization studies further supported the taxonomic position of strain CR18(T) as a novel genomic species. Chemotaxonomic analyses which included whole cell sugars, polar lipids, fatty acid profiles and menaquinone composition confirmed the affiliation of the new strain to the genus Micromonospora and also highlighted differences at the species level. These studies were finally complemented with an array of physiological tests to help differentiate between the new strain and its phylogenetic neighbours. Consequently, strain CR18(T) (= CECT 7890(T) = DSM 45598(T)) is proposed as the type strain of a novel species, Micromonospora halotolerans sp. nov.
DNA fingerprinting, DNA barcoding, and next generation sequencing technology in plants.

PubMed

Sucher, Nikolaus J; Hennell, James R; Carles, Maria C

2012-01-01

DNA fingerprinting of plants has become an invaluable tool in forensic, scientific, and industrial laboratories all over the world. PCR has become part of virtually every variation of the plethora of approaches used for DNA fingerprinting today. DNA sequencing is increasingly used either in combination with or as a replacement for traditional DNA fingerprinting techniques. A prime example is the use of short, standardized regions of the genome as taxon barcodes for biological identification of plants. Rapid advances in "next generation sequencing" (NGS) technology are driving down the cost of sequencing and bringing large-scale sequencing projects into the reach of individual investigators. We present an overview of recent publications that demonstrate the use of "NGS" technology for DNA fingerprinting and DNA barcoding applications.
Mammalian DNA enriched for replication origins is enriched for snap-back sequences.

PubMed

Zannis-Hadjopoulos, M; Kaufmann, G; Martin, R G

1984-11-15

Using the instability of replication loops as a method for the isolation of double-stranded nascent DNA, extruded DNA enriched for replication origins was obtained and denatured. Snap-back DNA, single-stranded DNA with inverted repeats (palindromic sequences), reassociates rapidly into stem-loop structures with zero-order kinetics when conditions are changed from denaturing to renaturing, and can be assayed by chromatography on hydroxyapatite. Origin-enriched nascent DNA strands from mouse, rat and monkey cells growing either synchronously or asynchronously were purified and assayed for the presence of snap-back sequences. The results show that origin-enriched DNA is also enriched for snap-back sequences, implying that some origins for mammalian DNA replication contain or lie near palindromic sequences.
Synthesis of the human insulin gene. Part III. Chemical synthesis of 5'-phosphomonoester group containing deoxyribooligonucleotides by the modified phosphotriester method. Its application in the synthesis of seventeen fragments constituting human insulin C-chain DNA.

PubMed Central

Hsiung, H M; Sung, W L; Brousseau, R; Wu, R; Narang, S A

1980-01-01

A method for phosphorylating a protected deoxyribooligonucleotide containing phosphotriester linkages is described. The modified phosphotriester method of chemical synthesis is further refined in terms of (i) better final deblocking conditions and (ii) new chromatography solvent systems containing acetone-water-ethyl acetate to yield pure oligomers. The effectiveness of these improvements has been demonstrated in the rapid and efficient synthesis of seventeen fragments constituting the sequence of human insulin C-chain DNA. Images PMID:7008029
Molecular cloning of cDNAs for the nerve-cell specific phosphoprotein, synapsin I.

PubMed Central

Kilimann, M W; DeGennaro, L J

1985-01-01

To provide access to synapsin I-specific DNA sequences, we have constructed cDNA clones complementary to synapsin I mRNA isolated from rat brain. Synapsin I mRNA was specifically enriched by immunoadsorption of polysomes prepared from the brains of 10-14 day old rats. Employing this enriched mRNA, a cDNA library was constructed in pBR322 and screened by differential colony hybridization with single-stranded cDNA probes made from synapsin I mRNA and total polysomal poly(A)+ RNA. This screening procedure proved to be highly selective. Five independent recombinant plasmids which exhibited distinctly stronger hybridization with the synapsin I probe were characterized further by restriction mapping. All of the cDNA inserts gave restriction enzyme digestion patterns which could be aligned. In addition, some of the cDNA inserts were shown to contain poly(dA) sequences. Final identification of synapsin I cDNA clones relied on the ability of the cDNA inserts to hybridize specifically to synapsin I mRNA. Several plasmids were tested by positive hybridization selection. They specifically selected synapsin I mRNA which was identified by in vitro translation and immunoprecipitation of the translation products. The established cDNA clones were used for a blot-hybridization analysis of synapsin I mRNA. A fragment (1600 bases) from the longest cDNA clone hybridized with two discrete RNA species 5800 and 4500 bases long, in polyadenylated RNA from rat brain and PC12 cells. No hybridization was detected to RNA from rat liver, skeletal muscle or cardiac muscle. Images Fig. 1. Fig. 2. Fig. 4. Fig. 5. PMID:3933975
Using Environmental DNA to Census Marine Fishes in a Large Mesocosm

PubMed Central

Kelly, Ryan P.; Port, Jesse A.; Yamahara, Kevan M.; Crowder, Larry B.

2014-01-01

The ocean is a soup of its resident species' genetic material, cast off in the forms of metabolic waste, shed skin cells, or damaged tissue. Sampling this environmental DNA (eDNA) is a potentially powerful means of assessing whole biological communities, a significant advance over the manual methods of environmental sampling that have historically dominated marine ecology and related fields. Here, we estimate the vertebrate fauna in a 4.5-million-liter mesocosm aquarium tank at the Monterey Bay Aquarium of known species composition by sequencing the eDNA from its constituent seawater. We find that it is generally possible to detect mitochondrial DNA of bony fishes sufficient to identify organisms to taxonomic family- or genus-level using a 106 bp fragment of the 12S ribosomal gene. Within bony fishes, we observe a low false-negative detection rate, although we did not detect the cartilaginous fishes or sea turtles present with this fragment. We find that the rank abundance of recovered eDNA sequences correlates with the abundance of corresponding species' biomass in the mesocosm, but the data in hand do not allow us to develop a quantitative relationship between biomass and eDNA abundance. Finally, we find a low false-positive rate for detection of exogenous eDNA, and we were able to diagnose non-native species' tissue in the food used to maintain the mesocosm, underscoring the sensitivity of eDNA as a technique for community-level ecological surveys. We conclude that eDNA has substantial potential to become a core tool for environmental monitoring, but that a variety of challenges remain before reliable quantitative assessments of ecological communities in the field become possible. PMID:24454960
DNA sequence determinants controlling affinity, stability and shape of DNA complexes bound by the nucleoid protein Fis

DOE Office of Scientific and Technical Information (OSTI.GOV)

Hancock, Stephen P.; Stella, Stefano; Cascio, Duilio

The abundant Fis nucleoid protein selectively binds poorly related DNA sequences with high affinities to regulate diverse DNA reactions. Fis binds DNA primarily through DNA backbone contacts and selects target sites by reading conformational properties of DNA sequences, most prominently intrinsic minor groove widths. High-affinity binding requires Fis-stabilized DNA conformational changes that vary depending on DNA sequence. In order to better understand the molecular basis for high affinity site recognition, we analyzed the effects of DNA sequence within and flanking the core Fis binding site on binding affinity and DNA structure. X-ray crystal structures of Fis-DNA complexes containing variable sequencesmore » in the noncontacted center of the binding site or variations within the major groove interfaces show that the DNA can adapt to the Fis dimer surface asymmetrically. We show that the presence and position of pyrimidine-purine base steps within the major groove interfaces affect both local DNA bending and minor groove compression to modulate affinities and lifetimes of Fis-DNA complexes. Sequences flanking the core binding site also modulate complex affinities, lifetimes, and the degree of local and global Fis-induced DNA bending. In particular, a G immediately upstream of the 15 bp core sequence inhibits binding and bending, and A-tracts within the flanking base pairs increase both complex lifetimes and global DNA curvatures. Taken together, our observations support a revised DNA motif specifying high-affinity Fis binding and highlight the range of conformations that Fis-bound DNA can adopt. Lastly, the affinities and DNA conformations of individual Fis-DNA complexes are likely to be tailored to their context-specific biological functions.« less
DNA sequence determinants controlling affinity, stability and shape of DNA complexes bound by the nucleoid protein Fis

DOE PAGES

Hancock, Stephen P.; Stella, Stefano; Cascio, Duilio; ...

2016-03-09

The abundant Fis nucleoid protein selectively binds poorly related DNA sequences with high affinities to regulate diverse DNA reactions. Fis binds DNA primarily through DNA backbone contacts and selects target sites by reading conformational properties of DNA sequences, most prominently intrinsic minor groove widths. High-affinity binding requires Fis-stabilized DNA conformational changes that vary depending on DNA sequence. In order to better understand the molecular basis for high affinity site recognition, we analyzed the effects of DNA sequence within and flanking the core Fis binding site on binding affinity and DNA structure. X-ray crystal structures of Fis-DNA complexes containing variable sequencesmore » in the noncontacted center of the binding site or variations within the major groove interfaces show that the DNA can adapt to the Fis dimer surface asymmetrically. We show that the presence and position of pyrimidine-purine base steps within the major groove interfaces affect both local DNA bending and minor groove compression to modulate affinities and lifetimes of Fis-DNA complexes. Sequences flanking the core binding site also modulate complex affinities, lifetimes, and the degree of local and global Fis-induced DNA bending. In particular, a G immediately upstream of the 15 bp core sequence inhibits binding and bending, and A-tracts within the flanking base pairs increase both complex lifetimes and global DNA curvatures. Taken together, our observations support a revised DNA motif specifying high-affinity Fis binding and highlight the range of conformations that Fis-bound DNA can adopt. Lastly, the affinities and DNA conformations of individual Fis-DNA complexes are likely to be tailored to their context-specific biological functions.« less
Assessing Understanding of Complex Learning Outcomes and Real-World Skills Using an Authentic Software Tool: A Study from Biomedical Sciences

ERIC Educational Resources Information Center

Dermo, John; Boyne, James

2014-01-01

We describe a study conducted during 2009-12 into innovative assessment practice, evaluating an assessed coursework task on a final year Medical Genetics module for Biomedical Science undergraduates. An authentic e-assessment coursework task was developed, integrating objectively marked online questions with an online DNA sequence analysis tool…
Specific minor groove solvation is a crucial determinant of DNA binding site recognition

PubMed Central

Harris, Lydia-Ann; Williams, Loren Dean; Koudelka, Gerald B.

2014-01-01

The DNA sequence preferences of nearly all sequence specific DNA binding proteins are influenced by the identities of bases that are not directly contacted by protein. Discrimination between non-contacted base sequences is commonly based on the differential abilities of DNA sequences to allow narrowing of the DNA minor groove. However, the factors that govern the propensity of minor groove narrowing are not completely understood. Here we show that the differential abilities of various DNA sequences to support formation of a highly ordered and stable minor groove solvation network are a key determinant of non-contacted base recognition by a sequence-specific binding protein. In addition, disrupting the solvent network in the non-contacted region of the binding site alters the protein's ability to recognize contacted base sequences at positions 5–6 bases away. This observation suggests that DNA solvent interactions link contacted and non-contacted base recognition by the protein. PMID:25429976
A Method for Preparing DNA Sequencing Templates Using a DNA-Binding Microplate

PubMed Central

Yang, Yu; Hebron, Haroun R.; Hang, Jun

2009-01-01

A DNA-binding matrix was immobilized on the surface of a 96-well microplate and used for plasmid DNA preparation for DNA sequencing. The same DNA-binding plate was used for bacterial growth, cell lysis, DNA purification, and storage. In a single step using one buffer, bacterial cells were lysed by enzymes, and released DNA was captured on the plate simultaneously. After two wash steps, DNA was eluted and stored in the same plate. Inclusion of phosphates in the culture medium was found to enhance the yield of plasmid significantly. Purified DNA samples were used successfully in DNA sequencing with high consistency and reproducibility. Eleven vectors and nine libraries were tested using this method. In 10 μl sequencing reactions using 3 μl sample and 0.25 μl BigDye Terminator v3.1, the results from a 3730xl sequencer gave a success rate of 90–95% and read-lengths of 700 bases or more. The method is fully automatable and convenient for manual operation as well. It enables reproducible, high-throughput, rapid production of DNA with purity and yields sufficient for high-quality DNA sequencing at a substantially reduced cost. PMID:19568455

Dendritic Cell-Based Immunotherapy of Breast Cancer: Modulation by CpG DNA

DTIC Science & Technology

2005-09-01

tumor-associated antigens and bacterial DNA oligodeoxynucleotides containing unmethylated CpG sequences (CpG DNA) further augment the immune priming...associated antigens by cytotoxic T lymphocytes, and bacterial DNA oligodeoxy- nucleotides containing unmethylated CpG sequences (CpG DNA) can further...further amplify their immunostimulatory capacity and bacterial DNA oligodeoxynucleotides (ODN) containing unmethylated CpG sequences (CpG DNA) provide such
An exonuclease III and graphene oxide-aided assay for DNA detection.

PubMed

Peng, Lu; Zhu, Zhi; Chen, Yan; Han, Da; Tan, Weihong

2012-05-15

We have developed a novel DNA assay based on exonuclease III (ExoIII)-induced target recycling and the fluorescence quenching ability of graphene oxide (GO). This assay consists of a linear DNA probe labeled with a fluorophore in the middle. Introduction of target sequence induces the exonuclease III catalyzed probe digestion and generation of single nucleotides. After each cycle of digestion, the target is recycled to realize the amplification. Finally, graphene oxide is added to quench the remaining probes and the signal from the resulting fluorophore labeled single nucleotides is detected. With this approach, a sub-picomolar detection limit can be achieved within 40 min at 37°C. The method was successfully applied to multicolor DNA detection and the analysis of telomerase activity in extracts from cancer cells. Copyright © 2012 Elsevier B.V. All rights reserved.
A rapid and cost-effective method for sequencing pooled cDNA clones by using a combination of transposon insertion and Gateway technology.

PubMed

Morozumi, Takeya; Toki, Daisuke; Eguchi-Ogawa, Tomoko; Uenishi, Hirohide

2011-09-01

Large-scale cDNA-sequencing projects require an efficient strategy for mass sequencing. Here we describe a method for sequencing pooled cDNA clones using a combination of transposon insertion and Gateway technology. Our method reduces the number of shotgun clones that are unsuitable for reconstruction of cDNA sequences, and has the advantage of reducing the total costs of the sequencing project.
Biological sequence compression algorithms.

PubMed

Matsumoto, T; Sadakane, K; Imai, H

2000-01-01

Today, more and more DNA sequences are becoming available. The information about DNA sequences are stored in molecular biology databases. The size and importance of these databases will be bigger and bigger in the future, therefore this information must be stored or communicated efficiently. Furthermore, sequence compression can be used to define similarities between biological sequences. The standard compression algorithms such as gzip or compress cannot compress DNA sequences, but only expand them in size. On the other hand, CTW (Context Tree Weighting Method) can compress DNA sequences less than two bits per symbol. These algorithms do not use special structures of biological sequences. Two characteristic structures of DNA sequences are known. One is called palindromes or reverse complements and the other structure is approximate repeats. Several specific algorithms for DNA sequences that use these structures can compress them less than two bits per symbol. In this paper, we improve the CTW so that characteristic structures of DNA sequences are available. Before encoding the next symbol, the algorithm searches an approximate repeat and palindrome using hash and dynamic programming. If there is a palindrome or an approximate repeat with enough length then our algorithm represents it with length and distance. By using this preprocessing, a new program achieves a little higher compression ratio than that of existing DNA-oriented compression algorithms. We also describe new compression algorithm for protein sequences.
Detection of DNA Methylation by Whole-Genome Bisulfite Sequencing.

PubMed

Li, Qing; Hermanson, Peter J; Springer, Nathan M

2018-01-01

DNA methylation plays an important role in the regulation of the expression of transposons and genes. Various methods have been developed to assay DNA methylation levels. Bisulfite sequencing is considered to be the "gold standard" for single-base resolution measurement of DNA methylation levels. Coupled with next-generation sequencing, whole-genome bisulfite sequencing (WGBS) allows DNA methylation to be evaluated at a genome-wide scale. Here, we described a protocol for WGBS in plant species with large genomes. This protocol has been successfully applied to assay genome-wide DNA methylation levels in maize and barley. This protocol has also been successfully coupled with sequence capture technology to assay DNA methylation levels in a targeted set of genomic regions.
Single-Molecule Electrical Random Resequencing of DNA and RNA

NASA Astrophysics Data System (ADS)

Ohshiro, Takahito; Matsubara, Kazuki; Tsutsui, Makusu; Furuhashi, Masayuki; Taniguchi, Masateru; Kawai, Tomoji

2012-07-01

Two paradigm shifts in DNA sequencing technologies--from bulk to single molecules and from optical to electrical detection--are expected to realize label-free, low-cost DNA sequencing that does not require PCR amplification. It will lead to development of high-throughput third-generation sequencing technologies for personalized medicine. Although nanopore devices have been proposed as third-generation DNA-sequencing devices, a significant milestone in these technologies has been attained by demonstrating a novel technique for resequencing DNA using electrical signals. Here we report single-molecule electrical resequencing of DNA and RNA using a hybrid method of identifying single-base molecules via tunneling currents and random sequencing. Our method reads sequences of nine types of DNA oligomers. The complete sequence of 5'-UGAGGUA-3' from the let-7 microRNA family was also identified by creating a composite of overlapping fragment sequences, which was randomly determined using tunneling current conducted by single-base molecules as they passed between a pair of nanoelectrodes.
DNA as a powerful tool for morphology control, spatial positioning, and dynamic assembly of nanoparticles.

PubMed

Tan, Li Huey; Xing, Hang; Lu, Yi

2014-06-17

CONSPECTUS: Several properties of nanomaterials, such as morphologies (e.g., shapes and surface structures) and distance dependent properties (e.g., plasmonic and quantum confinement effects), make nanomaterials uniquely qualified as potential choices for future applications from catalysis to biomedicine. To realize the full potential of these nanomaterials, it is important to demonstrate fine control of the morphology of individual nanoparticles, as well as precise spatial control of the position, orientation, and distances between multiple nanoparticles. In addition, dynamic control of nanomaterial assembly in response to multiple stimuli, with minimal or no error, and the reversibility of the assemblies are also required. In this Account, we summarize recent progress of using DNA as a powerful programmable tool to realize the above goals. First, inspired by the discovery of genetic codes in biology, we have discovered DNA sequence combinations to control different morphologies of nanoparticles during their growth process and have shown that these effects are synergistic or competitive, depending on the sequence combination. The DNA, which guides the growth of the nanomaterial, is stable and retains its biorecognition ability. Second, by taking advantage of different reactivities of phosphorothioate and phosphodiester backbone, we have placed phosphorothioate at selective positions on different DNA nanostructures including DNA tetrahedrons. Bifunctional linkers have been used to conjugate phosphorothioate on one end and bind nanoparticles or proteins on the other end. In doing so, precise control of distances between two or more nanoparticles or proteins with nanometer resolution can be achieved. Furthermore, by developing facile methods to functionalize two hemispheres of Janus nanoparticles with two different DNA sequences regioselectively, we have demonstrated directional control of nanomaterial assembly, where DNA strands with specific hybridization serve as orthogonal linkers. Third, by using functional DNA that includes DNAzyme, aptamer, and aptazyme, dynamic control of assemblies of gold nanoparticles, quantum dots, carbon nanotubes, and iron oxide nanoparticles in response to one or more stimuli cooperatively have been achieved, resulting in colorimetric, fluorescent, electrochemical, and magnetic resonance signals for a wide range of targets, such as metal ions, small molecules, proteins, and intact cells. Fourth, by mimicking biology, we have employed DNAzymes as proofreading units to remove errors in nanoparticle assembly and further used DNAzyme cascade reactions to modify or repair DNA sequences involved in the assembly. Finally, by taking advantage of different affinities of biotin and desthiobiotin toward streptavidin, we have demonstrated reversible assembly of proteins on DNA origami.
An end-to-end workflow for engineering of biological networks from high-level specifications.

PubMed

Beal, Jacob; Weiss, Ron; Densmore, Douglas; Adler, Aaron; Appleton, Evan; Babb, Jonathan; Bhatia, Swapnil; Davidsohn, Noah; Haddock, Traci; Loyall, Joseph; Schantz, Richard; Vasilev, Viktor; Yaman, Fusun

2012-08-17

We present a workflow for the design and production of biological networks from high-level program specifications. The workflow is based on a sequence of intermediate models that incrementally translate high-level specifications into DNA samples that implement them. We identify algorithms for translating between adjacent models and implement them as a set of software tools, organized into a four-stage toolchain: Specification, Compilation, Part Assignment, and Assembly. The specification stage begins with a Boolean logic computation specified in the Proto programming language. The compilation stage uses a library of network motifs and cellular platforms, also specified in Proto, to transform the program into an optimized Abstract Genetic Regulatory Network (AGRN) that implements the programmed behavior. The part assignment stage assigns DNA parts to the AGRN, drawing the parts from a database for the target cellular platform, to create a DNA sequence implementing the AGRN. Finally, the assembly stage computes an optimized assembly plan to create the DNA sequence from available part samples, yielding a protocol for producing a sample of engineered plasmids with robotics assistance. Our workflow is the first to automate the production of biological networks from a high-level program specification. Furthermore, the workflow's modular design allows the same program to be realized on different cellular platforms simply by swapping workflow configurations. We validated our workflow by specifying a small-molecule sensor-reporter program and verifying the resulting plasmids in both HEK 293 mammalian cells and in E. coli bacterial cells.
Lactobacillus hayakitensis sp. nov., isolated from intestines of healthy thoroughbreds

PubMed Central

Morita, Hidetoshi; Shiratori, Chiharu; Murakami, Masaru; Takami, Hideto; Kato, Yukio; Endo, Akihito; Nakajima, Fumihiko; Takagi, Misako; Akita, Hiroaki; Okada, Sanae; Masaoka, Toshio

2007-01-01

Two strains, KBL13T and GBL13, were isolated as one of intestinal lactobacilli from the faecal specimens from different thoroughbreds of the same farm where they were born in Hokkaido, Japan. They were Gram-positive, facultatively anaerobic, catalase-negative, non-spore-forming and non-motile rods. KBL13T and GBL13 homofermentatively metabolize glucose, and produce lactate as the sole final product from glucose. The 16S rRNA gene sequence, DNA–DNA hybridization, DNA G+C content and biochemical characterization indicated that these two strains, KBL13T and GBL13, belong to the same species. In the representative strain, KBL13T, the DNA G+C content was 34.3 mol%. Lactobacillus salivarius JCM 1231T (=ATCC 11741T; AF089108) is the type strain most closely related to the strain KBL13T as shown in the phylogenetic tree, and the 16S rRNA gene sequence identity showed 96.0 % (1425/1484 bp). Comparative 16S rRNA gene sequence analysis of this strain indicated that the two isolated strains belong to the genus Lactobacillus and that they formed a branch distinct from their closest relatives, L. salivarius, Lactobacillus aviarius, Lactobacillus saerimneri and Lactobacillus acidipiscis. DNA–DNA reassociation experiments with L. salivarius and L. aviarius confirmed that KBL13T represents a novel species, for which the name Lactobacillus hayakitensis sp. nov. is proposed. The type strain is KBL13T (=JCM 14209T=DSM 18933T). PMID:18048734
Diversity arrays technology: a generic genome profiling technology on open platforms.

PubMed

Kilian, Andrzej; Wenzl, Peter; Huttner, Eric; Carling, Jason; Xia, Ling; Blois, Hélène; Caig, Vanessa; Heller-Uszynska, Katarzyna; Jaccoud, Damian; Hopper, Colleen; Aschenbrenner-Kilian, Malgorzata; Evers, Margaret; Peng, Kaiman; Cayla, Cyril; Hok, Puthick; Uszynski, Grzegorz

2012-01-01

In the last 20 years, we have observed an exponential growth of the DNA sequence data and simular increase in the volume of DNA polymorphism data generated by numerous molecular marker technologies. Most of the investment, and therefore progress, concentrated on human genome and genomes of selected model species. Diversity Arrays Technology (DArT), developed over a decade ago, was among the first "democratizing" genotyping technologies, as its performance was primarily driven by the level of DNA sequence variation in the species rather than by the level of financial investment. DArT also proved more robust to genome size and ploidy-level differences among approximately 60 organisms for which DArT was developed to date compared to other high-throughput genotyping technologies. The success of DArT in a number of organisms, including a wide range of "orphan crops," can be attributed to the simplicity of underlying concepts: DArT combines genome complexity reduction methods enriching for genic regions with a highly parallel assay readout on a number of "open-access" microarray platforms. The quantitative nature of the assay enabled a number of applications in which allelic frequencies can be estimated from DArT arrays. A typical DArT assay tests for polymorphism tens of thousands of genomic loci with the final number of markers reported (hundreds to thousands) reflecting the level of DNA sequence variation in the tested loci. Detailed DArT methods, protocols, and a range of their application examples as well as DArT's evolution path are presented.
Sequence-independent construction of ordered combinatorial libraries with predefined crossover points.

PubMed

Jézéquel, Laetitia; Loeper, Jacqueline; Pompon, Denis

2008-11-01

Combinatorial libraries coding for mosaic enzymes with predefined crossover points constitute useful tools to address and model structure-function relationships and for functional optimization of enzymes based on multivariate statistics. The presented method, called sequence-independent generation of a chimera-ordered library (SIGNAL), allows easy shuffling of any predefined amino acid segment between two or more proteins. This method is particularly well adapted to the exchange of protein structural modules. The procedure could also be well suited to generate ordered combinatorial libraries independent of sequence similarities in a robotized manner. Sequence segments to be recombined are first extracted by PCR from a single-stranded template coding for an enzyme of interest using a biotin-avidin-based method. This technique allows the reduction of parental template contamination in the final library. Specific PCR primers allow amplification of two complementary mosaic DNA fragments, overlapping in the region to be exchanged. Fragments are finally reassembled using a fusion PCR. The process is illustrated via the construction of a set of mosaic CYP2B enzymes using this highly modular approach.
Quantitative Assessment of RNA-Protein Interactions with High Throughput Sequencing - RNA Affinity Profiling (HiTS-RAP)

PubMed Central

Ozer, Abdullah; Tome, Jacob M.; Friedman, Robin C.; Gheba, Dan; Schroth, Gary P.; Lis, John T.

2016-01-01

Because RNA-protein interactions play a central role in a wide-array of biological processes, methods that enable a quantitative assessment of these interactions in a high-throughput manner are in great demand. Recently, we developed the High Throughput Sequencing-RNA Affinity Profiling (HiTS-RAP) assay, which couples sequencing on an Illumina GAIIx with the quantitative assessment of one or several proteins’ interactions with millions of different RNAs in a single experiment. We have successfully used HiTS-RAP to analyze interactions of EGFP and NELF-E proteins with their corresponding canonical and mutant RNA aptamers. Here, we provide a detailed protocol for HiTS-RAP, which can be completed in about a month (8 days hands-on time) including the preparation and testing of recombinant proteins and DNA templates, clustering DNA templates on a flowcell, high-throughput sequencing and protein binding with GAIIx, and finally data analysis. We also highlight aspects of HiTS-RAP that can be further improved and points of comparison between HiTS-RAP and two other recently developed methods, RNA-MaP and RBNS. A successful HiTS-RAP experiment provides the sequence and binding curves for approximately 200 million RNAs in a single experiment. PMID:26182240
Genetics of bacteria that utilize one carbon compounds: Final report, March 1, 1982-February 29, 1988

DOE Office of Scientific and Technical Information (OSTI.GOV)

Hanson, R.S.

Broad host range plasmid vectors useful for cloning genes from bacteria that grow on methane and methanol were constructed. We have cloned and mapped nineteen genes required for the growth of Methylobacterium organophilum strain XX on methanol. Nineteen genes were found in seven linkage groups on the M. organophilum genome and were separated by 40 kb or more. Eleven genes were required for the synthesis of methanol dehydrogenase (MDH) and were located in three unlinked gene clusters. The MDH structural gene was localized on a 2.5 kb DNA fragment. The gene was sequenced and contains a 175 bp untranslated leadermore » sequence, a signal sequence and the structural gene. MDH messenger RNA (mRNA) has a half life of approximately 20 min. and is present at approximately 2% of the cellular mRNA. The structural gene for the ..gamma.. subunit of methane monoxygenases has been cloned from Methylosporovibrio. Methane monooxygenase subunits have been purified by Prof. J. Lipscomb's laboratory and are being sequenced to construct DNA probes to identify cloned subunit genes. New facultative methylotrophic bacteria were isolated and characterized. Several amino acid auxotrophs have been isolated. 11 refs.« less
DNA/RNA hybrid substrates modulate the catalytic activity of purified AID.

PubMed

Abdouni, Hala S; King, Justin J; Ghorbani, Atefeh; Fifield, Heather; Berghuis, Lesley; Larijani, Mani

2018-01-01

Activation-induced cytidine deaminase (AID) converts cytidine to uridine at Immunoglobulin (Ig) loci, initiating somatic hypermutation and class switching of antibodies. In vitro, AID acts on single stranded DNA (ssDNA), but neither double-stranded DNA (dsDNA) oligonucleotides nor RNA, and it is believed that transcription is the in vivo generator of ssDNA targeted by AID. It is also known that the Ig loci, particularly the switch (S) regions targeted by AID are rich in transcription-generated DNA/RNA hybrids. Here, we examined the binding and catalytic behavior of purified AID on DNA/RNA hybrid substrates bearing either random sequences or GC-rich sequences simulating Ig S regions. If substrates were made up of a random sequence, AID preferred substrates composed entirely of DNA over DNA/RNA hybrids. In contrast, if substrates were composed of S region sequences, AID preferred to mutate DNA/RNA hybrids over substrates composed entirely of DNA. Accordingly, AID exhibited a significantly higher affinity for binding DNA/RNA hybrid substrates composed specifically of S region sequences, than any other substrates composed of DNA. Thus, in the absence of any other cellular processes or factors, AID itself favors binding and mutating DNA/RNA hybrids composed of S region sequences. AID:DNA/RNA complex formation and supporting mutational analyses suggest that recognition of DNA/RNA hybrids is an inherent structural property of AID. Copyright © 2017 Elsevier Ltd. All rights reserved.
Characterization of the repetitive DNA elements in the genome of fish lymphocystis disease viruses.

PubMed

Schnitzler, P; Darai, G

1989-09-01

The complete DNA nucleotide sequence of the repetitive DNA elements in the genome of fish lymphocystis disease virus (FLDV) isolated from two different species (flounder and dab) was determined. The size of these repetitive DNA elements was found to be 1413 bp which corresponds to the DNA sequences of the 5' terminus of the EcoRI DNA fragment B (0.034 to 0.052 m.u.) and to the EcoRI DNA fragment M (0.718 to 0.736 m.u.) of the FLDV genome causing lymphocystis disease in flounder and plaice. The degree of DNA nucleotide homology between both regions was found to be 99%. The repetitive DNA element in the genome of FLDV isolated from other fish species (dab) was identified and is located within the EcoRI DNA fragment B and J of the viral genome. The DNA nucleotide sequence of one duplicate of this repetition (EcoRI DNA fragment J) was determined (1410 bp) and compared to the DNA nucleotide sequences of the repetitive DNA elements of the genome of FLDV isolated from flounder. It was found that the repetitive DNA elements of the genome of FLDV derived from two different fish species are highly conserved and possess a degree of DNA sequence homology of 94%. The DNA sequences of each strand of the individual repetitive element possess one open reading frame.
Energetics of protein-DNA interactions.

PubMed

Donald, Jason E; Chen, William W; Shakhnovich, Eugene I

2007-01-01

Protein-DNA interactions are vital for many processes in living cells, especially transcriptional regulation and DNA modification. To further our understanding of these important processes on the microscopic level, it is necessary that theoretical models describe the macromolecular interaction energetics accurately. While several methods have been proposed, there has not been a careful comparison of how well the different methods are able to predict biologically important quantities such as the correct DNA binding sequence, total binding free energy and free energy changes caused by DNA mutation. In addition to carrying out the comparison, we present two important theoretical models developed initially in protein folding that have not yet been tried on protein-DNA interactions. In the process, we find that the results of these knowledge-based potentials show a strong dependence on the interaction distance and the derivation method. Finally, we present a knowledge-based potential that gives comparable or superior results to the best of the other methods, including the molecular mechanics force field AMBER99.
Analysis of JC virus DNA replication using a quantitative and high-throughput assay

PubMed Central

Shin, Jong; Phelan, Paul J.; Chhum, Panharith; Bashkenova, Nazym; Yim, Sung; Parker, Robert; Gagnon, David; Gjoerup, Ole; Archambault, Jacques; Bullock, Peter A.

2015-01-01

Progressive Multifocal Leukoencephalopathy (PML) is caused by lytic replication of JC virus (JCV) in specific cells of the central nervous system. Like other polyomaviruses, JCV encodes a large T-antigen helicase needed for replication of the viral DNA. Here, we report the development of a luciferase-based, quantitative and high-throughput assay of JCV DNA replication in C33A cells, which, unlike the glial cell lines Hs 683 and U87, accumulate high levels of nuclear T-ag needed for robust replication. Using this assay, we investigated the requirement for different domains of T-ag, and for specific sequences within and flanking the viral origin, in JCV DNA replication. Beyond providing validation of the assay, these studies revealed an important stimulatory role of the transcription factor NF1 in JCV DNA replication. Finally, we show that the assay can be used for inhibitor testing, highlighting its value for the identification of antiviral drugs targeting JCV DNA replication. PMID:25155200
Long-range correlations and charge transport properties of DNA sequences

NASA Astrophysics Data System (ADS)

Liu, Xiao-liang; Ren, Yi; Xie, Qiong-tao; Deng, Chao-sheng; Xu, Hui

2010-04-01

By using Hurst's analysis and transfer approach, the rescaled range functions and Hurst exponents of human chromosome 22 and enterobacteria phage lambda DNA sequences are investigated and the transmission coefficients, Landauer resistances and Lyapunov coefficients of finite segments based on above genomic DNA sequences are calculated. In a comparison with quasiperiodic and random artificial DNA sequences, we find that λ-DNA exhibits anticorrelation behavior characterized by a Hurst exponent 0.5
Cohort analysis of a single nucleotide polymorphism on DNA chips.

PubMed

Schwonbeck, Susanne; Krause-Griep, Andrea; Gajovic-Eichelmann, Nenad; Ehrentreich-Förster, Eva; Meinl, Walter; Glatt, Hansrüdi; Bier, Frank F

2004-11-15

A method has been developed to determine SNPs on DNA chips by applying a flow-through bioscanner. As a practical application we demonstrated the fast and simple SNP analysis of 24 genotypes in an array of 96 spots with a single hybridisation and dissociation experiment. The main advantage of this methodical concept is the parallel and fast analysis without any need of enzymatic digestion. Additionally, the DNA chip format used is appropriate for parallel analysis up to 400 spots. The polymorphism in the gene of the human phenol sulfotransferase SULT1A1 was studied as a model SNP. Biotinylated PCR products containing the SNP (The SNP summary web site: ) (mutant) and those containing no mutation (wild-type) were brought onto the chips coated with NeutrAvidin using non-contact spotting. This was followed by an analysis which was carried out in a flow-through biochip scanner while constantly rinsing with buffer. After removing the non-biotinylated strand a fluorescent probe was hybridised, which is complementary to the wild-type sequence. If this probe binds to a mutant sequence, then one single base is not fully matching. Thereby, the mismatched hybrid (mutant) is less stable than the full-matched hybrid (wild-type). The final step after hybridisation on the chip involves rinsing with a buffer to start dissociation of the fluorescent probe from the immobilised DNA strand. The online measurement of the fluorescence intensity by the biochip scanner provides the possibility to follow the kinetics of the hybridisation and dissociation processes. According to the different stability of the full-match and the mismatch, either visual discrimination or kinetic analysis is possible to distinguish SNP-containing sequence from the wild-type sequence.
[Whole Genome Sequencing of Human mtDNA Based on Ion Torrent PGM™ Platform].

PubMed

Cao, Y; Zou, K N; Huang, J P; Ma, K; Ping, Y

2017-08-01

To analyze and detect the whole genome sequence of human mitochondrial DNA （mtDNA） by Ion Torrent PGM™ platform and to study the differences of mtDNA sequence in different tissues. Samples were collected from 6 unrelated individuals by forensic postmortem examination, including chest blood, hair, costicartilage, nail, skeletal muscle and oral epithelium. Amplification of whole genome sequence of mtDNA was performed by 4 pairs of primer. Libraries were constructed with Ion Shear™ Plus Reagents kit and Ion Plus Fragment Library kit. Whole genome sequencing of mtDNA was performed using Ion Torrent PGM™ platform. Sanger sequencing was used to determine the heteroplasmy positions and the mutation positions on HVⅠ region. The whole genome sequence of mtDNA from all samples were amplified successfully. Six unrelated individuals belonged to 6 different haplotypes. Different tissues in one individual had heteroplasmy difference. The heteroplasmy positions and the mutation positions on HVⅠ region were verified by Sanger sequencing. After a consistency check by the Kappa method, it was found that the results of mtDNA sequence had a high consistency in different tissues. The testing method used in present study for sequencing the whole genome sequence of human mtDNA can detect the heteroplasmy difference in different tissues, which have good consistency. The results provide guidance for the further applications of mtDNA in forensic science. Copyright© by the Editorial Department of Journal of Forensic Medicine

The ATRX cDNA is prone to bacterial IS10 element insertions that alter its structure.

PubMed

Valle-García, David; Griffiths, Lyra M; Dyer, Michael A; Bernstein, Emily; Recillas-Targa, Félix

2014-01-01

The SWI/SNF-like chromatin-remodeling protein ATRX has emerged as a key factor in the regulation of α-globin gene expression, incorporation of histone variants into the chromatin template and, more recently, as a frequently mutated gene across a wide spectrum of cancers. Therefore, the availability of a functional ATRX cDNA for expression studies is a valuable tool for the scientific community. We have identified two independent transposon insertions of a bacterial IS10 element into exon 8 of ATRX isoform 2 coding sequence in two different plasmids derived from a single source. We demonstrate that these insertion events are common and there is an insertion hotspot within the ATRX cDNA. Such IS10 insertions produce a truncated form of ATRX, which significantly compromises its nuclear localization. In turn, we describe ways to prevent IS10 insertion during propagation and cloning of ATRX-containing vectors, including optimal growth conditions, bacterial strains, and suggested sequencing strategies. Finally, we have generated an insertion-free plasmid that is available to the community for expression studies of ATRX.
Programmable RNA Cleavage and Recognition by a Natural CRISPR-Cas9 System from Neisseria meningitidis.

PubMed

Rousseau, Beth A; Hou, Zhonggang; Gramelspacher, Max J; Zhang, Yan

2018-03-01

The microbial CRISPR systems enable adaptive defense against mobile elements and also provide formidable tools for genome engineering. The Cas9 proteins are type II CRISPR-associated, RNA-guided DNA endonucleases that identify double-stranded DNA targets by sequence complementarity and protospacer adjacent motif (PAM) recognition. Here we report that the type II-C CRISPR-Cas9 from Neisseria meningitidis (Nme) is capable of programmable, RNA-guided, site-specific cleavage and recognition of single-stranded RNA targets and that this ribonuclease activity is independent of the PAM sequence. We define the mechanistic feature and specificity constraint for RNA cleavage by NmeCas9 and also show that nuclease null dNmeCas9 binds to RNA target complementary to CRISPR RNA. Finally, we demonstrate that NmeCas9-catalyzed RNA cleavage can be blocked by three families of type II-C anti-CRISPR proteins. These results fundamentally expand the targeting capacities of CRISPR-Cas9 and highlight the potential utility of NmeCas9 as a single platform to target both RNA and DNA. Copyright © 2018 Elsevier Inc. All rights reserved.
Sequence periodicity in nucleosomal DNA and intrinsic curvature

PubMed Central

2010-01-01

Background Most eukaryotic DNA contained in the nucleus is packaged by wrapping DNA around histone octamers. Histones are ubiquitous and bind most regions of chromosomal DNA. In order to achieve smooth wrapping of the DNA around the histone octamer, the DNA duplex should be able to deform and should possess intrinsic curvature. The deformability of DNA is a result of the non-parallelness of base pair stacks. The stacking interaction between base pairs is sequence dependent. The higher the stacking energy the more rigid the DNA helix, thus it is natural to expect that sequences that are involved in wrapping around the histone octamer should be unstacked and possess intrinsic curvature. Intrinsic curvature has been shown to be dictated by the periodic recurrence of certain dinucleotides. Several genome-wide studies directed towards mapping of nucleosome positions have revealed periodicity associated with certain stretches of sequences. In the current study, these sequences have been analyzed with a view to understand their sequence-dependent structures. Results Higher order DNA structures and the distribution of molecular bend loci associated with 146 base nucleosome core DNA sequence from C. elegans and chicken have been analyzed using the theoretical model for DNA curvature. The curvature dispersion calculated by cyclically permuting the sequences revealed that the molecular bend loci were delocalized throughout the nucleosome core region and had varying degrees of intrinsic curvature. Conclusions The higher order structures associated with nucleosomes of C.elegans and chicken calculated from the sequences revealed heterogeneity with respect to the deviation of the DNA axis. The results points to the possibility of context dependent curvature of varying degrees to be associated with nucleosomal DNA. PMID:20487515
A survey of the sequence-specific interaction of damaging agents with DNA: emphasis on antitumor agents.

PubMed

Murray, V

1999-01-01

This article reviews the literature concerning the sequence specificity of DNA-damaging agents. DNA-damaging agents are widely used in cancer chemotherapy. It is important to understand fully the determinants of DNA sequence specificity so that more effective DNA-damaging agents can be developed as antitumor drugs. There are five main methods of DNA sequence specificity analysis: cleavage of end-labeled fragments, linear amplification with Taq DNA polymerase, ligation-mediated polymerase chain reaction (PCR), single-strand ligation PCR, and footprinting. The DNA sequence specificity in purified DNA and in intact mammalian cells is reviewed for several classes of DNA-damaging agent. These include agents that form covalent adducts with DNA, free radical generators, topoisomerase inhibitors, intercalators and minor groove binders, enzymes, and electromagnetic radiation. The main sites of adduct formation are at the N-7 of guanine in the major groove of DNA and the N-3 of adenine in the minor groove, whereas free radical generators abstract hydrogen from the deoxyribose sugar and topoisomerase inhibitors cause enzyme-DNA cross-links to form. Several issues involved in the determination of the DNA sequence specificity are discussed. The future directions of the field, with respect to cancer chemotherapy, are also examined.
Resolving species delimitation within the genus Bunopus Blanford, 1874 (Squamata: Gekkonidae) in Iran using DNA barcoding approach.

PubMed

Khosravani, Azar; Rastegar-Pouyani, Eskandar; Rastegar-Pouyani, Nasrullah; Oraie, Hamzeh; Papenfuss, Theodore J

2017-12-19

Mitochondrial COI sequences were used to investigate species delimitation within the genus Bunopus in Iran. A dataset with a final sequence length of 633 nucleotides including 100 specimens from 31 geographically distant localities across Iran were generated. The result demonstrated that two major clades with strong support can be identified within the genus Bunopus in Iran. Clade A includes Bunopus crassicaudus and two new entities, eastern populations (subclade A2,1) and Shahdad populations (subclade A2,2). The second clade comprises western and southwestern populations (subclade B1,1), Arabian populations (subclade B1,2) and south and southeast populations in Iran, to which Bunopus tuberculatus (subclade B2) is assigned. In addition to Bunopus crassicaudus and B. tuberculatus, three new candidate species in Iran can easily be identified based on the DNA barcoding approach.
Deciphering the genomic targets of alkylating polyamide conjugates using high-throughput sequencing

PubMed Central

Chandran, Anandhakumar; Syed, Junetha; Taylor, Rhys D.; Kashiwazaki, Gengo; Sato, Shinsuke; Hashiya, Kaori; Bando, Toshikazu; Sugiyama, Hiroshi

2016-01-01

Chemically engineered small molecules targeting specific genomic sequences play an important role in drug development research. Pyrrole-imidazole polyamides (PIPs) are a group of molecules that can bind to the DNA minor-groove and can be engineered to target specific sequences. Their biological effects rely primarily on their selective DNA binding. However, the binding mechanism of PIPs at the chromatinized genome level is poorly understood. Herein, we report a method using high-throughput sequencing to identify the DNA-alkylating sites of PIP-indole-seco-CBI conjugates. High-throughput sequencing analysis of conjugate 2 showed highly similar DNA-alkylating sites on synthetic oligos (histone-free DNA) and on human genomes (chromatinized DNA context). To our knowledge, this is the first report identifying alkylation sites across genomic DNA by alkylating PIP conjugates using high-throughput sequencing. PMID:27098039
A Case Study into Microbial Genome Assembly Gap Sequences and Finishing Strategies.

PubMed

Utturkar, Sagar M; Klingeman, Dawn M; Hurt, Richard A; Brown, Steven D

2017-01-01

This study characterized regions of DNA which remained unassembled by either PacBio and Illumina sequencing technologies for seven bacterial genomes. Two genomes were manually finished using bioinformatics and PCR/Sanger sequencing approaches and regions not assembled by automated software were analyzed. Gaps present within Illumina assemblies mostly correspond to repetitive DNA regions such as multiple rRNA operon sequences. PacBio gap sequences were evaluated for several properties such as GC content, read coverage, gap length, ability to form strong secondary structures, and corresponding annotations. Our hypothesis that strong secondary DNA structures blocked DNA polymerases and contributed to gap sequences was not accepted. PacBio assemblies had few limitations overall and gaps were explained as cumulative effect of lower than average sequence coverage and repetitive sequences at contig termini. An important aspect of the present study is the compilation of biological features that interfered with assembly and included active transposons, multiple plasmid sequences, phage DNA integration, and large sequence duplication. Our targeted genome finishing approach and systematic evaluation of the unassembled DNA will be useful for others looking to close, finish, and polish microbial genome sequences.
Burkholderia pseudomallei sequencing identifies genomic clades with distinct recombination, accessory, and epigenetic profiles

PubMed Central

Nandi, Tannistha; Holden, Matthew T.G.; Didelot, Xavier; Mehershahi, Kurosh; Boddey, Justin A.; Beacham, Ifor; Peak, Ian; Harting, John; Baybayan, Primo; Guo, Yan; Wang, Susana; How, Lee Chee; Sim, Bernice; Essex-Lopresti, Angela; Sarkar-Tyson, Mitali; Nelson, Michelle; Smither, Sophie; Ong, Catherine; Aw, Lay Tin; Hoon, Chua Hui; Michell, Stephen; Studholme, David J.; Titball, Richard; Chen, Swaine L.; Parkhill, Julian

2015-01-01

Burkholderia pseudomallei (Bp) is the causative agent of the infectious disease melioidosis. To investigate population diversity, recombination, and horizontal gene transfer in closely related Bp isolates, we performed whole-genome sequencing (WGS) on 106 clinical, animal, and environmental strains from a restricted Asian locale. Whole-genome phylogenies resolved multiple genomic clades of Bp, largely congruent with multilocus sequence typing (MLST). We discovered widespread recombination in the Bp core genome, involving hundreds of regions associated with multiple haplotypes. Highly recombinant regions exhibited functional enrichments that may contribute to virulence. We observed clade-specific patterns of recombination and accessory gene exchange, and provide evidence that this is likely due to ongoing recombination between clade members. Reciprocally, interclade exchanges were rarely observed, suggesting mechanisms restricting gene flow between clades. Interrogation of accessory elements revealed that each clade harbored a distinct complement of restriction-modification (RM) systems, predicted to cause clade-specific patterns of DNA methylation. Using methylome sequencing, we confirmed that representative strains from separate clades indeed exhibit distinct methylation profiles. Finally, using an E. coli system, we demonstrate that Bp RM systems can inhibit uptake of non-self DNA. Our data suggest that RM systems borne on mobile elements, besides preventing foreign DNA invasion, may also contribute to limiting exchanges of genetic material between individuals of the same species. Genomic clades may thus represent functional units of genetic isolation in Bp, modulating intraspecies genetic diversity. PMID:25236617
Scalable whole-exome sequencing of cell-free DNA reveals high concordance with metastatic tumors.

PubMed

Adalsteinsson, Viktor A; Ha, Gavin; Freeman, Samuel S; Choudhury, Atish D; Stover, Daniel G; Parsons, Heather A; Gydush, Gregory; Reed, Sarah C; Rotem, Denisse; Rhoades, Justin; Loginov, Denis; Livitz, Dimitri; Rosebrock, Daniel; Leshchiner, Ignaty; Kim, Jaegil; Stewart, Chip; Rosenberg, Mara; Francis, Joshua M; Zhang, Cheng-Zhong; Cohen, Ofir; Oh, Coyin; Ding, Huiming; Polak, Paz; Lloyd, Max; Mahmud, Sairah; Helvie, Karla; Merrill, Margaret S; Santiago, Rebecca A; O'Connor, Edward P; Jeong, Seong H; Leeson, Rachel; Barry, Rachel M; Kramkowski, Joseph F; Zhang, Zhenwei; Polacek, Laura; Lohr, Jens G; Schleicher, Molly; Lipscomb, Emily; Saltzman, Andrea; Oliver, Nelly M; Marini, Lori; Waks, Adrienne G; Harshman, Lauren C; Tolaney, Sara M; Van Allen, Eliezer M; Winer, Eric P; Lin, Nancy U; Nakabayashi, Mari; Taplin, Mary-Ellen; Johannessen, Cory M; Garraway, Levi A; Golub, Todd R; Boehm, Jesse S; Wagle, Nikhil; Getz, Gad; Love, J Christopher; Meyerson, Matthew

2017-11-06

Whole-exome sequencing of cell-free DNA (cfDNA) could enable comprehensive profiling of tumors from blood but the genome-wide concordance between cfDNA and tumor biopsies is uncertain. Here we report ichorCNA, software that quantifies tumor content in cfDNA from 0.1× coverage whole-genome sequencing data without prior knowledge of tumor mutations. We apply ichorCNA to 1439 blood samples from 520 patients with metastatic prostate or breast cancers. In the earliest tested sample for each patient, 34% of patients have ≥10% tumor-derived cfDNA, sufficient for standard coverage whole-exome sequencing. Using whole-exome sequencing, we validate the concordance of clonal somatic mutations (88%), copy number alterations (80%), mutational signatures, and neoantigens between cfDNA and matched tumor biopsies from 41 patients with ≥10% cfDNA tumor content. In summary, we provide methods to identify patients eligible for comprehensive cfDNA profiling, revealing its applicability to many patients, and demonstrate high concordance of cfDNA and metastatic tumor whole-exome sequencing.
An evolution based biosensor receptor DNA sequence generation algorithm.

PubMed

Kim, Eungyeong; Lee, Malrey; Gatton, Thomas M; Lee, Jaewan; Zang, Yupeng

2010-01-01

A biosensor is composed of a bioreceptor, an associated recognition molecule, and a signal transducer that can selectively detect target substances for analysis. DNA based biosensors utilize receptor molecules that allow hybridization with the target analyte. However, most DNA biosensor research uses oligonucleotides as the target analytes and does not address the potential problems of real samples. The identification of recognition molecules suitable for real target analyte samples is an important step towards further development of DNA biosensors. This study examines the characteristics of DNA used as bioreceptors and proposes a hybrid evolution-based DNA sequence generating algorithm, based on DNA computing, to identify suitable DNA bioreceptor recognition molecules for stable hybridization with real target substances. The Traveling Salesman Problem (TSP) approach is applied in the proposed algorithm to evaluate the safety and fitness of the generated DNA sequences. This approach improves efficiency and stability for enhanced and variable-length DNA sequence generation and allows extension to generation of variable-length DNA sequences with diverse receptor recognition requirements.
RDNAnalyzer: A tool for DNA secondary structure prediction and sequence analysis

PubMed Central

Afzal, Muhammad; Shahid, Ahmad Ali; Shehzadi, Abida; Nadeem, Shahid; Husnain, Tayyab

2012-01-01

RDNAnalyzer is an innovative computer based tool designed for DNA secondary structure prediction and sequence analysis. It can randomly generate the DNA sequence or user can upload the sequences of their own interest in RAW format. It uses and extends the Nussinov dynamic programming algorithm and has various application for the sequence analysis. It predicts the DNA secondary structure and base pairings. It also provides the tools for routinely performed sequence analysis by the biological scientists such as DNA replication, reverse compliment generation, transcription, translation, sequence specific information as total number of nucleotide bases, ATGC base contents along with their respective percentages and sequence cleaner. RDNAnalyzer is a unique tool developed in Microsoft Visual Studio 2008 using Microsoft Visual C# and Windows Presentation Foundation and provides user friendly environment for sequence analysis. It is freely available. Availability http://www.cemb.edu.pk/sw.html Abbreviations RDNAnalyzer - Random DNA Analyser, GUI - Graphical user interface, XAML - Extensible Application Markup Language. PMID:23055611
Structural and Thermodynamic Signatures of DNA Recognition by Mycobacterium tuberculosis DnaA

DOE Office of Scientific and Technical Information (OSTI.GOV)

Tsodikov, Oleg V.; Biswas, Tapan

An essential protein, DnaA, binds to 9-bp DNA sites within the origin of replication oriC. These binding events are prerequisite to forming an enigmatic nucleoprotein scaffold that initiates replication. The number, sequences, positions, and orientations of these short DNA sites, or DnaA boxes, within the oriCs of different bacteria vary considerably. To investigate features of DnaA boxes that are important for binding Mycobacterium tuberculosis DnaA (MtDnaA), we have determined the crystal structures of the DNA binding domain (DBD) of MtDnaA bound to a cognate MtDnaA-box (at 2.0 {angstrom} resolution) and to a consensus Escherichia coli DnaA-box (at 2.3 {angstrom}). Thesemore » structures, complemented by calorimetric equilibrium binding studies of MtDnaA DBD in a series of DnaA-box variants, reveal the main determinants of DNA recognition and establish the [T/C][T/A][G/A]TCCACA sequence as a high-affinity MtDnaA-box. Bioinformatic and calorimetric analyses indicate that DnaA-box sequences in mycobacterial oriCs generally differ from the optimal binding sequence. This sequence variation occurs commonly at the first 2 bp, making an in vivo mycobacterial DnaA-box effectively a 7-mer and not a 9-mer. We demonstrate that the decrease in the affinity of these MtDnaA-box variants for MtDnaA DBD relative to that of the highest-affinity box TTGTCCACA is less than 10-fold. The understanding of DnaA-box recognition by MtDnaA and E. coli DnaA enables one to map DnaA-box sequences in the genomes of M. tuberculosis and other eubacteria.« less
Mobile small RNAs regulate genome-wide DNA methylation.

PubMed

Lewsey, Mathew G; Hardcastle, Thomas J; Melnyk, Charles W; Molnar, Attila; Valli, Adrián; Urich, Mark A; Nery, Joseph R; Baulcombe, David C; Ecker, Joseph R

2016-02-09

RNA silencing at the transcriptional and posttranscriptional levels regulates endogenous gene expression, controls invading transposable elements (TEs), and protects the cell against viruses. Key components of the mechanism are small RNAs (sRNAs) of 21-24 nt that guide the silencing machinery to their nucleic acid targets in a nucleotide sequence-specific manner. Transcriptional gene silencing is associated with 24-nt sRNAs and RNA-directed DNA methylation (RdDM) at cytosine residues in three DNA sequence contexts (CG, CHG, and CHH). We previously demonstrated that 24-nt sRNAs are mobile from shoot to root in Arabidopsis thaliana and confirmed that they mediate DNA methylation at three sites in recipient cells. In this study, we extend this finding by demonstrating that RdDM of thousands of loci in root tissues is dependent upon mobile sRNAs from the shoot and that mobile sRNA-dependent DNA methylation occurs predominantly in non-CG contexts. Mobile sRNA-dependent non-CG methylation is largely dependent on the DOMAINS REARRANGED METHYLTRANSFERASES 1/2 (DRM1/DRM2) RdDM pathway but is independent of the CHROMOMETHYLASE (CMT)2/3 DNA methyltransferases. Specific superfamilies of TEs, including those typically found in gene-rich euchromatic regions, lose DNA methylation in a mutant lacking 22- to 24-nt sRNAs (dicer-like 2, 3, 4 triple mutant). Transcriptome analyses identified a small number of genes whose expression in roots is associated with mobile sRNAs and connected to DNA methylation directly or indirectly. Finally, we demonstrate that sRNAs from shoots of one accession move across a graft union and target DNA methylation de novo at normally unmethylated sites in the genomes of root cells from a different accession.
DNA barcode goes two-dimensions: DNA QR code web server.

PubMed

Liu, Chang; Shi, Linchun; Xu, Xiaolan; Li, Huan; Xing, Hang; Liang, Dong; Jiang, Kun; Pang, Xiaohui; Song, Jingyuan; Chen, Shilin

2012-01-01

The DNA barcoding technology uses a standard region of DNA sequence for species identification and discovery. At present, "DNA barcode" actually refers to DNA sequences, which are not amenable to information storage, recognition, and retrieval. Our aim is to identify the best symbology that can represent DNA barcode sequences in practical applications. A comprehensive set of sequences for five DNA barcode markers ITS2, rbcL, matK, psbA-trnH, and CO1 was used as the test data. Fifty-three different types of one-dimensional and ten two-dimensional barcode symbologies were compared based on different criteria, such as coding capacity, compression efficiency, and error detection ability. The quick response (QR) code was found to have the largest coding capacity and relatively high compression ratio. To facilitate the further usage of QR code-based DNA barcodes, a web server was developed and is accessible at http://qrfordna.dnsalias.org. The web server allows users to retrieve the QR code for a species of interests, convert a DNA sequence to and from a QR code, and perform species identification based on local and global sequence similarities. In summary, the first comprehensive evaluation of various barcode symbologies has been carried out. The QR code has been found to be the most appropriate symbology for DNA barcode sequences. A web server has also been constructed to allow biologists to utilize QR codes in practical DNA barcoding applications.
TaxI: a software tool for DNA barcoding using distance methods

PubMed Central

Steinke, Dirk; Vences, Miguel; Salzburger, Walter; Meyer, Axel

2005-01-01

DNA barcoding is a promising approach to the diagnosis of biological diversity in which DNA sequences serve as the primary key for information retrieval. Most existing software for evolutionary analysis of DNA sequences was designed for phylogenetic analyses and, hence, those algorithms do not offer appropriate solutions for the rapid, but precise analyses needed for DNA barcoding, and are also unable to process the often large comparative datasets. We developed a flexible software tool for DNA taxonomy, named TaxI. This program calculates sequence divergences between a query sequence (taxon to be barcoded) and each sequence of a dataset of reference sequences defined by the user. Because the analysis is based on separate pairwise alignments this software is also able to work with sequences characterized by multiple insertions and deletions that are difficult to align in large sequence sets (i.e. thousands of sequences) by multiple alignment algorithms because of computational restrictions. Here, we demonstrate the utility of this approach with two datasets of fish larvae and juveniles from Lake Constance and juvenile land snails under different models of sequence evolution. Sets of ribosomal 16S rRNA sequences, characterized by multiple indels, performed as good as or better than cox1 sequence sets in assigning sequences to species, demonstrating the suitability of rRNA genes for DNA barcoding. PMID:16214755
PipeOnline 2.0: automated EST processing and functional data sorting.

PubMed

Ayoubi, Patricia; Jin, Xiaojing; Leite, Saul; Liu, Xianghui; Martajaja, Jeson; Abduraham, Abdurashid; Wan, Qiaolan; Yan, Wei; Misawa, Eduardo; Prade, Rolf A

2002-11-01

Expressed sequence tags (ESTs) are generated and deposited in the public domain, as redundant, unannotated, single-pass reactions, with virtually no biological content. PipeOnline automatically analyses and transforms large collections of raw DNA-sequence data from chromatograms or FASTA files by calling the quality of bases, screening and removing vector sequences, assembling and rewriting consensus sequences of redundant input files into a unigene EST data set and finally through translation, amino acid sequence similarity searches, annotation of public databases and functional data. PipeOnline generates an annotated database, retaining the processed unigene sequence, clone/file history, alignments with similar sequences, and proposed functional classification, if available. Functional annotation is automatic and based on a novel method that relies on homology of amino acid sequence multiplicity within GenBank records. Records are examined through a function ordered browser or keyword queries with automated export of results. PipeOnline offers customization for individual projects (MyPipeOnline), automated updating and alert service. PipeOnline is available at http://stress-genomics.org.
Dna Sequencing

DOEpatents

Tabor, Stanley; Richardson, Charles C.

1995-04-25

A method for sequencing a strand of DNA, including the steps off: providing the strand of DNA; annealing the strand with a primer able to hybridize to the strand to give an annealed mixture; incubating the mixture with four deoxyribonucleoside triphosphates, a DNA polymerase, and at least three deoxyribonucleoside triphosphates in different amounts, under conditions in favoring primer extension to form nucleic acid fragments complementory to the DNA to be sequenced; labelling the nucleic and fragments; separating them and determining the position of the deoxyribonucleoside triphosphates by differences in the intensity of the labels, thereby to determine the DNA sequence.
Sequence-Dependent Diastereospecific and Diastereodivergent Crosslinking of DNA by Decarbamoylmitomycin C.

PubMed

Aguilar, William; Paz, Manuel M; Vargas, Anayatzinc; Clement, Cristina C; Cheng, Shu-Yuan; Champeil, Elise

2018-04-20

Mitomycin C (MC), a potent antitumor drug, and decarbamoylmitomycin C (DMC), a derivative lacking the carbamoyl group, form highly cytotoxic DNA interstrand crosslinks. The major interstrand crosslink formed by DMC is the C1'' epimer of the major crosslink formed by MC. The molecular basis for the stereochemical configuration exhibited by DMC was investigated using biomimetic synthesis. The formation of DNA-DNA crosslinks by DMC is diastereospecific and diastereodivergent: Only the 1''S-diastereomer of the initially formed monoadduct can form crosslinks at GpC sequences, and only the 1''R-diastereomer of the monoadduct can form crosslinks at CpG sequences. We also show that CpG and GpC sequences react with divergent diastereoselectivity in the first alkylation step: 1"S stereochemistry is favored at GpC sequences and 1''R stereochemistry is favored at CpG sequences. Therefore, the first alkylation step results, at each sequence, in the selective formation of the diastereomer able to generate an interstrand DNA-DNA crosslink after the "second arm" alkylation. Examination of the known DNA adduct pattern obtained after treatment of cancer cell cultures with DMC indicates that the GpC sequence is the major target for the formation of DNA-DNA crosslinks in vivo by this drug. © 2018 Wiley-VCH Verlag GmbH & Co. KGaA, Weinheim.
Sequencing historical specimens: successful preparation of small specimens with low amounts of degraded DNA.

PubMed

Sproul, John S; Maddison, David R

2017-11-01

Despite advances that allow DNA sequencing of old museum specimens, sequencing small-bodied, historical specimens can be challenging and unreliable as many contain only small amounts of fragmented DNA. Dependable methods to sequence such specimens are especially critical if the specimens are unique. We attempt to sequence small-bodied (3-6 mm) historical specimens (including nomenclatural types) of beetles that have been housed, dried, in museums for 58-159 years, and for which few or no suitable replacement specimens exist. To better understand ideal approaches of sample preparation and produce preparation guidelines, we compared different library preparation protocols using low amounts of input DNA (1-10 ng). We also explored low-cost optimizations designed to improve library preparation efficiency and sequencing success of historical specimens with minimal DNA, such as enzymatic repair of DNA. We report successful sample preparation and sequencing for all historical specimens despite our low-input DNA approach. We provide a list of guidelines related to DNA repair, bead handling, reducing adapter dimers and library amplification. We present these guidelines to facilitate more economical use of valuable DNA and enable more consistent results in projects that aim to sequence challenging, irreplaceable historical specimens. © 2017 John Wiley & Sons Ltd.
i-rDNA: alignment-free algorithm for rapid in silico detection of ribosomal gene fragments from metagenomic sequence data sets.

PubMed

Mohammed, Monzoorul Haque; Ghosh, Tarini Shankar; Chadaram, Sudha; Mande, Sharmila S

2011-11-30

Obtaining accurate estimates of microbial diversity using rDNA profiling is the first step in most metagenomics projects. Consequently, most metagenomic projects spend considerable amounts of time, money and manpower for experimentally cloning, amplifying and sequencing the rDNA content in a metagenomic sample. In the second step, the entire genomic content of the metagenome is extracted, sequenced and analyzed. Since DNA sequences obtained in this second step also contain rDNA fragments, rapid in silico identification of these rDNA fragments would drastically reduce the cost, time and effort of current metagenomic projects by entirely bypassing the experimental steps of primer based rDNA amplification, cloning and sequencing. In this study, we present an algorithm called i-rDNA that can facilitate the rapid detection of 16S rDNA fragments from amongst millions of sequences in metagenomic data sets with high detection sensitivity. Performance evaluation with data sets/database variants simulating typical metagenomic scenarios indicates the significantly high detection sensitivity of i-rDNA. Moreover, i-rDNA can process a million sequences in less than an hour on a simple desktop with modest hardware specifications. In addition to the speed of execution, high sensitivity and low false positive rate, the utility of the algorithmic approach discussed in this paper is immense given that it would help in bypassing the entire experimental step of primer-based rDNA amplification, cloning and sequencing. Application of this algorithmic approach would thus drastically reduce the cost, time and human efforts invested in all metagenomic projects. A web-server for the i-rDNA algorithm is available at http://metagenomics.atc.tcs.com/i-rDNA/

Biosensors for DNA sequence detection

NASA Technical Reports Server (NTRS)

Vercoutere, Wenonah; Akeson, Mark

2002-01-01

DNA biosensors are being developed as alternatives to conventional DNA microarrays. These devices couple signal transduction directly to sequence recognition. Some of the most sensitive and functional technologies use fibre optics or electrochemical sensors in combination with DNA hybridization. In a shift from sequence recognition by hybridization, two emerging single-molecule techniques read sequence composition using zero-mode waveguides or electrical impedance in nanoscale pores.
DNA Sequences from Formalin-Fixed Nematodes: Integrating Molecular and Morphological Approaches to Taxonomy

PubMed Central

Thomas, W. Kelley; Vida, J. T.; Frisse, Linda M.; Mundo, Manuel; Baldwin, James G.

1997-01-01

To effectively integrate DNA sequence analysis and classical nematode taxonomy, we must be able to obtain DNA sequences from formalin-fixed specimens. Microdissected sections of nematodes were removed from specimens fixed in formalin, using standard protocols and without destroying morphological features. The fixed sections provided sufficient template for multiple polymerase chain reaction-based DNA sequence analyses. PMID:19274156
Palindromic Sequence Artifacts Generated during Next Generation Sequencing Library Preparation from Historic and Ancient DNA

PubMed Central

Star, Bastiaan; Nederbragt, Alexander J.; Hansen, Marianne H. S.; Skage, Morten; Gilfillan, Gregor D.; Bradbury, Ian R.; Pampoulie, Christophe; Stenseth, Nils Chr; Jakobsen, Kjetill S.; Jentoft, Sissel

2014-01-01

Degradation-specific processes and variation in laboratory protocols can bias the DNA sequence composition from samples of ancient or historic origin. Here, we identify a novel artifact in sequences from historic samples of Atlantic cod (Gadus morhua), which forms interrupted palindromes consisting of reverse complementary sequence at the 5′ and 3′-ends of sequencing reads. The palindromic sequences themselves have specific properties – the bases at the 5′-end align well to the reference genome, whereas extensive misalignments exists among the bases at the terminal 3′-end. The terminal 3′ bases are artificial extensions likely caused by the occurrence of hairpin loops in single stranded DNA (ssDNA), which can be ligated and amplified in particular library creation protocols. We propose that such hairpin loops allow the inclusion of erroneous nucleotides, specifically at the 3′-end of DNA strands, with the 5′-end of the same strand providing the template. We also find these palindromes in previously published ancient DNA (aDNA) datasets, albeit at varying and substantially lower frequencies. This artifact can negatively affect the yield of endogenous DNA in these types of samples and introduces sequence bias. PMID:24608104
A new family of satellite DNA sequences as a major component of centromeric heterochromatin in owls (Strigiformes).

PubMed

Yamada, Kazuhiko; Nishida-Umehara, Chizuko; Matsuda, Yoichi

2004-03-01

We isolated a new family of satellite DNA sequences from HaeIII- and EcoRI-digested genomic DNA of the Blakiston's fish owl ( Ketupa blakistoni). The repetitive sequences were organized in tandem arrays of the 174 bp element, and localized to the centromeric regions of all macrochromosomes, including the Z and W chromosomes, and microchromosomes. This hybridization pattern was consistent with the distribution of C-band-positive centromeric heterochromatin, and the satellite DNA sequences occupied 10% of the total genome as a major component of centromeric heterochromatin. The sequences were homogenized between macro- and microchromosomes in this species, and therefore intraspecific divergence of the nucleotide sequences was low. The 174 bp element cross-hybridized to the genomic DNA of six other Strigidae species, but not to that of the Tytonidae, suggesting that the satellite DNA sequences are conserved in the same family but fairly divergent between the different families in the Strigiformes. Secondly, the centromeric satellite DNAs were cloned from eight Strigidae species, and the nucleotide sequences of 41 monomer fragments were compared within and between species. Molecular phylogenetic relationships of the nucleotide sequences were highly correlated with both the taxonomy based on morphological traits and the phylogenetic tree constructed by DNA-DNA hybridization. These results suggest that the satellite DNA sequence has evolved by concerted evolution in the Strigidae and that it is a good taxonomic and phylogenetic marker to examine genetic diversity between Strigiformes species.
DOE Office of Scientific and Technical Information (OSTI.GOV)

Sobottka, Marcelo, E-mail: sobottka@mtm.ufsc.br; Hart, Andrew G., E-mail: ahart@dim.uchile.cl

Highlights: {yields} We propose a simple stochastic model to construct primitive DNA sequences. {yields} The model provide an explanation for Chargaff's second parity rule in primitive DNA sequences. {yields} The model is also used to predict a novel type of strand symmetry in primitive DNA sequences. {yields} We extend the results for bacterial DNA sequences and compare distributional properties intrinsic to the model to statistical estimates from 1049 bacterial genomes. {yields} We find out statistical evidences that the novel type of strand symmetry holds for bacterial DNA sequences. -- Abstract: Chargaff's second parity rule for short oligonucleotides states that themore » frequency of any short nucleotide sequence on a strand is approximately equal to the frequency of its reverse complement on the same strand. Recent studies have shown that, with the exception of organellar DNA, this parity rule generally holds for double-stranded DNA genomes and fails to hold for single-stranded genomes. While Chargaff's first parity rule is fully explained by the Watson-Crick pairing in the DNA double helix, a definitive explanation for the second parity rule has not yet been determined. In this work, we propose a model based on a hidden Markov process for approximating the distributional structure of primitive DNA sequences. Then, we use the model to provide another possible theoretical explanation for Chargaff's second parity rule, and to predict novel distributional aspects of bacterial DNA sequences.« less
Chromatin Immunoprecipitation Sequencing (ChIP-Seq) for Transcription Factors and Chromatin Factors in Arabidopsis thaliana Roots: From Material Collection to Data Analysis.

PubMed

Cortijo, Sandra; Charoensawan, Varodom; Roudier, François; Wigge, Philip A

2018-01-01

Chromatin immunoprecipitation combined with next-generation sequencing (ChIP-seq) is a powerful technique to investigate in vivo transcription factor (TF) binding to DNA, as well as chromatin marks. Here we provide a detailed protocol for all the key steps to perform ChIP-seq in Arabidopsis thaliana roots, also working on other A. thaliana tissues and in most non-ligneous plants. We detail all steps from material collection, fixation, chromatin preparation, immunoprecipitation, library preparation, and finally computational analysis based on a combination of publicly available tools.
Heterologous mitochondrial DNA recombination in human cells.

PubMed

D'Aurelio, Marilena; Gajewski, Carl D; Lin, Michael T; Mauck, William M; Shao, Leon Z; Lenaz, Giorgio; Moraes, Carlos T; Manfredi, Giovanni

2004-12-15

Inter-molecular heterologous mitochondrial DNA (mtDNA) recombination is known to occur in yeast and plants. Nevertheless, its occurrence in human cells is still controversial. To address this issue we have fused two human cytoplasmic hybrid cell lines, each containing a distinct pathogenic mtDNA mutation and specific sets of genetic markers. In this hybrid model, we found direct evidence of recombination between these two mtDNA haplotypes. Recombinant mtDNA molecules in the hybrid cells were identified using three independent experimental approaches. First, recombinant molecules containing genetic markers from both parental alleles were demonstrated with restriction fragment length polymorphism of polymerase chain reaction products, by measuring the relative frequencies of each marker. Second, fragments of recombinant mtDNA were cloned and sequenced to identify the regions involved in the recombination events. Finally, recombinant molecules were demonstrated directly by Southern blot using appropriate combinations of polymorphic restriction sites and probes. This combined approach confirmed the existence of heterogeneous species of recombinant mtDNA molecules in the hybrid cells. These findings have important implications for mtDNA-related diseases, the interpretation of human evolution and population genetics and forensic analyses based on mtDNA genotyping.
EHMT2 directs DNA methylation for efficient gene silencing in mouse embryos

PubMed Central

Auclair, Ghislain; Borgel, Julie; Sanz, Lionel A.; Vallet, Judith; Guibert, Sylvain; Dumas, Michael; Cavelier, Patricia; Girardot, Michael; Forné, Thierry; Feil, Robert; Weber, Michael

2016-01-01

The extent to which histone modifying enzymes contribute to DNA methylation in mammals remains unclear. Previous studies suggested a link between the lysine methyltransferase EHMT2 (also known as G9A and KMT1C) and DNA methylation in the mouse. Here, we used a model of knockout mice to explore the role of EHMT2 in DNA methylation during mouse embryogenesis. The Ehmt2 gene is expressed in epiblast cells but is dispensable for global DNA methylation in embryogenesis. In contrast, EHMT2 regulates DNA methylation at specific sequences that include CpG-rich promoters of germline-specific genes. These loci are bound by EHMT2 in embryonic cells, are marked by H3K9 dimethylation, and have strongly reduced DNA methylation in Ehmt2−/− embryos. EHMT2 also plays a role in the maintenance of germline-derived DNA methylation at one imprinted locus, the Slc38a4 gene. Finally, we show that DNA methylation is instrumental for EHMT2-mediated gene silencing in embryogenesis. Our findings identify EHMT2 as a critical factor that facilitates repressive DNA methylation at specific genomic loci during mammalian development. PMID:26576615
Estimating Diversity of Florida Keys Zooplankton Using New Environmental DNA Methods

NASA Astrophysics Data System (ADS)

Djurhuus, A.; Goldsmith, D. B.; Sawaya, N. A.; Breitbart, M.

2016-02-01

Zooplankton are of great importance in marine food webs, where they serve to link the phytoplankton and bacteria with higher trophic levels. Zooplankton are a diverse group containing molluscs, crustaceans, fish larvae and many other taxa. The sheer number of species and often minor morphological distinctions between species makes it challenging and exceptionally time consuming to identify the species composition of marine zooplankton samples. As a part of the Marine Biodiversity Observation Network (MBON) project, we have developed and groundtruthed an alternative, relatively time-efficient method for zooplankton identification using environmental DNA (eDNA). Samples were collected from Molasses reef, Looe Key, and Western Sambo along the Florida Keys from five bi-monthly cruises on board the RV Walton Smith. Samples were collected for environmental DNA (eDNA) by filtering 1 L of water on to a 0.22 µm filter and zooplankton samples were collected using nets with three mesh sizes (64μm, 200μm, and 500μm) to catch different size fractions. Half of zooplankton samples were fixed in 70% ethanol and half in 10% formalin, for DNA extraction and morphological identification, respectively. Individuals representing visually abundant taxa were picked into individual wells for PCR with universal 18S rRNA gene primers and subsequent sequencing to build a reference barcode database for zooplankton species commonly found in the study region. PCR and Illumina MiSeq next generation sequencing was applied to the eDNA extracted from the 0.22 μm filters and sequences were be compared to our local custom database as well as publicly available databases to determine zooplankton community composition. Finally, composition and diversity analyses were performed to compare results obtained with the new eDNA approach to standard morphological classification of zooplankton communities. Results show that the eDNA approach can enable the determination of zooplankton diversity through collection of a single water sample, which, when combined with bacterial and archaeal diversity analyses, will help us understand the coupling between different trophic levels and the drivers of plankton dynamics in the sub-tropical Florida Keys.
Assessing information content and interactive relationships of subgenomic DNA sequences of the MHC using complexity theory approaches based on the non-extensive statistical mechanics

NASA Astrophysics Data System (ADS)

Karakatsanis, L. P.; Pavlos, G. P.; Iliopoulos, A. C.; Pavlos, E. G.; Clark, P. M.; Duke, J. L.; Monos, D. S.

2018-09-01

This study combines two independent domains of science, the high throughput DNA sequencing capabilities of Genomics and complexity theory from Physics, to assess the information encoded by the different genomic segments of exonic, intronic and intergenic regions of the Major Histocompatibility Complex (MHC) and identify possible interactive relationships. The dynamic and non-extensive statistical characteristics of two well characterized MHC sequences from the homozygous cell lines, PGF and COX, in addition to two other genomic regions of comparable size, used as controls, have been studied using the reconstructed phase space theorem and the non-extensive statistical theory of Tsallis. The results reveal similar non-linear dynamical behavior as far as complexity and self-organization features. In particular, the low-dimensional deterministic nonlinear chaotic and non-extensive statistical character of the DNA sequences was verified with strong multifractal characteristics and long-range correlations. The nonlinear indices repeatedly verified that MHC sequences, whether exonic, intronic or intergenic include varying levels of information and reveal an interaction of the genes with intergenic regions, whereby the lower the number of genes in a region, the less the complexity and information content of the intergenic region. Finally we showed the significance of the intergenic region in the production of the DNA dynamics. The findings reveal interesting content information in all three genomic elements and interactive relationships of the genes with the intergenic regions. The results most likely are relevant to the whole genome and not only to the MHC. These findings are consistent with the ENCODE project, which has now established that the non-coding regions of the genome remain to be of relevance, as they are functionally important and play a significant role in the regulation of expression of genes and coordination of the many biological processes of the cell.
A Three-Dimensional Model of the Yeast Genome

NASA Astrophysics Data System (ADS)

Noble, William; Duan, Zhi-Jun; Andronescu, Mirela; Schutz, Kevin; McIlwain, Sean; Kim, Yoo Jung; Lee, Choli; Shendure, Jay; Fields, Stanley; Blau, C. Anthony

Layered on top of information conveyed by DNA sequence and chromatin are higher order structures that encompass portions of chromosomes, entire chromosomes, and even whole genomes. Interphase chromosomes are not positioned randomly within the nucleus, but instead adopt preferred conformations. Disparate DNA elements co-localize into functionally defined aggregates or factories for transcription and DNA replication. In budding yeast, Drosophila and many other eukaryotes, chromosomes adopt a Rabl configuration, with arms extending from centromeres adjacent to the spindle pole body to telomeres that abut the nuclear envelope. Nonetheless, the topologies and spatial relationships of chromosomes remain poorly understood. Here we developed a method to globally capture intra- and inter-chromosomal interactions, and applied it to generate a map at kilobase resolution of the haploid genome of Saccharomyces cerevisiae. The map recapitulates known features of genome organization, thereby validating the method, and identifies new features. Extensive regional and higher order folding of individual chromosomes is observed. Chromosome XII exhibits a striking conformation that implicates the nucleolus as a formidable barrier to interaction between DNA sequences at either end. Inter-chromosomal contacts are anchored by centromeres and include interactions among transfer RNA genes, among origins of early DNA replication and among sites where chromosomal breakpoints occur. Finally, we constructed a three-dimensional model of the yeast genome. Our findings provide a glimpse of the interface between the form and function of a eukaryotic genome.
Pydna: a simulation and documentation tool for DNA assembly strategies using python.

PubMed

Pereira, Filipa; Azevedo, Flávio; Carvalho, Ângela; Ribeiro, Gabriela F; Budde, Mark W; Johansson, Björn

2015-05-02

Recent advances in synthetic biology have provided tools to efficiently construct complex DNA molecules which are an important part of many molecular biology and biotechnology projects. The planning of such constructs has traditionally been done manually using a DNA sequence editor which becomes error-prone as scale and complexity of the construction increase. A human-readable formal description of cloning and assembly strategies, which also allows for automatic computer simulation and verification, would therefore be a valuable tool. We have developed pydna, an extensible, free and open source Python library for simulating basic molecular biology DNA unit operations such as restriction digestion, ligation, PCR, primer design, Gibson assembly and homologous recombination. A cloning strategy expressed as a pydna script provides a description that is complete, unambiguous and stable. Execution of the script automatically yields the sequence of the final molecule(s) and that of any intermediate constructs. Pydna has been designed to be understandable for biologists with limited programming skills by providing interfaces that are semantically similar to the description of molecular biology unit operations found in literature. Pydna simplifies both the planning and sharing of cloning strategies and is especially useful for complex or combinatorial DNA molecule construction. An important difference compared to existing tools with similar goals is the use of Python instead of a specifically constructed language, providing a simulation environment that is more flexible and extensible by the user.
Genetic and epigenetic variation in 5S ribosomal RNA genes reveals genome dynamics in Arabidopsis thaliana

PubMed Central

Simon, Lauriane; Rabanal, Fernando A; Dubos, Tristan; Oliver, Cecilia; Lauber, Damien; Poulet, Axel; Vogt, Alexander; Mandlbauer, Ariane; Le Goff, Samuel; Sommer, Andreas; Duborjal, Hervé; Tatout, Christophe

2018-01-01

Abstract Organized in tandem repeat arrays in most eukaryotes and transcribed by RNA polymerase III, expression of 5S rRNA genes is under epigenetic control. To unveil mechanisms of transcriptional regulation, we obtained here in depth sequence information on 5S rRNA genes from the Arabidopsis thaliana genome and identified differential enrichment in epigenetic marks between the three 5S rDNA loci situated on chromosomes 3, 4 and 5. We reveal the chromosome 5 locus as the major source of an atypical, long 5S rRNA transcript characteristic of an open chromatin structure. 5S rRNA genes from this locus translocated in the Landsberg erecta ecotype as shown by linkage mapping and chromosome-specific FISH analysis. These variations in 5S rDNA locus organization cause changes in the spatial arrangement of chromosomes in the nucleus. Furthermore, 5S rRNA gene arrangements are highly dynamic with alterations in chromosomal positions through translocations in certain mutants of the RNA-directed DNA methylation pathway and important copy number variations among ecotypes. Finally, variations in 5S rRNA gene sequence, chromatin organization and transcripts indicate differential usage of 5S rDNA loci in distinct ecotypes. We suggest that both the usage of existing and new 5S rDNA loci resulting from translocations may impact neighboring chromatin organization. PMID:29518237
Genetic and epigenetic variation in 5S ribosomal RNA genes reveals genome dynamics in Arabidopsis thaliana.

PubMed

Simon, Lauriane; Rabanal, Fernando A; Dubos, Tristan; Oliver, Cecilia; Lauber, Damien; Poulet, Axel; Vogt, Alexander; Mandlbauer, Ariane; Le Goff, Samuel; Sommer, Andreas; Duborjal, Hervé; Tatout, Christophe; Probst, Aline V

2018-04-06

Organized in tandem repeat arrays in most eukaryotes and transcribed by RNA polymerase III, expression of 5S rRNA genes is under epigenetic control. To unveil mechanisms of transcriptional regulation, we obtained here in depth sequence information on 5S rRNA genes from the Arabidopsis thaliana genome and identified differential enrichment in epigenetic marks between the three 5S rDNA loci situated on chromosomes 3, 4 and 5. We reveal the chromosome 5 locus as the major source of an atypical, long 5S rRNA transcript characteristic of an open chromatin structure. 5S rRNA genes from this locus translocated in the Landsberg erecta ecotype as shown by linkage mapping and chromosome-specific FISH analysis. These variations in 5S rDNA locus organization cause changes in the spatial arrangement of chromosomes in the nucleus. Furthermore, 5S rRNA gene arrangements are highly dynamic with alterations in chromosomal positions through translocations in certain mutants of the RNA-directed DNA methylation pathway and important copy number variations among ecotypes. Finally, variations in 5S rRNA gene sequence, chromatin organization and transcripts indicate differential usage of 5S rDNA loci in distinct ecotypes. We suggest that both the usage of existing and new 5S rDNA loci resulting from translocations may impact neighboring chromatin organization.
DNA and RNA sequencing by nanoscale reading through programmable electrophoresis and nanoelectrode-gated tunneling and dielectric detection

DOEpatents

Lee, James W.; Thundat, Thomas G.

2005-06-14

An apparatus and method for performing nucleic acid (DNA and/or RNA) sequencing on a single molecule. The genetic sequence information is obtained by probing through a DNA or RNA molecule base by base at nanometer scale as though looking through a strip of movie film. This DNA sequencing nanotechnology has the theoretical capability of performing DNA sequencing at a maximal rate of about 1,000,000 bases per second. This enhanced performance is made possible by a series of innovations including: novel applications of a fine-tuned nanometer gap for passage of a single DNA or RNA molecule; thin layer microfluidics for sample loading and delivery; and programmable electric fields for precise control of DNA or RNA movement. Detection methods include nanoelectrode-gated tunneling current measurements, dielectric molecular characterization, and atomic force microscopy/electrostatic force microscopy (AFM/EFM) probing for nanoscale reading of the nucleic acid sequences.
The sequence specificity of UV-induced DNA damage in a systematically altered DNA sequence.

PubMed

Khoe, Clairine V; Chung, Long H; Murray, Vincent

2018-06-01

The sequence specificity of UV-induced DNA damage was investigated in a specifically designed DNA plasmid using two procedures: end-labelling and linear amplification. Absorption of UV photons by DNA leads to dimerisation of pyrimidine bases and produces two major photoproducts, cyclobutane pyrimidine dimers (CPDs) and pyrimidine(6-4)pyrimidone photoproducts (6-4PPs). A previous study had determined that two hexanucleotide sequences, 5'-GCTC*AC and 5'-TATT*AA, were high intensity UV-induced DNA damage sites. The UV clone plasmid was constructed by systematically altering each nucleotide of these two hexanucleotide sequences. One of the main goals of this study was to determine the influence of single nucleotide alterations on the intensity of UV-induced DNA damage. The sequence 5'-GCTC*AC was designed to examine the sequence specificity of 6-4PPs and the highest intensity 6-4PP damage sites were found at 5'-GTTC*CC nucleotides. The sequence 5'-TATT*AA was devised to investigate the sequence specificity of CPDs and the highest intensity CPD damage sites were found at 5'-TTTT*CG nucleotides. It was proposed that the tetranucleotide DNA sequence, 5'-YTC*Y (where Y is T or C), was the consensus sequence for the highest intensity UV-induced 6-4PP adduct sites; while it was 5'-YTT*C for the highest intensity UV-induced CPD damage sites. These consensus tetranucleotides are composed entirely of consecutive pyrimidines and must have a DNA conformation that is highly productive for the absorption of UV photons. Crown Copyright © 2018. Published by Elsevier B.V. All rights reserved.
Electrotransformation and expression of bacterial genes encoding hygromycin phosphotransferase and beta-galactosidase in the pathogenic fungus Histoplasma capsulatum.

PubMed

Woods, J P; Heinecke, E L; Goldman, W E

1998-04-01

We developed an efficient electrotransformation system for the pathogenic fungus Histoplasma capsulatum and used it to examine the effects of features of the transforming DNA on transformation efficiency and fate of the transforming DNA and to demonstrate fungal expression of two recombinant Escherichia coli genes, hph and lacZ. Linearized DNA and plasmids containing Histoplasma telomeric sequences showed the greatest transformation efficiencies, while the plasmid vector had no significant effect, nor did the derivation of the selectable URA5 marker (native Histoplasma gene or a heterologous Podospora anserina gene). Electrotransformation resulted in more frequent multimerization, other modification, or possibly chromosomal integration of transforming telomeric plasmids when saturating amounts of DNA were used, but this effect was not observed with smaller amounts of transforming DNA. We developed another selection system using a hygromycin B resistance marker from plasmid pAN7-1, consisting of the E. coli hph gene flanked by Aspergillus nidulans promoter and terminator sequences. Much of the heterologous fungal sequences could be removed without compromising function in H. capsulatum, allowing construction of a substantially smaller effective marker fragment. Transformation efficiency increased when nonselective conditions were maintained for a time after electrotransformation before selection with the protein synthesis inhibitor hygromycin B was imposed. Finally, we constructed a readily detectable and quantifiable reporter gene by fusing Histoplasma URA5 with E. coli lacZ, resulting in expression of functional beta-galactosidase in H. capsulatum. Demonstration of expression of bacterial genes as effective selectable markers and reporters, together with a highly efficient electrotransformation system, provide valuable approaches for molecular genetic analysis and manipulation of H. capsulatum, which have proven useful for examination of targeted gene disruption, regulated gene expression, and potential virulence determinants in this fungus.
Molecular identification and phylogenetic study of Demodex caprae.

PubMed

Zhao, Ya-E; Cheng, Juan; Hu, Li; Ma, Jun-Xian

2014-10-01

The DNA barcode has been widely used in species identification and phylogenetic analysis since 2003, but there have been no reports in Demodex. In this study, to obtain an appropriate DNA barcode for Demodex, molecular identification of Demodex caprae based on mitochondrial cox1 was conducted. Firstly, individual adults and eggs of D. caprae were obtained for genomic DNA (gDNA) extraction; Secondly, mitochondrial cox1 fragment was amplified, cloned, and sequenced; Thirdly, cox1 fragments of D. caprae were aligned with those of other Demodex retrieved from GenBank; Finally, the intra- and inter-specific divergences were computed and the phylogenetic trees were reconstructed to analyze phylogenetic relationship in Demodex. Results obtained from seven 429-bp fragments of D. caprae showed that sequence identities were above 99.1% among three adults and four eggs. The intraspecific divergences in D. caprae, Demodex folliculorum, Demodex brevis, and Demodex canis were 0.0-0.9, 0.5-0.9, 0.0-0.2, and 0.0-0.5%, respectively, while the interspecific divergences between D. caprae and D. folliculorum, D. canis, and D. brevis were 20.3-20.9, 21.8-23.0, and 25.0-25.3, respectively. The interspecific divergences were 10 times higher than intraspecific ones, indicating considerable barcoding gap. Furthermore, the phylogenetic trees showed that four Demodex species gathered separately, representing independent species; and Demodex folliculorum gathered with canine Demodex, D. caprae, and D. brevis in sequence. In conclusion, the selected 429-bp mitochondrial cox1 gene is an appropriate DNA barcode for molecular classification, identification, and phylogenetic analysis of Demodex. D. caprae is an independent species and D. folliculorum is closer to D. canis than to D. caprae or D. brevis.
Torque measurements reveal sequence-specific cooperative transitions in supercoiled DNA

PubMed Central

Oberstrass, Florian C.; Fernandes, Louis E.; Bryant, Zev

2012-01-01

B-DNA becomes unstable under superhelical stress and is able to adopt a wide range of alternative conformations including strand-separated DNA and Z-DNA. Localized sequence-dependent structural transitions are important for the regulation of biological processes such as DNA replication and transcription. To directly probe the effect of sequence on structural transitions driven by torque, we have measured the torsional response of a panel of DNA sequences using single molecule assays that employ nanosphere rotational probes to achieve high torque resolution. The responses of Z-forming d(pGpC)n sequences match our predictions based on a theoretical treatment of cooperative transitions in helical polymers. “Bubble” templates containing 50–100 bp mismatch regions show cooperative structural transitions similar to B-DNA, although less torque is required to disrupt strand–strand interactions. Our mechanical measurements, including direct characterization of the torsional rigidity of strand-separated DNA, establish a framework for quantitative predictions of the complex torsional response of arbitrary sequences in their biological context. PMID:22474350
Ancient dna from pleistocene fossils: Preservation, recovery, and utility of ancient genetic information for quaternary research

NASA Astrophysics Data System (ADS)

Yang, Hong

Until recently, recovery and analysis of genetic information encoded in ancient DNA sequences from Pleistocene fossils were impossible. Recent advances in molecular biology offered technical tools to obtain ancient DNA sequences from well-preserved Quaternary fossils and opened the possibilities to directly study genetic changes in fossil species to address various biological and paleontological questions. Ancient DNA studies involving Pleistocene fossil material and ancient DNA degradation and preservation in Quaternary deposits are reviewed. The molecular technology applied to isolate, amplify, and sequence ancient DNA is also presented. Authentication of ancient DNA sequences and technical problems associated with modern and ancient DNA contamination are discussed. As illustrated in recent studies on ancient DNA from proboscideans, it is apparent that fossil DNA sequence data can shed light on many aspects of Quaternary research such as systematics and phylogeny. conservation biology, evolutionary theory, molecular taphonomy, and forensic sciences. Improvement of molecular techniques and a better understanding of DNA degradation during fossilization are likely to build on current strengths and to overcome existing problems, making fossil DNA data a unique source of information for Quaternary scientists.

Enantiospecific recognition of DNA sequences by a proflavine Tröger base.

PubMed

Bailly, C; Laine, W; Demeunynck, M; Lhomme, J

2000-07-05

The DNA interaction of a chiral Tröger base derived from proflavine was investigated by DNA melting temperature measurements and complementary biochemical assays. DNase I footprinting experiments demonstrate that the binding of the proflavine-based Tröger base is both enantio- and sequence-specific. The (+)-isomer poorly interacts with DNA in a non-sequence-selective fashion. In sharp contrast, the corresponding (-)-isomer recognizes preferentially certain DNA sequences containing both A. T and G. C base pairs, such as the motifs 5'-GTT. AAC and 5'-ATGA. TCAT. This is the first experimental demonstration that acridine-type Tröger bases can be used for enantiospecific recognition of DNA sequences. Copyright 2000 Academic Press.
Sensitive detection of mercury and copper ions by fluorescent DNA/Ag nanoclusters in guanine-rich DNA hybridization

NASA Astrophysics Data System (ADS)

Peng, Jun; Ling, Jian; Zhang, Xiu-Qing; Bai, Hui-Ping; Zheng, Liyan; Cao, Qiu-E.; Ding, Zhong-Tao

2015-02-01

In this work, we designed a new fluorescent oligonucleotides-stabilized silver nanoclusters (DNA/AgNCs) probe for sensitive detection of mercury and copper ions. This probe contains two tailored DNA sequence. One is a signal probe contains a cytosine-rich sequence template for AgNCs synthesis and link sequence at both ends. The other is a guanine-rich sequence for signal enhancement and link sequence complementary to the link sequence of the signal probe. After hybridization, the fluorescence of hybridized double-strand DNA/AgNCs is 200-fold enhanced based on the fluorescence enhancement effect of DNA/AgNCs in proximity of guanine-rich DNA sequence. The double-strand DNA/AgNCs probe is brighter and stable than that of single-strand DNA/AgNCs, and more importantly, can be used as novel fluorescent probes for detecting mercury and copper ions. Mercury and copper ions in the range of 6.0-160.0 and 6-240 nM, can be linearly detected with the detection limits of 2.1 and 3.4 nM, respectively. Our results indicated that the analytical parameters of the method for mercury and copper ions detection are much better than which using a single-strand DNA/AgNCs.
Ab initio DNA synthesis by Bst polymerase in the presence of nicking endonucleases Nt.AlwI, Nb.BbvCI, and Nb.BsmI.

PubMed

Antipova, Valeriya N; Zheleznaya, Lyudmila A; Zyrina, Nadezhda V

2014-08-01

In the absence of added DNA, thermophilic DNA polymerases synthesize double-stranded DNA from free dNTPs, which consist of numerous repetitive units (ab initio DNA synthesis). The addition of thermophilic restriction endonuclease (REase), or nicking endonuclease (NEase), effectively stimulates ab initio DNA synthesis and determines the nucleotide sequence of reaction products. We have found that NEases Nt.AlwI, Nb.BbvCI, and Nb.BsmI with non-palindromic recognition sites stimulate the synthesis of sequences organized mainly as palindromes. Moreover, the nucleotide sequence of the palindromes appeared to be dependent on NEase recognition/cleavage modes. Thus, the heterodimeric Nb.BbvCI stimulated the synthesis of palindromes composed of two recognition sites of this NEase, which were separated by AT-reach sequences or (A)n (T)m spacers. Palindromic DNA sequences obtained in the ab initio DNA synthesis with the monomeric NEases Nb.BsmI and Nt.AlwI contained, along with the sites of these NEases, randomly synthesized sequences consisted of blocks of short repeats. These findings could help investigation of the potential abilities of highly productive ab initio DNA synthesis for the creation of DNA molecules with desirable sequence. © 2014 Federation of European Microbiological Societies. Published by John Wiley & Sons Ltd. All rights reserved.
Mitochondrial genome of the moon jelly Aurelia aurita (Cnidaria, Scyphozoa): A linear DNA molecule encoding a putative DNA-dependent DNA polymerase.

PubMed

Shao, Zhiyong; Graf, Shannon; Chaga, Oleg Y; Lavrov, Dennis V

2006-10-15

The 16,937-nuceotide sequence of the linear mitochondrial DNA (mt-DNA) molecule of the moon jelly Aurelia aurita (Cnidaria, Scyphozoa) - the first mtDNA sequence from the class Scypozoa and the first sequence of a linear mtDNA from Metazoa - has been determined. This sequence contains genes for 13 energy pathway proteins, small and large subunit rRNAs, and methionine and tryptophan tRNAs. In addition, two open reading frames of 324 and 969 base pairs in length have been found. The deduced amino-acid sequence of one of them, ORF969, displays extensive sequence similarity with the polymerase [but not the exonuclease] domain of family B DNA polymerases, and this ORF has been tentatively identified as dnab. This is the first report of dnab in animal mtDNA. The genes in A. aurita mtDNA are arranged in two clusters with opposite transcriptional polarities; transcription proceeding toward the ends of the molecule. The determined sequences at the ends of the molecule are nearly identical but inverted and lack any obvious potential secondary structures or telomere-like repeat elements. The acquisition of mitochondrial genomic data for the second class of Cnidaria allows us to reconstruct characteristic features of mitochondrial evolution in this animal phylum.
Recent patents of nanopore DNA sequencing technology: progress and challenges.

PubMed

Zhou, Jianfeng; Xu, Bingqian

2010-11-01

DNA sequencing techniques witnessed fast development in the last decades, primarily driven by the Human Genome Project. Among the proposed new techniques, Nanopore was considered as a suitable candidate for the single DNA sequencing with ultrahigh speed and very low cost. Several fabrication and modification techniques have been developed to produce robust and well-defined nanopore devices. Many efforts have also been done to apply nanopore to analyze the properties of DNA molecules. By comparing with traditional sequencing techniques, nanopore has demonstrated its distinctive superiorities in main practical issues, such as sample preparation, sequencing speed, cost-effective and read-length. Although challenges still remain, recent researches in improving the capabilities of nanopore have shed a light to achieve its ultimate goal: Sequence individual DNA strand at single nucleotide level. This patent review briefly highlights recent developments and technological achievements for DNA analysis and sequencing at single molecule level, focusing on nanopore based methods.
Small tandemly repeated DNA sequences of higher plants likely originate from a tRNA gene ancestor.

PubMed Central

Benslimane, A A; Dron, M; Hartmann, C; Rode, A

1986-01-01

Several monomers (177 bp) of a tandemly arranged repetitive nuclear DNA sequence of Brassica oleracea have been cloned and sequenced. They share up to 95% homology between one another and up to 80% with other satellite DNA sequences of Cruciferae, suggesting a common ancestor. Both strands of these monomers show more than 50% homology with many tRNA genes; the best homologies have been obtained with Lys and His yeast mitochondrial tRNA genes (respectively 64% and 60%). These results suggest that small tandemly repeated DNA sequences of plants may have evolved from a tRNA gene ancestor. These tandem repeats have probably arisen via a process involving reverse transcription of polymerase III RNA intermediates, as is the case for interspersed DNA sequences of mammalians. A model is proposed to explain the formation of such small tandemly repeated DNA sequences. Images PMID:3774553
Primer on Molecular Genetics; DOE Human Genome Program

DOE R&D Accomplishments Database

1992-04-01

This report is taken from the April 1992 draft of the DOE Human Genome 1991--1992 Program Report, which is expected to be published in May 1992. The primer is intended to be an introduction to basic principles of molecular genetics pertaining to the genome project. The material contained herein is not final and may be incomplete. Techniques of genetic mapping and DNA sequencing are described.
Next-Generation Sequencing Platforms

NASA Astrophysics Data System (ADS)

Mardis, Elaine R.

2013-06-01

Automated DNA sequencing instruments embody an elegant interplay among chemistry, engineering, software, and molecular biology and have built upon Sanger's founding discovery of dideoxynucleotide sequencing to perform once-unfathomable tasks. Combined with innovative physical mapping approaches that helped to establish long-range relationships between cloned stretches of genomic DNA, fluorescent DNA sequencers produced reference genome sequences for model organisms and for the reference human genome. New types of sequencing instruments that permit amazing acceleration of data-collection rates for DNA sequencing have been developed. The ability to generate genome-scale data sets is now transforming the nature of biological inquiry. Here, I provide an historical perspective of the field, focusing on the fundamental developments that predated the advent of next-generation sequencing instruments and providing information about how these instruments work, their application to biological research, and the newest types of sequencers that can extract data from single DNA molecules.
Regulatory link between DNA methylation and active demethylation in Arabidopsis

PubMed Central

Lei, Mingguang; Zhang, Huiming; Julian, Russell; Tang, Kai; Xie, Shaojun; Zhu, Jian-Kang

2015-01-01

De novo DNA methylation through the RNA-directed DNA methylation (RdDM) pathway and active DNA demethylation play important roles in controlling genome-wide DNA methylation patterns in plants. Little is known about how cells manage the balance between DNA methylation and active demethylation activities. Here, we report the identification of a unique RdDM target sequence, where DNA methylation is required for maintaining proper active DNA demethylation of the Arabidopsis genome. In a genetic screen for cellular antisilencing factors, we isolated several REPRESSOR OF SILENCING 1 (ros1) mutant alleles, as well as many RdDM mutants, which showed drastically reduced ROS1 gene expression and, consequently, transcriptional silencing of two reporter genes. A helitron transposon element (TE) in the ROS1 gene promoter negatively controls ROS1 expression, whereas DNA methylation of an RdDM target sequence between ROS1 5′ UTR and the promoter TE region antagonizes this helitron TE in regulating ROS1 expression. This RdDM target sequence is also targeted by ROS1, and defective DNA demethylation in loss-of-function ros1 mutant alleles causes DNA hypermethylation of this sequence and concomitantly causes increased ROS1 expression. Our results suggest that this sequence in the ROS1 promoter region serves as a DNA methylation monitoring sequence (MEMS) that senses DNA methylation and active DNA demethylation activities. Therefore, the ROS1 promoter functions like a thermostat (i.e., methylstat) to sense DNA methylation levels and regulates DNA methylation by controlling ROS1 expression. PMID:25733903
Attomole-level Genomics with Single-molecule Direct DNA, cDNA and RNA Sequencing Technologies.

PubMed

Ozsolak, Fatih

2016-01-01

With the introduction of next-generation sequencing (NGS) technologies in 2005, the domination of microarrays in genomics quickly came to an end due to NGS's superior technical performance and cost advantages. By enabling genetic analysis capabilities that were not possible previously, NGS technologies have started to play an integral role in all areas of biomedical research. This chapter outlines the low-quantity DNA and cDNA sequencing capabilities and applications developed with the Helicos single molecule DNA sequencing technology.
A cDNA from a mouse pancreatic beta cell encoding a putative transcription factor of the insulin gene.

PubMed Central

Walker, M D; Park, C W; Rosen, A; Aronheim, A

1990-01-01

Cell specific expression of the insulin gene is achieved through transcriptional mechanisms operating on multiple DNA sequence elements located in the 5' flanking region of the gene. Of particular importance in the rat insulin I gene are two closely similar 9 bp sequences (IEB1 and IEB2): mutation of either of these leads to 5-10 fold reduction in transcriptional activity. We have screened an expression cDNA library derived from mouse pancreatic endocrine beta cells with a radioactive DNA probe containing multiple copies of the IEB1 sequence. A cDNA clone (A1) isolated by this procedure encodes a protein which shows efficient binding to the IEB1 probe, but much weaker binding to either an unrelated DNA probe or to a probe bearing a single base pair insertion within the recognition sequence. DNA sequence analysis indicates a protein belonging to the helix-loop-helix family of DNA-binding proteins. The ability of the protein encoded by clone A1 to recognize a number of wild type and mutant DNA sequences correlates closely with the ability of each sequence element to support transcription in vivo in the context of the insulin 5' flanking DNA. We conclude that the isolated cDNA may encode a transcription factor that participates in control of insulin gene expression. Images PMID:2181401
Highly multiplexed targeted DNA sequencing from single nuclei.

PubMed

Leung, Marco L; Wang, Yong; Kim, Charissa; Gao, Ruli; Jiang, Jerry; Sei, Emi; Navin, Nicholas E

2016-02-01

Single-cell DNA sequencing methods are challenged by poor physical coverage, high technical error rates and low throughput. To address these issues, we developed a single-cell DNA sequencing protocol that combines flow-sorting of single nuclei, time-limited multiple-displacement amplification (MDA), low-input library preparation, DNA barcoding, targeted capture and next-generation sequencing (NGS). This approach represents a major improvement over our previous single nucleus sequencing (SNS) Nature Protocols paper in terms of generating higher-coverage data (>90%), thereby enabling the detection of genome-wide variants in single mammalian cells at base-pair resolution. Furthermore, by pooling 48-96 single-cell libraries together for targeted capture, this approach can be used to sequence many single-cell libraries in parallel in a single reaction. This protocol greatly reduces the cost of single-cell DNA sequencing, and it can be completed in 5-6 d by advanced users. This single-cell DNA sequencing protocol has broad applications for studying rare cells and complex populations in diverse fields of biological research and medicine.
Structurally Ordered Nanowire Formation from Co-Assembly of DNA Origami and Collagen-Mimetic Peptides

DOE Office of Scientific and Technical Information (OSTI.GOV)

Jiang, Tao; Meyer, Travis A.; Modlin, Charles

In this paper, we describe the co-assembly of two different building units: collagen-mimetic peptides and DNA origami. Two peptides CP ++ and sCP ++ are designed with a sequence comprising a central block (Pro-Hyp-Gly) and two positively charged domains (Pro-Arg-Gly) at both N- and C-termini. Co-assembly of peptides and DNA origami two-layer (TL) nanosheets affords the formation of one-dimensional nanowires with repeating periodicity of similar to 10 nm. Structural analyses suggest a face-to-face stacking of DNA nanosheets with peptides aligned perpendicularly to the sheet surfaces. We demonstrate the potential of selective peptide-DNA association between face-to-face and edge-to-edge packing by tailoringmore » the size of DNA nanostructures. Finally, this study presents an attractive strategy to create hybrid biomolecular assemblies from peptide and DNA-based building blocks that takes advantage of the intrinsic chemical and physical properties of the respective components to encode structural and, potentially, functional complexity within readily accessible biomimetic materials.« less
Molecular sled is an eleven-amino acid vehicle facilitating biochemical interactions via sliding components along DNA

DOE PAGES

Mangel, Walter F.; McGrath, William J.; Xiong, Kan; ...

2016-02-02

Recently, we showed the adenovirus proteinase interacts productively with its protein substrates in vitro and in vivo in nascent virus particles via one-dimensional diffusion along the viral DNA. The mechanism by which this occurs has heretofore been unknown. We show sliding of these proteins along DNA occurs on a new vehicle in molecular biology, a ‘molecular sled’ named pVIc. This 11-amino acid viral peptide binds to DNA independent of sequence. pVIc slides on DNA, exhibiting the fastest one-dimensional diffusion constant, 26±1.8 × 10 6 (bp) 2 s −1. pVIc is a ‘molecular sled,’ because it can slide heterologous cargos along DNA,more » for example, a streptavidin tetramer. Similar peptides, for example, from the C terminus of β-actin or NLSIII of the p53 protein, slide along DNA. Finally, characteristics of the ‘molecular sled’ in its milieu (virion, nucleus) have implications for how proteins in the nucleus of cells interact and imply a new form of biochemistry, one-dimensional biochemistry.« less
Structurally Ordered Nanowire Formation from Co-Assembly of DNA Origami and Collagen-Mimetic Peptides

DOE PAGES

Jiang, Tao; Meyer, Travis A.; Modlin, Charles; ...

2017-09-26

In this paper, we describe the co-assembly of two different building units: collagen-mimetic peptides and DNA origami. Two peptides CP ++ and sCP ++ are designed with a sequence comprising a central block (Pro-Hyp-Gly) and two positively charged domains (Pro-Arg-Gly) at both N- and C-termini. Co-assembly of peptides and DNA origami two-layer (TL) nanosheets affords the formation of one-dimensional nanowires with repeating periodicity of similar to 10 nm. Structural analyses suggest a face-to-face stacking of DNA nanosheets with peptides aligned perpendicularly to the sheet surfaces. We demonstrate the potential of selective peptide-DNA association between face-to-face and edge-to-edge packing by tailoringmore » the size of DNA nanostructures. Finally, this study presents an attractive strategy to create hybrid biomolecular assemblies from peptide and DNA-based building blocks that takes advantage of the intrinsic chemical and physical properties of the respective components to encode structural and, potentially, functional complexity within readily accessible biomimetic materials.« less
A Case Study into Microbial Genome Assembly Gap Sequences and Finishing Strategies

DOE Office of Scientific and Technical Information (OSTI.GOV)

Utturkar, Sagar M.; Klingeman, Dawn M.; Hurt, Jr., Richard A.

This study characterized regions of DNA which remained unassembled by either PacBio and Illumina sequencing technologies for seven bacterial genomes. Two genomes were manually finished using bioinformatics and PCR/Sanger sequencing approaches and regions not assembled by automated software were analyzed. Gaps present within Illumina assemblies mostly correspond to repetitive DNA regions such as multiple rRNA operon sequences. PacBio gap sequences were evaluated for several properties such as GC content, read coverage, gap length, ability to form strong secondary structures, and corresponding annotations. Our hypothesis that strong secondary DNA structures blocked DNA polymerases and contributed to gap sequences was not accepted.more » PacBio assemblies had few limitations overall and gaps were explained as cumulative effect of lower than average sequence coverage and repetitive sequences at contig termini. An important aspect of the present study is the compilation of biological features that interfered with assembly and included active transposons, multiple plasmid sequences, phage DNA integration, and large sequence duplication. Furthermore, our targeted genome finishing approach and systematic evaluation of the unassembled DNA will be useful for others looking to close, finish, and polish microbial genome sequences.« less
A Case Study into Microbial Genome Assembly Gap Sequences and Finishing Strategies

DOE PAGES

Utturkar, Sagar M.; Klingeman, Dawn M.; Hurt, Jr., Richard A.; ...

2017-07-18

This study characterized regions of DNA which remained unassembled by either PacBio and Illumina sequencing technologies for seven bacterial genomes. Two genomes were manually finished using bioinformatics and PCR/Sanger sequencing approaches and regions not assembled by automated software were analyzed. Gaps present within Illumina assemblies mostly correspond to repetitive DNA regions such as multiple rRNA operon sequences. PacBio gap sequences were evaluated for several properties such as GC content, read coverage, gap length, ability to form strong secondary structures, and corresponding annotations. Our hypothesis that strong secondary DNA structures blocked DNA polymerases and contributed to gap sequences was not accepted.more » PacBio assemblies had few limitations overall and gaps were explained as cumulative effect of lower than average sequence coverage and repetitive sequences at contig termini. An important aspect of the present study is the compilation of biological features that interfered with assembly and included active transposons, multiple plasmid sequences, phage DNA integration, and large sequence duplication. Furthermore, our targeted genome finishing approach and systematic evaluation of the unassembled DNA will be useful for others looking to close, finish, and polish microbial genome sequences.« less
A Case Study into Microbial Genome Assembly Gap Sequences and Finishing Strategies

PubMed Central

Utturkar, Sagar M.; Klingeman, Dawn M.; Hurt, Richard A.; Brown, Steven D.

2017-01-01

This study characterized regions of DNA which remained unassembled by either PacBio and Illumina sequencing technologies for seven bacterial genomes. Two genomes were manually finished using bioinformatics and PCR/Sanger sequencing approaches and regions not assembled by automated software were analyzed. Gaps present within Illumina assemblies mostly correspond to repetitive DNA regions such as multiple rRNA operon sequences. PacBio gap sequences were evaluated for several properties such as GC content, read coverage, gap length, ability to form strong secondary structures, and corresponding annotations. Our hypothesis that strong secondary DNA structures blocked DNA polymerases and contributed to gap sequences was not accepted. PacBio assemblies had few limitations overall and gaps were explained as cumulative effect of lower than average sequence coverage and repetitive sequences at contig termini. An important aspect of the present study is the compilation of biological features that interfered with assembly and included active transposons, multiple plasmid sequences, phage DNA integration, and large sequence duplication. Our targeted genome finishing approach and systematic evaluation of the unassembled DNA will be useful for others looking to close, finish, and polish microbial genome sequences. PMID:28769883
Mapping the binding site of aflatoxin B/sub 1/ in DNA: systematic analysis of the reactivity of aflatoxin B/sub 1/ with guanines in different DNA sequences

DOE Office of Scientific and Technical Information (OSTI.GOV)

Benasutti, M.; Ejadi, S.; Whitlow, M.D.

The mutagenic and carcinogenic chemical aflatoxin B/sub 1/ (AFB/sub 1/) reacts almost exclusively at the N(7)-position of guanine following activation to its reactive form, the 8,9-epoxide (AFB/sub 1/ oxide). In general N(7)-guanine adducts yield DNA strand breaks when heated in base, a property that serves as the basis for the Maxam-Gilbert DNA sequencing reaction specific for guanine. Using DNA sequencing methods, other workers have shown that AFB/sub 1/ oxide gives strand breaks at positions of guanines; however, the guanine bands varied in intensity. This phenomenon has been used to infer that AFB/sub 1/ oxide prefers to react with guanines inmore » some sequence contexts more than in others and has been referred to as sequence specificity of binding. Herein, data on the reaction of AFB/sub 1/ oxide with several synthetic DNA polymers with different sequences are presented, and (following hydrolysis) adduct levels are determine by high-pressure liquid chromatography. These results reveal that for AFB/sub 1/ oxide (1) the N(7)-guanine adduct is the major adduct found in all of the DNA polymers, (2) adduct levels vary in different sequences, and, thus, sequence specificity is also observed by this more direct method, and (3) the intensity of bands in DNA sequencing gels is likely to reflect adduct levels formed at the N(7)-position of guanine. Knowing this, a reinvestigation of the reactivity of guanines in different DNA sequences using DNA sequencing methods was undertaken. Methods are developed to determine the X (5'-side) base and the Y (3'-side) base are most influential in determining guanine reactivity. These rules in conjunction with molecular modeling studies were used to assess the binding sites that might be utilized by AFB/sub 1/ oxide in its reaction with DNA.« less
Chromosome specific repetitive DNA sequences

DOEpatents

Moyzis, Robert K.; Meyne, Julianne

1991-01-01

A method is provided for determining specific nucleotide sequences useful in forming a probe which can identify specific chromosomes, preferably through in situ hybridization within the cell itself. In one embodiment, chromosome preferential nucleotide sequences are first determined from a library of recombinant DNA clones having families of repetitive sequences. Library clones are identified with a low homology with a sequence of repetitive DNA families to which the first clones respectively belong and variant sequences are then identified by selecting clones having a pattern of hybridization with genomic DNA dissimilar to the hybridization pattern shown by the respective families. In another embodiment, variant sequences are selected from a sequence of a known repetitive DNA family. The selected variant sequence is classified as chromosome specific, chromosome preferential, or chromosome nonspecific. Sequences which are classified as chromosome preferential are further sequenced and regions are identified having a low homology with other regions of the chromosome preferential sequence or with known sequences of other family me This invention is the result of a contract with the Department of Energy (Contract No. W-7405-ENG-36).

Regulation of nucleic acid and protein synthesis: a background study related to the biological effects of radiation. Final report on research activities

DOE Office of Scientific and Technical Information (OSTI.GOV)

Not Available

The existence of an intricate interplay of nucleic acids and nucleotides in the chain of events leading from free amino acid to completed polypeptide chain has been determined. To this was added another participant to the nucleotides in protein synthesis - diadenosine-5', 5''', p'p/sup 4/-tetraphosphate (Ap4A). Ap/sub 4/A serves as an initiation primer for DNA synthesis in a eukaryotic system catalyzed by DNA polymerase ..cap alpha... Thus the initial step in protein synthesis is linked to the first step in DNA synthesis by a small molecular weight, unique dinucleotide signal. Advances in the methodology of nucleic acid sequencing have mademore » it possible to examine the relationship between specific short segments of DNA and RNA and their function in the metabolism of the living cell. The triester method of synthesizing deoxynucleotide polymers has made it feasible to synthesize and use specific oligomeric deoxynucleotide sequences as probes of genetic function and potential viral inhibitors. The synthesis of ribonucleotide polymers has been more difficult, due almost entirely to the presence of the 2' ribosyl hydroxyl group. The possibility is now emerging, however, of employing ribonucleotide polymers as specific RNA-virus inhibitors.« less
Affordable Hands-On DNA Sequencing and Genotyping: An Exercise for Teaching DNA Analysis to Undergraduates

ERIC Educational Resources Information Center

Shah, Kushani; Thomas, Shelby; Stein, Arnold

2013-01-01

In this report, we describe a 5-week laboratory exercise for undergraduate biology and biochemistry students in which students learn to sequence DNA and to genotype their DNA for selected single nucleotide polymorphisms (SNPs). Students use miniaturized DNA sequencing gels that require approximately 8 min to run. The students perform G, A, T, C…
DNA Barcode Goes Two-Dimensions: DNA QR Code Web Server

PubMed Central

Li, Huan; Xing, Hang; Liang, Dong; Jiang, Kun; Pang, Xiaohui; Song, Jingyuan; Chen, Shilin

2012-01-01

The DNA barcoding technology uses a standard region of DNA sequence for species identification and discovery. At present, “DNA barcode” actually refers to DNA sequences, which are not amenable to information storage, recognition, and retrieval. Our aim is to identify the best symbology that can represent DNA barcode sequences in practical applications. A comprehensive set of sequences for five DNA barcode markers ITS2, rbcL, matK, psbA-trnH, and CO1 was used as the test data. Fifty-three different types of one-dimensional and ten two-dimensional barcode symbologies were compared based on different criteria, such as coding capacity, compression efficiency, and error detection ability. The quick response (QR) code was found to have the largest coding capacity and relatively high compression ratio. To facilitate the further usage of QR code-based DNA barcodes, a web server was developed and is accessible at http://qrfordna.dnsalias.org. The web server allows users to retrieve the QR code for a species of interests, convert a DNA sequence to and from a QR code, and perform species identification based on local and global sequence similarities. In summary, the first comprehensive evaluation of various barcode symbologies has been carried out. The QR code has been found to be the most appropriate symbology for DNA barcode sequences. A web server has also been constructed to allow biologists to utilize QR codes in practical DNA barcoding applications. PMID:22574113
Genotyping by Sequencing Using Specific Allelic Capture to Build a High-Density Genetic Map of Durum Wheat

PubMed Central

Holtz, Yan; Ardisson, Morgane; Ranwez, Vincent; Besnard, Alban; Leroy, Philippe; Poux, Gérard; Roumet, Pierre; Viader, Véronique; Santoni, Sylvain; David, Jacques

2016-01-01

Targeted sequence capture is a promising technology which helps reduce costs for sequencing and genotyping numerous genomic regions in large sets of individuals. Bait sequences are designed to capture specific alleles previously discovered in parents or reference populations. We studied a set of 135 RILs originating from a cross between an emmer cultivar (Dic2) and a recent durum elite cultivar (Silur). Six thousand sequence baits were designed to target Dic2 vs. Silur polymorphisms discovered in a previous RNAseq study. These baits were exposed to genomic DNA of the RIL population. Eighty percent of the targeted SNPs were recovered, 65% of which were of high quality and coverage. The final high density genetic map consisted of more than 3,000 markers, whose genetic and physical mapping were consistent with those obtained with large arrays. PMID:27171472
High-resolution characterization of sequence signatures due to non-random cleavage of cell-free DNA.

PubMed

Chandrananda, Dineika; Thorne, Natalie P; Bahlo, Melanie

2015-06-17

High-throughput sequencing of cell-free DNA fragments found in human plasma has been used to non-invasively detect fetal aneuploidy, monitor organ transplants and investigate tumor DNA. However, many biological properties of this extracellular genetic material remain unknown. Research that further characterizes circulating DNA could substantially increase its diagnostic value by allowing the application of more sophisticated bioinformatics tools that lead to an improved signal to noise ratio in the sequencing data. In this study, we investigate various features of cell-free DNA in plasma using deep-sequencing data from two pregnant women (>70X, >50X) and compare them with matched cellular DNA. We utilize a descriptive approach to examine how the biological cleavage of cell-free DNA affects different sequence signatures such as fragment lengths, sequence motifs at fragment ends and the distribution of cleavage sites along the genome. We show that the size distributions of these cell-free DNA molecules are dependent on their autosomal and mitochondrial origin as well as the genomic location within chromosomes. DNA mapping to particular microsatellites and alpha repeat elements display unique size signatures. We show how cell-free fragments occur in clusters along the genome, localizing to nucleosomal arrays and are preferentially cleaved at linker regions by correlating the mapping locations of these fragments with ENCODE annotation of chromatin organization. Our work further demonstrates that cell-free autosomal DNA cleavage is sequence dependent. The region spanning up to 10 positions on either side of the DNA cleavage site show a consistent pattern of preference for specific nucleotides. This sequence motif is present in cleavage sites localized to nucleosomal cores and linker regions but is absent in nucleosome-free mitochondrial DNA. These background signals in cell-free DNA sequencing data stem from the non-random biological cleavage of these fragments. This sequence structure can be harnessed to improve bioinformatics algorithms, in particular for CNV and structural variant detection. Descriptive measures for cell-free DNA features developed here could also be used in biomarker analysis to monitor the changes that occur during different pathological conditions.
Analysis of DNA Sequences by An Optical Time-Integrating Correlator: Proof-Of-Concept Experiments.

DTIC Science & Technology

1992-05-01

TABLES xv LIST OF ABBREVIATIONS xvii 1.0 INTRODUCTION 1 2.0 DNA ANALYSIS STRATEGY 4 2.1 Representation of DNA Bases 4 2.2 DNA Analysis Strategy 6 3.0...Zehnder architecture. 3 Figure 3: Short representations of the DNA bases where each base is represented by a 7-bits long pseudorandom sequence. 5... DNA bases where each base is represented by 7-bits long pseudorandom sequences. 4 Table 2: Long representations of the DNA bases with 255-bits maximum
SNP discovery through de novo deep sequencing using the next generation of DNA sequencers

USDA-ARS?s Scientific Manuscript database

The production of high volumes of DNA sequence data using new technologies has permitted more efficient identification of single nucleotide polymorphisms in vertebrate genomes. This chapter presented practical methodology for production and analysis of DNA sequence data for SNP discovery....
A simple procedure for parallel sequence analysis of both strands of 5'-labeled DNA.

PubMed

Razvi, F; Gargiulo, G; Worcel, A

1983-08-01

Ligation of a 5'-labeled DNA restriction fragment results in a circular DNA molecule carrying the two 32Ps at the reformed restriction site. Double digestions of the circular DNA with the original enzyme and a second restriction enzyme cleavage near the labeled site allows direct chemical sequencing of one 5'-labeled DNA strand. Similar double digestions, using an isoschizomer that cleaves differently at the 32P-labeled site, allows direct sequencing of the now 3'-labeled complementary DNA strand. It is possible to directly sequence both strands of cloned DNA inserts by using the above protocol and a multiple cloning site vector that provides the necessary restriction sites. The simultaneous and parallel visualization of both DNA strands eliminates sequence ambiguities. In addition, the labeled circular molecules are particularly useful for single-hit DNA cleavage studies and DNA footprint analysis. As an example, we show here an analysis of the micrococcal nuclease-induced breaks on the two strands of the somatic 5S RNA gene of Xenopus borealis, which suggests that the enzyme may recognize and cleave small AT-containing palindromes along the DNA helix.
A Glimpse into the Satellite DNA Library in Characidae Fish (Teleostei, Characiformes)

PubMed Central

Utsunomia, Ricardo; Ruiz-Ruano, Francisco J.; Silva, Duílio M. Z. A.; Serrano, Érica A.; Rosa, Ivana F.; Scudeler, Patrícia E. S.; Hashimoto, Diogo T.; Oliveira, Claudio; Camacho, Juan Pedro M.; Foresti, Fausto

2017-01-01

Satellite DNA (satDNA) is an abundant fraction of repetitive DNA in eukaryotic genomes and plays an important role in genome organization and evolution. In general, satDNA sequences follow a concerted evolutionary pattern through the intragenomic homogenization of different repeat units. In addition, the satDNA library hypothesis predicts that related species share a series of satDNA variants descended from a common ancestor species, with differential amplification of different satDNA variants. The finding of a same satDNA family in species belonging to different genera within Characidae fish provided the opportunity to test both concerted evolution and library hypotheses. For this purpose, we analyzed here sequence variation and abundance of this satDNA family in ten species, by a combination of next generation sequencing (NGS), PCR and Sanger sequencing, and fluorescence in situ hybridization (FISH). We found extensive between-species variation for the number and size of pericentromeric FISH signals. At genomic level, the analysis of 1000s of DNA sequences obtained by Illumina sequencing and PCR amplification allowed defining 150 haplotypes which were linked in a common minimum spanning tree, where different patterns of concerted evolution were apparent. This also provided a glimpse into the satDNA library of this group of species. In consistency with the library hypothesis, different variants for this satDNA showed high differences in abundance between species, from highly abundant to simply relictual variants. PMID:28855916
Short, interspersed, and repetitive DNA sequences in Spiroplasma species.

PubMed

Nur, I; LeBlanc, D J; Tully, J G

1987-03-01

Small fragments of DNA from an 8-kbp plasmid, pRA1, from a plant pathogenic strain of Spiroplasma citri were shown previously to be present in the chromosomal DNA of at least two species of Spiroplasma. We describe here the shot-gun cloning of chromosomal DNA from S. citri Maroc and the identification of two distinct sequences exhibiting homology to pRA1. Further subcloning experiments provided specific molecular probes for the identification of these two sequences in chromosomal DNA from three distinct plant pathogenic species of Spiroplasma. The results of Southern blot hybridization indicated that each of the pRA1-associated sequences is present as multiple copies in short, dispersed, and repetitive sequences in the chromosomes of these three strains. None of the sequences was detectable in chromosomal DNA from an additional nine Spiroplasma strains examined.
Laser Desorption Mass Spectrometry for DNA Sequencing and Analysis

NASA Astrophysics Data System (ADS)

Chen, C. H. Winston; Taranenko, N. I.; Golovlev, V. V.; Isola, N. R.; Allman, S. L.

1998-03-01

Rapid DNA sequencing and/or analysis is critically important for biomedical research. In the past, gel electrophoresis has been the primary tool to achieve DNA analysis and sequencing. However, gel electrophoresis is a time-consuming and labor-extensive process. Recently, we have developed and used laser desorption mass spectrometry (LDMS) to achieve sequencing of ss-DNA longer than 100 nucleotides. With LDMS, we succeeded in sequencing DNA in seconds instead of hours or days required by gel electrophoresis. In addition to sequencing, we also applied LDMS for the detection of DNA probes for hybridization LDMS was also used to detect short tandem repeats for forensic applications. Clinical applications for disease diagnosis such as cystic fibrosis caused by base deletion and point mutation have also been demonstrated. Experimental details will be presented in the meeting. abstract.
Analysis of expressed sequence tags generated from full-length enriched cDNA libraries of melon

PubMed Central

2011-01-01

Background Melon (Cucumis melo), an economically important vegetable crop, belongs to the Cucurbitaceae family which includes several other important crops such as watermelon, cucumber, and pumpkin. It has served as a model system for sex determination and vascular biology studies. However, genomic resources currently available for melon are limited. Result We constructed eleven full-length enriched and four standard cDNA libraries from fruits, flowers, leaves, roots, cotyledons, and calluses of four different melon genotypes, and generated 71,577 and 22,179 ESTs from full-length enriched and standard cDNA libraries, respectively. These ESTs, together with ~35,000 ESTs available in public domains, were assembled into 24,444 unigenes, which were extensively annotated by comparing their sequences to different protein and functional domain databases, assigning them Gene Ontology (GO) terms, and mapping them onto metabolic pathways. Comparative analysis of melon unigenes and other plant genomes revealed that 75% to 85% of melon unigenes had homologs in other dicot plants, while approximately 70% had homologs in monocot plants. The analysis also identified 6,972 gene families that were conserved across dicot and monocot plants, and 181, 1,192, and 220 gene families specific to fleshy fruit-bearing plants, the Cucurbitaceae family, and melon, respectively. Digital expression analysis identified a total of 175 tissue-specific genes, which provides a valuable gene sequence resource for future genomics and functional studies. Furthermore, we identified 4,068 simple sequence repeats (SSRs) and 3,073 single nucleotide polymorphisms (SNPs) in the melon EST collection. Finally, we obtained a total of 1,382 melon full-length transcripts through the analysis of full-length enriched cDNA clones that were sequenced from both ends. Analysis of these full-length transcripts indicated that sizes of melon 5' and 3' UTRs were similar to those of tomato, but longer than many other dicot plants. Codon usages of melon full-length transcripts were largely similar to those of Arabidopsis coding sequences. Conclusion The collection of melon ESTs generated from full-length enriched and standard cDNA libraries is expected to play significant roles in annotating the melon genome. The ESTs and associated analysis results will be useful resources for gene discovery, functional analysis, marker-assisted breeding of melon and closely related species, comparative genomic studies and for gaining insights into gene expression patterns. PMID:21599934
Direct non transcriptional role of NF-Y in DNA replication.

PubMed

Benatti, Paolo; Belluti, Silvia; Miotto, Benoit; Neusiedler, Julia; Dolfini, Diletta; Drac, Marjorie; Basile, Valentina; Schwob, Etienne; Mantovani, Roberto; Blow, J Julian; Imbriano, Carol

2016-04-01

NF-Y is a heterotrimeric transcription factor, which plays a pioneer role in the transcriptional control of promoters containing the CCAAT-box, among which genes involved in cell cycle regulation, apoptosis and DNA damage response. The knock-down of the sequence-specific subunit NF-YA triggers defects in S-phase progression, which lead to apoptotic cell death. Here, we report that NF-Y has a critical function in DNA replication progression, independent from its transcriptional activity. NF-YA colocalizes with early DNA replication factories, its depletion affects the loading of replisome proteins to DNA, among which Cdc45, and delays the passage from early to middle-late S phase. Molecular combing experiments are consistent with a role for NF-Y in the control of fork progression. Finally, we unambiguously demonstrate a direct non-transcriptional role of NF-Y in the overall efficiency of DNA replication, specifically in the DNA elongation process, using a Xenopus cell-free system. Our findings broaden the activity of NF-Y on a DNA metabolism other than transcription, supporting the existence of specific TFs required for proper and efficient DNA replication. Copyright © 2015 The Authors. Published by Elsevier B.V. All rights reserved.
Detection of a new bat gammaherpesvirus in the Philippines.

PubMed

Watanabe, Shumpei; Ueda, Naoya; Iha, Koichiro; Masangkay, Joseph S; Fujii, Hikaru; Alviola, Phillip; Mizutani, Tetsuya; Maeda, Ken; Yamane, Daisuke; Walid, Azab; Kato, Kentaro; Kyuwa, Shigeru; Tohya, Yukinobu; Yoshikawa, Yasuhiro; Akashi, Hiroomi

2009-08-01

A new bat herpesvirus was detected in the spleen of an insectivorous bat (Hipposideros diadema, family Hipposideridae) collected on Panay Island, the Philippines. PCR analyses were performed using COnsensus-DEgenerate Hybrid Oligonucleotide Primers (CODEHOPs) targeting the herpesvirus DNA polymerase (DPOL) gene. Although we obtained PCR products with CODEHOPs, direct sequencing using the primers was not possible because of high degree of degeneracy. Direct sequencing technology developed in our rapid determination system of viral RNA sequences (RDV) was applied in this study, and a partial DPOL nucleotide sequence was determined. In addition, a partial gB gene nucleotide sequence was also determined using the same strategy. We connected the partial gB and DPOL sequences with long-distance PCR, and a 3741-bp nucleotide fragment, including the 3' part of the gB gene and the 5' part of the DPOL gene, was finally determined. Phylogenetic analysis showed that the sequence was novel and most similar to those of the subfamily Gammaherpesvirinae.
The application of the high throughput sequencing technology in the transposable elements.

PubMed

Liu, Zhen; Xu, Jian-hong

2015-09-01

High throughput sequencing technology has dramatically improved the efficiency of DNA sequencing, and decreased the costs to a great extent. Meanwhile, this technology usually has advantages of better specificity, higher sensitivity and accuracy. Therefore, it has been applied to the research on genetic variations, transcriptomics and epigenomics. Recently, this technology has been widely employed in the studies of transposable elements and has achieved fruitful results. In this review, we summarize the application of high throughput sequencing technology in the fields of transposable elements, including the estimation of transposon content, preference of target sites and distribution, insertion polymorphism and population frequency, identification of rare copies, transposon horizontal transfers as well as transposon tagging. We also briefly introduce the major common sequencing strategies and algorithms, their advantages and disadvantages, and the corresponding solutions. Finally, we envision the developing trends of high throughput sequencing technology, especially the third generation sequencing technology, and its application in transposon studies in the future, hopefully providing a comprehensive understanding and reference for related scientific researchers.
Constructing DNA Barcode Sets Based on Particle Swarm Optimization.

PubMed

Wang, Bin; Zheng, Xuedong; Zhou, Shihua; Zhou, Changjun; Wei, Xiaopeng; Zhang, Qiang; Wei, Ziqi

2018-01-01

Following the completion of the human genome project, a large amount of high-throughput bio-data was generated. To analyze these data, massively parallel sequencing, namely next-generation sequencing, was rapidly developed. DNA barcodes are used to identify the ownership between sequences and samples when they are attached at the beginning or end of sequencing reads. Constructing DNA barcode sets provides the candidate DNA barcodes for this application. To increase the accuracy of DNA barcode sets, a particle swarm optimization (PSO) algorithm has been modified and used to construct the DNA barcode sets in this paper. Compared with the extant results, some lower bounds of DNA barcode sets are improved. The results show that the proposed algorithm is effective in constructing DNA barcode sets.
Winnowing DNA for rare sequences: highly specific sequence and methylation based enrichment.

PubMed

Thompson, Jason D; Shibahara, Gosuke; Rajan, Sweta; Pel, Joel; Marziali, Andre

2012-01-01

Rare mutations in cell populations are known to be hallmarks of many diseases and cancers. Similarly, differential DNA methylation patterns arise in rare cell populations with diagnostic potential such as fetal cells circulating in maternal blood. Unfortunately, the frequency of alleles with diagnostic potential, relative to wild-type background sequence, is often well below the frequency of errors in currently available methods for sequence analysis, including very high throughput DNA sequencing. We demonstrate a DNA preparation and purification method that through non-linear electrophoretic separation in media containing oligonucleotide probes, achieves 10,000 fold enrichment of target DNA with single nucleotide specificity, and 100 fold enrichment of unmodified methylated DNA differing from the background by the methylation of a single cytosine residue.
Pulling out the 1%: Whole-Genome Capture for the Targeted Enrichment of Ancient DNA Sequencing Libraries

PubMed Central

Carpenter, Meredith L.; Buenrostro, Jason D.; Valdiosera, Cristina; Schroeder, Hannes; Allentoft, Morten E.; Sikora, Martin; Rasmussen, Morten; Gravel, Simon; Guillén, Sonia; Nekhrizov, Georgi; Leshtakov, Krasimir; Dimitrova, Diana; Theodossiev, Nikola; Pettener, Davide; Luiselli, Donata; Sandoval, Karla; Moreno-Estrada, Andrés; Li, Yingrui; Wang, Jun; Gilbert, M. Thomas P.; Willerslev, Eske; Greenleaf, William J.; Bustamante, Carlos D.

2013-01-01

Most ancient specimens contain very low levels of endogenous DNA, precluding the shotgun sequencing of many interesting samples because of cost. Ancient DNA (aDNA) libraries often contain <1% endogenous DNA, with the majority of sequencing capacity taken up by environmental DNA. Here we present a capture-based method for enriching the endogenous component of aDNA sequencing libraries. By using biotinylated RNA baits transcribed from genomic DNA libraries, we are able to capture DNA fragments from across the human genome. We demonstrate this method on libraries created from four Iron Age and Bronze Age human teeth from Bulgaria, as well as bone samples from seven Peruvian mummies and a Bronze Age hair sample from Denmark. Prior to capture, shotgun sequencing of these libraries yielded an average of 1.2% of reads mapping to the human genome (including duplicates). After capture, this fraction increased substantially, with up to 59% of reads mapped to human and enrichment ranging from 6- to 159-fold. Furthermore, we maintained coverage of the majority of regions sequenced in the precapture library. Intersection with the 1000 Genomes Project reference panel yielded an average of 50,723 SNPs (range 3,062–147,243) for the postcapture libraries sequenced with 1 million reads, compared with 13,280 SNPs (range 217–73,266) for the precapture libraries, increasing resolution in population genetic analyses. Our whole-genome capture approach makes it less costly to sequence aDNA from specimens containing very low levels of endogenous DNA, enabling the analysis of larger numbers of samples. PMID:24568772
Interactions of DNA binding proteins with G-Quadruplex structures at the single molecule level

NASA Astrophysics Data System (ADS)

Ray, Sujay

Guanine-rich nucleic acid (DNA/RNA) sequences can form non-canonical secondary structures, known as G-quadruplex (GQ). Numerous in vivo and in vitro studies have demonstrated formation of these structures in telomeric and non-telomeric regions of the genome. Telomeric GQs protect the chromosome ends whereas non-telomeric GQs either act as road blocks or recognition sites for DNA metabolic machinery. These observations suggest the significance of these structures in regulation of different metabolic processes, such as replication and repair. GQs are typically thermodynamically more stable than the corresponding Watson-Crick base pairing formed by G-rich and C-rich strands, making protein activity a crucial factor for their destabilization. Inside the cell, GQs interact with different proteins and their enzymatic activity is the determining factor for their stability. We studied interactions of several proteins with GQs to understand the underlying principles of protein-GQ interactions using single-molecule FRET and other biophysical techniques. Replication Protein-A (RPA), a single stranded DNA (ssDNA) binding protein, is known to posses GQ unfolding activity. First, we compared the thermal stability of three potentially GQ-forming DNA sequences (PQS) to their stability against RPA-mediated unfolding. One of these sequences is the human telomeric repeat and the other two, located in the promoter region of tyrosine hydroxylase gene, are highly heterogeneous sequences that better represent PQS in the genome. The thermal stability of these structures do not necessarily correlate with their stability against protein-mediated unfolding. We conclude that thermal stability is not necessarily an adequate criterion for predicting the physiological viability of GQ structures. To determine the critical structural factors that influence protein-GQ interactions we studied two groups of GQ structures that have systematically varying loop lengths and number of G-tetrad layers. We observed a linear increase in the steady-state stability of the GQ against RPA-mediated unfolding with increasing number of layers or decreasing loop length. The stability demonstrated by different GQ structures varied by at least three orders of magnitude. Finally, we studied another protein-GQ system where a protein complex works synergistically with a GQ to suppress DNA damage signals by preventing RPA to bind to telomeric DNA. Human telomeres that terminate with a single-stranded 3' G-overhang can be recognized as a DNA damage site by RPA. The protection of telomere-1 (POT1) and POT1-interacting protein (TPP1) heterodimer, binds specifically to telomeric DNA and protects it against RPA binding. Using model telomeric DNA, we studied the competition between POT1/TPP1 and RPA to access telomeric GQs in vitro. Under physiological salt and pH conditions, POT1/TPP1 stably load to a minimal DNA sequence adjacent to a folded GQ and unfolds the anti-parallel GQ as the parallel conformation remains folded. We showed that GQ formation of telomeres enhances the ability of POT1/TPP1 to block RPA's access to telomeres by two orders of magnitude and contributes to suppress DNA damage signals.
Capturing the Biofuel Wellhead and Powerhouse: The Chloroplast and Mitochondrial Genomes of the Leguminous Feedstock Tree Pongamia pinnata

PubMed Central

Kazakoff, Stephen H.; Imelfort, Michael; Edwards, David; Koehorst, Jasper; Biswas, Bandana; Batley, Jacqueline; Scott, Paul T.; Gresshoff, Peter M.

2012-01-01

Pongamia pinnata (syn. Millettia pinnata) is a novel, fast-growing arboreal legume that bears prolific quantities of oil-rich seeds suitable for the production of biodiesel and aviation biofuel. Here, we have used Illumina® ‘Second Generation DNA Sequencing (2GS)’ and a new short-read de novo assembler, SaSSY, to assemble and annotate the Pongamia chloroplast (152,968 bp; cpDNA) and mitochondrial (425,718 bp; mtDNA) genomes. We also show that SaSSY can be used to accurately assemble 2GS data, by re-assembling the Lotus japonicus cpDNA and in the process assemble its mtDNA (380,861 bp). The Pongamia cpDNA contains 77 unique protein-coding genes and is almost 60% gene-dense. It contains a 50 kb inversion common to other legumes, as well as a novel 6.5 kb inversion that is responsible for the non-disruptive, re-orientation of five protein-coding genes. Additionally, two copies of an inverted repeat firmly place the species outside the subclade of the Fabaceae lacking the inverted repeat. The Pongamia and L. japonicus mtDNA contain just 33 and 31 unique protein-coding genes, respectively, and like other angiosperm mtDNA, have expanded intergenic and multiple repeat regions. Through comparative analysis with Vigna radiata we measured the average synonymous and non-synonymous divergence of all three legume mitochondrial (1.59% and 2.40%, respectively) and chloroplast (8.37% and 8.99%, respectively) protein-coding genes. Finally, we explored the relatedness of Pongamia within the Fabaceae and showed the utility of the organellar genome sequences by mapping transcriptomic data to identify up- and down-regulated stress-responsive gene candidates and confirm in silico predicted RNA editing sites. PMID:23272141

Capturing the biofuel wellhead and powerhouse: the chloroplast and mitochondrial genomes of the leguminous feedstock tree Pongamia pinnata.

PubMed

Kazakoff, Stephen H; Imelfort, Michael; Edwards, David; Koehorst, Jasper; Biswas, Bandana; Batley, Jacqueline; Scott, Paul T; Gresshoff, Peter M

2012-01-01

Pongamia pinnata (syn. Millettia pinnata) is a novel, fast-growing arboreal legume that bears prolific quantities of oil-rich seeds suitable for the production of biodiesel and aviation biofuel. Here, we have used Illumina® 'Second Generation DNA Sequencing (2GS)' and a new short-read de novo assembler, SaSSY, to assemble and annotate the Pongamia chloroplast (152,968 bp; cpDNA) and mitochondrial (425,718 bp; mtDNA) genomes. We also show that SaSSY can be used to accurately assemble 2GS data, by re-assembling the Lotus japonicus cpDNA and in the process assemble its mtDNA (380,861 bp). The Pongamia cpDNA contains 77 unique protein-coding genes and is almost 60% gene-dense. It contains a 50 kb inversion common to other legumes, as well as a novel 6.5 kb inversion that is responsible for the non-disruptive, re-orientation of five protein-coding genes. Additionally, two copies of an inverted repeat firmly place the species outside the subclade of the Fabaceae lacking the inverted repeat. The Pongamia and L. japonicus mtDNA contain just 33 and 31 unique protein-coding genes, respectively, and like other angiosperm mtDNA, have expanded intergenic and multiple repeat regions. Through comparative analysis with Vigna radiata we measured the average synonymous and non-synonymous divergence of all three legume mitochondrial (1.59% and 2.40%, respectively) and chloroplast (8.37% and 8.99%, respectively) protein-coding genes. Finally, we explored the relatedness of Pongamia within the Fabaceae and showed the utility of the organellar genome sequences by mapping transcriptomic data to identify up- and down-regulated stress-responsive gene candidates and confirm in silico predicted RNA editing sites.
Dimensions and Global Twist of Single-Layer DNA Origami Measured by Small-Angle X-ray Scattering.

PubMed

Baker, Matthew A B; Tuckwell, Andrew J; Berengut, Jonathan F; Bath, Jonathan; Benn, Florence; Duff, Anthony P; Whitten, Andrew E; Dunn, Katherine E; Hynson, Robert M; Turberfield, Andrew J; Lee, Lawrence K

2018-06-04

The rational design of complementary DNA sequences can be used to create nanostructures that self-assemble with nanometer precision. DNA nanostructures have been imaged by atomic force microscopy and electron microscopy. Small-angle X-ray scattering (SAXS) provides complementary structural information on the ensemble-averaged state of DNA nanostructures in solution. Here we demonstrate that SAXS can distinguish between different single-layer DNA origami tiles that look identical when immobilized on a mica surface and imaged with atomic force microscopy. We use SAXS to quantify the magnitude of global twist of DNA origami tiles with different crossover periodicities: these measurements highlight the extreme structural sensitivity of single-layer origami to the location of strand crossovers. We also use SAXS to quantify the distance between pairs of gold nanoparticles tethered to specific locations on a DNA origami tile and use this method to measure the overall dimensions and geometry of the DNA nanostructure in solution. Finally, we use indirect Fourier methods, which have long been used for the interpretation of SAXS data from biomolecules, to measure the distance between DNA helix pairs in a DNA origami nanotube. Together, these results provide important methodological advances in the use of SAXS to analyze DNA nanostructures in solution and insights into the structures of single-layer DNA origami.
Biological nanopore MspA for DNA sequencing

NASA Astrophysics Data System (ADS)

Manrao, Elizabeth A.

Unlocking the information hidden in the human genome provides insight into the inner workings of complex biological systems and can be used to greatly improve health-care. In order to allow for widespread sequencing, new technologies are required that provide fast and inexpensive readings of DNA. Nanopore sequencing is a third generation DNA sequencing technology that is currently being developed to fulfill this need. In nanopore sequencing, a voltage is applied across a small pore in an electrolyte solution and the resulting ionic current is recorded. When DNA passes through the channel, the ionic current is partially blocked. If the DNA bases uniquely modulate the ionic current flowing through the channel, the time trace of the current can be related to the sequence of DNA passing through the pore. There are two main challenges to realizing nanopore sequencing: identifying a pore with sensitivity to single nucleotides and controlling the translocation of DNA through the pore so that the small single nucleotide current signatures are distinguishable from background noise. In this dissertation, I explore the use of Mycobacterium smegmatis porin A (MspA) for nanopore sequencing. In order to determine MspA's sensitivity to single nucleotides, DNA strands of various compositions are held in the pore as the resulting ionic current is measured. DNA is immobilized in MspA by attaching it to a large molecule which acts as an anchor. This technique confirms the single nucleotide resolution of the pore and additionally shows that MspA is sensitive to epigenetic modifications and single nucleotide polymorphisms. The forces from the electric field within MspA, the effective charge of nucleotides, and elasticity of DNA are estimated using a Freely Jointed Chain model of single stranded DNA. These results offer insight into the interactions of DNA within the pore. With the nucleotide sensitivity of MspA confirmed, a method is introduced to controllably pass DNA through the pore. Using a DNA polymerase, DNA strands are stepped through MspA one nucleotide at a time. The steps are observable as distinct levels on the ionic-current time-trace and are related to the DNA sequence. These experiments overcome the two fundamental challenges to realizing MspA nanopore sequencing and pave the way to the development of a commercial technology.
Characterization of constitutive and putative differentially expressed mRNAs by means of expressed sequence tags, differential display reverse transcriptase-PCR and randomly amplified polymorphic DNA-PCR from the sand fly vector Lutzomyia longipalpis.

PubMed

Ramalho-Ortigão, J M; Temporal, P; de Oliveira , S M; Barbosa, A F; Vilela, M L; Rangel, E F; Brazil, R P; Traub-Cseko, Y M

2001-01-01

Molecular studies of insect disease vectors are of paramount importance for understanding parasite-vector relationship. Advances in this area have led to important findings regarding changes in vectors' physiology upon blood feeding and parasite infection. Mechanisms for interfering with the vectorial capacity of insects responsible for the transmission of diseases such as malaria, Chagas disease and dengue fever are being devised with the ultimate goal of developing transgenic insects. A primary necessity for this goal is information on gene expression and control in the target insect. Our group is investigating molecular aspects of the interaction between Leishmania parasites and Lutzomyia sand flies. As an initial step in our studies we have used random sequencing of cDNA clones from two expression libraries made from head/thorax and abdomen of sugar fed L. longipalpis for the identification of expressed sequence tags (EST). We applied differential display reverse transcriptase-PCR and randomly amplified polymorphic DNA-PCR to characterize differentially expressed mRNA from sugar and blood fed insects, and, in one case, from a L. (V.) braziliensis-infected L. longipalpis. We identified 37 cDNAs that have shown homology to known sequences from GeneBank. Of these, 32 cDNAs code for constitutive proteins such as zinc finger protein, glutamine synthetase, G binding protein, ubiquitin conjugating enzyme. Three are putative differentially expressed cDNAs from blood fed and Leishmania-infected midgut, a chitinase, a V-ATPase and a MAP kinase. Finally, two sequences are homologous to Drosophila melanogaster gene products recently discovered through the Drosophila genome initiative.
Identification of antigenic regions on VP2 of African horsesickness virus serotype 3 by using phage-displayed epitope libraries.

PubMed

Bentley, L; Fehrsen, J; Jordaan, F; Huismans, H; du Plessis, D H

2000-04-01

VP2 is an outer capsid protein of African horsesickness virus (AHSV) and is recognized by serotype-discriminatory neutralizing antibodies. With the objective of locating its antigenic regions, a filamentous phage library was constructed that displayed peptides derived from the fragmentation of a cDNA copy of the gene encoding VP2. Peptides ranging in size from approximately 30 to 100 amino acids were fused with pIII, the attachment protein of the display vector, fUSE2. To ensure maximum diversity, the final library consisted of three sub-libraries. The first utilized enzymatically fragmented DNA encoding only the VP2 gene, the second included plasmid sequences, while the third included a PCR step designed to allow different peptide-encoding sequences to recombine before ligation into the vector. The resulting composite library was subjected to immunoaffinity selection with AHSV-specific polyclonal chicken IgY, polyclonal horse immunoglobulins and a monoclonal antibody (MAb) known to neutralize AHSV. Antigenic peptides were located by sequencing the DNA of phages bound by the antibodies. Most antigenic determinants capable of being mapped by this method were located in the N-terminal half of VP2. Important binding areas were mapped with high resolution by identifying the minimum overlapping areas of the selected peptides. The MAb was also used to screen a random 17-mer epitope library. Sequences that may be part of a discontinuous neutralization epitope were identified. The amino acid sequences of the antigenic regions on VP2 of serotype 3 were compared with corresponding regions on three other serotypes, revealing regions with the potential to discriminate AHSV serotypes serologically.
Effects of sequence on DNA wrapping around histones

NASA Astrophysics Data System (ADS)

Ortiz, Vanessa

2011-03-01

A central question in biophysics is whether the sequence of a DNA strand affects its mechanical properties. In epigenetics, these are thought to influence nucleosome positioning and gene expression. Theoretical and experimental attempts to answer this question have been hindered by an inability to directly resolve DNA structure and dynamics at the base-pair level. In our previous studies we used a detailed model of DNA to measure the effects of sequence on the stability of naked DNA under bending. Sequence was shown to influence DNA's ability to form kinks, which arise when certain motifs slide past others to form non-native contacts. Here, we have now included histone-DNA interactions to see if the results obtained for naked DNA are transferable to the problem of nucleosome positioning. Different DNA sequences interacting with the histone protein complex are studied, and their equilibrium and mechanical properties are compared among themselves and with the naked case. NLM training grant to the Computation and Informatics in Biology and Medicine Training Program (NLM T15LM007359).
A high-throughput and quantitative method to assess the mutagenic potential of translesion DNA synthesis

PubMed Central

Taggart, David J.; Camerlengo, Terry L.; Harrison, Jason K.; Sherrer, Shanen M.; Kshetry, Ajay K.; Taylor, John-Stephen; Huang, Kun; Suo, Zucai

2013-01-01

Cellular genomes are constantly damaged by endogenous and exogenous agents that covalently and structurally modify DNA to produce DNA lesions. Although most lesions are mended by various DNA repair pathways in vivo, a significant number of damage sites persist during genomic replication. Our understanding of the mutagenic outcomes derived from these unrepaired DNA lesions has been hindered by the low throughput of existing sequencing methods. Therefore, we have developed a cost-effective high-throughput short oligonucleotide sequencing assay that uses next-generation DNA sequencing technology for the assessment of the mutagenic profiles of translesion DNA synthesis catalyzed by any error-prone DNA polymerase. The vast amount of sequencing data produced were aligned and quantified by using our novel software. As an example, the high-throughput short oligonucleotide sequencing assay was used to analyze the types and frequencies of mutations upstream, downstream and at a site-specifically placed cis–syn thymidine–thymidine dimer generated individually by three lesion-bypass human Y-family DNA polymerases. PMID:23470999
An extended sequence specificity for UV-induced DNA damage.

PubMed

Chung, Long H; Murray, Vincent

2018-01-01

The sequence specificity of UV-induced DNA damage was determined with a higher precision and accuracy than previously reported. UV light induces two major damage adducts: cyclobutane pyrimidine dimers (CPDs) and pyrimidine(6-4)pyrimidone photoproducts (6-4PPs). Employing capillary electrophoresis with laser-induced fluorescence and taking advantages of the distinct properties of the CPDs and 6-4PPs, we studied the sequence specificity of UV-induced DNA damage in a purified DNA sequence using two approaches: end-labelling and a polymerase stop/linear amplification assay. A mitochondrial DNA sequence that contained a random nucleotide composition was employed as the target DNA sequence. With previous methodology, the UV sequence specificity was determined at a dinucleotide or trinucleotide level; however, in this paper, we have extended the UV sequence specificity to a hexanucleotide level. With the end-labelling technique (for 6-4PPs), the consensus sequence was found to be 5'-GCTC*AC (where C* is the breakage site); while with the linear amplification procedure, it was 5'-TCTT*AC. With end-labelling, the dinucleotide frequency of occurrence was highest for 5'-TC*, 5'-TT* and 5'-CC*; whereas it was 5'-TT* for linear amplification. The influence of neighbouring nucleotides on the degree of UV-induced DNA damage was also examined. The core sequences consisted of pyrimidine nucleotides 5'-CTC* and 5'-CTT* while an A at position "1" and C at position "2" enhanced UV-induced DNA damage. Crown Copyright © 2017. Published by Elsevier B.V. All rights reserved.
Multiplex analysis of DNA

DOEpatents

Church, George M.; Kieffer-Higgins, Stephen

1992-01-01

This invention features vectors and a method for sequencing DNA. The method includes the steps of: a) ligating the DNA into a vector comprising a tag sequence, the tag sequence includes at least 15 bases, wherein the tag sequence will not hybridize to the DNA under stringent hybridization conditions and is unique in the vector, to form a hybrid vector, b) treating the hybrid vector in a plurality of vessels to produce fragments comprising the tag sequence, wherein the fragments differ in length and terminate at a fixed known base or bases, wherein the fixed known base or bases differs in each vessel, c) separating the fragments from each vessel according to their size, d) hybridizing the fragments with an oligonucleotide able to hybridize specifically with the tag sequence, and e) detecting the pattern of hybridization of the tag sequence, wherein the pattern reflects the nucleotide sequence of the DNA.
BiQ Analyzer HT: locus-specific analysis of DNA methylation by high-throughput bisulfite sequencing

PubMed Central

Lutsik, Pavlo; Feuerbach, Lars; Arand, Julia; Lengauer, Thomas; Walter, Jörn; Bock, Christoph

2011-01-01

Bisulfite sequencing is a widely used method for measuring DNA methylation in eukaryotic genomes. The assay provides single-base pair resolution and, given sufficient sequencing depth, its quantitative accuracy is excellent. High-throughput sequencing of bisulfite-converted DNA can be applied either genome wide or targeted to a defined set of genomic loci (e.g. using locus-specific PCR primers or DNA capture probes). Here, we describe BiQ Analyzer HT (http://biq-analyzer-ht.bioinf.mpi-inf.mpg.de/), a user-friendly software tool that supports locus-specific analysis and visualization of high-throughput bisulfite sequencing data. The software facilitates the shift from time-consuming clonal bisulfite sequencing to the more quantitative and cost-efficient use of high-throughput sequencing for studying locus-specific DNA methylation patterns. In addition, it is useful for locus-specific visualization of genome-wide bisulfite sequencing data. PMID:21565797
Measurement of pyrimidine (6-4) photoproducts in DNA by a mild acidic hydrolysis-HPLC fluorescence detection assay.

PubMed

Douki, T; Voituriez, L; Cadet, J

1995-03-01

Pyrimidine (6-4) pyrimidone photoproducts constitute one of the major classes of DNA lesions induced by far-UV irradiation. However, their biological role remains difficult to assess partly because of the lack of a specific and sensitive assay for monitoring their formation in DNA. Here is presented a measurement method based on the release of the (6-4) base adducts from DNA followed by an HPLC separation associated with a sensitive and specific fluorescence detection. The quantitative and mechanistic aspects of the chemical hydrolysis, based on the use of hydrogen fluoride stabilized in pyridine, were investigated, using dinucleoside monophosphate (6-4) photoproducts as model compounds. The final hydrolysis products were isolated and characterized by UV, fluorescence, mass, and 1H NMR spectroscopies. Application of the assay to far-UV irradiated calf thymus DNA provided information on the sequence effect on the rate of formation of three of the four possible bipyrimidine (6-4) photoproducts.
A DNA sequence analysis package for the IBM personal computer.

PubMed Central

Lagrimini, L M; Brentano, S T; Donelson, J E

1984-01-01

We present here a collection of DNA sequence analysis programs, called "PC Sequence" (PCS), which are designed to run on the IBM Personal Computer (PC). These programs are written in IBM PC compiled BASIC and take full advantage of the IBM PC's speed, error handling, and graphics capabilities. For a modest initial expense in hardware any laboratory can use these programs to quickly perform computer analysis on DNA sequences. They are written with the novice user in mind and require very little training or previous experience with computers. Also provided are a text editing program for creating and modifying DNA sequence files and a communications program which enables the PC to communicate with and collect information from mainframe computers and DNA sequence databases. PMID:6546433
Genomic sequencing of Pleistocene cave bears

DOE Office of Scientific and Technical Information (OSTI.GOV)

Noonan, James P.; Hofreiter, Michael; Smith, Doug

2005-04-01

Despite the information content of genomic DNA, ancient DNA studies to date have largely been limited to amplification of mitochondrial DNA due to technical hurdles such as contamination and degradation of ancient DNAs. In this study, we describe two metagenomic libraries constructed using unamplified DNA extracted from the bones of two 40,000-year-old extinct cave bears. Analysis of {approx}1 Mb of sequence from each library showed that, despite significant microbial contamination, 5.8 percent and 1.1 percent of clones in the libraries contain cave bear inserts, yielding 26,861 bp of cave bear genome sequence. Alignment of this sequence to the dog genome,more » the closest sequenced genome to cave bear in terms of evolutionary distance, revealed roughly the expected ratio of cave bear exons, repeats and conserved noncoding sequences. Only 0.04 percent of all clones sequenced were derived from contamination with modern human DNA. Comparison of cave bear with orthologous sequences from several modern bear species revealed the evolutionary relationship of these lineages. Using the metagenomic approach described here, we have recovered substantial quantities of mammalian genomic sequence more than twice as old as any previously reported, establishing the feasibility of ancient DNA genomic sequencing programs.« less
Characterization of North American Armillaria species: Genetic relationships determined by ribosomal DNA sequences and AFLP markers

Treesearch

M. -S. Kim; N. B. Klopfenstein; J. W. Hanna; G. I. McDonald

2006-01-01

Phylogenetic and genetic relationships among 10 North American Armillaria species were analysed using sequence data from ribosomal DNA (rDNA), including intergenic spacer (IGS-1), internal transcribed spacers with associated 5.8S (ITS + 5.8S), and nuclear large subunit rDNA (nLSU), and amplified fragment length polymorphism (AFLP) markers. Based on rDNA sequence data,...
Fractal landscape analysis of DNA walks

NASA Technical Reports Server (NTRS)

Peng, C. K.; Buldyrev, S. V.; Goldberger, A. L.; Havlin, S.; Sciortino, F.; Simons, M.; Stanley, H. E.

1992-01-01

By mapping nucleotide sequences onto a "DNA walk", we uncovered remarkably long-range power law correlations [Nature 356 (1992) 168] that imply a new scale invariant property of DNA. We found such long-range correlations in intron-containing genes and in non-transcribed regulatory DNA sequences, but not in cDNA sequences or intron-less genes. In this paper, we present more explicit evidences to support our findings.
[Genome-scale sequence data processing and epigenetic analysis of DNA methylation].

PubMed

Wang, Ting-Zhang; Shan, Gao; Xu, Jian-Hong; Xue, Qing-Zhong

2013-06-01

A new approach recently developed for detecting cytosine DNA methylation (mC) and analyzing the genome-scale DNA methylation profiling, is called BS-Seq which is based on bisulfite conversion of genomic DNA combined with next-generation sequencing. The method can not only provide an insight into the difference of genome-scale DNA methylation among different organisms, but also reveal the conservation of DNA methylation in all contexts and nucleotide preference for different genomic regions, including genes, exons, and repetitive DNA sequences. It will be helpful to under-stand the epigenetic impacts of cytosine DNA methylation on the regulation of gene expression and maintaining silence of repetitive sequences, such as transposable elements. In this paper, we introduce the preprocessing steps of DNA methylation data, by which cytosine (C) and guanine (G) in the reference sequence are transferred to thymine (T) and adenine (A), and cytosine in reads is transferred to thymine, respectively. We also comprehensively review the main content of the DNA methylation analysis on the genomic scale: (1) the cytosine methylation under the context of different sequences; (2) the distribution of genomic methylcytosine; (3) DNA methylation context and the preference for the nucleotides; (4) DNA- protein interaction sites of DNA methylation; (5) degree of methylation of cytosine in the different structural elements of genes. DNA methylation analysis technique provides a powerful tool for the epigenome study in human and other species, and genes and environment interaction, and founds the theoretical basis for further development of disease diagnostics and therapeutics in human.
Extracting DNA words based on the sequence features: non-uniform distribution and integrity.

PubMed

Li, Zhi; Cao, Hongyan; Cui, Yuehua; Zhang, Yanbo

2016-01-25

DNA sequence can be viewed as an unknown language with words as its functional units. Given that most sequence alignment algorithms such as the motif discovery algorithms depend on the quality of background information about sequences, it is necessary to develop an ab initio algorithm for extracting the "words" based only on the DNA sequences. We considered that non-uniform distribution and integrity were two important features of a word, based on which we developed an ab initio algorithm to extract "DNA words" that have potential functional meaning. A Kolmogorov-Smirnov test was used for consistency test of uniform distribution of DNA sequences, and the integrity was judged by the sequence and position alignment. Two random base sequences were adopted as negative control, and an English book was used as positive control to verify our algorithm. We applied our algorithm to the genomes of Saccharomyces cerevisiae and 10 strains of Escherichia coli to show the utility of the methods. The results provide strong evidences that the algorithm is a promising tool for ab initio building a DNA dictionary. Our method provides a fast way for large scale screening of important DNA elements and offers potential insights into the understanding of a genome.
DNA-based watermarks using the DNA-Crypt algorithm.

PubMed

Heider, Dominik; Barnekow, Angelika

2007-05-29

The aim of this paper is to demonstrate the application of watermarks based on DNA sequences to identify the unauthorized use of genetically modified organisms (GMOs) protected by patents. Predicted mutations in the genome can be corrected by the DNA-Crypt program leaving the encrypted information intact. Existing DNA cryptographic and steganographic algorithms use synthetic DNA sequences to store binary information however, although these sequences can be used for authentication, they may change the target DNA sequence when introduced into living organisms. The DNA-Crypt algorithm and image steganography are based on the same watermark-hiding principle, namely using the least significant base in case of DNA-Crypt and the least significant bit in case of the image steganography. It can be combined with binary encryption algorithms like AES, RSA or Blowfish. DNA-Crypt is able to correct mutations in the target DNA with several mutation correction codes such as the Hamming-code or the WDH-code. Mutations which can occur infrequently may destroy the encrypted information, however an integrated fuzzy controller decides on a set of heuristics based on three input dimensions, and recommends whether or not to use a correction code. These three input dimensions are the length of the sequence, the individual mutation rate and the stability over time, which is represented by the number of generations. In silico experiments using the Ypt7 in Saccharomyces cerevisiae shows that the DNA watermarks produced by DNA-Crypt do not alter the translation of mRNA into protein. The program is able to store watermarks in living organisms and can maintain the original information by correcting mutations itself. Pairwise or multiple sequence alignments show that DNA-Crypt produces few mismatches between the sequences similar to all steganographic algorithms.
DNA-based watermarks using the DNA-Crypt algorithm

PubMed Central

Heider, Dominik; Barnekow, Angelika

2007-01-01

Background The aim of this paper is to demonstrate the application of watermarks based on DNA sequences to identify the unauthorized use of genetically modified organisms (GMOs) protected by patents. Predicted mutations in the genome can be corrected by the DNA-Crypt program leaving the encrypted information intact. Existing DNA cryptographic and steganographic algorithms use synthetic DNA sequences to store binary information however, although these sequences can be used for authentication, they may change the target DNA sequence when introduced into living organisms. Results The DNA-Crypt algorithm and image steganography are based on the same watermark-hiding principle, namely using the least significant base in case of DNA-Crypt and the least significant bit in case of the image steganography. It can be combined with binary encryption algorithms like AES, RSA or Blowfish. DNA-Crypt is able to correct mutations in the target DNA with several mutation correction codes such as the Hamming-code or the WDH-code. Mutations which can occur infrequently may destroy the encrypted information, however an integrated fuzzy controller decides on a set of heuristics based on three input dimensions, and recommends whether or not to use a correction code. These three input dimensions are the length of the sequence, the individual mutation rate and the stability over time, which is represented by the number of generations. In silico experiments using the Ypt7 in Saccharomyces cerevisiae shows that the DNA watermarks produced by DNA-Crypt do not alter the translation of mRNA into protein. Conclusion The program is able to store watermarks in living organisms and can maintain the original information by correcting mutations itself. Pairwise or multiple sequence alignments show that DNA-Crypt produces few mismatches between the sequences similar to all steganographic algorithms. PMID:17535434
Conserved Sequences at the Origin of Adenovirus DNA Replication

PubMed Central

Stillman, Bruce W.; Topp, William C.; Engler, Jeffrey A.

1982-01-01

The origin of adenovirus DNA replication lies within an inverted sequence repetition at either end of the linear, double-stranded viral DNA. Initiation of DNA replication is primed by a deoxynucleoside that is covalently linked to a protein, which remains bound to the newly synthesized DNA. We demonstrate that virion-derived DNA-protein complexes from five human adenovirus serological subgroups (A to E) can act as a template for both the initiation and the elongation of DNA replication in vitro, using nuclear extracts from adenovirus type 2 (Ad2)-infected HeLa cells. The heterologous template DNA-protein complexes were not as active as the homologous Ad2 DNA, most probably due to inefficient initiation by Ad2 replication factors. In an attempt to identify common features which may permit this replication, we have also sequenced the inverted terminal repeated DNA from human adenovirus serotypes Ad4 (group E), Ad9 and Ad10 (group D), and Ad31 (group A), and we have compared these to previously determined sequences from Ad2 and Ad5 (group C), Ad7 (group B), and Ad12 and Ad18 (group A) DNA. In all cases, the sequence around the origin of DNA replication can be divided into two structural domains: a proximal A · T-rich region which is partially conserved among these serotypes, and a distal G · C-rich region which is less well conserved. The G · C-rich region contains sequences similar to sequences present in papovavirus replication origins. The two domains may reflect a dual mechanism for initiation of DNA replication: adenovirus-specific protein priming of replication, and subsequent utilization of this primer by host replication factors for completion of DNA synthesis. Images PMID:7143575

Hardware Acceleration Of Multi-Deme Genetic Algorithm for DNA Codeword Searching

DTIC Science & Technology

2008-01-01

C and G are complementary to each other. A Watson - Crick complement of a DNA sequence is another DNA sequence which replaces all the A with T or vise...versa and replaces all the T with A or vise versa, and also switches the 5’ and 3’ ends. A DNA sequence binds most stably with its Watson - Crick ...bind with 5 Watson - Crick pairs. The length of the longest complementary sequence between two flexible DNA strands, A and B, is the same as the
Molecular genetic and morphological analyses of the African wild dog (Lycaon pictus).

PubMed

Girman, D J; Kat, P W; Mills, M G; Ginsberg, J R; Borner, M; Wilson, V; Fanshawe, J H; Fitzgibbon, C; Lau, L M; Wayne, R K

1993-01-01

African wild dog populations have declined precipitously during the last 100 years in eastern Africa. The possible causes of this decline include a reduction in prey abundance and habitat; disease; and loss of genetic variability accompanied by inbreeding depression. We examined the levels of genetic variability and distinctiveness among populations of African wild dogs using mitochondrial DNA (mtDNA) restriction site and sequence analyses and multivariate analysis of cranial and dental measurements. Our results indicate that the genetic variability of eastern African wild dog populations is comparable to that of southern Africa and similar to levels of variability found in other large canids. Southern and eastern populations of wild dogs show about 1% divergence in mtDNA sequence and form two monophyletic assemblages containing three mtDNA genotypes each. No genotypes are shared between the two regions. With one exception, all wild dogs examined from zoos had southern African genotypes. Morphological analysis supports the distinction of eastern and southern African wild dog populations, and we suggest they should be considered separate subspecies. An eastern African wild dog breeding program should be initiated to ensure preservation of the eastern African form and to slow the loss of genetic variability that, while not yet apparent, will inevitably occur if wild populations continue to decline. Finally, we examined the phylogenetic relationships of wild dogs to other wolf-like canids through analysis of 736 base pairs (bp) of cytochrome b sequence and showed wild dogs to belong to a phylogenetically distinct lineage of the wolf-like canids.
Deep sequencing reveals distinct patterns of DNA methylation in prostate cancer.

PubMed

Kim, Jung H; Dhanasekaran, Saravana M; Prensner, John R; Cao, Xuhong; Robinson, Daniel; Kalyana-Sundaram, Shanker; Huang, Christina; Shankar, Sunita; Jing, Xiaojun; Iyer, Matthew; Hu, Ming; Sam, Lee; Grasso, Catherine; Maher, Christopher A; Palanisamy, Nallasivam; Mehra, Rohit; Kominsky, Hal D; Siddiqui, Javed; Yu, Jindan; Qin, Zhaohui S; Chinnaiyan, Arul M

2011-07-01

Beginning with precursor lesions, aberrant DNA methylation marks the entire spectrum of prostate cancer progression. We mapped the global DNA methylation patterns in select prostate tissues and cell lines using MethylPlex-next-generation sequencing (M-NGS). Hidden Markov model-based next-generation sequence analysis identified ∼68,000 methylated regions per sample. While global CpG island (CGI) methylation was not differential between benign adjacent and cancer samples, overall promoter CGI methylation significantly increased from ~12.6% in benign samples to 19.3% and 21.8% in localized and metastatic cancer tissues, respectively (P-value < 2 × 10(-16)). We found distinct patterns of promoter methylation around transcription start sites, where methylation occurred not only on the CGIs, but also on flanking regions and CGI sparse promoters. Among the 6691 methylated promoters in prostate tissues, 2481 differentially methylated regions (DMRs) are cancer-specific, including numerous novel DMRs. A novel cancer-specific DMR in the WFDC2 promoter showed frequent methylation in cancer (17/22 tissues, 6/6 cell lines), but not in the benign tissues (0/10) and normal PrEC cells. Integration of LNCaP DNA methylation and H3K4me3 data suggested an epigenetic mechanism for alternate transcription start site utilization, and these modifications segregated into distinct regions when present on the same promoter. Finally, we observed differences in repeat element methylation, particularly LINE-1, between ERG gene fusion-positive and -negative cancers, and we confirmed this observation using pyrosequencing on a tissue panel. This comprehensive methylome map will further our understanding of epigenetic regulation in prostate cancer progression.
Sequence Dependent Interactions Between DNA and Single-Walled Carbon Nanotubes

NASA Astrophysics Data System (ADS)

Roxbury, Daniel

It is known that single-stranded DNA adopts a helical wrap around a single-walled carbon nanotube (SWCNT), forming a water-dispersible hybrid molecule. The ability to sort mixtures of SWCNTs based on chirality (electronic species) has recently been demonstrated using special short DNA sequences that recognize certain matching SWCNTs of specific chirality. This thesis investigates the intricacies of DNA-SWCNT sequence-specific interactions through both experimental and molecular simulation studies. The DNA-SWCNT binding strengths were experimentally quantified by studying the kinetics of DNA replacement by a surfactant on the surface of particular SWCNTs. Recognition ability was found to correlate strongly with measured binding strength, e.g. DNA sequence (TAT)4 was found to bind 20 times stronger to the (6,5)-SWCNT than sequence (TAT)4T. Next, using replica exchange molecular dynamics (REMD) simulations, equilibrium structures formed by (a) single-strands and (b) multiple-strands of 12-mer oligonucleotides adsorbed on various SWCNTs were explored. A number of structural motifs were discovered in which the DNA strand wraps around the SWCNT and 'stitches' to itself via hydrogen bonding. Great variability among equilibrium structures was observed and shown to be directly influenced by DNA sequence and SWCNT type. For example, the (6,5)-SWCNT DNA recognition sequence, (TAT)4, was found to wrap in a tight single-stranded right-handed helical conformation. In contrast, DNA sequence T12 forms a beta-barrel left-handed structure on the same SWCNT. These are the first theoretical indications that DNA-based SWCNT selectivity can arise on a molecular level. In a biomedical collaboration with the Mayo Clinic, pathways for DNA-SWCNT internalization into healthy human endothelial cells were explored. Through absorbance spectroscopy, TEM imaging, and confocal fluorescence microscopy, we showed that intracellular concentrations of SWCNTs far exceeded those of the incubation solution, which suggested an energy-dependent pathway. Additionally, by means of pharmacological inhibition and vector-induced gene knockout studies, the DNA-SWCNTs were shown to enter the cells via Rac1-mediated macropinocytosis.
Flow cytometry for enrichment and titration in massively parallel DNA sequencing

PubMed Central

Sandberg, Julia; Ståhl, Patrik L.; Ahmadian, Afshin; Bjursell, Magnus K.; Lundeberg, Joakim

2009-01-01

Massively parallel DNA sequencing is revolutionizing genomics research throughout the life sciences. However, the reagent costs and labor requirements in current sequencing protocols are still substantial, although improvements are continuously being made. Here, we demonstrate an effective alternative to existing sample titration protocols for the Roche/454 system using Fluorescence Activated Cell Sorting (FACS) technology to determine the optimal DNA-to-bead ratio prior to large-scale sequencing. Our method, which eliminates the need for the costly pilot sequencing of samples during titration is capable of rapidly providing accurate DNA-to-bead ratios that are not biased by the quantification and sedimentation steps included in current protocols. Moreover, we demonstrate that FACS sorting can be readily used to highly enrich fractions of beads carrying template DNA, with near total elimination of empty beads and no downstream sacrifice of DNA sequencing quality. Automated enrichment by FACS is a simple approach to obtain pure samples for bead-based sequencing systems, and offers an efficient, low-cost alternative to current enrichment protocols. PMID:19304748
A DNA sequence obtained by replacement of the dopamine RNA aptamer bases is not an aptamer.

PubMed

Álvarez-Martos, Isabel; Ferapontova, Elena E

2017-08-05

A unique specificity of the aptamer-ligand biorecognition and binding facilitates bioanalysis and biosensor development, contributing to discrimination of structurally related molecules, such as dopamine and other catecholamine neurotransmitters. The aptamer sequence capable of specific binding of dopamine is a 57 nucleotides long RNA sequence reported in 1997 (Biochemistry, 1997, 36, 9726). Later, it was suggested that the DNA homologue of the RNA aptamer retains the specificity of dopamine binding (Biochem. Biophys. Res. Commun., 2009, 388, 732). Here, we show that the DNA sequence obtained by the replacement of the RNA aptamer bases for their DNA analogues is not able of specific biorecognition of dopamine, in contrast to the original RNA aptamer sequence. This DNA sequence binds dopamine and structurally related catecholamine neurotransmitters non-specifically, as any DNA sequence, and, thus, is not an aptamer and cannot be used neither for in vivo nor in situ analysis of dopamine in the presence of structurally related neurotransmitters. Copyright © 2017 Elsevier Inc. All rights reserved.
Method for sequencing DNA base pairs

DOEpatents

Sessler, Andrew M.; Dawson, John

1993-01-01

The base pairs of a DNA structure are sequenced with the use of a scanning tunneling microscope (STM). The DNA structure is scanned by the STM probe tip, and, as it is being scanned, the DNA structure is separately subjected to a sequence of infrared radiation from four different sources, each source being selected to preferentially excite one of the four different bases in the DNA structure. Each particular base being scanned is subjected to such sequence of infrared radiation from the four different sources as that particular base is being scanned. The DNA structure as a whole is separately imaged for each subjection thereof to radiation from one only of each source.
Nuclear counterparts of the cytoplasmic mitochondrial 12S rRNA gene: a problem of ancient DNA and molecular phylogenies.

PubMed

van der Kuyl, A C; Kuiken, C L; Dekker, J T; Perizonius, W R; Goudsmit, J

1995-06-01

Monkey mummy bones and teeth originating from the North Saqqara Baboon Galleries (Egypt), soft tissue from a mummified baboon in a museum collection, and nineteenth/twentieth-century skin fragments from mangabeys were used for DNA extraction and PCR amplification of part of the mitochondrial 12S rRNA gene. Sequences aligning with the 12S rRNA gene were recovered but were only distantly related to contemporary monkey mitochondrial 12S rRNA sequences. However, many of these sequences were identical or closely related to human nuclear DNA sequences resembling mitochondrial 12S rRNA (isolated from a cell line depleted in mitochondria) and therefore have to be considered contamination. Subsequently in a separate study we were able to recover genuine mitochondrial 12S rRNA sequences from many extant species of nonhuman Old World primates and sequences closely resembling the human nuclear integrations. Analysis of all sequences by the neighbor-joining (NJ) method indicated that mitochondrial DNA sequences and their nuclear counterparts can be divided into two distinct clusters. One cluster contained all temporary cytoplasmic mitochondrial DNA sequences and approximately half of the monkey nuclear mitochondriallike sequences. A second cluster contained most human nuclear sequences and the other half of monkey nuclear sequences with a separate branch leading to human and gorilla mitochondrial and nuclear sequences. Sequences recovered from ancient materials were equally divided between the two clusters. These results constitute a warning for when working with ancient DNA or performing phylogenetic analysis using mitochondrial DNA as a target sequence: Nuclear counterparts of mitochondrial genes may lead to faulty interpretation of results.
Impact of alemtuzumab on HIV persistence in an HIV-infected individual on antiretroviral therapy with Sezary syndrome.

PubMed

Rasmussen, Thomas A; McMahon, James; Chang, J Judy; Symons, Jori; Roche, Michael; Dantanarayana, Ashanti; Okoye, Afam; Hiener, Bonnie; Palmer, Sarah; Lee, Wen Shi; Kent, Stephen J; Van Der Weyden, Carrie; Prince, H Miles; Cameron, Paul U; Lewin, Sharon R

2017-08-24

To study the effects of alemtuzumab on HIV persistence in an HIV-infected individual on antiretroviral therapy (ART) with Sezary syndrome, a rare malignancy of CD4 T cells. Case report. Blood was collected 30 and 18 months prior to presentation with Sezary syndrome, at the time of presentation and during alemtuzumab. T-cell subsets in malignant (CD7-CD26-TCR-VBeta2+) and nonmalignant cells were quantified by flow cytometry. HIV-DNA in total CD4 T cells, in sorted malignant and nonmalignant CD4 T cells, was quantified by PCR and clonal expansion of HIV-DNA assessed by full-length next-generation sequencing. HIV-hepatitis B virus coinfection was diagnosed and antiretroviral therapy initiated 4 years prior to presentation with Sezary syndrome and primary cutaneous anaplastic large cell lymphoma. The patient received alemtuzumab 10 mg three times per week for 4 weeks but died 6 weeks post alemtuzumab. HIV-DNA was detected in nonmalignant but not in malignant CD4 T cells, consistent with expansion of a noninfected CD4 T-cell clone. Full-length HIV-DNA sequencing demonstrated multiple defective viruses but no identical or expanded sequences. Alemtuzumab extensively depleted T cells, including more than 1 log reduction in total T cells and more than 3 log reduction in CD4 T cells. Finally, alemtuzumab decreased HIV-DNA in CD4 T cells by 57% but HIV-DNA remained detectable at low levels even after depletion of nearly all CD4 T cells. Alemtuzumab extensively depleted multiple T-cell subsets and decreased the frequency of but did not eliminate HIV-infected CD4 T cells. Studying the effects on HIV persistence following immune recovery in HIV-infected individuals who require alemtuzumab for malignancy or in animal studies may provide further insights into novel cure strategies.
Sequence independent amplification of DNA

DOEpatents

Bohlander, S.K.

1998-03-24

The present invention is a rapid sequence-independent amplification procedure (SIA). Even minute amounts of DNA from various sources can be amplified independent of any sequence requirements of the DNA or any a priori knowledge of any sequence characteristics of the DNA to be amplified. This method allows, for example, the sequence independent amplification of microdissected chromosomal material and the reliable construction of high quality fluorescent in situ hybridization (FISH) probes from YACs or from other sources. These probes can be used to localize YACs on metaphase chromosomes but also--with high efficiency--in interphase nuclei. 25 figs.
Sequence independent amplification of DNA

DOEpatents

Bohlander, Stefan K.

1998-01-01

The present invention is a rapid sequence-independent amplification procedure (SIA). Even minute amounts of DNA from various sources can be amplified independent of any sequence requirements of the DNA or any a priori knowledge of any sequence characteristics of the DNA to be amplified. This method allows, for example the sequence independent amplification of microdissected chromosomal material and the reliable construction of high quality fluorescent in situ hybridization (FISH) probes from YACs or from other sources. These probes can be used to localize YACs on metaphase chromosomes but also--with high efficiency--in interphase nuclei.
UV-Visible Spectroscopy-Based Quantification of Unlabeled DNA Bound to Gold Nanoparticles.

PubMed

Baldock, Brandi L; Hutchison, James E

2016-12-20

DNA-functionalized gold nanoparticles have been increasingly applied as sensitive and selective analytical probes and biosensors. The DNA ligands bound to a nanoparticle dictate its reactivity, making it essential to know the type and number of DNA strands bound to the nanoparticle surface. Existing methods used to determine the number of DNA strands per gold nanoparticle (AuNP) require that the sequences be fluorophore-labeled, which may affect the DNA surface coverage and reactivity of the nanoparticle and/or require specialized equipment and other fluorophore-containing reagents. We report a UV-visible-based method to conveniently and inexpensively determine the number of DNA strands attached to AuNPs of different core sizes. When this method is used in tandem with a fluorescence dye assay, it is possible to determine the ratio of two unlabeled sequences of different lengths bound to AuNPs. Two sizes of citrate-stabilized AuNPs (5 and 12 nm) were functionalized with mixtures of short (5 base) and long (32 base) disulfide-terminated DNA sequences, and the ratios of sequences bound to the AuNPs were determined using the new method. The long DNA sequence was present as a lower proportion of the ligand shell than in the ligand exchange mixture, suggesting it had a lower propensity to bind the AuNPs than the short DNA sequence. The ratio of DNA sequences bound to the AuNPs was not the same for the large and small AuNPs, which suggests that the radius of curvature had a significant influence on the assembly of DNA strands onto the AuNPs.
Ancient DNA sequence revealed by error-correcting codes.

PubMed

Brandão, Marcelo M; Spoladore, Larissa; Faria, Luzinete C B; Rocha, Andréa S L; Silva-Filho, Marcio C; Palazzo, Reginaldo

2015-07-10

A previously described DNA sequence generator algorithm (DNA-SGA) using error-correcting codes has been employed as a computational tool to address the evolutionary pathway of the genetic code. The code-generated sequence alignment demonstrated that a residue mutation revealed by the code can be found in the same position in sequences of distantly related taxa. Furthermore, the code-generated sequences do not promote amino acid changes in the deviant genomes through codon reassignment. A Bayesian evolutionary analysis of both code-generated and homologous sequences of the Arabidopsis thaliana malate dehydrogenase gene indicates an approximately 1 MYA divergence time from the MDH code-generated sequence node to its paralogous sequences. The DNA-SGA helps to determine the plesiomorphic state of DNA sequences because a single nucleotide alteration often occurs in distantly related taxa and can be found in the alternative codon patterns of noncanonical genetic codes. As a consequence, the algorithm may reveal an earlier stage of the evolution of the standard code.
Ancient DNA sequence revealed by error-correcting codes

PubMed Central

Brandão, Marcelo M.; Spoladore, Larissa; Faria, Luzinete C. B.; Rocha, Andréa S. L.; Silva-Filho, Marcio C.; Palazzo, Reginaldo

2015-01-01

A previously described DNA sequence generator algorithm (DNA-SGA) using error-correcting codes has been employed as a computational tool to address the evolutionary pathway of the genetic code. The code-generated sequence alignment demonstrated that a residue mutation revealed by the code can be found in the same position in sequences of distantly related taxa. Furthermore, the code-generated sequences do not promote amino acid changes in the deviant genomes through codon reassignment. A Bayesian evolutionary analysis of both code-generated and homologous sequences of the Arabidopsis thaliana malate dehydrogenase gene indicates an approximately 1 MYA divergence time from the MDH code-generated sequence node to its paralogous sequences. The DNA-SGA helps to determine the plesiomorphic state of DNA sequences because a single nucleotide alteration often occurs in distantly related taxa and can be found in the alternative codon patterns of noncanonical genetic codes. As a consequence, the algorithm may reveal an earlier stage of the evolution of the standard code. PMID:26159228
The 28S–18S rDNA intergenic spacer from Crithidia fasciculata: repeated sequences, length heterogeneity, putative processing sites and potential interactions between U3 small nucleolar RNA and the ribosomal RNA precursor

PubMed Central

Schnare, Murray N.; Collings, James C.; Spencer, David F.; Gray, Michael W.

2000-01-01

In Crithidia fasciculata, the ribosomal RNA (rRNA) gene repeats range in size from ∼11 to 12 kb. This length heterogeneity is localized to a region of the intergenic spacer (IGS) that contains tandemly repeated copies of a 19mer sequence. The IGS also contains four copies of an ∼55 nt repeat that has an internal inverted repeat and is also present in the IGS of Leishmania species. We have mapped the C.fasciculata transcription initiation site as well as two other reverse transcriptase stop sites that may be analogous to the A0 and A′ pre-rRNA processing sites within the 5′ external transcribed spacer (ETS) of other eukaryotes. Features that could influence processing at these sites include two stretches of conserved primary sequence and three secondary structure elements present in the 5′ ETS. We also characterized the C.fasciculata U3 snoRNA, which has the potential for base-pairing with pre-rRNA sequences. Finally, we demonstrate that biosynthesis of large subunit rRNA in both C.fasciculata and Trypanosoma brucei involves 3′-terminal addition of three A residues that are not present in the corresponding DNA sequences. PMID:10982863
Multiplex picoliter-droplet digital PCR for quantitative assessment of DNA integrity in clinical samples.

PubMed

Didelot, Audrey; Kotsopoulos, Steve K; Lupo, Audrey; Pekin, Deniz; Li, Xinyu; Atochin, Ivan; Srinivasan, Preethi; Zhong, Qun; Olson, Jeff; Link, Darren R; Laurent-Puig, Pierre; Blons, Hélène; Hutchison, J Brian; Taly, Valerie

2013-05-01

Assessment of DNA integrity and quantity remains a bottleneck for high-throughput molecular genotyping technologies, including next-generation sequencing. In particular, DNA extracted from paraffin-embedded tissues, a major potential source of tumor DNA, varies widely in quality, leading to unpredictable sequencing data. We describe a picoliter droplet-based digital PCR method that enables simultaneous detection of DNA integrity and the quantity of amplifiable DNA. Using a multiplex assay, we detected 4 different target lengths (78, 159, 197, and 550 bp). Assays were validated with human genomic DNA fragmented to sizes of 170 bp to 3000 bp. The technique was validated with DNA quantities as low as 1 ng. We evaluated 12 DNA samples extracted from paraffin-embedded lung adenocarcinoma tissues. One sample contained no amplifiable DNA. The fractions of amplifiable DNA for the 11 other samples were between 0.05% and 10.1% for 78-bp fragments and ≤1% for longer fragments. Four samples were chosen for enrichment and next-generation sequencing. The quality of the sequencing data was in agreement with the results of the DNA-integrity test. Specifically, DNA with low integrity yielded sequencing results with lower levels of coverage and uniformity and had higher levels of false-positive variants. The development of DNA-quality assays will enable researchers to downselect samples or process more DNA to achieve reliable genome sequencing with the highest possible efficiency of cost and effort, as well as minimize the waste of precious samples. © 2013 American Association for Clinical Chemistry.
Detecting and Estimating Contamination of Human DNA Samples in Sequencing and Array-Based Genotype Data

PubMed Central

Jun, Goo; Flickinger, Matthew; Hetrick, Kurt N.; Romm, Jane M.; Doheny, Kimberly F.; Abecasis, Gonçalo R.; Boehnke, Michael; Kang, Hyun Min

2012-01-01

DNA sample contamination is a serious problem in DNA sequencing studies and may result in systematic genotype misclassification and false positive associations. Although methods exist to detect and filter out cross-species contamination, few methods to detect within-species sample contamination are available. In this paper, we describe methods to identify within-species DNA sample contamination based on (1) a combination of sequencing reads and array-based genotype data, (2) sequence reads alone, and (3) array-based genotype data alone. Analysis of sequencing reads allows contamination detection after sequence data is generated but prior to variant calling; analysis of array-based genotype data allows contamination detection prior to generation of costly sequence data. Through a combination of analysis of in silico and experimentally contaminated samples, we show that our methods can reliably detect and estimate levels of contamination as low as 1%. We evaluate the impact of DNA contamination on genotype accuracy and propose effective strategies to screen for and prevent DNA contamination in sequencing studies. PMID:23103226
Integrated sequencing of exome and mRNA of large-sized single cells.

PubMed

Wang, Lily Yan; Guo, Jiajie; Cao, Wei; Zhang, Meng; He, Jiankui; Li, Zhoufang

2018-01-10

Current approaches of single cell DNA-RNA integrated sequencing are difficult to call SNPs, because a large amount of DNA and RNA is lost during DNA-RNA separation. Here, we performed simultaneous single-cell exome and transcriptome sequencing on individual mouse oocytes. Using microinjection, we kept the nuclei intact to avoid DNA loss, while retaining the cytoplasm inside the cell membrane, to maximize the amount of DNA and RNA captured from the single cell. We then conducted exome-sequencing on the isolated nuclei and mRNA-sequencing on the enucleated cytoplasm. For single oocytes, exome-seq can cover up to 92% of exome region with an average sequencing depth of 10+, while mRNA-sequencing reveals more than 10,000 expressed genes in enucleated cytoplasm, with similar performance for intact oocytes. This approach provides unprecedented opportunities to study DNA-RNA regulation, such as RNA editing at single nucleotide level in oocytes. In future, this method can also be applied to other large cells, including neurons, large dendritic cells and large tumour cells for integrated exome and transcriptome sequencing.
A simple method for semi-random DNA amplicon fragmentation using the methylation-dependent restriction enzyme MspJI.

PubMed

Shinozuka, Hiroshi; Cogan, Noel O I; Shinozuka, Maiko; Marshall, Alexis; Kay, Pippa; Lin, Yi-Han; Spangenberg, German C; Forster, John W

2015-04-11

Fragmentation at random nucleotide locations is an essential process for preparation of DNA libraries to be used on massively parallel short-read DNA sequencing platforms. Although instruments for physical shearing, such as the Covaris S2 focused-ultrasonicator system, and products for enzymatic shearing, such as the Nextera technology and NEBNext dsDNA Fragmentase kit, are commercially available, a simple and inexpensive method is desirable for high-throughput sequencing library preparation. MspJI is a recently characterised restriction enzyme which recognises the sequence motif CNNR (where R = G or A) when the first base is modified to 5-methylcytosine or 5-hydroxymethylcytosine. A semi-random enzymatic DNA amplicon fragmentation method was developed based on the unique cleavage properties of MspJI. In this method, random incorporation of 5-methyl-2'-deoxycytidine-5'-triphosphate is achieved through DNA amplification with DNA polymerase, followed by DNA digestion with MspJI. Due to the recognition sequence of the enzyme, DNA amplicons are fragmented in a relatively sequence-independent manner. The size range of the resulting fragments was capable of control through optimisation of 5-methyl-2'-deoxycytidine-5'-triphosphate concentration in the reaction mixture. A library suitable for sequencing using the Illumina MiSeq platform was prepared and processed using the proposed method. Alignment of generated short reads to a reference sequence demonstrated a relatively high level of random fragmentation. The proposed method may be performed with standard laboratory equipment. Although the uniformity of coverage was slightly inferior to the Covaris physical shearing procedure, due to efficiencies of cost and labour, the method may be more suitable than existing approaches for implementation in large-scale sequencing activities, such as bacterial artificial chromosome (BAC)-based genome sequence assembly, pan-genomic studies and locus-targeted genotyping-by-sequencing.
Genomics dataset of unidentified disclosed isolates.

PubMed

Rekadwad, Bhagwan N

2016-09-01

Analysis of DNA sequences is necessary for higher hierarchical classification of the organisms. It gives clues about the characteristics of organisms and their taxonomic position. This dataset is chosen to find complexities in the unidentified DNA in the disclosed patents. A total of 17 unidentified DNA sequences were thoroughly analyzed. The quick response codes were generated. AT/GC content of the DNA sequences analysis was carried out. The QR is helpful for quick identification of isolates. AT/GC content is helpful for studying their stability at different temperatures. Additionally, a dataset on cleavage code and enzyme code studied under the restriction digestion study, which helpful for performing studies using short DNA sequences was reported. The dataset disclosed here is the new revelatory data for exploration of unique DNA sequences for evaluation, identification, comparison and analysis.

Winnowing DNA for Rare Sequences: Highly Specific Sequence and Methylation Based Enrichment

PubMed Central

Thompson, Jason D.; Shibahara, Gosuke; Rajan, Sweta; Pel, Joel; Marziali, Andre

2012-01-01

Rare mutations in cell populations are known to be hallmarks of many diseases and cancers. Similarly, differential DNA methylation patterns arise in rare cell populations with diagnostic potential such as fetal cells circulating in maternal blood. Unfortunately, the frequency of alleles with diagnostic potential, relative to wild-type background sequence, is often well below the frequency of errors in currently available methods for sequence analysis, including very high throughput DNA sequencing. We demonstrate a DNA preparation and purification method that through non-linear electrophoretic separation in media containing oligonucleotide probes, achieves 10,000 fold enrichment of target DNA with single nucleotide specificity, and 100 fold enrichment of unmodified methylated DNA differing from the background by the methylation of a single cytosine residue. PMID:22355378
Low-Energy Electron-Induced Strand Breaks in Telomere-Derived DNA Sequences-Influence of DNA Sequence and Topology.

PubMed

Rackwitz, Jenny; Bald, Ilko

2018-03-26

During cancer radiation therapy high-energy radiation is used to reduce tumour tissue. The irradiation produces a shower of secondary low-energy (<20 eV) electrons, which are able to damage DNA very efficiently by dissociative electron attachment. Recently, it was suggested that low-energy electron-induced DNA strand breaks strongly depend on the specific DNA sequence with a high sensitivity of G-rich sequences. Here, we use DNA origami platforms to expose G-rich telomere sequences to low-energy (8.8 eV) electrons to determine absolute cross sections for strand breakage and to study the influence of sequence modifications and topology of telomeric DNA on the strand breakage. We find that the telomeric DNA 5'-(TTA GGG) 2 is more sensitive to low-energy electrons than an intermixed sequence 5'-(TGT GTG A) 2 confirming the unique electronic properties resulting from G-stacking. With increasing length of the oligonucleotide (i.e., going from 5'-(GGG ATT) 2 to 5'-(GGG ATT) 4 ), both the variety of topology and the electron-induced strand break cross sections increase. Addition of K + ions decreases the strand break cross section for all sequences that are able to fold G-quadruplexes or G-intermediates, whereas the strand break cross section for the intermixed sequence remains unchanged. These results indicate that telomeric DNA is rather sensitive towards low-energy electron-induced strand breakage suggesting significant telomere shortening that can also occur during cancer radiation therapy. © 2018 Wiley-VCH Verlag GmbH & Co. KGaA, Weinheim.
Direct Nanoscale Conversion of Biomolecular Signals into Electronic Information

DTIC Science & Technology

2008-09-22

the electrode surface. In this experiment, the single free cysteine group featured in the GOx structure was exploited to demonstrate that orientation...first with GOx-ssDNA conjugates featuring a sequence complementary to the address strand, then with a non-complementary conjugate and finally with...fully-functional for an enzyme that features a free thiol group, or that can be engineered to incorporate a thiol onto its outer shell
Rapid, Multiplexed Microfluidic Phage Display

DTIC Science & Technology

2012-01-01

affinity phage- displayed peptides for multiple targets in just a single round and without the need for bacterial infection. The chip is shown to be able...by bacterial titer and amplification, and at least two additional rounds of selection. After the final round of biopan- ning, eluted phage are grown on...agar plates, and individual plaques are selected for DNA characterization to determine the amino acid sequence of the phage-displayed peptides. While
Improved multiple displacement amplification (iMDA) and ultraclean reagents.

PubMed

Motley, S Timothy; Picuri, John M; Crowder, Chris D; Minich, Jeremiah J; Hofstadler, Steven A; Eshoo, Mark W

2014-06-06

Next-generation sequencing sample preparation requires nanogram to microgram quantities of DNA; however, many relevant samples are comprised of only a few cells. Genomic analysis of these samples requires a whole genome amplification method that is unbiased and free of exogenous DNA contamination. To address these challenges we have developed protocols for the production of DNA-free consumables including reagents and have improved upon multiple displacement amplification (iMDA). A specialized ethylene oxide treatment was developed that renders free DNA and DNA present within Gram positive bacterial cells undetectable by qPCR. To reduce DNA contamination in amplification reagents, a combination of ion exchange chromatography, filtration, and lot testing protocols were developed. Our multiple displacement amplification protocol employs a second strand-displacing DNA polymerase, improved buffers, improved reaction conditions and DNA free reagents. The iMDA protocol, when used in combination with DNA-free laboratory consumables and reagents, significantly improved efficiency and accuracy of amplification and sequencing of specimens with moderate to low levels of DNA. The sensitivity and specificity of sequencing of amplified DNA prepared using iMDA was compared to that of DNA obtained with two commercial whole genome amplification kits using 10 fg (~1-2 bacterial cells worth) of bacterial genomic DNA as a template. Analysis showed >99% of the iMDA reads mapped to the template organism whereas only 0.02% of the reads from the commercial kits mapped to the template. To assess the ability of iMDA to achieve balanced genomic coverage, a non-stochastic amount of bacterial genomic DNA (1 pg) was amplified and sequenced, and data obtained were compared to sequencing data obtained directly from genomic DNA. The iMDA DNA and genomic DNA sequencing had comparable coverage 99.98% of the reference genome at ≥1X coverage and 99.9% at ≥5X coverage while maintaining both balance and representation of the genome. The iMDA protocol in combination with DNA-free laboratory consumables, significantly improved the ability to sequence specimens with low levels of DNA. iMDA has broad utility in metagenomics, diagnostics, ancient DNA analysis, pre-implantation embryo screening, single-cell genomics, whole genome sequencing of unculturable organisms, and forensic applications for both human and microbial targets.
Strong minor groove base conservation in sequence logos implies DNA distortion or base flipping during replication and transcription initiation.

PubMed

Schneider, T D

2001-12-01

The sequence logo for DNA binding sites of the bacteriophage P1 replication protein RepA shows unusually high sequence conservation ( approximately 2 bits) at a minor groove that faces RepA. However, B-form DNA can support only 1 bit of sequence conservation via contacts into the minor groove. The high conservation in RepA sites therefore implies a distorted DNA helix with direct or indirect contacts to the protein. Here I show that a high minor groove conservation signature also appears in sequence logos of sites for other replication origin binding proteins (Rts1, DnaA, P4 alpha, EBNA1, ORC) and promoter binding proteins (sigma(70), sigma(D) factors). This finding implies that DNA binding proteins generally use non-B-form DNA distortion such as base flipping to initiate replication and transcription.
Molecular design of sequence specific DNA alkylating agents.

PubMed

Minoshima, Masafumi; Bando, Toshikazu; Shinohara, Ken-ichi; Sugiyama, Hiroshi

2009-01-01

Sequence-specific DNA alkylating agents have great interest for novel approach to cancer chemotherapy. We designed the conjugates between pyrrole (Py)-imidazole (Im) polyamides and DNA alkylating chlorambucil moiety possessing at different positions. The sequence-specific DNA alkylation by conjugates was investigated by using high-resolution denaturing polyacrylamide gel electrophoresis (PAGE). The results showed that polyamide chlorambucil conjugates alkylate DNA at flanking adenines in recognition sequences of Py-Im polyamides, however, the reactivities and alkylation sites were influenced by the positions of conjugation. In addition, we synthesized conjugate between Py-Im polyamide and another alkylating agent, 1-(chloromethyl)-5-hydroxy-1,2-dihydro-3H-benz[e]indole (seco-CBI). DNA alkylation reactivies by both alkylating polyamides were almost comparable. In contrast, cytotoxicities against cell lines differed greatly. These comparative studies would promote development of appropriate sequence-specific DNA alkylating polyamides against specific cancer cells.
Phylogenetic analysis of higher-level relationships within Hydroidolina (Cnidaria: Hydrozoa) using mitochondrial genome data and insight into their mitochondrial transcription.

PubMed

Kayal, Ehsan; Bentlage, Bastian; Cartwright, Paulyn; Yanagihara, Angel A; Lindsay, Dhugal J; Hopcroft, Russell R; Collins, Allen G

2015-01-01

Hydrozoans display the most morphological diversity within the phylum Cnidaria. While recent molecular studies have provided some insights into their evolutionary history, sister group relationships remain mostly unresolved, particularly at mid-taxonomic levels. Specifically, within Hydroidolina, the most speciose hydrozoan subclass, the relationships and sometimes integrity of orders are highly unsettled. Here we obtained the near complete mitochondrial sequence of twenty-six hydroidolinan hydrozoan species from a range of sources (DNA and RNA-seq data, long-range PCR). Our analyses confirm previous inference of the evolution of mtDNA in Hydrozoa while introducing a novel genome organization. Using RNA-seq data, we propose a mechanism for the expression of mitochondrial mRNA in Hydroidolina that can be extrapolated to the other medusozoan taxa. Phylogenetic analyses using the full set of mitochondrial gene sequences provide some insights into the order-level relationships within Hydroidolina, including siphonophores as the first diverging clade, a well-supported clade comprised of Leptothecata-Filifera III-IV, and a second clade comprised of Aplanulata-Capitata s.s.-Filifera I-II. Finally, we describe our relatively inexpensive and accessible multiplexing strategy to sequence long-range PCR amplicons that can be adapted to most high-throughput sequencing platforms.
Phylogenetic analysis of higher-level relationships within Hydroidolina (Cnidaria: Hydrozoa) using mitochondrial genome data and insight into their mitochondrial transcription

PubMed Central

Bentlage, Bastian; Cartwright, Paulyn; Yanagihara, Angel A.; Lindsay, Dhugal J.; Hopcroft, Russell R.; Collins, Allen G.

2015-01-01

Hydrozoans display the most morphological diversity within the phylum Cnidaria. While recent molecular studies have provided some insights into their evolutionary history, sister group relationships remain mostly unresolved, particularly at mid-taxonomic levels. Specifically, within Hydroidolina, the most speciose hydrozoan subclass, the relationships and sometimes integrity of orders are highly unsettled. Here we obtained the near complete mitochondrial sequence of twenty-six hydroidolinan hydrozoan species from a range of sources (DNA and RNA-seq data, long-range PCR). Our analyses confirm previous inference of the evolution of mtDNA in Hydrozoa while introducing a novel genome organization. Using RNA-seq data, we propose a mechanism for the expression of mitochondrial mRNA in Hydroidolina that can be extrapolated to the other medusozoan taxa. Phylogenetic analyses using the full set of mitochondrial gene sequences provide some insights into the order-level relationships within Hydroidolina, including siphonophores as the first diverging clade, a well-supported clade comprised of Leptothecata-Filifera III–IV, and a second clade comprised of Aplanulata-Capitata s.s.-Filifera I–II. Finally, we describe our relatively inexpensive and accessible multiplexing strategy to sequence long-range PCR amplicons that can be adapted to most high-throughput sequencing platforms. PMID:26618080
Utility of 16S rDNA Sequencing for Identification of Rare Pathogenic Bacteria.

PubMed

Loong, Shih Keng; Khor, Chee Sieng; Jafar, Faizatul Lela; AbuBakar, Sazaly

2016-11-01

Phenotypic identification systems are established methods for laboratory identification of bacteria causing human infections. Here, the utility of phenotypic identification systems was compared against 16S rDNA identification method on clinical isolates obtained during a 5-year study period, with special emphasis on isolates that gave unsatisfactory identification. One hundred and eighty-seven clinical bacteria isolates were tested with commercial phenotypic identification systems and 16S rDNA sequencing. Isolate identities determined using phenotypic identification systems and 16S rDNA sequencing were compared for similarity at genus and species level, with 16S rDNA sequencing as the reference method. Phenotypic identification systems identified ~46% (86/187) of the isolates with identity similar to that identified using 16S rDNA sequencing. Approximately 39% (73/187) and ~15% (28/187) of the isolates showed different genus identity and could not be identified using the phenotypic identification systems, respectively. Both methods succeeded in determining the species identities of 55 isolates; however, only ~69% (38/55) of the isolates matched at species level. 16S rDNA sequencing could not determine the species of ~20% (37/187) of the isolates. The 16S rDNA sequencing is a useful method over the phenotypic identification systems for the identification of rare and difficult to identify bacteria species. The 16S rDNA sequencing method, however, does have limitation for species-level identification of some bacteria highlighting the need for better bacterial pathogen identification tools. © 2016 Wiley Periodicals, Inc.
Enhancing the GABI-Kat Arabidopsis thaliana T-DNA Insertion Mutant Database by Incorporating Araport11 Annotation.

PubMed

Kleinboelting, Nils; Huep, Gunnar; Weisshaar, Bernd

2017-01-01

SimpleSearch provides access to a database containing information about T-DNA insertion lines of the GABI-Kat collection of Arabidopsis thaliana mutants. These mutants are an important tool for reverse genetics, and GABI-Kat is the second largest collection of such T-DNA insertion mutants. Insertion sites were deduced from flanking sequence tags (FSTs), and the database contains information about mutant plant lines as well as insertion alleles. Here, we describe improvements within the interface (available at http://www.gabi-kat.de/db/genehits.php) and with regard to the database content that have been realized in the last five years. These improvements include the integration of the Araport11 genome sequence annotation data containing the recently updated A. thaliana structural gene descriptions, an updated visualization component that displays groups of insertions with very similar insertion positions, mapped confirmation sequences, and primers. The visualization component provides a quick way to identify insertions of interest, and access to improved data about the exact structure of confirmed insertion alleles. In addition, the database content has been extended by incorporating additional insertion alleles that were detected during the confirmation process, as well as by adding new FSTs that have been produced during continued efforts to complement gaps in FST availability. Finally, the current database content regarding predicted and confirmed insertion alleles as well as primer sequences has been made available as downloadable flat files. © The Author 2016. Published by Oxford University Press on behalf of Japanese Society of Plant Physiologists.
DNA origami metallized site specifically to form electrically conductive nanowires.

PubMed

Pearson, Anthony C; Liu, Jianfei; Pound, Elisabeth; Uprety, Bibek; Woolley, Adam T; Davis, Robert C; Harb, John N

2012-09-06

DNA origami is a promising tool for use as a template in the design and fabrication of nanoscale structures. The ability to engineer selected staple strands on a DNA origami structure provides a high density of addressable locations across the structure. Here we report a method using site-specific attachment of gold nanoparticles to modified staple strands and subsequent metallization to fabricate conductive wires from DNA origami templates. We have modified DNA origami structures by lengthening each staple strand in select regions with a 10-base nucleotide sequence and have attached DNA-modified gold nanoparticles to the lengthened staple strands via complementary base-pairing. The high density of extended staple strands allowed the gold nanoparticles to pack tightly in the modified regions of the DNA origami, where the measured median gap size between neighboring particles was 4.1 nm. Gold metallization processes were optimized so that the attached gold nanoparticles grew until gaps between particles were filled and uniform continuous nanowires were formed. Finally, electron beam lithography was used to pattern electrodes in order to measure the electrical conductivity of metallized DNA origami, which showed an average resistance of 2.4 kΩ per metallized structure.
Studying Different Binding and Intracellular Delivery Efficiency of ssDNA Single-Walled Carbon Nanotubes and Their Effects on LC3-Related Autophagy in Renal Mesangial Cells via miRNA-382.

PubMed

Wang, Guobao; Zhao, Tingting; Wang, Leyu; Hu, Bianxiang; Darabi, Ali; Lin, Jiansheng; Xing, Malcolm M Q; Qiu, Xiaozhong

2015-11-25

Single-walled carbon nanotubes (SWCNTs) have been used to deliver single-stranded (ssDNA). ssDNA in oligonucleotide can act as an inhibitor of microRNA to regulate cellular functions. However, these ssDNA are difficult to bind carbon nanotubes with low transferring efficiency to cells. To this end, we designed ssDNA with regulatory and functional units to form ssDNA-SWCNT hybrids to study their binding effects and transferring efficiency. The functional unit on ssDNA mimics the inhibitor (MI) of miRNA-382, which plays a crucial role in the progress of many diseases such as renal interstitial fibrosis. After verification of overexpression of miRNA-382 in a coculture system, we designed oligonucleotide sequences (GCG)5-MI, (TAT)5-MI, and N23-MI as regulatory units added to the 5'-terminal end of the functional DNA fragment, respectively. These regulatory units lead to different secondary structures and thus exhibit different affinity ability to SWCNTs, and finally decide their deliver efficacy to cells. Autophagy, apoptosis and necrosis were observed in renal mesangial cells.
Mapping the Space of Genomic Signatures

PubMed Central

Kari, Lila; Hill, Kathleen A.; Sayem, Abu S.; Karamichalis, Rallis; Bryans, Nathaniel; Davis, Katelyn; Dattani, Nikesh S.

2015-01-01

We propose a computational method to measure and visualize interrelationships among any number of DNA sequences allowing, for example, the examination of hundreds or thousands of complete mitochondrial genomes. An "image distance" is computed for each pair of graphical representations of DNA sequences, and the distances are visualized as a Molecular Distance Map: Each point on the map represents a DNA sequence, and the spatial proximity between any two points reflects the degree of structural similarity between the corresponding sequences. The graphical representation of DNA sequences utilized, Chaos Game Representation (CGR), is genome- and species-specific and can thus act as a genomic signature. Consequently, Molecular Distance Maps could inform species identification, taxonomic classifications and, to a certain extent, evolutionary history. The image distance employed, Structural Dissimilarity Index (DSSIM), implicitly compares the occurrences of oligomers of length up to k (herein k = 9) in DNA sequences. We computed DSSIM distances for more than 5 million pairs of complete mitochondrial genomes, and used Multi-Dimensional Scaling (MDS) to obtain Molecular Distance Maps that visually display the sequence relatedness in various subsets, at different taxonomic levels. This general-purpose method does not require DNA sequence alignment and can thus be used to compare similar or vastly different DNA sequences, genomic or computer-generated, of the same or different lengths. We illustrate potential uses of this approach by applying it to several taxonomic subsets: phylum Vertebrata, (super)kingdom Protista, classes Amphibia-Insecta-Mammalia, class Amphibia, and order Primates. This analysis of an extensive dataset confirms that the oligomer composition of full mtDNA sequences can be a source of taxonomic information. This method also correctly finds the mtDNA sequences most closely related to that of the anatomically modern human (the Neanderthal, the Denisovan, and the chimp), and that the sequence most different from it in this dataset belongs to a cucumber. PMID:26000734
The number of reduced alignments between two DNA sequences

PubMed Central

2014-01-01

Background In this study we consider DNA sequences as mathematical strings. Total and reduced alignments between two DNA sequences have been considered in the literature to measure their similarity. Results for explicit representations of some alignments have been already obtained. Results We present exact, explicit and computable formulas for the number of different possible alignments between two DNA sequences and a new formula for a class of reduced alignments. Conclusions A unified approach for a wide class of alignments between two DNA sequences has been provided. The formula is computable and, if complemented by software development, will provide a deeper insight into the theory of sequence alignment and give rise to new comparison methods. AMS Subject Classification Primary 92B05, 33C20, secondary 39A14, 65Q30 PMID:24684679
Novel numerical and graphical representation of DNA sequences and proteins.

PubMed

Randić, M; Novic, M; Vikić-Topić, D; Plavsić, D

2006-12-01

We have introduced novel numerical and graphical representations of DNA, which offer a simple and unique characterization of DNA sequences. The numerical representation of a DNA sequence is given as a sequence of real numbers derived from a unique graphical representation of the standard genetic code. There is no loss of information on the primary structure of a DNA sequence associated with this numerical representation. The novel representations are illustrated with the coding sequences of the first exon of beta-globin gene of half a dozen species in addition to human. The method can be extended to proteins as is exemplified by humanin, a 24-aa peptide that has recently been identified as a specific inhibitor of neuronal cell death induced by familial Alzheimer's disease mutant genes.
Capillary electrophoresis of Big-Dye terminator sequencing reactions for human mtDNA Control Region haplotyping in the identification of human remains.

PubMed

Montesino, Marta; Prieto, Lourdes

2012-01-01

Cycle sequencing reaction with Big-Dye terminators provides the methodology to analyze mtDNA Control Region amplicons by means of capillary electrophoresis. DNA sequencing with ddNTPs or terminators was developed by (1). The progressive automation of the method by combining the use of fluorescent-dye terminators with cycle sequencing has made it possible to increase the sensibility and efficiency of the method and hence has allowed its introduction into the forensic field. PCR-generated mitochondrial DNA products are the templates for sequencing reactions. Different set of primers can be used to generate amplicons with different sizes according to the quality and quantity of the DNA extract providing sequence data for different ranges inside the Control Region.
Gene Identification Algorithms Using Exploratory Statistical Analysis of Periodicity

NASA Astrophysics Data System (ADS)

Mukherjee, Shashi Bajaj; Sen, Pradip Kumar

2010-10-01

Studying periodic pattern is expected as a standard line of attack for recognizing DNA sequence in identification of gene and similar problems. But peculiarly very little significant work is done in this direction. This paper studies statistical properties of DNA sequences of complete genome using a new technique. A DNA sequence is converted to a numeric sequence using various types of mappings and standard Fourier technique is applied to study the periodicity. Distinct statistical behaviour of periodicity parameters is found in coding and non-coding sequences, which can be used to distinguish between these parts. Here DNA sequences of Drosophila melanogaster were analyzed with significant accuracy.
Effectiveness of a cloning and sequencing exercise on student learning with subsequent publication in the National Center for Biotechnology Information GenBank.

PubMed

Lau, Joann M; Robinson, David L

2009-01-01

With rapid advances in biotechnology and molecular biology, instructors are challenged to not only provide undergraduate students with hands-on experiences in these disciplines but also to engage them in the "real-world" scientific process. Two common topics covered in biotechnology or molecular biology courses are gene-cloning and bioinformatics, but to provide students with a continuous laboratory-based research experience in these techniques is difficult. To meet these challenges, we have partnered with Bio-Rad Laboratories in the development of the "Cloning and Sequencing Explorer Series," which combines wet-lab experiences (e.g., DNA extraction, polymerase chain reaction, ligation, transformation, and restriction digestion) with bioinformatics analysis (e.g., evaluation of DNA sequence quality, sequence editing, Basic Local Alignment Search Tool searches, contig construction, intron identification, and six-frame translation) to produce a sequence publishable in the National Center for Biotechnology Information GenBank. This 6- to 8-wk project-based exercise focuses on a pivotal gene of glycolysis (glyceraldehyde-3-phosphate dehydrogenase), in which students isolate, sequence, and characterize the gene from a plant species or cultivar not yet published in GenBank. Student achievement was evaluated using pre-, mid-, and final-test assessments, as well as with a survey to assess student perceptions. Student confidence with basic laboratory techniques and knowledge of bioinformatics tools were significantly increased upon completion of this hands-on exercise.
Signal sequence and keyword trap in silico for selection of full-length human cDNAs encoding secretion or membrane proteins from oligo-capped cDNA libraries.

PubMed

Otsuki, Tetsuji; Ota, Toshio; Nishikawa, Tetsuo; Hayashi, Koji; Suzuki, Yutaka; Yamamoto, Jun-ichi; Wakamatsu, Ai; Kimura, Kouichi; Sakamoto, Katsuhiko; Hatano, Naoto; Kawai, Yuri; Ishii, Shizuko; Saito, Kaoru; Kojima, Shin-ichi; Sugiyama, Tomoyasu; Ono, Tetsuyoshi; Okano, Kazunori; Yoshikawa, Yoko; Aotsuka, Satoshi; Sasaki, Naokazu; Hattori, Atsushi; Okumura, Koji; Nagai, Keiichi; Sugano, Sumio; Isogai, Takao

2005-01-01

We have developed an in silico method of selection of human full-length cDNAs encoding secretion or membrane proteins from oligo-capped cDNA libraries. Fullness rates were increased to about 80% by combination of the oligo-capping method and ATGpr, software for prediction of translation start point and the coding potential. Then, using 5'-end single-pass sequences, cDNAs having the signal sequence were selected by PSORT ('signal sequence trap'). We also applied 'secretion or membrane protein-related keyword trap' based on the result of BLAST search against the SWISS-PROT database for the cDNAs which could not be selected by PSORT. Using the above procedures, 789 cDNAs were primarily selected and subjected to full-length sequencing, and 334 of these cDNAs were finally selected as novel. Most of the cDNAs (295 cDNAs: 88.3%) were predicted to encode secretion or membrane proteins. In particular, 165(80.5%) of the 205 cDNAs selected by PSORT were predicted to have signal sequences, while 70 (54.2%) of the 129 cDNAs selected by 'keyword trap' preserved the secretion or membrane protein-related keywords. Many important cDNAs were obtained, including transporters, receptors, and ligands, involved in significant cellular functions. Thus, an efficient method of selecting secretion or membrane protein-encoding cDNAs was developed by combining the above four procedures.

Sequencing of adenine in DNA by scanning tunneling microscopy

NASA Astrophysics Data System (ADS)

Tanaka, Hiroyuki; Taniguchi, Masateru

2017-08-01

The development of DNA sequencing technology utilizing the detection of a tunnel current is important for next-generation sequencer technologies based on single-molecule analysis technology. Using a scanning tunneling microscope, we previously reported that dI/dV measurements and dI/dV mapping revealed that the guanine base (purine base) of DNA adsorbed onto the Cu(111) surface has a characteristic peak at V s = -1.6 V. If, in addition to guanine, the other purine base of DNA, namely, adenine, can be distinguished, then by reading all the purine bases of each single strand of a DNA double helix, the entire base sequence of the original double helix can be determined due to the complementarity of the DNA base pair. Therefore, the ability to read adenine is important from the viewpoint of sequencing. Here, we report on the identification of adenine by STM topographic and spectroscopic measurements using a synthetic DNA oligomer and viral DNA.
Complete mitochondrial genome sequence of a Middle Pleistocene cave bear reconstructed from ultrashort DNA fragments.

PubMed

Dabney, Jesse; Knapp, Michael; Glocke, Isabelle; Gansauge, Marie-Theres; Weihmann, Antje; Nickel, Birgit; Valdiosera, Cristina; García, Nuria; Pääbo, Svante; Arsuaga, Juan-Luis; Meyer, Matthias

2013-09-24

Although an inverse relationship is expected in ancient DNA samples between the number of surviving DNA fragments and their length, ancient DNA sequencing libraries are strikingly deficient in molecules shorter than 40 bp. We find that a loss of short molecules can occur during DNA extraction and present an improved silica-based extraction protocol that enables their efficient retrieval. In combination with single-stranded DNA library preparation, this method enabled us to reconstruct the mitochondrial genome sequence from a Middle Pleistocene cave bear (Ursus deningeri) bone excavated at Sima de los Huesos in the Sierra de Atapuerca, Spain. Phylogenetic reconstructions indicate that the U. deningeri sequence forms an early diverging sister lineage to all Western European Late Pleistocene cave bears. Our results prove that authentic ancient DNA can be preserved for hundreds of thousand years outside of permafrost. Moreover, the techniques presented enable the retrieval of phylogenetically informative sequences from samples in which virtually all DNA is diminished to fragments shorter than 50 bp.
Complete mitochondrial genome sequence of a Middle Pleistocene cave bear reconstructed from ultrashort DNA fragments

PubMed Central

Dabney, Jesse; Knapp, Michael; Glocke, Isabelle; Gansauge, Marie-Theres; Weihmann, Antje; Nickel, Birgit; Valdiosera, Cristina; García, Nuria; Pääbo, Svante; Arsuaga, Juan-Luis; Meyer, Matthias

2013-01-01

Although an inverse relationship is expected in ancient DNA samples between the number of surviving DNA fragments and their length, ancient DNA sequencing libraries are strikingly deficient in molecules shorter than 40 bp. We find that a loss of short molecules can occur during DNA extraction and present an improved silica-based extraction protocol that enables their efficient retrieval. In combination with single-stranded DNA library preparation, this method enabled us to reconstruct the mitochondrial genome sequence from a Middle Pleistocene cave bear (Ursus deningeri) bone excavated at Sima de los Huesos in the Sierra de Atapuerca, Spain. Phylogenetic reconstructions indicate that the U. deningeri sequence forms an early diverging sister lineage to all Western European Late Pleistocene cave bears. Our results prove that authentic ancient DNA can be preserved for hundreds of thousand years outside of permafrost. Moreover, the techniques presented enable the retrieval of phylogenetically informative sequences from samples in which virtually all DNA is diminished to fragments shorter than 50 bp. PMID:24019490
Crystal structure of MboIIA methyltransferase.

PubMed

Osipiuk, Jerzy; Walsh, Martin A; Joachimiak, Andrzej

2003-09-15

DNA methyltransferases (MTases) are sequence-specific enzymes which transfer a methyl group from S-adenosyl-L-methionine (AdoMet) to the amino group of either cytosine or adenine within a recognized DNA sequence. Methylation of a base in a specific DNA sequence protects DNA from nucleolytic cleavage by restriction enzymes recognizing the same DNA sequence. We have determined at 1.74 A resolution the crystal structure of a beta-class DNA MTase MboIIA (M.MboIIA) from the bacterium Moraxella bovis, the smallest DNA MTase determined to date. M.MboIIA methylates the 3' adenine of the pentanucleotide sequence 5'-GAAGA-3'. The protein crystallizes with two molecules in the asymmetric unit which we propose to resemble the dimer when M.MboIIA is not bound to DNA. The overall structure of the enzyme closely resembles that of M.RsrI. However, the cofactor-binding pocket in M.MboIIA forms a closed structure which is in contrast to the open-form structures of other known MTases.
Genomics dataset on unclassified published organism (patent US 7547531).

PubMed

Khan Shawan, Mohammad Mahfuz Ali; Hasan, Md Ashraful; Hossain, Md Mozammel; Hasan, Md Mahmudul; Parvin, Afroza; Akter, Salina; Uddin, Kazi Rasel; Banik, Subrata; Morshed, Mahbubul; Rahman, Md Nazibur; Rahman, S M Badier

2016-12-01

Nucleotide (DNA) sequence analysis provides important clues regarding the characteristics and taxonomic position of an organism. With the intention that, DNA sequence analysis is very crucial to learn about hierarchical classification of that particular organism. This dataset (patent US 7547531) is chosen to simplify all the complex raw data buried in undisclosed DNA sequences which help to open doors for new collaborations. In this data, a total of 48 unidentified DNA sequences from patent US 7547531 were selected and their complete sequences were retrieved from NCBI BioSample database. Quick response (QR) code of those DNA sequences was constructed by DNA BarID tool. QR code is useful for the identification and comparison of isolates with other organisms. AT/GC content of the DNA sequences was determined using ENDMEMO GC Content Calculator, which indicates their stability at different temperature. The highest GC content was observed in GP445188 (62.5%) which was followed by GP445198 (61.8%) and GP445189 (59.44%), while lowest was in GP445178 (24.39%). In addition, New England BioLabs (NEB) database was used to identify cleavage code indicating the 5, 3 and blunt end and enzyme code indicating the methylation site of the DNA sequences was also shown. These data will be helpful for the construction of the organisms' hierarchical classification, determination of their phylogenetic and taxonomic position and revelation of their molecular characteristics.
Fluorescent DNA-templated silver nanoclusters

NASA Astrophysics Data System (ADS)

Lin, Ruoqian

Because of the ultra-small size and biocompatibility of silver nanoclusters, they have attracted much research interest for their applications in biolabeling. Among the many ways of synthesizing silver nanoclusters, DNA templated method is particularly attractive---the high tunability of DNA sequences provides another degree of freedom for controlling the chemical and photophysical properties. However, systematic studies about how DNA sequences and concentrations are controlling the photophysical properties are still lacking. The aim of this thesis is to investigate the binding mechanisms of silver clusters binding and single stranded DNAs. Here in this thesis, we report synthesis and characterization of DNA-templated silver nanoclusters and provide a systematic interrogation of the effects of DNA concentrations and sequences, including lengths and secondary structures. We performed a series of syntheses utilizing five different sequences to explore the optimal synthesis condition. By characterizing samples with UV-vis and fluorescence spectroscopy, we achieved the most proper reactants ratio and synthesis conditions. Two of them were chosen for further concentration dependence studies and sequence dependence studies. We found that cytosine-rich sequences are more likely to produce silver nanoclusters with stronger fluorescence signals; however, sequences with hairpin secondary structures are more capable in stabilizing silver nanoclusters. In addition, the fluorescence peak emission intensities and wavelengths of the DNA templated silver clusters have sequence dependent fingerprints. This potentially can be applied to sequence sensing in the future. However all the current conclusions are not warranted; there is still difficulty in formulating general rules in DNA strand design and silver nanocluster production. Further investigation of more sequences could solve these questions in the future.
Cloning and sequence analysis of complementary DNA encoding an aberrantly rearranged human T-cell gamma chain.

PubMed Central

Dialynas, D P; Murre, C; Quertermous, T; Boss, J M; Leiden, J M; Seidman, J G; Strominger, J L

1986-01-01

Complementary DNA (cDNA) encoding a human T-cell gamma chain has been cloned and sequenced. At the junction of the variable and joining regions, there is an apparent deletion of two nucleotides in the human cDNA sequence relative to the murine gamma-chain cDNA sequence, resulting simultaneously in the generation of an in-frame stop codon and in a translational frameshift. For this reason, the sequence presented here encodes an aberrantly rearranged human T-cell gamma chain. There are several surprising differences between the deduced human and murine gamma-chain amino acid sequences. These include poor homology in the variable region, poor homology in a discrete segment of the constant region precisely bounded by the expected junctions of exon CII, and the presence in the human sequence of five potential sites for N-linked glycosylation. Images PMID:3458221
The gene space in wheat: the complete γ-gliadin gene family from the wheat cultivar Chinese Spring.

PubMed

Anderson, Olin D; Huo, Naxin; Gu, Yong Q

2013-06-01

The complete set of unique γ-gliadin genes is described for the wheat cultivar Chinese Spring using a combination of expressed sequence tag (EST) and Roche 454 DNA sequences. Assemblies of Chinese Spring ESTs yielded 11 different γ-gliadin gene sequences. Two of the sequences encode identical polypeptides and are assumed to be the result of a recent gene duplication. One gene has a 3' coding mutation that changes the reading frame in the final eight codons. A second assembly of Chinese Spring γ-gliadin sequences was generated using Roche 454 total genomic DNA sequences. The 454 assembly confirmed the same 11 active genes as the EST assembly plus two pseudogenes not represented by ESTs. These 13 γ-gliadin sequences represent the complete unique set of γ-gliadin genes for cv Chinese Spring, although not ruled out are additional genes that are exact duplications of these 13 genes. A comparison with the ESTs of two other hexaploid cultivars (Butte 86 and Recital) finds that the most active genes are present in all three cultivars, with exceptions likely due to too few ESTs for detection in Butte 86 and Recital. A comparison of the numbers of ESTs per gene indicates differential levels of expression within the γ-gliadin gene family. Genome assignments were made for 6 of the 13 Chinese Spring γ-gliadin genes, i.e., one assignment from a match to two γ-gliadin genes found within a tetraploid wheat A genome BAC and four genes that match four distinct γ-gliadin sequences assembled from Roche 454 sequences from Aegilops tauschii, the hexaploid wheat D-genome ancestor.
'DNA Strider': a 'C' program for the fast analysis of DNA and protein sequences on the Apple Macintosh family of computers.

PubMed Central

Marck, C

1988-01-01

DNA Strider is a new integrated DNA and Protein sequence analysis program written with the C language for the Macintosh Plus, SE and II computers. It has been designed as an easy to learn and use program as well as a fast and efficient tool for the day-to-day sequence analysis work. The program consists of a multi-window sequence editor and of various DNA and Protein analysis functions. The editor may use 4 different types of sequences (DNA, degenerate DNA, RNA and one-letter coded protein) and can handle simultaneously 6 sequences of any type up to 32.5 kB each. Negative numbering of the bases is allowed for DNA sequences. All classical restriction and translation analysis functions are present and can be performed in any order on any open sequence or part of a sequence. The main feature of the program is that the same analysis function can be repeated several times on different sequences, thus generating multiple windows on the screen. Many graphic capabilities have been incorporated such as graphic restriction map, hydrophobicity profile and the CAI plot- codon adaptation index according to Sharp and Li. The restriction sites search uses a newly designed fast hexamer look-ahead algorithm. Typical runtime for the search of all sites with a library of 130 restriction endonucleases is 1 second per 10,000 bases. The circular graphic restriction map of the pBR322 plasmid can be therefore computed from its sequence and displayed on the Macintosh Plus screen within 2 seconds and its multiline restriction map obtained in a scrolling window within 5 seconds. PMID:2832831
Sequence-dependent DNA deformability studied using molecular dynamics simulations.

PubMed

Fujii, Satoshi; Kono, Hidetoshi; Takenaka, Shigeori; Go, Nobuhiro; Sarai, Akinori

2007-01-01

Proteins recognize specific DNA sequences not only through direct contact between amino acids and bases, but also indirectly based on the sequence-dependent conformation and deformability of the DNA (indirect readout). We used molecular dynamics simulations to analyze the sequence-dependent DNA conformations of all 136 possible tetrameric sequences sandwiched between CGCG sequences. The deformability of dimeric steps obtained by the simulations is consistent with that by the crystal structures. The simulation results further showed that the conformation and deformability of the tetramers can highly depend on the flanking base pairs. The conformations of xATx tetramers show the most rigidity and are not affected by the flanking base pairs and the xYRx show by contrast the greatest flexibility and change their conformations depending on the base pairs at both ends, suggesting tetramers with the same central dimer can show different deformabilities. These results suggest that analysis of dimeric steps alone may overlook some conformational features of DNA and provide insight into the mechanism of indirect readout during protein-DNA recognition. Moreover, the sequence dependence of DNA conformation and deformability may be used to estimate the contribution of indirect readout to the specificity of protein-DNA recognition as well as nucleosome positioning and large-scale behavior of nucleic acids.
CROSS-DISCIPLINARY PHYSICS AND RELATED AREAS OF SCIENCE AND TECHNOLOGY: Characteristics of alternating current hopping conductivity in DNA sequences

NASA Astrophysics Data System (ADS)

Ma, Song-Shan; Xu, Hui; Wang, Huan-You; Guo, Rui

2009-08-01

This paper presents a model to describe alternating current (AC) conductivity of DNA sequences, in which DNA is considered as a one-dimensional (1D) disordered system, and electrons transport via hopping between localized states. It finds that AC conductivity in DNA sequences increases as the frequency of the external electric field rises, and it takes the form of øac(ω) ~ ω2 ln2(1/ω). Also AC conductivity of DNA sequences increases with the increase of temperature, this phenomenon presents characteristics of weak temperature-dependence. Meanwhile, the AC conductivity in an off-diagonally correlated case is much larger than that in the uncorrelated case of the Anderson limit in low temperatures, which indicates that the off-diagonal correlations in DNA sequences have a great effect on the AC conductivity, while at high temperature the off-diagonal correlations no longer play a vital role in electric transport. In addition, the proportion of nucleotide pairs p also plays an important role in AC electron transport of DNA sequences. For p < 0.5, the conductivity of DNA sequence decreases with the increase of p, while for p >= 0.5, the conductivity increases with the increase of p.
Phylogenetic relationships of the Gomphales based on nuc-25S-rDNA, mit-12S-rDNA, and mit-atp6-DNA combined sequences

Treesearch

Admir J. Giachini; Kentaro Hosaka; Eduardo Nouhra; Joseph Spatafora; James M. Trappe

2010-01-01

Phylogenetic relationships among Geastrales, Gomphales, Hysterangiales, and Phallales were estimated via combined sequences: nuclear large subunit ribosomal DNA (nuc-25S-rDNA), mitochondrial small subunit ribosomal DNA (mit-12S-rDNA), and mitochondrial atp6 DNA (mit-atp6-DNA). Eighty-one taxa comprising 19 genera and 58 species...
Species classifier choice is a key consideration when analysing low-complexity food microbiome data.

PubMed

Walsh, Aaron M; Crispie, Fiona; O'Sullivan, Orla; Finnegan, Laura; Claesson, Marcus J; Cotter, Paul D

2018-03-20

The use of shotgun metagenomics to analyse low-complexity microbial communities in foods has the potential to be of considerable fundamental and applied value. However, there is currently no consensus with respect to choice of species classification tool, platform, or sequencing depth. Here, we benchmarked the performances of three high-throughput short-read sequencing platforms, the Illumina MiSeq, NextSeq 500, and Ion Proton, for shotgun metagenomics of food microbiota. Briefly, we sequenced six kefir DNA samples and a mock community DNA sample, the latter constructed by evenly mixing genomic DNA from 13 food-related bacterial species. A variety of bioinformatic tools were used to analyse the data generated, and the effects of sequencing depth on these analyses were tested by randomly subsampling reads. Compositional analysis results were consistent between the platforms at divergent sequencing depths. However, we observed pronounced differences in the predictions from species classification tools. Indeed, PERMANOVA indicated that there was no significant differences between the compositional results generated by the different sequencers (p = 0.693, R 2 = 0.011), but there was a significant difference between the results predicted by the species classifiers (p = 0.01, R 2 = 0.127). The relative abundances predicted by the classifiers, apart from MetaPhlAn2, were apparently biased by reference genome sizes. Additionally, we observed varying false-positive rates among the classifiers. MetaPhlAn2 had the lowest false-positive rate, whereas SLIMM had the greatest false-positive rate. Strain-level analysis results were also similar across platforms. Each platform correctly identified the strains present in the mock community, but accuracy was improved slightly with greater sequencing depth. Notably, PanPhlAn detected the dominant strains in each kefir sample above 500,000 reads per sample. Again, the outputs from functional profiling analysis using SUPER-FOCUS were generally accordant between the platforms at different sequencing depths. Finally, and expectedly, metagenome assembly completeness was significantly lower on the MiSeq than either on the NextSeq (p = 0.03) or the Proton (p = 0.011), and it improved with increased sequencing depth. Our results demonstrate a remarkable similarity in the results generated by the three sequencing platforms at different sequencing depths, and, in fact, the choice of bioinformatics methodology had a more evident impact on results than the choice of sequencer did.
A high-throughput multiplex method adapted for GMO detection.

PubMed

Chaouachi, Maher; Chupeau, Gaëlle; Berard, Aurélie; McKhann, Heather; Romaniuk, Marcel; Giancola, Sandra; Laval, Valérie; Bertheau, Yves; Brunel, Dominique

2008-12-24

A high-throughput multiplex assay for the detection of genetically modified organisms (GMO) was developed on the basis of the existing SNPlex method designed for SNP genotyping. This SNPlex assay allows the simultaneous detection of up to 48 short DNA sequences (approximately 70 bp; "signature sequences") from taxa endogenous reference genes, from GMO constructions, screening targets, construct-specific, and event-specific targets, and finally from donor organisms. This assay avoids certain shortcomings of multiplex PCR-based methods already in widespread use for GMO detection. The assay demonstrated high specificity and sensitivity. The results suggest that this assay is reliable, flexible, and cost- and time-effective for high-throughput GMO detection.
Method for performing site-specific affinity fractionation for use in DNA sequencing

DOEpatents

Mirzabekov, Andrei Darievich; Lysov, Yuri Petrovich; Dubley, Svetlana A.

1999-01-01

A method for fractionating and sequencing DNA via affinity interaction is provided comprising contacting cleaved DNA to a first array of oligonucleotide molecules to facilitate hybridization between said cleaved DNA and the molecules; extracting the hybridized DNA from the molecules; contacting said extracted hybridized DNA with a second array of oligonucleotide molecules, wherein the oligonucleotide molecules in the second array have specified base sequences that are complementary to said extracted hybridized DNA; and attaching labeled DNA to the second array of oligonucleotide molecules, wherein the labeled re-hybridized DNA have sequences that are complementary to the oligomers. The invention further provides a method for performing multi-step conversions of the chemical structure of compounds comprising supplying an array of polyacrylamide vessels separated by hydrophobic surfaces; immobilizing a plurality of reactants, such as enzymes, in the vessels so that each vessel contains one reactant; contacting the compounds to each of the vessels in a predetermined sequence and for a sufficient time to convert the compounds to a desired state; and isolating the converted compounds from said array.
Miniaturized reaction vessel system, method for performing site-specific biochemical reactions and affinity fractionation for use in DNA sequencing

DOEpatents

Mirzabekov, Andrei Darievich; Lysov, Yuri Petrovich; Dubley, Svetlana A.

2000-01-01

A method for fractionating and sequencing DNA via affinity interaction is provided comprising contacting cleaved DNA to a first array of oligonucleotide molecules to facilitate hybridization between said cleaved DNA and the molecules; extracting the hybridized DNA from the molecules; contacting said extracted hybridized DNA with a second array of oligonucleotide molecules, wherein the oligonucleotide molecules in the second array have specified base sequences that are complementary to said extracted hybridized DNA; and attaching labeled DNA to the second array of oligonucleotide molecules, wherein the labeled re-hybridized DNA have sequences that are complementary to the oligomers. The invention further provides a method for performing multi-step conversions of the chemical structure of compounds comprising supplying an array of polyacrylamide vessels separated by hydrophobic surfaces; immobilizing a plurality of reactants, such as enzymes, in the vessels so that each vessel contains one reactant; contacting the compounds to each of the vessels in a predetermined sequence and for a sufficient time to convert the compounds to a desired state; and isolating the converted compounds from said array.
DNABIT Compress - Genome compression algorithm.

PubMed

Rajarajeswari, Pothuraju; Apparao, Allam

2011-01-22

Data compression is concerned with how information is organized in data. Efficient storage means removal of redundancy from the data being stored in the DNA molecule. Data compression algorithms remove redundancy and are used to understand biologically important molecules. We present a compression algorithm, "DNABIT Compress" for DNA sequences based on a novel algorithm of assigning binary bits for smaller segments of DNA bases to compress both repetitive and non repetitive DNA sequence. Our proposed algorithm achieves the best compression ratio for DNA sequences for larger genome. Significantly better compression results show that "DNABIT Compress" algorithm is the best among the remaining compression algorithms. While achieving the best compression ratios for DNA sequences (Genomes),our new DNABIT Compress algorithm significantly improves the running time of all previous DNA compression programs. Assigning binary bits (Unique BIT CODE) for (Exact Repeats, Reverse Repeats) fragments of DNA sequence is also a unique concept introduced in this algorithm for the first time in DNA compression. This proposed new algorithm could achieve the best compression ratio as much as 1.58 bits/bases where the existing best methods could not achieve a ratio less than 1.72 bits/bases.
Method for performing site-specific affinity fractionation for use in DNA sequencing

DOEpatents

Mirzabekov, A.D.; Lysov, Y.P.; Dubley, S.A.

1999-05-18

A method for fractionating and sequencing DNA via affinity interaction is provided comprising contacting cleaved DNA to a first array of oligonucleotide molecules to facilitate hybridization between the cleaved DNA and the molecules; extracting the hybridized DNA from the molecules; contacting the extracted hybridized DNA with a second array of oligonucleotide molecules, wherein the oligonucleotide molecules in the second array have specified base sequences that are complementary to the extracted hybridized DNA; and attaching labeled DNA to the second array of oligonucleotide molecules, wherein the labeled re-hybridized DNA have sequences that are complementary to the oligomers. The invention further provides a method for performing multi-step conversions of the chemical structure of compounds comprising supplying an array of polyacrylamide vessels separated by hydrophobic surfaces; immobilizing a plurality of reactants, such as enzymes, in the vessels so that each vessel contains one reactant; contacting the compounds to each of the vessels in a predetermined sequence and for a sufficient time to convert the compounds to a desired state; and isolating the converted compounds from the array. 14 figs.
Partial DNA sequencing of Douglas-fir cDNAs used in RFLP mapping

Treesearch

K.D. Jermstad; D.L. Bassoni; C.S. Kinlaw; D.B. Neale

1998-01-01

DNA sequences from 87 Douglas-fir (Pseudotsuga menziesii [Mirb.] Franco) cDNA RFLP probes were determined. Sequences were submitted to the GenBank dbEST database and searched for similarity against nucleotide and protein databases using the BLASTn and BLASTx programs. Twenty-one sequences (24%) were assigned putative functions; 18 of which...
Entire plastid phylogeny of the carrot genus (Daucus, Apiaceae):Concordance with nuclear data and mitochondrial and nuclear DNA insertions to the plastid

USDA-ARS?s Scientific Manuscript database

We explored the phylogenetic utility of entire plastid DNA sequences in Daucus and compared the results to prior phylogenetic results using plastid, nuclear, and mitochondrial DNA sequences. We obtained, using Illumina sequencing, full plastid sequences of 37 accessions of 20 Daucus taxa and outgrou...

Rhipicephalus microplus dataset of nonredundant raw sequence reads from 454 GS FLX sequencing of Cot-selected (Cot = 660) genomic DNA

USDA-ARS?s Scientific Manuscript database

A reassociation kinetics-based approach was used to reduce the complexity of genomic DNA from the Deutsch laboratory strain of the cattle tick, Rhipicephalus microplus, to facilitate genome sequencing. Selected genomic DNA (Cot value = 660) was sequenced using 454 GS FLX technology, resulting in 356...
Discovery and information-theoretic characterization of transcription factor binding sites that act cooperatively.

PubMed

Clifford, Jacob; Adami, Christoph

2015-09-02

Transcription factor binding to the surface of DNA regulatory regions is one of the primary causes of regulating gene expression levels. A probabilistic approach to model protein-DNA interactions at the sequence level is through position weight matrices (PWMs) that estimate the joint probability of a DNA binding site sequence by assuming positional independence within the DNA sequence. Here we construct conditional PWMs that depend on the motif signatures in the flanking DNA sequence, by conditioning known binding site loci on the presence or absence of additional binding sites in the flanking sequence of each site's locus. Pooling known sites with similar flanking sequence patterns allows for the estimation of the conditional distribution function over the binding site sequences. We apply our model to the Dorsal transcription factor binding sites active in patterning the Dorsal-Ventral axis of Drosophila development. We find that those binding sites that cooperate with nearby Twist sites on average contain about 0.5 bits of information about the presence of Twist transcription factor binding sites in the flanking sequence. We also find that Dorsal binding site detectors conditioned on flanking sequence information make better predictions about what is a Dorsal site relative to background DNA than detection without information about flanking sequence features.
Real-Time DNA Sequencing in the Antarctic Dry Valleys Using the Oxford Nanopore Sequencer

PubMed Central

Johnson, Sarah S.; Zaikova, Elena; Goerlitz, David S.; Bai, Yu; Tighe, Scott W.

2017-01-01

The ability to sequence DNA outside of the laboratory setting has enabled novel research questions to be addressed in the field in diverse areas, ranging from environmental microbiology to viral epidemics. Here, we demonstrate the application of offline DNA sequencing of environmental samples using a hand-held nanopore sequencer in a remote field location: the McMurdo Dry Valleys, Antarctica. Sequencing was performed using a MK1B MinION sequencer from Oxford Nanopore Technologies (ONT; Oxford, United Kingdom) that was equipped with software to operate without internet connectivity. One-direction (1D) genomic libraries were prepared using portable field techniques on DNA isolated from desiccated microbial mats. By adequately insulating the sequencer and laptop, it was possible to run the sequencing protocol for up to 2½ h under arduous conditions. PMID:28337073
Multiplexed Sequence Encoding: A Framework for DNA Communication.

PubMed

Zakeri, Bijan; Carr, Peter A; Lu, Timothy K

2016-01-01

Synthetic DNA has great propensity for efficiently and stably storing non-biological information. With DNA writing and reading technologies rapidly advancing, new applications for synthetic DNA are emerging in data storage and communication. Traditionally, DNA communication has focused on the encoding and transfer of complete sets of information. Here, we explore the use of DNA for the communication of short messages that are fragmented across multiple distinct DNA molecules. We identified three pivotal points in a communication-data encoding, data transfer & data extraction-and developed novel tools to enable communication via molecules of DNA. To address data encoding, we designed DNA-based individualized keyboards (iKeys) to convert plaintext into DNA, while reducing the occurrence of DNA homopolymers to improve synthesis and sequencing processes. To address data transfer, we implemented a secret-sharing system-Multiplexed Sequence Encoding (MuSE)-that conceals messages between multiple distinct DNA molecules, requiring a combination key to reveal messages. To address data extraction, we achieved the first instance of chromatogram patterning through multiplexed sequencing, thereby enabling a new method for data extraction. We envision these approaches will enable more widespread communication of information via DNA.
An improved divergent synthesis of comb-type branched oligodeoxyribonucleotides (bDNA) containing multiple secondary sequences.

PubMed

Horn, T; Chang, C A; Urdea, M S

1997-12-01

The divergent synthesis of branched DNA (bDNA) comb structures is described. This new type of bDNA contains one unique oligonucleotide, the primary sequence, covalently attached through a comb-like branch network to many identical copies of a different oligonucleotide, the secondary sequence. The bDNA comb structures were assembled on a solid support and several synthesis parameters were investigated and optimized. The bDNA comb molecules were characterized by polyacrylamide gel electrophoretic methods and by controlled cleavage at periodate-cleavable moieties incorporated during synthesis. The developed chemistry allows synthesis of bDNA comb molecules containing multiple secondary sequences. In the accompanying article we describe the synthesis and characterization of large bDNA combs containing all four deoxynucleotides for use as signal amplifiers in nucleic acid quantification assays.
An improved divergent synthesis of comb-type branched oligodeoxyribonucleotides (bDNA) containing multiple secondary sequences.

PubMed Central

Horn, T; Chang, C A; Urdea, M S

1997-01-01

The divergent synthesis of branched DNA (bDNA) comb structures is described. This new type of bDNA contains one unique oligonucleotide, the primary sequence, covalently attached through a comb-like branch network to many identical copies of a different oligonucleotide, the secondary sequence. The bDNA comb structures were assembled on a solid support and several synthesis parameters were investigated and optimized. The bDNA comb molecules were characterized by polyacrylamide gel electrophoretic methods and by controlled cleavage at periodate-cleavable moieties incorporated during synthesis. The developed chemistry allows synthesis of bDNA comb molecules containing multiple secondary sequences. In the accompanying article we describe the synthesis and characterization of large bDNA combs containing all four deoxynucleotides for use as signal amplifiers in nucleic acid quantification assays. PMID:9365265
A 28,000 Years Old Cro-Magnon mtDNA Sequence Differs from All Potentially Contaminating Modern Sequences

PubMed Central

Caramelli, David; Milani, Lucio; Vai, Stefania; Modi, Alessandra; Pecchioli, Elena; Girardi, Matteo; Pilli, Elena; Lari, Martina; Lippi, Barbara; Ronchitelli, Annamaria; Mallegni, Francesco; Casoli, Antonella; Bertorelle, Giorgio; Barbujani, Guido

2008-01-01

Background DNA sequences from ancient speciments may in fact result from undetected contamination of the ancient specimens by modern DNA, and the problem is particularly challenging in studies of human fossils. Doubts on the authenticity of the available sequences have so far hampered genetic comparisons between anatomically archaic (Neandertal) and early modern (Cro-Magnoid) Europeans. Methodology/Principal Findings We typed the mitochondrial DNA (mtDNA) hypervariable region I in a 28,000 years old Cro-Magnoid individual from the Paglicci cave, in Italy (Paglicci 23) and in all the people who had contact with the sample since its discovery in 2003. The Paglicci 23 sequence, determined through the analysis of 152 clones, is the Cambridge reference sequence, and cannot possibly reflect contamination because it differs from all potentially contaminating modern sequences. Conclusions/Significance: The Paglicci 23 individual carried a mtDNA sequence that is still common in Europe, and which radically differs from those of the almost contemporary Neandertals, demonstrating a genealogical continuity across 28,000 years, from Cro-Magnoid to modern Europeans. Because all potential sources of modern DNA contamination are known, the Paglicci 23 sample will offer a unique opportunity to get insight for the first time into the nuclear genes of early modern Europeans. PMID:18628960
High-Throughput Block Optical DNA Sequence Identification.

PubMed

Sagar, Dodderi Manjunatha; Korshoj, Lee Erik; Hanson, Katrina Bethany; Chowdhury, Partha Pratim; Otoupal, Peter Britton; Chatterjee, Anushree; Nagpal, Prashant

2018-01-01

Optical techniques for molecular diagnostics or DNA sequencing generally rely on small molecule fluorescent labels, which utilize light with a wavelength of several hundred nanometers for detection. Developing a label-free optical DNA sequencing technique will require nanoscale focusing of light, a high-throughput and multiplexed identification method, and a data compression technique to rapidly identify sequences and analyze genomic heterogeneity for big datasets. Such a method should identify characteristic molecular vibrations using optical spectroscopy, especially in the "fingerprinting region" from ≈400-1400 cm -1 . Here, surface-enhanced Raman spectroscopy is used to demonstrate label-free identification of DNA nucleobases with multiplexed 3D plasmonic nanofocusing. While nanometer-scale mode volumes prevent identification of single nucleobases within a DNA sequence, the block optical technique can identify A, T, G, and C content in DNA k-mers. The content of each nucleotide in a DNA block can be a unique and high-throughput method for identifying sequences, genes, and other biomarkers as an alternative to single-letter sequencing. Additionally, coupling two complementary vibrational spectroscopy techniques (infrared and Raman) can improve block characterization. These results pave the way for developing a novel, high-throughput block optical sequencing method with lossy genomic data compression using k-mer identification from multiplexed optical data acquisition. © 2017 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
Basic quantitative polymerase chain reaction using real-time fluorescence measurements.

PubMed

Ares, Manuel

2014-10-01

This protocol uses quantitative polymerase chain reaction (qPCR) to measure the number of DNA molecules containing a specific contiguous sequence in a sample of interest (e.g., genomic DNA or cDNA generated by reverse transcription). The sample is subjected to fluorescence-based PCR amplification and, theoretically, during each cycle, two new duplex DNA molecules are produced for each duplex DNA molecule present in the sample. The progress of the reaction during PCR is evaluated by measuring the fluorescence of dsDNA-dye complexes in real time. In the early cycles, DNA duplication is not detected because inadequate amounts of DNA are made. At a certain threshold cycle, DNA-dye complexes double each cycle for 8-10 cycles, until the DNA concentration becomes so high and the primer concentration so low that the reassociation of the product strands blocks efficient synthesis of new DNA and the reaction plateaus. There are two types of measurements: (1) the relative change of the target sequence compared to a reference sequence and (2) the determination of molecule number in the starting sample. The first requires a reference sequence, and the second requires a sample of the target sequence with known numbers of the molecules of sequence to generate a standard curve. By identifying the threshold cycle at which a sample first begins to accumulate DNA-dye complexes exponentially, an estimation of the numbers of starting molecules in the sample can be extrapolated. © 2014 Cold Spring Harbor Laboratory Press.
Spiroplasma species share common DNA sequences among their viruses, plasmids and genomes.

PubMed

Ranhand, J M; Nur, I; Rose, D L; Tully, J G

1987-01-01

Alkaline-Southern-blot analyses showed that a spiroplasma plasmid, pRA1, obtained from Spiroplasma citri (Maroc-R8A2), contained DNA sequences that were homologous to spiroplasma type 3 viruses (SV3) obtained from S. citri (Maroc-R8A2), S. citri (608) and S. mirum (SMCA). In addition, pRA1 and SV3(608) DNA shared common, but not necessarily related, sequences with extrachromosomal DNA derived from 11 Spiroplasma species or strains. Furthermore, SV3(608) had DNA homology with the chromosome from 6 distinct spiroplasmas but not with chromosomal DNA from eight other Spiroplasma species or strains. The biological function of these common sequences is unknown.
Flow cytometric detection method for DNA samples

DOEpatents

Nasarabadi, Shanavaz [Livermore, CA; Langlois, Richard G [Livermore, CA; Venkateswaran, Kodumudi S [Round Rock, TX

2011-07-05

Disclosed herein are two methods for rapid multiplex analysis to determine the presence and identity of target DNA sequences within a DNA sample. Both methods use reporting DNA sequences, e.g., modified conventional Taqman.RTM. probes, to combine multiplex PCR amplification with microsphere-based hybridization using flow cytometry means of detection. Real-time PCR detection can also be incorporated. The first method uses a cyanine dye, such as, Cy3.TM., as the reporter linked to the 5' end of a reporting DNA sequence. The second method positions a reporter dye, e.g., FAM.TM. on the 3' end of the reporting DNA sequence and a quencher dye, e.g., TAMRA.TM., on the 5' end.
Flow cytometric detection method for DNA samples

DOEpatents

Nasarabadi, Shanavaz [Livermore, CA; Langlois, Richard G [Livermore, CA; Venkateswaran, Kodumudi S [Livermore, CA

2006-08-01

Disclosed herein are two methods for rapid multiplex analysis to determine the presence and identity of target DNA sequences within a DNA sample. Both methods use reporting DNA sequences, e.g., modified conventional Taqman.RTM. probes, to combine multiplex PCR amplification with microsphere-based hybridization using flow cytometry means of detection. Real-time PCR detection can also be incorporated. The first method uses a cyanine dye, such as, Cy3.TM., as the reporter linked to the 5' end of a reporting DNA sequence. The second method positions a reporter dye, e.g., FAM, on the 3' end of the reporting DNA sequence and a quencher dye, e.g., TAMRA, on the 5' end.
Method for sequencing DNA base pairs

DOEpatents

Sessler, A.M.; Dawson, J.

1993-12-14

The base pairs of a DNA structure are sequenced with the use of a scanning tunneling microscope (STM). The DNA structure is scanned by the STM probe tip, and, as it is being scanned, the DNA structure is separately subjected to a sequence of infrared radiation from four different sources, each source being selected to preferentially excite one of the four different bases in the DNA structure. Each particular base being scanned is subjected to such sequence of infrared radiation from the four different sources as that particular base is being scanned. The DNA structure as a whole is separately imaged for each subjection thereof to radiation from one only of each source. 6 figures.
G-quadruplex and G-rich sequence stimulate Pif1p-catalyzed downstream duplex DNA unwinding through reducing waiting time at ss/dsDNA junction

PubMed Central

Zhang, Bo; Wu, Wen-Qiang; Liu, Na-Nv; Duan, Xiao-Lei; Li, Ming; Dou, Shuo-Xing; Hou, Xi-Miao; Xi, Xu-Guang

2016-01-01

Alternative DNA structures that deviate from B-form double-stranded DNA such as G-quadruplex (G4) DNA can be formed by G-rich sequences that are widely distributed throughout the human genome. We have previously shown that Pif1p not only unfolds G4, but also unwinds the downstream duplex DNA in a G4-stimulated manner. In the present study, we further characterized the G4-stimulated duplex DNA unwinding phenomenon by means of single-molecule fluorescence resonance energy transfer. It was found that Pif1p did not unwind the partial duplex DNA immediately after unfolding the upstream G4 structure, but rather, it would dwell at the ss/dsDNA junction with a ‘waiting time’. Further studies revealed that the waiting time was in fact related to a protein dimerization process that was sensitive to ssDNA sequence and would become rapid if the sequence is G-rich. Furthermore, we identified that the G-rich sequence, as the G4 structure, equally stimulates duplex DNA unwinding. The present work sheds new light on the molecular mechanism by which G4-unwinding helicase Pif1p resolves physiological G4/duplex DNA structures in cells. PMID:27471032
A computer aided thermodynamic approach for predicting the formation of Z-DNA in naturally occurring sequences

NASA Technical Reports Server (NTRS)

Ho, P. S.; Ellison, M. J.; Quigley, G. J.; Rich, A.

1986-01-01

The ease with which a particular DNA segment adopts the left-handed Z-conformation depends largely on the sequence and on the degree of negative supercoiling to which it is subjected. We describe a computer program (Z-hunt) that is designed to search long sequences of naturally occurring DNA and retrieve those nucleotide combinations of up to 24 bp in length which show a strong propensity for Z-DNA formation. Incorporated into Z-hunt is a statistical mechanical model based on empirically determined energetic parameters for the B to Z transition accumulated to date. The Z-forming potential of a sequence is assessed by ranking its behavior as a function of negative superhelicity relative to the behavior of similar sized randomly generated nucleotide sequences assembled from over 80,000 combinations. The program makes it possible to compare directly the Z-forming potential of sequences with different base compositions and different sequence lengths. Using Z-hunt, we have analyzed the DNA sequences of the bacteriophage phi X174, plasmid pBR322, the animal virus SV40 and the replicative form of the eukaryotic adenovirus-2. The results are compared with those previously obtained by others from experiments designed to locate Z-DNA forming regions in these sequences using probes which show specificity for the left-handed DNA conformation.
Recognition of platinum-DNA adducts by HMGB1a.

PubMed

Ramachandran, Srinivas; Temple, Brenda; Alexandrova, Anastassia N; Chaney, Stephen G; Dokholyan, Nikolay V

2012-09-25

Cisplatin (CP) and oxaliplatin (OX), platinum-based drugs used widely in chemotherapy, form adducts on intrastrand guanines (5'GG) in genomic DNA. DNA damage recognition proteins, transcription factors, mismatch repair proteins, and DNA polymerases discriminate between CP- and OX-GG DNA adducts, which could partly account for differences in the efficacy, toxicity, and mutagenicity of CP and OX. In addition, differential recognition of CP- and OX-GG adducts is highly dependent on the sequence context of the Pt-GG adduct. In particular, DNA binding protein domain HMGB1a binds to CP-GG DNA adducts with up to 53-fold greater affinity than to OX-GG adducts in the TGGA sequence context but shows much smaller differences in binding in the AGGC or TGGT sequence contexts. Here, simulations of the HMGB1a-Pt-DNA complex in the three sequence contexts revealed a higher number of interface contacts for the CP-DNA complex in the TGGA sequence context than in the OX-DNA complex. However, the number of interface contacts was similar in the TGGT and AGGC sequence contexts. The higher number of interface contacts in the CP-TGGA sequence context corresponded to a larger roll of the Pt-GG base pair step. Furthermore, geometric analysis of stacking of phenylalanine 37 in HMGB1a (Phe37) with the platinated guanines revealed more favorable stacking modes correlated with a larger roll of the Pt-GG base pair step in the TGGA sequence context. These data are consistent with our previous molecular dynamics simulations showing that the CP-TGGA complex was able to sample larger roll angles than the OX-TGGA complex or either CP- or OX-DNA complexes in the AGGC or TGGT sequences. We infer that the high binding affinity of HMGB1a for CP-TGGA is due to the greater flexibility of CP-TGGA compared to OX-TGGA and other Pt-DNA adducts. This increased flexibility is reflected in the ability of CP-TGGA to sample larger roll angles, which allows for a higher number of interface contacts between the Pt-DNA adduct and HMGB1a.
Water-Soluble Conjugated Polymers: Self-Assembly and Biosensor Applications

NASA Astrophysics Data System (ADS)

Bazan, Guillermo

2005-03-01

Homogeneous assays can be designed which take advantage of the optical amplification of conjugated polymers and the self-assembly characteristic of aqueous polyelectrolytes. For example, a ssDNA sequence sensor comprises an aqueous solution containing a cationic water soluble conjugated polymer such as poly(9,9-bis(trimethylammonium)-hexyl)-fluorene phenylene) with a peptide nucleic acid (PNA) labeled with a dye (PNA-C*). Signal transduction is controlled by hybridization of the neutral PNA-C* probe and the negative ssDNA target, resulting in favorable electrostatic interactions between the hybrid complex and the cationic polymer. Distance requirements for Förster energy transfer are thus met only when ssDNA of complementary sequence to the PNA-C* probe is present. Signal amplification by the conjugated polymer provides fluorescein emission >25 times higher than that of the directly excited dye. Transduction by electrostatic interactions followed by energy transfer is a general strategy. Examples involving other biomolecular recognition events, such as DNA/DNA, RNA/protein and RNA/RNA, will also be provided. The mechanism of biosensing will be discussed, with special attention to the varying contributions of hydrophobic and electrostatic forces, polymer conformation, charge density, local concentration of C*s and tailored defect sites for aggregation-induced optical changes. Finally, the water solubility of these conjugated polymers opens possibilities for spin casting onto organic materials, without dissolving the underlying layers. This property is useful for fabricating multilayer organic optoelectronic devices by simple solution techniques.
A simplified protocol for molecular identification of Eimeria species in field samples.

PubMed

Haug, Anita; Thebo, Per; Mattsson, Jens G

2007-05-15

This study aimed to find a fast, sensitive and efficient protocol for molecular identification of chicken Eimeria spp. in field samples. Various methods for each of the three steps of the protocol were evaluated: oocyst wall rupturing methods, DNA extraction methods, and identification of species-specific DNA sequences by PCR. We then compared and evaluated five complete protocols. Three series of oocyst suspensions of known number of oocysts from Eimeria mitis, Eimeria praecox, Eimeria maxima and Eimeria tenella were prepared and ground using glass beads or mini-pestle. DNA was extracted from ruptured oocysts using commercial systems (GeneReleaser, Qiagen Stoolkit and Prepman) or phenol-chloroform DNA extraction, followed by identification of species-specific ITS-1 sequences by optimised single species PCR assays. The Stoolkit and Prepman protocols showed insufficient repeatability, and the former was also expensive and relatively time-consuming. In contrast, both the GeneReleaser protocol and phenol-chloroform protocols were robust and sensitive, detecting less than 0.4 oocysts of each species per PCR. Finally, we evaluated our new protocol on 68 coccidia positive field samples. Our data suggests that rupturing the oocysts by mini-pestle grinding, preparing the DNA with GeneReleaser, followed by optimised single species PCR assays, makes a robust and sensitive procedure for identifying chicken Eimeria species in field samples. Importantly, it also provides minimal hands-on-time in the pre-PCR process, lower contamination risk and no handling of toxic chemicals.
Phosphorylation and cellular function of the human Rpa2 N-terminus in the budding yeast Saccharomyces cerevisiae.

PubMed

Ghospurkar, Padmaja L; Wilson, Timothy M; Liu, Shengqin; Herauf, Anna; Steffes, Jenna; Mueller, Erica N; Oakley, Gregory G; Haring, Stuart J

2015-02-01

Maintenance of genome integrity is critical for proper cell growth. This occurs through accurate DNA replication and repair of DNA lesions. A key factor involved in both DNA replication and the DNA damage response is the heterotrimeric single-stranded DNA (ssDNA) binding complex Replication Protein A (RPA). Although the RPA complex appears to be structurally conserved throughout eukaryotes, the primary amino acid sequence of each subunit can vary considerably. Examination of sequence differences along with the functional interchangeability of orthologous RPA subunits or regions could provide insight into important regions and their functions. This might also allow for study in simpler systems. We determined that substitution of yeast Replication Factor A (RFA) with human RPA does not support yeast cell viability. Exchange of a single yeast RFA subunit with the corresponding human RPA subunit does not function due to lack of inter-species subunit interactions. Substitution of yeast Rfa2 with domains/regions of human Rpa2 important for Rpa2 function (i.e., the N-terminus and the loop 3-4 region) supports viability in yeast cells, and hybrid proteins containing human Rpa2 N-terminal phospho-mutations result in similar DNA damage phenotypes to analogous yeast Rfa2 N-terminal phospho-mutants. Finally, the human Rpa2 N-terminus (NT) fused to yeast Rfa2 is phosphorylated in a manner similar to human Rpa2 in human cells, indicating that conserved kinases recognize the human domain in yeast. The implication is that budding yeast represents a potential model system for studying not only human Rpa2 N-terminal phosphorylation, but also phosphorylation of Rpa2 N-termini from other eukaryotic organisms. Copyright © 2014 The Authors. Published by Elsevier Inc. All rights reserved.
Characterization of proviruses cloned from mink cell focus-forming virus-infected cellular DNA.

PubMed Central

Khan, A S; Repaske, R; Garon, C F; Chan, H W; Rowe, W P; Martin, M A

1982-01-01

Two proviruses were cloned from EcoRI-digested DNA extracted from mink cells chronically infected with AKR mink cell focus-forming (MCF) 247 murine leukemia virus (MuLV), using a lambda phage host vector system. One cloned MuLV DNA fragment (designated MCF 1) contained sequences extending 6.8 kilobases from an EcoRI restriction site in the 5' long terminal repeat (LTR) to an EcoRI site located in the envelope (env) region and was indistinguishable by restriction endonuclease mapping for 5.1 kilobases (except for the EcoRI site in the LTR) from the 5' end of AKR ecotropic proviral DNA. The DNA segment extending from 5.1 to 6.8 kilobases contained several restriction sites that were not present in the AKR ecotropic provirus. A 0.5-kilobase DNA segment located at the 3' end of MCF 1 DNA contained sequences which hybridized to a xenotropic env-specific DNA probe but not to labeled ecotropic env-specific DNA. This dual character of MCF 1 proviral DNA was also confirmed by analyzing heteroduplex molecules by electron microscopy. The second cloned proviral DNA (designated MCF 2) was a 6.9-kilobase EcoRI DNA fragment which contained LTR sequences at each end and a 2.0-kilobase deletion encompassing most of the env region. The MCF 2 proviral DNA proved to be a useful reagent for detecting LTRs electron microscopically due to the presence of nonoverlapping, terminally located LTR sequences which effected its circularization with DNAs containing homologous LTR sequences. Nucleotide sequence analysis demonstrated the presence of a 104-base-pair direct repeat in the LTR of MCF 2 DNA. In contrast, only a single copy of the reiterated component of the direct repeat was present in MCF 1 DNA. Images PMID:6281459

Organization and evolution of highly repeated satellite DNA sequences in plant chromosomes.

PubMed

Sharma, S; Raina, S N

2005-01-01

A major component of the plant nuclear genome is constituted by different classes of repetitive DNA sequences. The structural, functional and evolutionary aspects of the satellite repetitive DNA families, and their organization in the chromosomes is reviewed. The tandem satellite DNA sequences exhibit characteristic chromosomal locations, usually at subtelomeric and centromeric regions. The repetitive DNA family(ies) may be widely distributed in a taxonomic family or a genus, or may be specific for a species, genome or even a chromosome. They may acquire large-scale variations in their sequence and copy number over an evolutionary time-scale. These features have formed the basis of extensive utilization of repetitive sequences for taxonomic and phylogenetic studies. Hybrid polyploids have especially proven to be excellent models for studying the evolution of repetitive DNA sequences. Recent studies explicitly show that some repetitive DNA families localized at the telomeres and centromeres have acquired important structural and functional significance. The repetitive elements are under different evolutionary constraints as compared to the genes. Satellite DNA families are thought to arise de novo as a consequence of molecular mechanisms such as unequal crossing over, rolling circle amplification, replication slippage and mutation that constitute "molecular drive". Copyright 2005 S. Karger AG, Basel.
First Molecular Identification and Phylogeny of Moroccan Anopheles sergentii (Diptera: Culicidae) Based on Second Internal Transcribed Spencer (ITS2) and Cytochrome c Oxidase I (COI) Sequences.

PubMed

Benabdelkrim Filali, Oumama; Kabine, Mostafa; El Hamouchi, Adil; Lemrani, Meryem; Debboun, Mustapha; Sarih, M'hammed

2018-06-05

Anopheles sergentii known as the "oasis vector" or the "desert malaria vector" is considered the main vector of malaria in the southern parts of Morocco. Its presence in Morocco is confirmed for the first time through sequencing of mitochondrial DNA (mDNA) cytochrome c oxidase subunit I (COI) barcodes and nuclear ribosomal DNA (rDNA) second internal transcribed spacer (ITS2) sequences and direct comparison with specimens of A. sergentii of other countries. The DNA barcodes (n = 39) obtained from A. sergentii collected in 2015 and 2016 showed more diversity with 10 haplotypes, compared with 3 haplotypes obtained from ITS2 sequences (n = 59). Moreover, the comparison using the ITS2 sequences showed closer evolutionary relationship between the Moroccan and Egyptian strains than the Iranian strain. Nevertheless, genetic differences due to geographical segregation were also observed. This study provides the first report on the sequence of rDNA-ITS2 and mtDNA COI, which could be used to better understand the biodiversity of A. sergentii.
Does TATA matter? A structural exploration of the selectivity determinants in its complexes with TATA box-binding protein.

PubMed Central

Pastor, N; Pardo, L; Weinstein, H

1997-01-01

The binding of the TATA box-binding protein (TBP) to a TATA sequence in DNA is essential for eukaryotic basal transcription. TBP binds in the minor groove of DNA, causing a large distortion of the DNA helix. Given the apparent stereochemical equivalence of AT and TA basepairs in the minor groove, DNA deformability must play a significant role in binding site selection, because not all AT-rich sequences are bound effectively by TBP. To gain insight into the precise role that the properties of the TATA sequence have in determining the specificity of the DNA substrates of TBP, the solution structure and dynamics of seven DNA dodecamers have been studied by using molecular dynamics simulations. The analysis of the structural properties of basepair steps in these TATA sequences suggests a reason for the preference for alternating pyrimidine-purine (YR) sequences, but indicates that these properties cannot be the sole determinant of the sequence specificity of TBP. Rather, recognition depends on the interplay between the inherent deformability of the DNA and steric complementarity at the molecular interface. Images FIGURE 2 PMID:9251783
Competition between B-Z and B-L transitions in a single DNA molecule: Computational studies

NASA Astrophysics Data System (ADS)

Kwon, Ah-Young; Nam, Gi-Moon; Johner, Albert; Kim, Seyong; Hong, Seok-Cheol; Lee, Nam-Kyung

2016-02-01

Under negative torsion, DNA adopts left-handed helical forms, such as Z-DNA and L-DNA. Using the random copolymer model developed for a wormlike chain, we represent a single DNA molecule with structural heterogeneity as a helical chain consisting of monomers which can be characterized by different helical senses and pitches. By Monte Carlo simulation, where we take into account bending and twist fluctuations explicitly, we study sequence dependence of B-Z transitions under torsional stress and tension focusing on the interaction with B-L transitions. We consider core sequences, (GC) n repeats or (TG) n repeats, which can interconvert between the right-handed B form and the left-handed Z form, imbedded in a random sequence, which can convert to left-handed L form with different (tension dependent) helical pitch. We show that Z-DNA formation from the (GC) n sequence is always supported by unwinding torsional stress but Z-DNA formation from the (TG) n sequence, which are more costly to convert but numerous, can be strongly influenced by the quenched disorder in the surrounding random sequence.
Extending the spectrum of DNA sequences retrieved from ancient bones and teeth

PubMed Central

Glocke, Isabelle; Meyer, Matthias

2017-01-01

The number of DNA fragments surviving in ancient bones and teeth is known to decrease with fragment length. Recent genetic analyses of Middle Pleistocene remains have shown that the recovery of extremely short fragments can prove critical for successful retrieval of sequence information from particularly degraded ancient biological material. Current sample preparation techniques, however, are not optimized to recover DNA sequences from fragments shorter than ∼35 base pairs (bp). Here, we show that much shorter DNA fragments are present in ancient skeletal remains but lost during DNA extraction. We present a refined silica-based DNA extraction method that not only enables efficient recovery of molecules as short as 25 bp but also doubles the yield of sequences from longer fragments due to improved recovery of molecules with single-strand breaks. Furthermore, we present strategies for monitoring inefficiencies in library preparation that may result from co-extraction of inhibitory substances during DNA extraction. The combination of DNA extraction and library preparation techniques described here substantially increases the yield of DNA sequences from ancient remains and provides access to a yet unexploited source of highly degraded DNA fragments. Our work may thus open the door for genetic analyses on even older material. PMID:28408382
Mitochondrial DNA copy number is regulated in a tissue specific manner by DNA methylation of the nuclear-encoded DNA polymerase gamma A

PubMed Central

Kelly, Richard D. W.; Mahmud, Arsalan; McKenzie, Matthew; Trounce, Ian A.; St John, Justin C.

2012-01-01

DNA methylation is an essential mechanism controlling gene expression during differentiation and development. We investigated the epigenetic regulation of the nuclear-encoded, mitochondrial DNA (mtDNA) polymerase γ catalytic subunit (PolgA) by examining the methylation status of a CpG island within exon 2 of PolgA. Bisulphite sequencing identified low methylation levels (<10%) within exon 2 of mouse oocytes, blastocysts and embryonic stem cells (ESCs), while somatic tissues contained significantly higher levels (>40%). In contrast, induced pluripotent stem (iPS) cells and somatic nuclear transfer ESCs were hypermethylated (>20%), indicating abnormal epigenetic reprogramming. Real time PCR analysis of 5-methylcytosine (5mC) and 5-hydroxymethylcytosine (5hmC) immunoprecipitated DNA suggests active DNA methylation and demethylation within exon 2 of PolgA. Moreover, neural differentiation of ESCs promoted de novo methylation and demethylation at the exon 2 locus. Regression analysis demonstrates that cell-specific PolgA expression levels were negatively correlated with DNA methylation within exon 2 and mtDNA copy number. Finally, using chromatin immunoprecipitation (ChIP) against RNA polymerase II (RNApII) phosphorylated on serine 2, we show increased DNA methylation levels are associated with reduced RNApII transcriptional elongation. This is the first study linking nuclear DNA epigenetic regulation with mtDNA regulation during differentiation and cell specialization. PMID:22941637
Complementary DNA cloning and molecular evolution of opine dehydrogenases in some marine invertebrates.

PubMed

Kimura, Tomohiro; Nakano, Toshiki; Yamaguchi, Toshiyasu; Sato, Minoru; Ogawa, Tomohisa; Muramoto, Koji; Yokoyama, Takehiko; Kan-No, Nobuhiro; Nagahisa, Eizou; Janssen, Frank; Grieshaber, Manfred K

2004-01-01

The complete complementary DNA sequences of genes presumably coding for opine dehydrogenases from Arabella iricolor (sandworm), Haliotis discus hannai (abalone), and Patinopecten yessoensis (scallop) were determined, and partial cDNA sequences were derived for Meretrix lusoria (Japanese hard clam) and Spisula sachalinensis (Sakhalin surf clam). The primers ODH-9F and ODH-11R proved useful for amplifying the sequences for opine dehydrogenases from the 4 mollusk species investigated in this study. The sequence of the sandworm was obtained using primers constructed from the amino acid sequence of tauropine dehydrogenase, the main opine dehydrogenase in A. iricolor. The complete cDNA sequence of A. iricolor, H. discus hannai, and P. yessoensis encode 397, 400, and 405 amino acids, respectively. All sequences were aligned and compared with published databank sequences of Loligo opalescens, Loligo vulgaris (squid), Sepia officinalis (cuttlefish), and Pecten maximus (scallop). As expected, a high level of homology was observed for the cDNA from closely related species, such as for cephalopods or scallops, whereas cDNA from the other species showed lower-level homologies. A similar trend was observed when the deduced amino acid sequences were compared. Furthermore, alignment of these sequences revealed some structural motifs that are possibly related to the binding sites of the substrates. The phylogenetic trees derived from the nucleotide and amino acid sequences were consistent with the classification of species resulting from classical taxonomic analyses.
Position specific variation in the rate of evolution in transcription factor binding sites

PubMed Central

Moses, Alan M; Chiang, Derek Y; Kellis, Manolis; Lander, Eric S; Eisen, Michael B

2003-01-01

Background The binding sites of sequence specific transcription factors are an important and relatively well-understood class of functional non-coding DNAs. Although a wide variety of experimental and computational methods have been developed to characterize transcription factor binding sites, they remain difficult to identify. Comparison of non-coding DNA from related species has shown considerable promise in identifying these functional non-coding sequences, even though relatively little is known about their evolution. Results Here we analyse the genome sequences of the budding yeasts Saccharomyces cerevisiae, S. bayanus, S. paradoxus and S. mikatae to study the evolution of transcription factor binding sites. As expected, we find that both experimentally characterized and computationally predicted binding sites evolve slower than surrounding sequence, consistent with the hypothesis that they are under purifying selection. We also observe position-specific variation in the rate of evolution within binding sites. We find that the position-specific rate of evolution is positively correlated with degeneracy among binding sites within S. cerevisiae. We test theoretical predictions for the rate of evolution at positions where the base frequencies deviate from background due to purifying selection and find reasonable agreement with the observed rates of evolution. Finally, we show how the evolutionary characteristics of real binding motifs can be used to distinguish them from artefacts of computational motif finding algorithms. Conclusion As has been observed for protein sequences, the rate of evolution in transcription factor binding sites varies with position, suggesting that some regions are under stronger functional constraint than others. This variation likely reflects the varying importance of different positions in the formation of the protein-DNA complex. The characterization of the pattern of evolution in known binding sites will likely contribute to the effective use of comparative sequence data in the identification of transcription factor binding sites and is an important step toward understanding the evolution of functional non-coding DNA. PMID:12946282
Chromatin accessibility prediction via a hybrid deep convolutional neural network.

PubMed

Liu, Qiao; Xia, Fei; Yin, Qijin; Jiang, Rui

2018-03-01

A majority of known genetic variants associated with human-inherited diseases lie in non-coding regions that lack adequate interpretation, making it indispensable to systematically discover functional sites at the whole genome level and precisely decipher their implications in a comprehensive manner. Although computational approaches have been complementing high-throughput biological experiments towards the annotation of the human genome, it still remains a big challenge to accurately annotate regulatory elements in the context of a specific cell type via automatic learning of the DNA sequence code from large-scale sequencing data. Indeed, the development of an accurate and interpretable model to learn the DNA sequence signature and further enable the identification of causative genetic variants has become essential in both genomic and genetic studies. We proposed Deopen, a hybrid framework mainly based on a deep convolutional neural network, to automatically learn the regulatory code of DNA sequences and predict chromatin accessibility. In a series of comparison with existing methods, we show the superior performance of our model in not only the classification of accessible regions against background sequences sampled at random, but also the regression of DNase-seq signals. Besides, we further visualize the convolutional kernels and show the match of identified sequence signatures and known motifs. We finally demonstrate the sensitivity of our model in finding causative noncoding variants in the analysis of a breast cancer dataset. We expect to see wide applications of Deopen with either public or in-house chromatin accessibility data in the annotation of the human genome and the identification of non-coding variants associated with diseases. Deopen is freely available at https://github.com/kimmo1019/Deopen. ruijiang@tsinghua.edu.cn. Supplementary data are available at Bioinformatics online. © The Author (2017). Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com
Fusion of GFP to the M.EcoKI DNA methyltransferase produces a new probe of Type I DNA restriction and modification enzymes

DOE Office of Scientific and Technical Information (OSTI.GOV)

Chen, Kai; Roberts, Gareth A.; Stephanou, Augoustinos S.

2010-07-23

Research highlights: {yields} Successful fusion of GFP to M.EcoKI DNA methyltransferase. {yields} GFP located at C-terminal of sequence specificity subunit does not later enzyme activity. {yields} FRET confirms structural model of M.EcoKI bound to DNA. -- Abstract: We describe the fusion of enhanced green fluorescent protein to the C-terminus of the HsdS DNA sequence-specificity subunit of the Type I DNA modification methyltransferase M.EcoKI. The fusion expresses well in vivo and assembles with the two HsdM modification subunits. The fusion protein functions as a sequence-specific DNA methyltransferase protecting DNA against digestion by the EcoKI restriction endonuclease. The purified enzyme shows Foerstermore » resonance energy transfer to fluorescently-labelled DNA duplexes containing the target sequence and to fluorescently-labelled ocr protein, a DNA mimic that binds to the M.EcoKI enzyme. Distances determined from the energy transfer experiments corroborate the structural model of M.EcoKI.« less
In silico evidence for sequence-dependent nucleosome sliding

DOE Office of Scientific and Technical Information (OSTI.GOV)

Lequieu, Joshua; Schwartz, David C.; de Pablo, Juan J.

Nucleosomes represent the basic building block of chromatin and provide an important mechanism by which cellular processes are controlled. The locations of nucleosomes across the genome are not random but instead depend on both the underlying DNA sequence and the dynamic action of other proteins within the nucleus. These processes are central to cellular function, and the molecular details of the interplay between DNA sequence and nudeosome dynamics remain poorly understood. In this work, we investigate this interplay in detail by relying on a molecular model, which permits development of a comprehensive picture of the underlying free energy surfaces andmore » the corresponding dynamics of nudeosome repositioning. The mechanism of nudeosome repositioning is shown to be strongly linked to DNA sequence and directly related to the binding energy of a given DNA sequence to the histone core. It is also demonstrated that chromatin remodelers can override DNA-sequence preferences by exerting torque, and the histone H4 tail is then identified as a key component by which DNA-sequence, histone modifications, and chromatin remodelers could in fact be coupled.« less
Dual signal amplification for highly sensitive electrochemical detection of uropathogens via enzyme-based catalytic target recycling.

PubMed

Su, Jiao; Zhang, Haijie; Jiang, Bingying; Zheng, Huzhi; Chai, Yaqin; Yuan, Ruo; Xiang, Yun

2011-11-15

We report an ultrasensitive electrochemical approach for the detection of uropathogen sequence-specific DNA target. The sensing strategy involves a dual signal amplification process, which combines the signal enhancement by the enzymatic target recycling technique with the sensitivity improvement by the quantum dot (QD) layer-by-layer (LBL) assembled labels. The enzyme-based catalytic target DNA recycling process results in the use of each target DNA sequence for multiple times and leads to direct amplification of the analytical signal. Moreover, the LBL assembled QD labels can further enhance the sensitivity of the sensing system. The coupling of these two effective signal amplification strategies thus leads to low femtomolar (5fM) detection of the target DNA sequences. The proposed strategy also shows excellent discrimination between the target DNA and the single-base mismatch sequences. The advantageous intrinsic sequence-independent property of exonuclease III over other sequence-dependent enzymes makes our new dual signal amplification system a general sensing platform for monitoring ultralow level of various types of target DNA sequences. Copyright © 2011 Elsevier B.V. All rights reserved.
Relations between Shannon entropy and genome order index in segmenting DNA sequences.

PubMed

Zhang, Yi

2009-04-01

Shannon entropy H and genome order index S are used in segmenting DNA sequences. Zhang [Phys. Rev. E 72, 041917 (2005)] found that the two schemes are equivalent when a DNA sequence is converted to a binary sequence of S (strong H bond) and W (weak H bond). They left the mathematical proof to mathematicians who are interested in this issue. In this paper, a possible mathematical explanation is given. Moreover, we find that Chargaff parity rule 2 is the necessary condition of the equivalence, and the equivalence disappears when a DNA sequence is regarded as a four-symbol sequence. At last, we propose that S-2(-H) may be related to species evolution.
Evaluating the role of coherent delocalized phonon-like modes in DNA cyclization

DOE PAGES

Alexandrov, Ludmil B.; Rasmussen, Kim Ã.; Bishop, Alan R.; ...

2017-08-29

The innate flexibility of a DNA sequence is quantified by the Jacobson-Stockmayer’s J-factor, which measures the propensity for DNA loop formation. Recent studies of ultra-short DNA sequences revealed a discrepancy of up to six orders of magnitude between experimentally measured and theoretically predicted J-factors. These large differences suggest that, in addition to the elastic moduli of the double helix, other factors contribute to loop formation. We develop a new theoretical model that explores how coherent delocalized phonon-like modes in DNA provide single-stranded ”flexible hinges” to assist in loop formation. We also combine the Czapla-Swigon-Olson structural model of DNA with ourmore » extended Peyrard-Bishop-Dauxois model and, without changing any of the parameters of the two models, apply this new computational framework to 86 experimentally characterized DNA sequences. Our results demonstrate that the new computational framework can predict J-factors within an order of magnitude of experimental measurements for most ultra-short DNA sequences, while continuing to accurately describe the J-factors of longer sequences. Furthermore, we demonstrate that our computational framework can be used to describe the cyclization of DNA sequences that contain a base pair mismatch. Overall, our results support the conclusion that coherent delocalized phonon-like modes play an important role in DNA cyclization.« less
Evaluating the role of coherent delocalized phonon-like modes in DNA cyclization

DOE Office of Scientific and Technical Information (OSTI.GOV)

Alexandrov, Ludmil B.; Rasmussen, Kim Ã.; Bishop, Alan R.

The innate flexibility of a DNA sequence is quantified by the Jacobson-Stockmayer’s J-factor, which measures the propensity for DNA loop formation. Recent studies of ultra-short DNA sequences revealed a discrepancy of up to six orders of magnitude between experimentally measured and theoretically predicted J-factors. These large differences suggest that, in addition to the elastic moduli of the double helix, other factors contribute to loop formation. We develop a new theoretical model that explores how coherent delocalized phonon-like modes in DNA provide single-stranded ”flexible hinges” to assist in loop formation. We also combine the Czapla-Swigon-Olson structural model of DNA with ourmore » extended Peyrard-Bishop-Dauxois model and, without changing any of the parameters of the two models, apply this new computational framework to 86 experimentally characterized DNA sequences. Our results demonstrate that the new computational framework can predict J-factors within an order of magnitude of experimental measurements for most ultra-short DNA sequences, while continuing to accurately describe the J-factors of longer sequences. Furthermore, we demonstrate that our computational framework can be used to describe the cyclization of DNA sequences that contain a base pair mismatch. Overall, our results support the conclusion that coherent delocalized phonon-like modes play an important role in DNA cyclization.« less
mtDNAmanager: a Web-based tool for the management and quality analysis of mitochondrial DNA control-region sequences

PubMed Central

Lee, Hwan Young; Song, Injee; Ha, Eunho; Cho, Sung-Bae; Yang, Woo Ick; Shin, Kyoung-Jin

2008-01-01

Background For the past few years, scientific controversy has surrounded the large number of errors in forensic and literature mitochondrial DNA (mtDNA) data. However, recent research has shown that using mtDNA phylogeny and referring to known mtDNA haplotypes can be useful for checking the quality of sequence data. Results We developed a Web-based bioinformatics resource "mtDNAmanager" that offers a convenient interface supporting the management and quality analysis of mtDNA sequence data. The mtDNAmanager performs computations on mtDNA control-region sequences to estimate the most-probable mtDNA haplogroups and retrieves similar sequences from a selected database. By the phased designation of the most-probable haplogroups (both expected and estimated haplogroups), mtDNAmanager enables users to systematically detect errors whilst allowing for confirmation of the presence of clear key diagnostic mutations and accompanying mutations. The query tools of mtDNAmanager also facilitate database screening with two options of "match" and "include the queried nucleotide polymorphism". In addition, mtDNAmanager provides Web interfaces for users to manage and analyse their own data in batch mode. Conclusion The mtDNAmanager will provide systematic routines for mtDNA sequence data management and analysis via easily accessible Web interfaces, and thus should be very useful for population, medical and forensic studies that employ mtDNA analysis. mtDNAmanager can be accessed at . PMID:19014619
repDNA: a Python package to generate various modes of feature vectors for DNA sequences by incorporating user-defined physicochemical properties and sequence-order effects.

PubMed

Liu, Bin; Liu, Fule; Fang, Longyun; Wang, Xiaolong; Chou, Kuo-Chen

2015-04-15

In order to develop powerful computational predictors for identifying the biological features or attributes of DNAs, one of the most challenging problems is to find a suitable approach to effectively represent the DNA sequences. To facilitate the studies of DNAs and nucleotides, we developed a Python package called representations of DNAs (repDNA) for generating the widely used features reflecting the physicochemical properties and sequence-order effects of DNAs and nucleotides. There are three feature groups composed of 15 features. The first group calculates three nucleic acid composition features describing the local sequence information by means of kmers; the second group calculates six autocorrelation features describing the level of correlation between two oligonucleotides along a DNA sequence in terms of their specific physicochemical properties; the third group calculates six pseudo nucleotide composition features, which can be used to represent a DNA sequence with a discrete model or vector yet still keep considerable sequence-order information via the physicochemical properties of its constituent oligonucleotides. In addition, these features can be easily calculated based on both the built-in and user-defined properties via using repDNA. The repDNA Python package is freely accessible to the public at http://bioinformatics.hitsz.edu.cn/repDNA/. bliu@insun.hit.edu.cn or kcchou@gordonlifescience.org Supplementary data are available at Bioinformatics online. © The Author 2014. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.
Sunflower centromeres consist of a centromere-specific LINE and a chromosome-specific tandem repeat.

PubMed

Nagaki, Kiyotaka; Tanaka, Keisuke; Yamaji, Naoki; Kobayashi, Hisato; Murata, Minoru

2015-01-01

The kinetochore is a protein complex including kinetochore-specific proteins that plays a role in chromatid segregation during mitosis and meiosis. The complex associates with centromeric DNA sequences that are usually species-specific. In plant species, tandem repeats including satellite DNA sequences and retrotransposons have been reported as centromeric DNA sequences. In this study on sunflowers, a cDNA-encoding centromere-specific histone H3 (CENH3) was isolated from a cDNA pool from a seedling, and an antibody was raised against a peptide synthesized from the deduced cDNA. The antibody specifically recognized the sunflower CENH3 (HaCENH3) and showed centromeric signals by immunostaining and immunohistochemical staining analysis. The antibody was also applied in chromatin immunoprecipitation (ChIP)-Seq to isolate centromeric DNA sequences and two different types of repetitive DNA sequences were identified. One was a long interspersed nuclear element (LINE)-like sequence, which showed centromere-specific signals on almost all chromosomes in sunflowers. This is the first report of a centromeric LINE sequence, suggesting possible centromere targeting ability. Another type of identified repetitive DNA was a tandem repeat sequence with a 187-bp unit that was found only on a pair of chromosomes. The HaCENH3 content of the tandem repeats was estimated to be much higher than that of the LINE, which implies centromere evolution from LINE-based centromeres to more stable tandem-repeat-based centromeres. In addition, the epigenetic status of the sunflower centromeres was investigated by immunohistochemical staining and ChIP, and it was found that centromeres were heterochromatic.
cgDNAweb: a web interface to the cgDNA sequence-dependent coarse-grain model of double-stranded DNA.

PubMed

De Bruin, Lennart; Maddocks, John H

2018-06-14

The sequence-dependent statistical mechanical properties of fragments of double-stranded DNA is believed to be pertinent to its biological function at length scales from a few base pairs (or bp) to a few hundreds of bp, e.g. indirect read-out protein binding sites, nucleosome positioning sequences, phased A-tracts, etc. In turn, the equilibrium statistical mechanics behaviour of DNA depends upon its ground state configuration, or minimum free energy shape, as well as on its fluctuations as governed by its stiffness (in an appropriate sense). We here present cgDNAweb, which provides browser-based interactive visualization of the sequence-dependent ground states of double-stranded DNA molecules, as predicted by the underlying cgDNA coarse-grain rigid-base model of fragments with arbitrary sequence. The cgDNAweb interface is specifically designed to facilitate comparison between ground state shapes of different sequences. The server is freely available at cgDNAweb.epfl.ch with no login requirement.
First Complete Squash leaf curl China virus Genomic Segment DNA-A Sequence from East Timor

PubMed Central

Maina, Solomon; Edwards, Owain R.; de Almeida, Luis; Ximenes, Abel

2017-01-01

ABSTRACT We present here the first complete Squash leaf curl China virus (SLCCV) genomic segment DNA-A sequence from East Timor. It was isolated from a pumpkin plant. When compared with 15 complete SLCCV DNA-A genome sequences from other world regions, it most resembled the Malaysian isolate MC1 sequence. PMID:28619789

Some links on this page may take you to non-federal websites. Their policies may differ from this site.