QPSO-Based Adaptive DNA Computing Algorithm
Karakose, Mehmet; Cigdem, Ugur
2013-01-01
DNA (deoxyribonucleic acid) computing that is a new computation model based on DNA molecules for information storage has been increasingly used for optimization and data analysis in recent years. However, DNA computing algorithm has some limitations in terms of convergence speed, adaptability, and effectiveness. In this paper, a new approach for improvement of DNA computing is proposed. This new approach aims to perform DNA computing algorithm with adaptive parameters towards the desired goal using quantum-behaved particle swarm optimization (QPSO). Some contributions provided by the proposed QPSO based on adaptive DNA computing algorithm are as follows: (1) parameters of population size, crossover rate, maximum number of operations, enzyme and virus mutation rate, and fitness function of DNA computing algorithm are simultaneously tuned for adaptive process, (2) adaptive algorithm is performed using QPSO algorithm for goal-driven progress, faster operation, and flexibility in data, and (3) numerical realization of DNA computing algorithm with proposed approach is implemented in system identification. Two experiments with different systems were carried out to evaluate the performance of the proposed approach with comparative results. Experimental results obtained with Matlab and FPGA demonstrate ability to provide effective optimization, considerable convergence speed, and high accuracy according to DNA computing algorithm. PMID:23935409
Fan, Daoqing; Zhu, Xiaoqing; Dong, Shaojun; Wang, Erkang
2017-07-05
DNA is believed to be a promising candidate for molecular logic computation, and the fluorogenic/colorimetric substrates of G-quadruplex DNAzyme (G4zyme) are broadly used as label-free output reporters of DNA logic circuits. Herein, for the first time, tyramine-HCl (a fluorogenic substrate of G4zyme) is applied to DNA logic computation and a series of label-free DNA-input logic gates, including elementary AND, OR, and INHIBIT logic gates, as well as a two to one encoder, are constructed. Furthermore, a DNA caliper that can measure the base number of target DNA as low as three bases is also fabricated. This DNA caliper can also perform concatenated AND-AND logic computation to fulfil the requirements of sophisticated logic computing. © 2017 Wiley-VCH Verlag GmbH & Co. KGaA, Weinheim.
Reversible Data Hiding Based on DNA Computing
Xie, Yingjie
2017-01-01
Biocomputing, especially DNA, computing has got great development. It is widely used in information security. In this paper, a novel algorithm of reversible data hiding based on DNA computing is proposed. Inspired by the algorithm of histogram modification, which is a classical algorithm for reversible data hiding, we combine it with DNA computing to realize this algorithm based on biological technology. Compared with previous results, our experimental results have significantly improved the ER (Embedding Rate). Furthermore, some PSNR (peak signal-to-noise ratios) of test images are also improved. Experimental results show that it is suitable for protecting the copyright of cover image in DNA-based information security. PMID:28280504
High performance transcription factor-DNA docking with GPU computing
2012-01-01
Background Protein-DNA docking is a very challenging problem in structural bioinformatics and has important implications in a number of applications, such as structure-based prediction of transcription factor binding sites and rational drug design. Protein-DNA docking is very computational demanding due to the high cost of energy calculation and the statistical nature of conformational sampling algorithms. More importantly, experiments show that the docking quality depends on the coverage of the conformational sampling space. It is therefore desirable to accelerate the computation of the docking algorithm, not only to reduce computing time, but also to improve docking quality. Methods In an attempt to accelerate the sampling process and to improve the docking performance, we developed a graphics processing unit (GPU)-based protein-DNA docking algorithm. The algorithm employs a potential-based energy function to describe the binding affinity of a protein-DNA pair, and integrates Monte-Carlo simulation and a simulated annealing method to search through the conformational space. Algorithmic techniques were developed to improve the computation efficiency and scalability on GPU-based high performance computing systems. Results The effectiveness of our approach is tested on a non-redundant set of 75 TF-DNA complexes and a newly developed TF-DNA docking benchmark. We demonstrated that the GPU-based docking algorithm can significantly accelerate the simulation process and thereby improving the chance of finding near-native TF-DNA complex structures. This study also suggests that further improvement in protein-DNA docking research would require efforts from two integral aspects: improvement in computation efficiency and energy function design. Conclusions We present a high performance computing approach for improving the prediction accuracy of protein-DNA docking. The GPU-based docking algorithm accelerates the search of the conformational space and thus increases the chance of finding more near-native structures. To the best of our knowledge, this is the first ad hoc effort of applying GPU or GPU clusters to the protein-DNA docking problem. PMID:22759575
Computational Approaches to Nucleic Acid Origami.
Jabbari, Hosna; Aminpour, Maral; Montemagno, Carlo
2015-10-12
Recent advances in experimental DNA origami have dramatically expanded the horizon of DNA nanotechnology. Complex 3D suprastructures have been designed and developed using DNA origami with applications in biomaterial science, nanomedicine, nanorobotics, and molecular computation. Ribonucleic acid (RNA) origami has recently been realized as a new approach. Similar to DNA, RNA molecules can be designed to form complex 3D structures through complementary base pairings. RNA origami structures are, however, more compact and more thermodynamically stable due to RNA's non-canonical base pairing and tertiary interactions. With all these advantages, the development of RNA origami lags behind DNA origami by a large gap. Furthermore, although computational methods have proven to be effective in designing DNA and RNA origami structures and in their evaluation, advances in computational nucleic acid origami is even more limited. In this paper, we review major milestones in experimental and computational DNA and RNA origami and present current challenges in these fields. We believe collaboration between experimental nanotechnologists and computer scientists are critical for advancing these new research paradigms.
Solving satisfiability problems using a novel microarray-based DNA computer.
Lin, Che-Hsin; Cheng, Hsiao-Ping; Yang, Chang-Biau; Yang, Chia-Ning
2007-01-01
An algorithm based on a modified sticker model accompanied with an advanced MEMS-based microarray technology is demonstrated to solve SAT problem, which has long served as a benchmark in DNA computing. Unlike conventional DNA computing algorithms needing an initial data pool to cover correct and incorrect answers and further executing a series of separation procedures to destroy the unwanted ones, we built solutions in parts to satisfy one clause in one step, and eventually solve the entire Boolean formula through steps. No time-consuming sample preparation procedures and delicate sample applying equipment were required for the computing process. Moreover, experimental results show the bound DNA sequences can sustain the chemical solutions during computing processes such that the proposed method shall be useful in dealing with large-scale problems.
Investigation of a Sybr-Green-Based Method to Validate DNA Sequences for DNA Computing
2005-05-01
OF A SYBR-GREEN-BASED METHOD TO VALIDATE DNA SEQUENCES FOR DNA COMPUTING 6. AUTHOR(S) Wendy Pogozelski, Salvatore Priore, Matthew Bernard ...simulated annealing. Biochemistry, 35, 14077-14089. 15 Pogozelski, W.K., Bernard , M.P. and Macula, A. (2004) DNA code validation using...and Clark, B.F.C. (eds) In RNA Biochemistry and Biotechnology, NATO ASI Series, Kluwer Academic Publishers. Zucker, M. and Stiegler , P. (1981
Research on Image Encryption Based on DNA Sequence and Chaos Theory
NASA Astrophysics Data System (ADS)
Tian Zhang, Tian; Yan, Shan Jun; Gu, Cheng Yan; Ren, Ran; Liao, Kai Xin
2018-04-01
Nowadays encryption is a common technique to protect image data from unauthorized access. In recent years, many scientists have proposed various encryption algorithms based on DNA sequence to provide a new idea for the design of image encryption algorithm. Therefore, a new method of image encryption based on DNA computing technology is proposed in this paper, whose original image is encrypted by DNA coding and 1-D logistic chaotic mapping. First, the algorithm uses two modules as the encryption key. The first module uses the real DNA sequence, and the second module is made by one-dimensional logistic chaos mapping. Secondly, the algorithm uses DNA complementary rules to encode original image, and uses the key and DNA computing technology to compute each pixel value of the original image, so as to realize the encryption of the whole image. Simulation results show that the algorithm has good encryption effect and security.
Simultaneous G-Quadruplex DNA Logic.
Bader, Antoine; Cockroft, Scott L
2018-04-03
A fundamental principle of digital computer operation is Boolean logic, where inputs and outputs are described by binary integer voltages. Similarly, inputs and outputs may be processed on the molecular level as exemplified by synthetic circuits that exploit the programmability of DNA base-pairing. Unlike modern computers, which execute large numbers of logic gates in parallel, most implementations of molecular logic have been limited to single computing tasks, or sensing applications. This work reports three G-quadruplex-based logic gates that operate simultaneously in a single reaction vessel. The gates respond to unique Boolean DNA inputs by undergoing topological conversion from duplex to G-quadruplex states that were resolved using a thioflavin T dye and gel electrophoresis. The modular, addressable, and label-free approach could be incorporated into DNA-based sensors, or used for resolving and debugging parallel processes in DNA computing applications. © 2018 Wiley-VCH Verlag GmbH & Co. KGaA, Weinheim.
Biomolecular computers with multiple restriction enzymes.
Sakowski, Sebastian; Krasinski, Tadeusz; Waldmajer, Jacek; Sarnik, Joanna; Blasiak, Janusz; Poplawski, Tomasz
2017-01-01
The development of conventional, silicon-based computers has several limitations, including some related to the Heisenberg uncertainty principle and the von Neumann "bottleneck". Biomolecular computers based on DNA and proteins are largely free of these disadvantages and, along with quantum computers, are reasonable alternatives to their conventional counterparts in some applications. The idea of a DNA computer proposed by Ehud Shapiro's group at the Weizmann Institute of Science was developed using one restriction enzyme as hardware and DNA fragments (the transition molecules) as software and input/output signals. This computer represented a two-state two-symbol finite automaton that was subsequently extended by using two restriction enzymes. In this paper, we propose the idea of a multistate biomolecular computer with multiple commercially available restriction enzymes as hardware. Additionally, an algorithmic method for the construction of transition molecules in the DNA computer based on the use of multiple restriction enzymes is presented. We use this method to construct multistate, biomolecular, nondeterministic finite automata with four commercially available restriction enzymes as hardware. We also describe an experimental applicaton of this theoretical model to a biomolecular finite automaton made of four endonucleases.
Vandersall, Jennifer A.; Gardner, Shea N.; Clague, David S.
2010-05-04
A computational method and computer-based system of modeling DNA synthesis for the design and interpretation of PCR amplification, parallel DNA synthesis, and microarray chip analysis. The method and system include modules that address the bioinformatics, kinetics, and thermodynamics of DNA amplification and synthesis. Specifically, the steps of DNA selection, as well as the kinetics and thermodynamics of DNA hybridization and extensions, are addressed, which enable the optimization of the processing and the prediction of the products as a function of DNA sequence, mixing protocol, time, temperature and concentration of species.
DENA: A Configurable Microarchitecture and Design Flow for Biomedical DNA-Based Logic Design.
Beiki, Zohre; Jahanian, Ali
2017-10-01
DNA is known as the building block for storing the life codes and transferring the genetic features through the generations. However, it is found that DNA strands can be used for a new type of computation that opens fascinating horizons in computational medicine. Significant contributions are addressed on design of DNA-based logic gates for medical and computational applications but there are serious challenges for designing the medium and large-scale DNA circuits. In this paper, a new microarchitecture and corresponding design flow is proposed to facilitate the design of multistage large-scale DNA logic systems. Feasibility and efficiency of the proposed microarchitecture are evaluated by implementing a full adder and, then, its cascadability is determined by implementing a multistage 8-bit adder. Simulation results show the highlight features of the proposed design style and microarchitecture in terms of the scalability, implementation cost, and signal integrity of the DNA-based logic system compared to the traditional approaches.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Tamrin, Mohd Izzuddin Mohd; Turaev, Sherzod; Sembok, Tengku Mohd Tengku
There are tremendous works in biotechnology especially in area of DNA molecules. The computer society is attempting to develop smaller computing devices through computational models which are based on the operations performed on the DNA molecules. A Watson-Crick automaton, a theoretical model for DNA based computation, has two reading heads, and works on double-stranded sequences of the input related by a complementarity relation similar with the Watson-Crick complementarity of DNA nucleotides. Over the time, several variants of Watson-Crick automata have been introduced and investigated. However, they cannot be used as suitable DNA based computational models for molecular stochastic processes andmore » fuzzy processes that are related to important practical problems such as molecular parsing, gene disease detection, and food authentication. In this paper we define new variants of Watson-Crick automata, called weighted Watson-Crick automata, developing theoretical models for molecular stochastic and fuzzy processes. We define weighted Watson-Crick automata adapting weight restriction mechanisms associated with formal grammars and automata. We also study the generative capacities of weighted Watson-Crick automata, including probabilistic and fuzzy variants. We show that weighted variants of Watson-Crick automata increase their generative power.« less
Weighted Watson-Crick automata
NASA Astrophysics Data System (ADS)
Tamrin, Mohd Izzuddin Mohd; Turaev, Sherzod; Sembok, Tengku Mohd Tengku
2014-07-01
There are tremendous works in biotechnology especially in area of DNA molecules. The computer society is attempting to develop smaller computing devices through computational models which are based on the operations performed on the DNA molecules. A Watson-Crick automaton, a theoretical model for DNA based computation, has two reading heads, and works on double-stranded sequences of the input related by a complementarity relation similar with the Watson-Crick complementarity of DNA nucleotides. Over the time, several variants of Watson-Crick automata have been introduced and investigated. However, they cannot be used as suitable DNA based computational models for molecular stochastic processes and fuzzy processes that are related to important practical problems such as molecular parsing, gene disease detection, and food authentication. In this paper we define new variants of Watson-Crick automata, called weighted Watson-Crick automata, developing theoretical models for molecular stochastic and fuzzy processes. We define weighted Watson-Crick automata adapting weight restriction mechanisms associated with formal grammars and automata. We also study the generative capacities of weighted Watson-Crick automata, including probabilistic and fuzzy variants. We show that weighted variants of Watson-Crick automata increase their generative power.
Approaching mathematical model of the immune network based DNA Strand Displacement system.
Mardian, Rizki; Sekiyama, Kosuke; Fukuda, Toshio
2013-12-01
One biggest obstacle in molecular programming is that there is still no direct method to compile any existed mathematical model into biochemical reaction in order to solve a computational problem. In this paper, the implementation of DNA Strand Displacement system based on nature-inspired computation is observed. By using the Immune Network Theory and Chemical Reaction Network, the compilation of DNA-based operation is defined and the formulation of its mathematical model is derived. Furthermore, the implementation on this system is compared with the conventional implementation by using silicon-based programming. From the obtained results, we can see a positive correlation between both. One possible application from this DNA-based model is for a decision making scheme of intelligent computer or molecular robot. Copyright © 2013 Elsevier Ireland Ltd. All rights reserved.
Wang, Zhaocai; Ji, Zuwen; Wang, Xiaoming; Wu, Tunhua; Huang, Wei
2017-12-01
As a promising approach to solve the computationally intractable problem, the method based on DNA computing is an emerging research area including mathematics, computer science and molecular biology. The task scheduling problem, as a well-known NP-complete problem, arranges n jobs to m individuals and finds the minimum execution time of last finished individual. In this paper, we use a biologically inspired computational model and describe a new parallel algorithm to solve the task scheduling problem by basic DNA molecular operations. In turn, we skillfully design flexible length DNA strands to represent elements of the allocation matrix, take appropriate biological experiment operations and get solutions of the task scheduling problem in proper length range with less than O(n 2 ) time complexity. Copyright © 2017. Published by Elsevier B.V.
Biomolecular computers with multiple restriction enzymes
Sakowski, Sebastian; Krasinski, Tadeusz; Waldmajer, Jacek; Sarnik, Joanna; Blasiak, Janusz; Poplawski, Tomasz
2017-01-01
Abstract The development of conventional, silicon-based computers has several limitations, including some related to the Heisenberg uncertainty principle and the von Neumann “bottleneck”. Biomolecular computers based on DNA and proteins are largely free of these disadvantages and, along with quantum computers, are reasonable alternatives to their conventional counterparts in some applications. The idea of a DNA computer proposed by Ehud Shapiro’s group at the Weizmann Institute of Science was developed using one restriction enzyme as hardware and DNA fragments (the transition molecules) as software and input/output signals. This computer represented a two-state two-symbol finite automaton that was subsequently extended by using two restriction enzymes. In this paper, we propose the idea of a multistate biomolecular computer with multiple commercially available restriction enzymes as hardware. Additionally, an algorithmic method for the construction of transition molecules in the DNA computer based on the use of multiple restriction enzymes is presented. We use this method to construct multistate, biomolecular, nondeterministic finite automata with four commercially available restriction enzymes as hardware. We also describe an experimental applicaton of this theoretical model to a biomolecular finite automaton made of four endonucleases. PMID:29064510
Analog Computation by DNA Strand Displacement Circuits.
Song, Tianqi; Garg, Sudhanshu; Mokhtar, Reem; Bui, Hieu; Reif, John
2016-08-19
DNA circuits have been widely used to develop biological computing devices because of their high programmability and versatility. Here, we propose an architecture for the systematic construction of DNA circuits for analog computation based on DNA strand displacement. The elementary gates in our architecture include addition, subtraction, and multiplication gates. The input and output of these gates are analog, which means that they are directly represented by the concentrations of the input and output DNA strands, respectively, without requiring a threshold for converting to Boolean signals. We provide detailed domain designs and kinetic simulations of the gates to demonstrate their expected performance. On the basis of these gates, we describe how DNA circuits to compute polynomial functions of inputs can be built. Using Taylor Series and Newton Iteration methods, functions beyond the scope of polynomials can also be computed by DNA circuits built upon our architecture.
Solving probability reasoning based on DNA strand displacement and probability modules.
Zhang, Qiang; Wang, Xiaobiao; Wang, Xiaojun; Zhou, Changjun
2017-12-01
In computation biology, DNA strand displacement technology is used to simulate the computation process and has shown strong computing ability. Most researchers use it to solve logic problems, but it is only rarely used in probabilistic reasoning. To process probabilistic reasoning, a conditional probability derivation model and total probability model based on DNA strand displacement were established in this paper. The models were assessed through the game "read your mind." It has been shown to enable the application of probabilistic reasoning in genetic diagnosis. Copyright © 2017 Elsevier Ltd. All rights reserved.
Manipulation of oligonucleotides immobilized on solid supports - DNA computations on surfaces
NASA Astrophysics Data System (ADS)
Liu, Qinghua
The manipulation of DNA oligonucleotides immobilized on various solid supports has been studied intensively, especially in the area of surface hybridization. Recently, surface-based biotechnology has been applied to the area of molecular computing. These surface-based methods have advantages with regard to ease of handling, facile purification, and less interference when compared to solution methodologies. This dissertation describes the investigation of molecular approaches to DNA computing. The feasibility of encoding a bit (0 or 1) of information for DNA-based computations at the single nucleotide level was studied, particularly with regard to the efficiency and specificity of hybridization discrimination. Both gold and glass surfaces, with addressed arrays of 32 oligonucleotides, were employed with similar hybridization results. Although single-base discrimination may be achieved in the system, it is at the cost of a severe decrease in the efficiency of hybridization to perfectly matched sequences. This compromises the utility of single nucleotide encoding for DNA computing applications in the absence of some additional mechanism for increasing specificity. Several methods are suggested including a multiple-base encoding strategy. The multiple-base encoding strategy was employed to develop a prototype DNA computer. The approach was demonstrated by solving a small example of the Satisfiability (SAT) problem, an NP-complete problem in Boolean logic. 16 distinct DNA oligonucleotides, encoding all candidate solutions to the 4-variable-4-clause-3-SAT problem, were immobilized on a gold surface in the non-addressed format. Four cycles of MARK (hybridization), DESTROY (enzymatic destruction) and UNMARK (denaturation) were performed, which identified and eliminated members of the set which were not solutions to the problem. Determination of the answer was accomplished in the READOUT (sequence identification) operation by PCR amplification of the remaining molecules and hybridization to an addressed array. Four answers were determined and the S/N ratio between correct and incorrect solutions ranged from 10 to 777, making discrimination between correct and incorrect solutions to the problem straightforward. Additionally, studies of enzymatic manipulations of DNA molecules on surfaces suggested the use of E. coli Exonuclease I (Exo I) and perhaps EarI in the DESTROY operation.
Superimposed Code Theoretic Analysis of DNA Codes and DNA Computing
2008-01-01
complements of one another and the DNA duplex formed is a Watson - Crick (WC) duplex. However, there are many instances when the formation of non-WC...that the user’s requirements for probe selection are met based on the Watson - Crick probe locality within a target. The second type, called...AFRL-RI-RS-TR-2007-288 Final Technical Report January 2008 SUPERIMPOSED CODE THEORETIC ANALYSIS OF DNA CODES AND DNA COMPUTING
Computer-Aided Drug Discovery: Molecular Docking of Diminazene Ligands to DNA Minor Groove
ERIC Educational Resources Information Center
Kholod, Yana; Hoag, Erin; Muratore, Katlynn; Kosenkov, Dmytro
2018-01-01
The reported project-based laboratory unit introduces upper-division undergraduate students to the basics of computer-aided drug discovery as a part of a computational chemistry laboratory course. The students learn to perform model binding of organic molecules (ligands) to the DNA minor groove with computer-aided drug discovery (CADD) tools. The…
Molecular Sticker Model Stimulation on Silicon for a Maximum Clique Problem
Ning, Jianguo; Li, Yanmei; Yu, Wen
2015-01-01
Molecular computers (also called DNA computers), as an alternative to traditional electronic computers, are smaller in size but more energy efficient, and have massive parallel processing capacity. However, DNA computers may not outperform electronic computers owing to their higher error rates and some limitations of the biological laboratory. The stickers model, as a typical DNA-based computer, is computationally complete and universal, and can be viewed as a bit-vertically operating machine. This makes it attractive for silicon implementation. Inspired by the information processing method on the stickers computer, we propose a novel parallel computing model called DEM (DNA Electronic Computing Model) on System-on-a-Programmable-Chip (SOPC) architecture. Except for the significant difference in the computing medium—transistor chips rather than bio-molecules—the DEM works similarly to DNA computers in immense parallel information processing. Additionally, a plasma display panel (PDP) is used to show the change of solutions, and helps us directly see the distribution of assignments. The feasibility of the DEM is tested by applying it to compute a maximum clique problem (MCP) with eight vertices. Owing to the limited computing sources on SOPC architecture, the DEM could solve moderate-size problems in polynomial time. PMID:26075867
Constructing Smart Protocells with Built-In DNA Computational Core to Eliminate Exogenous Challenge.
Lyu, Yifan; Wu, Cuichen; Heinke, Charles; Han, Da; Cai, Ren; Teng, I-Ting; Liu, Yuan; Liu, Hui; Zhang, Xiaobing; Liu, Qiaoling; Tan, Weihong
2018-06-06
A DNA reaction network is like a biological algorithm that can respond to "molecular input signals", such as biological molecules, while the artificial cell is like a microrobot whose function is powered by the encapsulated DNA reaction network. In this work, we describe the feasibility of using a DNA reaction network as the computational core of a protocell, which will perform an artificial immune response in a concise way to eliminate a mimicked pathogenic challenge. Such a DNA reaction network (RN)-powered protocell can realize the connection of logical computation and biological recognition due to the natural programmability and biological properties of DNA. Thus, the biological input molecules can be easily involved in the molecular computation and the computation process can be spatially isolated and protected by artificial bilayer membrane. We believe the strategy proposed in the current paper, i.e., using DNA RN to power artificial cells, will lay the groundwork for understanding the basic design principles of DNA algorithm-based nanodevices which will, in turn, inspire the construction of artificial cells, or protocells, that will find a place in future biomedical research.
Zhang, Li; Wang, Zhong-Xia; Liang, Ru-Ping; Qiu, Jian-Ding
2013-07-16
Utilizing the principles of metal-ion-mediated base pairs (C-Ag-C and T-Hg-T), the pH-sensitive conformational transition of C-rich DNA strand, and the ligand-exchange process triggered by DL-dithiothreitol (DTT), a system of colorimetric logic gates (YES, AND, INHIBIT, and XOR) can be rationally constructed based on the aggregation of the DNA-modified Au NPs. The proposed logic operation system is simple, which consists of only T-/C-rich DNA-modified Au NPs, and it is unnecessary to exquisitely design and alter the DNA sequence for different multiple molecular logic operations. The nonnatural base pairing combined with unique optical properties of Au NPs promises great potential in multiplexed ion sensing, molecular-scale computers, and other computational logic devices.
Rutty, Guy N; Barber, Jade; Amoroso, Jasmin; Morgan, Bruno; Graham, Eleanor A M
2013-12-01
Post-mortem computed tomography angiography (PMCTA) involves the injection of contrast agents. This could have both a dilution effect on biological fluid samples and could affect subsequent post-contrast analytical laboratory processes. We undertook a small sample study of 10 targeted and 10 whole body PMCTA cases to consider whether or not these two methods of PMCTA could affect post-PMCTA cadaver blood based DNA identification. We used standard methodology to examine DNA from blood samples obtained before and after the PMCTA procedure. We illustrate that neither of these PMCTA methods had an effect on the alleles called following short tandem repeat based DNA profiling, and therefore the ability to undertake post-PMCTA blood based DNA identification.
Genomic signal processing methods for computation of alignment-free distances from DNA sequences.
Borrayo, Ernesto; Mendizabal-Ruiz, E Gerardo; Vélez-Pérez, Hugo; Romo-Vázquez, Rebeca; Mendizabal, Adriana P; Morales, J Alejandro
2014-01-01
Genomic signal processing (GSP) refers to the use of digital signal processing (DSP) tools for analyzing genomic data such as DNA sequences. A possible application of GSP that has not been fully explored is the computation of the distance between a pair of sequences. In this work we present GAFD, a novel GSP alignment-free distance computation method. We introduce a DNA sequence-to-signal mapping function based on the employment of doublet values, which increases the number of possible amplitude values for the generated signal. Additionally, we explore the use of three DSP distance metrics as descriptors for categorizing DNA signal fragments. Our results indicate the feasibility of employing GAFD for computing sequence distances and the use of descriptors for characterizing DNA fragments.
Genomic Signal Processing Methods for Computation of Alignment-Free Distances from DNA Sequences
Borrayo, Ernesto; Mendizabal-Ruiz, E. Gerardo; Vélez-Pérez, Hugo; Romo-Vázquez, Rebeca; Mendizabal, Adriana P.; Morales, J. Alejandro
2014-01-01
Genomic signal processing (GSP) refers to the use of digital signal processing (DSP) tools for analyzing genomic data such as DNA sequences. A possible application of GSP that has not been fully explored is the computation of the distance between a pair of sequences. In this work we present GAFD, a novel GSP alignment-free distance computation method. We introduce a DNA sequence-to-signal mapping function based on the employment of doublet values, which increases the number of possible amplitude values for the generated signal. Additionally, we explore the use of three DSP distance metrics as descriptors for categorizing DNA signal fragments. Our results indicate the feasibility of employing GAFD for computing sequence distances and the use of descriptors for characterizing DNA fragments. PMID:25393409
Logical NAND and NOR Operations Using Algorithmic Self-assembly of DNA Molecules
NASA Astrophysics Data System (ADS)
Wang, Yanfeng; Cui, Guangzhao; Zhang, Xuncai; Zheng, Yan
DNA self-assembly is the most advanced and versatile system that has been experimentally demonstrated for programmable construction of patterned systems on the molecular scale. It has been demonstrated that the simple binary arithmetic and logical operations can be computed by the process of self assembly of DNA tiles. Here we report a one-dimensional algorithmic self-assembly of DNA triple-crossover molecules that can be used to execute five steps of a logical NAND and NOR operations on a string of binary bits. To achieve this, abstract tiles were translated into DNA tiles based on triple-crossover motifs. Serving as input for the computation, long single stranded DNA molecules were used to nucleate growth of tiles into algorithmic crystals. Our method shows that engineered DNA self-assembly can be treated as a bottom-up design techniques, and can be capable of designing DNA computer organization and architecture.
Solving traveling salesman problems with DNA molecules encoding numerical values.
Lee, Ji Youn; Shin, Soo-Yong; Park, Tai Hyun; Zhang, Byoung-Tak
2004-12-01
We introduce a DNA encoding method to represent numerical values and a biased molecular algorithm based on the thermodynamic properties of DNA. DNA strands are designed to encode real values by variation of their melting temperatures. The thermodynamic properties of DNA are used for effective local search of optimal solutions using biochemical techniques, such as denaturation temperature gradient polymerase chain reaction and temperature gradient gel electrophoresis. The proposed method was successfully applied to the traveling salesman problem, an instance of optimization problems on weighted graphs. This work extends the capability of DNA computing to solving numerical optimization problems, which is contrasted with other DNA computing methods focusing on logical problem solving.
A strand graph semantics for DNA-based computation
Petersen, Rasmus L.; Lakin, Matthew R.; Phillips, Andrew
2015-01-01
DNA nanotechnology is a promising approach for engineering computation at the nanoscale, with potential applications in biofabrication and intelligent nanomedicine. DNA strand displacement is a general strategy for implementing a broad range of nanoscale computations, including any computation that can be expressed as a chemical reaction network. Modelling and analysis of DNA strand displacement systems is an important part of the design process, prior to experimental realisation. As experimental techniques improve, it is important for modelling languages to keep pace with the complexity of structures that can be realised experimentally. In this paper we present a process calculus for modelling DNA strand displacement computations involving rich secondary structures, including DNA branches and loops. We prove that our calculus is also sufficiently expressive to model previous work on non-branching structures, and propose a mapping from our calculus to a canonical strand graph representation, in which vertices represent DNA strands, ordered sites represent domains, and edges between sites represent bonds between domains. We define interactions between strands by means of strand graph rewriting, and prove the correspondence between the process calculus and strand graph behaviours. Finally, we propose a mapping from strand graphs to an efficient implementation, which we use to perform modelling and simulation of DNA strand displacement systems with rich secondary structure. PMID:27293306
A detailed experimental study of a DNA computer with two endonucleases.
Sakowski, Sebastian; Krasiński, Tadeusz; Sarnik, Joanna; Blasiak, Janusz; Waldmajer, Jacek; Poplawski, Tomasz
2017-07-14
Great advances in biotechnology have allowed the construction of a computer from DNA. One of the proposed solutions is a biomolecular finite automaton, a simple two-state DNA computer without memory, which was presented by Ehud Shapiro's group at the Weizmann Institute of Science. The main problem with this computer, in which biomolecules carry out logical operations, is its complexity - increasing the number of states of biomolecular automata. In this study, we constructed (in laboratory conditions) a six-state DNA computer that uses two endonucleases (e.g. AcuI and BbvI) and a ligase. We have presented a detailed experimental verification of its feasibility. We described the effect of the number of states, the length of input data, and the nondeterminism on the computing process. We also tested different automata (with three, four, and six states) running on various accepted input words of different lengths such as ab, aab, aaab, ababa, and of an unaccepted word ba. Moreover, this article presents the reaction optimization and the methods of eliminating certain biochemical problems occurring in the implementation of a biomolecular DNA automaton based on two endonucleases.
Programmable energy landscapes for kinetic control of DNA strand displacement.
Machinek, Robert R F; Ouldridge, Thomas E; Haley, Natalie E C; Bath, Jonathan; Turberfield, Andrew J
2014-11-10
DNA is used to construct synthetic systems that sense, actuate, move and compute. The operation of many dynamic DNA devices depends on toehold-mediated strand displacement, by which one DNA strand displaces another from a duplex. Kinetic control of strand displacement is particularly important in autonomous molecular machinery and molecular computation, in which non-equilibrium systems are controlled through rates of competing processes. Here, we introduce a new method based on the creation of mismatched base pairs as kinetic barriers to strand displacement. Reaction rate constants can be tuned across three orders of magnitude by altering the position of such a defect without significantly changing the stabilities of reactants or products. By modelling reaction free-energy landscapes, we explore the mechanistic basis of this control mechanism. We also demonstrate that oxDNA, a coarse-grained model of DNA, is capable of accurately predicting and explaining the impact of mismatches on displacement kinetics.
Modelling of DNA-protein recognition
NASA Technical Reports Server (NTRS)
Rein, R.; Garduno, R.; Colombano, S.; Nir, S.; Haydock, K.; Macelroy, R. D.
1980-01-01
Computer model-building procedures using stereochemical principles together with theoretical energy calculations appear to be, at this stage, the most promising route toward the elucidation of DNA-protein binding schemes and recognition principles. A review of models and bonding principles is conducted and approaches to modeling are considered, taking into account possible di-hydrogen-bonding schemes between a peptide and a base (or a base pair) of a double-stranded nucleic acid in the major groove, aspects of computer graphic modeling, and a search for isogeometric helices. The energetics of recognition complexes is discussed and several models for peptide DNA recognition are presented.
A novel image encryption algorithm based on the chaotic system and DNA computing
NASA Astrophysics Data System (ADS)
Chai, Xiuli; Gan, Zhihua; Lu, Yang; Chen, Yiran; Han, Daojun
A novel image encryption algorithm using the chaotic system and deoxyribonucleic acid (DNA) computing is presented. Different from the traditional encryption methods, the permutation and diffusion of our method are manipulated on the 3D DNA matrix. Firstly, a 3D DNA matrix is obtained through bit plane splitting, bit plane recombination, DNA encoding of the plain image. Secondly, 3D DNA level permutation based on position sequence group (3DDNALPBPSG) is introduced, and chaotic sequences generated from the chaotic system are employed to permutate the positions of the elements of the 3D DNA matrix. Thirdly, 3D DNA level diffusion (3DDNALD) is given, the confused 3D DNA matrix is split into sub-blocks, and XOR operation by block is manipulated to the sub-DNA matrix and the key DNA matrix from the chaotic system. At last, by decoding the diffused DNA matrix, we get the cipher image. SHA 256 hash of the plain image is employed to calculate the initial values of the chaotic system to avoid chosen plaintext attack. Experimental results and security analyses show that our scheme is secure against several known attacks, and it can effectively protect the security of the images.
GUI to Facilitate Research on Biological Damage from Radiation
NASA Technical Reports Server (NTRS)
Cucinotta, Frances A.; Ponomarev, Artem Lvovich
2010-01-01
A graphical-user-interface (GUI) computer program has been developed to facilitate research on the damage caused by highly energetic particles and photons impinging on living organisms. The program brings together, into one computational workspace, computer codes that have been developed over the years, plus codes that will be developed during the foreseeable future, to address diverse aspects of radiation damage. These include codes that implement radiation-track models, codes for biophysical models of breakage of deoxyribonucleic acid (DNA) by radiation, pattern-recognition programs for extracting quantitative information from biological assays, and image-processing programs that aid visualization of DNA breaks. The radiation-track models are based on transport models of interactions of radiation with matter and solution of the Boltzmann transport equation by use of both theoretical and numerical models. The biophysical models of breakage of DNA by radiation include biopolymer coarse-grained and atomistic models of DNA, stochastic- process models of deposition of energy, and Markov-based probabilistic models of placement of double-strand breaks in DNA. The program is designed for use in the NT, 95, 98, 2000, ME, and XP variants of the Windows operating system.
A Feature Selection Algorithm to Compute Gene Centric Methylation from Probe Level Methylation Data.
Baur, Brittany; Bozdag, Serdar
2016-01-01
DNA methylation is an important epigenetic event that effects gene expression during development and various diseases such as cancer. Understanding the mechanism of action of DNA methylation is important for downstream analysis. In the Illumina Infinium HumanMethylation 450K array, there are tens of probes associated with each gene. Given methylation intensities of all these probes, it is necessary to compute which of these probes are most representative of the gene centric methylation level. In this study, we developed a feature selection algorithm based on sequential forward selection that utilized different classification methods to compute gene centric DNA methylation using probe level DNA methylation data. We compared our algorithm to other feature selection algorithms such as support vector machines with recursive feature elimination, genetic algorithms and ReliefF. We evaluated all methods based on the predictive power of selected probes on their mRNA expression levels and found that a K-Nearest Neighbors classification using the sequential forward selection algorithm performed better than other algorithms based on all metrics. We also observed that transcriptional activities of certain genes were more sensitive to DNA methylation changes than transcriptional activities of other genes. Our algorithm was able to predict the expression of those genes with high accuracy using only DNA methylation data. Our results also showed that those DNA methylation-sensitive genes were enriched in Gene Ontology terms related to the regulation of various biological processes.
Cloud-based MOTIFSIM: Detecting Similarity in Large DNA Motif Data Sets.
Tran, Ngoc Tam L; Huang, Chun-Hsi
2017-05-01
We developed the cloud-based MOTIFSIM on Amazon Web Services (AWS) cloud. The tool is an extended version from our web-based tool version 2.0, which was developed based on a novel algorithm for detecting similarity in multiple DNA motif data sets. This cloud-based version further allows researchers to exploit the computing resources available from AWS to detect similarity in multiple large-scale DNA motif data sets resulting from the next-generation sequencing technology. The tool is highly scalable with expandable AWS.
The Ins and Outs of DNA Fingerprinting the Infectious Fungi
Soll, David R.
2000-01-01
DNA fingerprinting methods have evolved as major tools in fungal epidemiology. However, no single method has emerged as the method of choice, and some methods perform better than others at different levels of resolution. In this review, requirements for an effective DNA fingerprinting method are proposed and procedures are described for testing the efficacy of a method. In light of the proposed requirements, the most common methods now being used to DNA fingerprint the infectious fungi are described and assessed. These methods include restriction fragment length polymorphisms (RFLP), RFLP with hybridization probes, randomly amplified polymorphic DNA and other PCR-based methods, electrophoretic karyotyping, and sequencing-based methods. Procedures for computing similarity coefficients, generating phylogenetic trees, and testing the stability of clusters are then described. To facilitate the analysis of DNA fingerprinting data, computer-assisted methods are described. Finally, the problems inherent in the collection of test and control isolates are considered, and DNA fingerprinting studies of strain maintenance during persistent or recurrent infections, microevolution in infecting strains, and the origin of nosocomial infections are assessed in light of the preceding discussion of the ins and outs of DNA fingerprinting. The intent of this review is to generate an awareness of the need to verify the efficacy of each DNA fingerprinting method for the level of genetic relatedness necessary to answer the epidemiological question posed, to use quantitative methods to analyze DNA fingerprint data, to use computer-assisted DNA fingerprint analysis systems to analyze data, and to file data in a form that can be used in the future for retrospective and comparative studies. PMID:10756003
Iyer, Lakshminarayan M; Zhang, Dapeng; Burroughs, A Maxwell; Aravind, L
2013-09-01
Discovery of the TET/JBP family of dioxygenases that modify bases in DNA has sparked considerable interest in novel DNA base modifications and their biological roles. Using sensitive sequence and structure analyses combined with contextual information from comparative genomics, we computationally characterize over 12 novel biochemical systems for DNA modifications. We predict previously unidentified enzymes, such as the kinetoplastid J-base generating glycosyltransferase (and its homolog GREB1), the catalytic specificity of bacteriophage TET/JBP proteins and their role in complex DNA base modifications. We also predict the enzymes involved in synthesis of hypermodified bases such as alpha-glutamylthymine and alpha-putrescinylthymine that have remained enigmatic for several decades. Moreover, the current analysis suggests that bacteriophages and certain nucleo-cytoplasmic large DNA viruses contain an unexpectedly diverse range of DNA modification systems, in addition to those using previously characterized enzymes such as Dam, Dcm, TET/JBP, pyrimidine hydroxymethylases, Mom and glycosyltransferases. These include enzymes generating modified bases such as deazaguanines related to queuine and archaeosine, pyrimidines comparable with lysidine, those derived using modified S-adenosyl methionine derivatives and those using TET/JBP-generated hydroxymethyl pyrimidines as biosynthetic starting points. We present evidence that some of these modification systems are also widely dispersed across prokaryotes and certain eukaryotes such as basidiomycetes, chlorophyte and stramenopile alga, where they could serve as novel epigenetic marks for regulation or discrimination of self from non-self DNA. Our study extends the role of the PUA-like fold domains in recognition of modified nucleic acids and predicts versions of the ASCH and EVE domains to be novel 'readers' of modified bases in DNA. These results open opportunities for the investigation of the biology of these systems and their use in biotechnology.
Iyer, Lakshminarayan M.; Zhang, Dapeng; Maxwell Burroughs, A.; Aravind, L.
2013-01-01
Discovery of the TET/JBP family of dioxygenases that modify bases in DNA has sparked considerable interest in novel DNA base modifications and their biological roles. Using sensitive sequence and structure analyses combined with contextual information from comparative genomics, we computationally characterize over 12 novel biochemical systems for DNA modifications. We predict previously unidentified enzymes, such as the kinetoplastid J-base generating glycosyltransferase (and its homolog GREB1), the catalytic specificity of bacteriophage TET/JBP proteins and their role in complex DNA base modifications. We also predict the enzymes involved in synthesis of hypermodified bases such as alpha-glutamylthymine and alpha-putrescinylthymine that have remained enigmatic for several decades. Moreover, the current analysis suggests that bacteriophages and certain nucleo-cytoplasmic large DNA viruses contain an unexpectedly diverse range of DNA modification systems, in addition to those using previously characterized enzymes such as Dam, Dcm, TET/JBP, pyrimidine hydroxymethylases, Mom and glycosyltransferases. These include enzymes generating modified bases such as deazaguanines related to queuine and archaeosine, pyrimidines comparable with lysidine, those derived using modified S-adenosyl methionine derivatives and those using TET/JBP-generated hydroxymethyl pyrimidines as biosynthetic starting points. We present evidence that some of these modification systems are also widely dispersed across prokaryotes and certain eukaryotes such as basidiomycetes, chlorophyte and stramenopile alga, where they could serve as novel epigenetic marks for regulation or discrimination of self from non-self DNA. Our study extends the role of the PUA-like fold domains in recognition of modified nucleic acids and predicts versions of the ASCH and EVE domains to be novel ‘readers’ of modified bases in DNA. These results open opportunities for the investigation of the biology of these systems and their use in biotechnology. PMID:23814188
NASA Astrophysics Data System (ADS)
Kingsland, Addie
DNA is an amazing molecule which is the basic template for all genetics. It is the primary molecule for storing biological information, and has many applications in nanotechnology. Double-stranded DNA may contain mismatched base pairs beyond the Watson-Crick pairs guanine-cytosine and adenine-thymine. To date, no one has found a physical property of base pair mismatches which describes the behavior of naturally occurring mismatch repair enzymes. Many materials properties of DNA are also unknown, for instance, when pulling DNA in different configurations, different energy differences are observed with no obvious reason why. DNA mismatches also affect their local environment, for instance changing the quantum yield of nearby azobenzene moieties. We utilize molecular dynamics computer simulations to study the structure and dynamics for both matched and mismatched base pairs, within both biological and materials contexts, and in both equilibrium and biased dynamics. We show that mismatched pairs shift further in the plane normal to the DNA strand and are more likely to exhibit non-canonical structures, including the e-motif. Base pair mismatches alter their local environment, affecting the trans- to cis- photoisomerization quantum yield of azobenzene, as well as increasing the likelihood of observing the e-motif. We also show that by using simulated data, we can give new insights on theoretical models to calculate the energetics of pulling DNA strands apart. These results, all relatively inexpensive on modern computer hardware, can help guide the design of DNA-based nanotechnologies, as well as give new insights into the functioning of mismatch repair systems in cancer prevention.
Suárez, Martha Y.; Villagrán; Miller, John H.
2015-01-01
We report on a new technique, computational DNA hole spectroscopy, which creates spectra of electron hole probabilities vs. nucleotide position. A hole is a site of positive charge created when an electron is removed. Peaks in the hole spectrum depict sites where holes tend to localize and potentially trigger a base pair mismatch during replication. Our studies of mitochondrial DNA reveal a correlation between L-strand hole spectrum peaks and spikes in the human mutation spectrum. Importantly, we also find that hole peak positions that do not coincide with large variant frequencies often coincide with disease-implicated mutations and/or (for coding DNA) encoded conserved amino acids. This enables combining hole spectra with variant data to identify critical base pairs and potential disease ‘driver’ mutations. Such integration of DNA hole and variance spectra could ultimately prove invaluable for pinpointing critical regions of the vast non-protein-coding genome. An observed asymmetry in correlations, between the spectrum of human mtDNA variations and the L- and H-strand hole spectra, is attributed to asymmetric DNA replication processes that occur for the leading and lagging strands. PMID:26310834
Villagrán, Martha Y Suárez; Miller, John H
2015-08-27
We report on a new technique, computational DNA hole spectroscopy, which creates spectra of electron hole probabilities vs. nucleotide position. A hole is a site of positive charge created when an electron is removed. Peaks in the hole spectrum depict sites where holes tend to localize and potentially trigger a base pair mismatch during replication. Our studies of mitochondrial DNA reveal a correlation between L-strand hole spectrum peaks and spikes in the human mutation spectrum. Importantly, we also find that hole peak positions that do not coincide with large variant frequencies often coincide with disease-implicated mutations and/or (for coding DNA) encoded conserved amino acids. This enables combining hole spectra with variant data to identify critical base pairs and potential disease 'driver' mutations. Such integration of DNA hole and variance spectra could ultimately prove invaluable for pinpointing critical regions of the vast non-protein-coding genome. An observed asymmetry in correlations, between the spectrum of human mtDNA variations and the L- and H-strand hole spectra, is attributed to asymmetric DNA replication processes that occur for the leading and lagging strands.
Exploring the Feasibility of a DNA Computer: Design of an ALU Using Sticker-Based DNA Model.
Sarkar, Mayukh; Ghosal, Prasun; Mohanty, Saraju P
2017-09-01
Since its inception, DNA computing has advanced to offer an extremely powerful, energy-efficient emerging technology for solving hard computational problems with its inherent massive parallelism and extremely high data density. This would be much more powerful and general purpose when combined with other existing well-known algorithmic solutions that exist for conventional computing architectures using a suitable ALU. Thus, a specifically designed DNA Arithmetic and Logic Unit (ALU) that can address operations suitable for both domains can mitigate the gap between these two. An ALU must be able to perform all possible logic operations, including NOT, OR, AND, XOR, NOR, NAND, and XNOR; compare, shift etc., integer and floating point arithmetic operations (addition, subtraction, multiplication, and division). In this paper, design of an ALU has been proposed using sticker-based DNA model with experimental feasibility analysis. Novelties of this paper may be in manifold. First, the integer arithmetic operations performed here are 2s complement arithmetic, and the floating point operations follow the IEEE 754 floating point format, resembling closely to a conventional ALU. Also, the output of each operation can be reused for any next operation. So any algorithm or program logic that users can think of can be implemented directly on the DNA computer without any modification. Second, once the basic operations of sticker model can be automated, the implementations proposed in this paper become highly suitable to design a fully automated ALU. Third, proposed approaches are easy to implement. Finally, these approaches can work on sufficiently large binary numbers.
Computing exponentially faster: implementing a non-deterministic universal Turing machine using DNA
Currin, Andrew; Korovin, Konstantin; Ababi, Maria; Roper, Katherine; Kell, Douglas B.; Day, Philip J.
2017-01-01
The theory of computer science is based around universal Turing machines (UTMs): abstract machines able to execute all possible algorithms. Modern digital computers are physical embodiments of classical UTMs. For the most important class of problem in computer science, non-deterministic polynomial complete problems, non-deterministic UTMs (NUTMs) are theoretically exponentially faster than both classical UTMs and quantum mechanical UTMs (QUTMs). However, no attempt has previously been made to build an NUTM, and their construction has been regarded as impossible. Here, we demonstrate the first physical design of an NUTM. This design is based on Thue string rewriting systems, and thereby avoids the limitations of most previous DNA computing schemes: all the computation is local (simple edits to strings) so there is no need for communication, and there is no need to order operations. The design exploits DNA's ability to replicate to execute an exponential number of computational paths in P time. Each Thue rewriting step is embodied in a DNA edit implemented using a novel combination of polymerase chain reactions and site-directed mutagenesis. We demonstrate that the design works using both computational modelling and in vitro molecular biology experimentation: the design is thermodynamically favourable, microprogramming can be used to encode arbitrary Thue rules, all classes of Thue rule can be implemented, and non-deterministic rule implementation. In an NUTM, the resource limitation is space, which contrasts with classical UTMs and QUTMs where it is time. This fundamental difference enables an NUTM to trade space for time, which is significant for both theoretical computer science and physics. It is also of practical importance, for to quote Richard Feynman ‘there's plenty of room at the bottom’. This means that a desktop DNA NUTM could potentially utilize more processors than all the electronic computers in the world combined, and thereby outperform the world's current fastest supercomputer, while consuming a tiny fraction of its energy. PMID:28250099
Petri-net-based 2D design of DNA walker circuits.
Gilbert, David; Heiner, Monika; Rohr, Christian
2018-01-01
We consider localised DNA computation, where a DNA strand walks along a binary decision graph to compute a binary function. One of the challenges for the design of reliable walker circuits consists in leakage transitions, which occur when a walker jumps into another branch of the decision graph. We automatically identify leakage transitions, which allows for a detailed qualitative and quantitative assessment of circuit designs, design comparison, and design optimisation. The ability to identify leakage transitions is an important step in the process of optimising DNA circuit layouts where the aim is to minimise the computational error inherent in a circuit while minimising the area of the circuit. Our 2D modelling approach of DNA walker circuits relies on coloured stochastic Petri nets which enable functionality, topology and dimensionality all to be integrated in one two-dimensional model. Our modelling and analysis approach can be easily extended to 3-dimensional walker systems.
DNA-Based Dynamic Reaction Networks.
Fu, Ting; Lyu, Yifan; Liu, Hui; Peng, Ruizi; Zhang, Xiaobing; Ye, Mao; Tan, Weihong
2018-05-21
Deriving from logical and mechanical interactions between DNA strands and complexes, DNA-based artificial reaction networks (RNs) are attractive for their high programmability, as well as cascading and fan-out ability, which are similar to the basic principles of electronic logic gates. Arising from the dream of creating novel computing mechanisms, researchers have placed high hopes on the development of DNA-based dynamic RNs and have strived to establish the basic theories and operative strategies of these networks. This review starts by looking back on the evolution of DNA dynamic RNs; in particular' the most significant applications in biochemistry occurring in recent years. Finally, we discuss the perspectives of DNA dynamic RNs and give a possible direction for the development of DNA circuits. Copyright © 2018. Published by Elsevier Ltd.
A new method for enhancer prediction based on deep belief network.
Bu, Hongda; Gan, Yanglan; Wang, Yang; Zhou, Shuigeng; Guan, Jihong
2017-10-16
Studies have shown that enhancers are significant regulatory elements to play crucial roles in gene expression regulation. Since enhancers are unrelated to the orientation and distance to their target genes, it is a challenging mission for scholars and researchers to accurately predicting distal enhancers. In the past years, with the high-throughout ChiP-seq technologies development, several computational techniques emerge to predict enhancers using epigenetic or genomic features. Nevertheless, the inconsistency of computational models across different cell-lines and the unsatisfactory prediction performance call for further research in this area. Here, we propose a new Deep Belief Network (DBN) based computational method for enhancer prediction, which is called EnhancerDBN. This method combines diverse features, composed of DNA sequence compositional features, DNA methylation and histone modifications. Our computational results indicate that 1) EnhancerDBN outperforms 13 existing methods in prediction, and 2) GC content and DNA methylation can serve as relevant features for enhancer prediction. Deep learning is effective in boosting the performance of enhancer prediction.
Discrete Ramanujan transform for distinguishing the protein coding regions from other regions.
Hua, Wei; Wang, Jiasong; Zhao, Jian
2014-01-01
Based on the study of Ramanujan sum and Ramanujan coefficient, this paper suggests the concepts of discrete Ramanujan transform and spectrum. Using Voss numerical representation, one maps a symbolic DNA strand as a numerical DNA sequence, and deduces the discrete Ramanujan spectrum of the numerical DNA sequence. It is well known that of discrete Fourier power spectrum of protein coding sequence has an important feature of 3-base periodicity, which is widely used for DNA sequence analysis by the technique of discrete Fourier transform. It is performed by testing the signal-to-noise ratio at frequency N/3 as a criterion for the analysis, where N is the length of the sequence. The results presented in this paper show that the property of 3-base periodicity can be only identified as a prominent spike of the discrete Ramanujan spectrum at period 3 for the protein coding regions. The signal-to-noise ratio for discrete Ramanujan spectrum is defined for numerical measurement. Therefore, the discrete Ramanujan spectrum and the signal-to-noise ratio of a DNA sequence can be used for distinguishing the protein coding regions from the noncoding regions. All the exon and intron sequences in whole chromosomes 1, 2, 3 and 4 of Caenorhabditis elegans have been tested and the histograms and tables from the computational results illustrate the reliability of our method. In addition, we have analyzed theoretically and gotten the conclusion that the algorithm for calculating discrete Ramanujan spectrum owns the lower computational complexity and higher computational accuracy. The computational experiments show that the technique by using discrete Ramanujan spectrum for classifying different DNA sequences is a fast and effective method. Copyright © 2014 Elsevier Ltd. All rights reserved.
An evolution based biosensor receptor DNA sequence generation algorithm.
Kim, Eungyeong; Lee, Malrey; Gatton, Thomas M; Lee, Jaewan; Zang, Yupeng
2010-01-01
A biosensor is composed of a bioreceptor, an associated recognition molecule, and a signal transducer that can selectively detect target substances for analysis. DNA based biosensors utilize receptor molecules that allow hybridization with the target analyte. However, most DNA biosensor research uses oligonucleotides as the target analytes and does not address the potential problems of real samples. The identification of recognition molecules suitable for real target analyte samples is an important step towards further development of DNA biosensors. This study examines the characteristics of DNA used as bioreceptors and proposes a hybrid evolution-based DNA sequence generating algorithm, based on DNA computing, to identify suitable DNA bioreceptor recognition molecules for stable hybridization with real target substances. The Traveling Salesman Problem (TSP) approach is applied in the proposed algorithm to evaluate the safety and fitness of the generated DNA sequences. This approach improves efficiency and stability for enhanced and variable-length DNA sequence generation and allows extension to generation of variable-length DNA sequences with diverse receptor recognition requirements.
Parker, Trent M; Hohenstein, Edward G; Parrish, Robert M; Hud, Nicholas V; Sherrill, C David
2013-01-30
Symmetry-adapted perturbation theory (SAPT) is applied to pairs of hydrogen-bonded nucleobases to obtain the energetic components of base stacking (electrostatic, exchange-repulsion, induction/polarization, and London dispersion interactions) and how they vary as a function of the helical parameters Rise, Twist, and Slide. Computed average values of Rise and Twist agree well with experimental data for B-form DNA from the Nucleic Acids Database, even though the model computations omitted the backbone atoms (suggesting that the backbone in B-form DNA is compatible with having the bases adopt their ideal stacking geometries). London dispersion forces are the most important attractive component in base stacking, followed by electrostatic interactions. At values of Rise typical of those in DNA (3.36 Å), the electrostatic contribution is nearly always attractive, providing further evidence for the importance of charge-penetration effects in π-π interactions (a term neglected in classical force fields). Comparison of the computed stacking energies with those from model complexes made of the "parent" nucleobases purine and 2-pyrimidone indicates that chemical substituents in DNA and RNA account for 20-40% of the base-stacking energy. A lack of correspondence between the SAPT results and experiment for Slide in RNA base-pair steps suggests that the backbone plays a larger role in determining stacking geometries in RNA than in B-form DNA. In comparisons of base-pair steps with thymine versus uracil, the thymine methyl group tends to enhance the strength of the stacking interaction through a combination of dispersion and electrosatic interactions.
DNA-programmed dynamic assembly of quantum dots for molecular computation.
He, Xuewen; Li, Zhi; Chen, Muzi; Ma, Nan
2014-12-22
Despite the widespread use of quantum dots (QDs) for biosensing and bioimaging, QD-based bio-interfaceable and reconfigurable molecular computing systems have not yet been realized. DNA-programmed dynamic assembly of multi-color QDs is presented for the construction of a new class of fluorescence resonance energy transfer (FRET)-based QD computing systems. A complete set of seven elementary logic gates (OR, AND, NOR, NAND, INH, XOR, XNOR) are realized using a series of binary and ternary QD complexes operated by strand displacement reactions. The integration of different logic gates into a half-adder circuit for molecular computation is also demonstrated. This strategy is quite versatile and straightforward for logical operations and would pave the way for QD-biocomputing-based intelligent molecular diagnostics. © 2014 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
Gener: a minimal programming module for chemical controllers based on DNA strand displacement
Kahramanoğulları, Ozan; Cardelli, Luca
2015-01-01
Summary: Gener is a development module for programming chemical controllers based on DNA strand displacement. Gener is developed with the aim of providing a simple interface that minimizes the opportunities for programming errors: Gener allows the user to test the computations of the DNA programs based on a simple two-domain strand displacement algebra, the minimal available so far. The tool allows the user to perform stepwise computations with respect to the rules of the algebra as well as exhaustive search of the computation space with different options for exploration and visualization. Gener can be used in combination with existing tools, and in particular, its programs can be exported to Microsoft Research’s DSD tool as well as to LaTeX. Availability and implementation: Gener is available for download at the Cosbi website at http://www.cosbi.eu/research/prototypes/gener as a windows executable that can be run on Mac OS X and Linux by using Mono. Contact: ozan@cosbi.eu PMID:25957353
Gener: a minimal programming module for chemical controllers based on DNA strand displacement.
Kahramanoğulları, Ozan; Cardelli, Luca
2015-09-01
: Gener is a development module for programming chemical controllers based on DNA strand displacement. Gener is developed with the aim of providing a simple interface that minimizes the opportunities for programming errors: Gener allows the user to test the computations of the DNA programs based on a simple two-domain strand displacement algebra, the minimal available so far. The tool allows the user to perform stepwise computations with respect to the rules of the algebra as well as exhaustive search of the computation space with different options for exploration and visualization. Gener can be used in combination with existing tools, and in particular, its programs can be exported to Microsoft Research's DSD tool as well as to LaTeX. Gener is available for download at the Cosbi website at http://www.cosbi.eu/research/prototypes/gener as a windows executable that can be run on Mac OS X and Linux by using Mono. ozan@cosbi.eu. © The Author 2015. Published by Oxford University Press.
ERIC Educational Resources Information Center
King, Angela G.
2007-01-01
This article presents three reports of research advances. The first report describes a deoxyribonucleic acid (DNA)-based computer that could lead to faster, more accurate tests for diagnosing West Nile Virus and bird flu. Representing the first "medium-scale integrated molecular circuit," it is the most powerful computing device of its type to…
DNA Origami-Graphene Hybrid Nanopore for DNA Detection.
Barati Farimani, Amir; Dibaeinia, Payam; Aluru, Narayana R
2017-01-11
DNA origami nanostructures can be used to functionalize solid-state nanopores for single molecule studies. In this study, we characterized a nanopore in a DNA origami-graphene heterostructure for DNA detection. The DNA origami nanopore is functionalized with a specific nucleotide type at the edge of the pore. Using extensive molecular dynamics (MD) simulations, we computed and analyzed the ionic conductivity of nanopores in heterostructures carpeted with one or two layers of DNA origami on graphene. We demonstrate that a nanopore in DNA origami-graphene gives rise to distinguishable dwell times for the four DNA base types, whereas for a nanopore in bare graphene, the dwell time is almost the same for all types of bases. The specific interactions (hydrogen bonds) between DNA origami and the translocating DNA strand yield different residence times and ionic currents. We also conclude that the speed of DNA translocation decreases due to the friction between the dangling bases at the pore mouth and the sequencing DNA strands.
Wang, Zhaocai; Pu, Jun; Cao, Liling; Tan, Jian
2015-10-23
The unbalanced assignment problem (UAP) is to optimally resolve the problem of assigning n jobs to m individuals (m < n), such that minimum cost or maximum profit obtained. It is a vitally important Non-deterministic Polynomial (NP) complete problem in operation management and applied mathematics, having numerous real life applications. In this paper, we present a new parallel DNA algorithm for solving the unbalanced assignment problem using DNA molecular operations. We reasonably design flexible-length DNA strands representing different jobs and individuals, take appropriate steps, and get the solutions of the UAP in the proper length range and O(mn) time. We extend the application of DNA molecular operations and simultaneity to simplify the complexity of the computation.
DNA Cryptography and Deep Learning using Genetic Algorithm with NW algorithm for Key Generation.
Kalsi, Shruti; Kaur, Harleen; Chang, Victor
2017-12-05
Cryptography is not only a science of applying complex mathematics and logic to design strong methods to hide data called as encryption, but also to retrieve the original data back, called decryption. The purpose of cryptography is to transmit a message between a sender and receiver such that an eavesdropper is unable to comprehend it. To accomplish this, not only we need a strong algorithm, but a strong key and a strong concept for encryption and decryption process. We have introduced a concept of DNA Deep Learning Cryptography which is defined as a technique of concealing data in terms of DNA sequence and deep learning. In the cryptographic technique, each alphabet of a letter is converted into a different combination of the four bases, namely; Adenine (A), Cytosine (C), Guanine (G) and Thymine (T), which make up the human deoxyribonucleic acid (DNA). Actual implementations with the DNA don't exceed laboratory level and are expensive. To bring DNA computing on a digital level, easy and effective algorithms are proposed in this paper. In proposed work we have introduced firstly, a method and its implementation for key generation based on the theory of natural selection using Genetic Algorithm with Needleman-Wunsch (NW) algorithm and Secondly, a method for implementation of encryption and decryption based on DNA computing using biological operations Transcription, Translation, DNA Sequencing and Deep Learning.
Programmable chemical controllers made from DNA.
Chen, Yuan-Jyue; Dalchau, Neil; Srinivas, Niranjan; Phillips, Andrew; Cardelli, Luca; Soloveichik, David; Seelig, Georg
2013-10-01
Biological organisms use complex molecular networks to navigate their environment and regulate their internal state. The development of synthetic systems with similar capabilities could lead to applications such as smart therapeutics or fabrication methods based on self-organization. To achieve this, molecular control circuits need to be engineered to perform integrated sensing, computation and actuation. Here we report a DNA-based technology for implementing the computational core of such controllers. We use the formalism of chemical reaction networks as a 'programming language' and our DNA architecture can, in principle, implement any behaviour that can be mathematically expressed as such. Unlike logic circuits, our formulation naturally allows complex signal processing of intrinsically analogue biological and chemical inputs. Controller components can be derived from biologically synthesized (plasmid) DNA, which reduces errors associated with chemically synthesized DNA. We implement several building-block reaction types and then combine them into a network that realizes, at the molecular level, an algorithm used in distributed control systems for achieving consensus between multiple agents.
Programmable chemical controllers made from DNA
NASA Astrophysics Data System (ADS)
Chen, Yuan-Jyue; Dalchau, Neil; Srinivas, Niranjan; Phillips, Andrew; Cardelli, Luca; Soloveichik, David; Seelig, Georg
2013-10-01
Biological organisms use complex molecular networks to navigate their environment and regulate their internal state. The development of synthetic systems with similar capabilities could lead to applications such as smart therapeutics or fabrication methods based on self-organization. To achieve this, molecular control circuits need to be engineered to perform integrated sensing, computation and actuation. Here we report a DNA-based technology for implementing the computational core of such controllers. We use the formalism of chemical reaction networks as a 'programming language' and our DNA architecture can, in principle, implement any behaviour that can be mathematically expressed as such. Unlike logic circuits, our formulation naturally allows complex signal processing of intrinsically analogue biological and chemical inputs. Controller components can be derived from biologically synthesized (plasmid) DNA, which reduces errors associated with chemically synthesized DNA. We implement several building-block reaction types and then combine them into a network that realizes, at the molecular level, an algorithm used in distributed control systems for achieving consensus between multiple agents.
Programmable chemical controllers made from DNA
Chen, Yuan-Jyue; Dalchau, Neil; Srinivas, Niranjan; Phillips, Andrew; Cardelli, Luca; Soloveichik, David; Seelig, Georg
2014-01-01
Biological organisms use complex molecular networks to navigate their environment and regulate their internal state. The development of synthetic systems with similar capabilities could lead to applications such as smart therapeutics or fabrication methods based on self-organization. To achieve this, molecular control circuits need to be engineered to perform integrated sensing, computation and actuation. Here we report a DNA-based technology for implementing the computational core of such controllers. We use the formalism of chemical reaction networks as a 'programming language', and our DNA architecture can, in principle, implement any behaviour that can be mathematically expressed as such. Unlike logic circuits, our formulation naturally allows complex signal processing of intrinsically analogue biological and chemical inputs. Controller components can be derived from biologically synthesized (plasmid) DNA, which reduces errors associated with chemically synthesized DNA. We implement several building-block reaction types and then combine them into a network that realizes, at the molecular level, an algorithm used in distributed control systems for achieving consensus between multiple agents. PMID:24077029
Liu, Bin; Wang, Shanyi; Dong, Qiwen; Li, Shumin; Liu, Xuan
2016-04-20
DNA-binding proteins play a pivotal role in various intra- and extra-cellular activities ranging from DNA replication to gene expression control. With the rapid development of next generation of sequencing technique, the number of protein sequences is unprecedentedly increasing. Thus it is necessary to develop computational methods to identify the DNA-binding proteins only based on the protein sequence information. In this study, a novel method called iDNA-KACC is presented, which combines the Support Vector Machine (SVM) and the auto-cross covariance transformation. The protein sequences are first converted into profile-based protein representation, and then converted into a series of fixed-length vectors by the auto-cross covariance transformation with Kmer composition. The sequence order effect can be effectively captured by this scheme. These vectors are then fed into Support Vector Machine (SVM) to discriminate the DNA-binding proteins from the non DNA-binding ones. iDNA-KACC achieves an overall accuracy of 75.16% and Matthew correlation coefficient of 0.5 by a rigorous jackknife test. Its performance is further improved by employing an ensemble learning approach, and the improved predictor is called iDNA-KACC-EL. Experimental results on an independent dataset shows that iDNA-KACC-EL outperforms all the other state-of-the-art predictors, indicating that it would be a useful computational tool for DNA binding protein identification. .
DNA strand displacement system running logic programs.
Rodríguez-Patón, Alfonso; Sainz de Murieta, Iñaki; Sosík, Petr
2014-01-01
The paper presents a DNA-based computing model which is enzyme-free and autonomous, not requiring a human intervention during the computation. The model is able to perform iterated resolution steps with logical formulae in conjunctive normal form. The implementation is based on the technique of DNA strand displacement, with each clause encoded in a separate DNA molecule. Propositions are encoded assigning a strand to each proposition p, and its complementary strand to the proposition ¬p; clauses are encoded comprising different propositions in the same strand. The model allows to run logic programs composed of Horn clauses by cascading resolution steps. The potential of the model is demonstrated also by its theoretical capability of solving SAT. The resulting SAT algorithm has a linear time complexity in the number of resolution steps, whereas its spatial complexity is exponential in the number of variables of the formula. Copyright © 2013 Elsevier Ireland Ltd. All rights reserved.
A DNA-based molecular motor that can navigate a network of tracks
NASA Astrophysics Data System (ADS)
Wickham, Shelley F. J.; Bath, Jonathan; Katsuda, Yousuke; Endo, Masayuki; Hidaka, Kumi; Sugiyama, Hiroshi; Turberfield, Andrew J.
2012-03-01
Synthetic molecular motors can be fuelled by the hydrolysis or hybridization of DNA. Such motors can move autonomously and programmably, and long-range transport has been observed on linear tracks. It has also been shown that DNA systems can compute. Here, we report a synthetic DNA-based system that integrates long-range transport and information processing. We show that the path of a motor through a network of tracks containing four possible routes can be programmed using instructions that are added externally or carried by the motor itself. When external control is used we find that 87% of the motors follow the correct path, and when internal control is used 71% of the motors follow the correct path. Programmable motion will allow the development of computing networks, molecular systems that can sort and process cargoes according to instructions that they carry, and assembly lines that can be reconfigured dynamically in response to changing demands.
Antibody-controlled actuation of DNA-based molecular circuits.
Engelen, Wouter; Meijer, Lenny H H; Somers, Bram; de Greef, Tom F A; Merkx, Maarten
2017-02-17
DNA-based molecular circuits allow autonomous signal processing, but their actuation has relied mostly on RNA/DNA-based inputs, limiting their application in synthetic biology, biomedicine and molecular diagnostics. Here we introduce a generic method to translate the presence of an antibody into a unique DNA strand, enabling the use of antibodies as specific inputs for DNA-based molecular computing. Our approach, antibody-templated strand exchange (ATSE), uses the characteristic bivalent architecture of antibodies to promote DNA-strand exchange reactions both thermodynamically and kinetically. Detailed characterization of the ATSE reaction allowed the establishment of a comprehensive model that describes the kinetics and thermodynamics of ATSE as a function of toehold length, antibody-epitope affinity and concentration. ATSE enables the introduction of complex signal processing in antibody-based diagnostics, as demonstrated here by constructing molecular circuits for multiplex antibody detection, integration of multiple antibody inputs using logic gates and actuation of enzymes and DNAzymes for signal amplification.
Antibody-controlled actuation of DNA-based molecular circuits
NASA Astrophysics Data System (ADS)
Engelen, Wouter; Meijer, Lenny H. H.; Somers, Bram; de Greef, Tom F. A.; Merkx, Maarten
2017-02-01
DNA-based molecular circuits allow autonomous signal processing, but their actuation has relied mostly on RNA/DNA-based inputs, limiting their application in synthetic biology, biomedicine and molecular diagnostics. Here we introduce a generic method to translate the presence of an antibody into a unique DNA strand, enabling the use of antibodies as specific inputs for DNA-based molecular computing. Our approach, antibody-templated strand exchange (ATSE), uses the characteristic bivalent architecture of antibodies to promote DNA-strand exchange reactions both thermodynamically and kinetically. Detailed characterization of the ATSE reaction allowed the establishment of a comprehensive model that describes the kinetics and thermodynamics of ATSE as a function of toehold length, antibody-epitope affinity and concentration. ATSE enables the introduction of complex signal processing in antibody-based diagnostics, as demonstrated here by constructing molecular circuits for multiplex antibody detection, integration of multiple antibody inputs using logic gates and actuation of enzymes and DNAzymes for signal amplification.
The 'Biologically-Inspired Computing' Column
NASA Technical Reports Server (NTRS)
Hinchey, Mike
2006-01-01
The field of Biology changed dramatically in 1953, with the determination by Francis Crick and James Dewey Watson of the double helix structure of DNA. This discovery changed Biology for ever, allowing the sequencing of the human genome, and the emergence of a "new Biology" focused on DNA, genes, proteins, data, and search. Computational Biology and Bioinformatics heavily rely on computing to facilitate research into life and development. Simultaneously, an understanding of the biology of living organisms indicates a parallel with computing systems: molecules in living cells interact, grow, and transform according to the "program" dictated by DNA. Moreover, paradigms of Computing are emerging based on modelling and developing computer-based systems exploiting ideas that are observed in nature. This includes building into computer systems self-management and self-governance mechanisms that are inspired by the human body's autonomic nervous system, modelling evolutionary systems analogous to colonies of ants or other insects, and developing highly-efficient and highly-complex distributed systems from large numbers of (often quite simple) largely homogeneous components to reflect the behaviour of flocks of birds, swarms of bees, herds of animals, or schools of fish. This new field of "Biologically-Inspired Computing", often known in other incarnations by other names, such as: Autonomic Computing, Pervasive Computing, Organic Computing, Biomimetics, and Artificial Life, amongst others, is poised at the intersection of Computer Science, Engineering, Mathematics, and the Life Sciences. Successes have been reported in the fields of drug discovery, data communications, computer animation, control and command, exploration systems for space, undersea, and harsh environments, to name but a few, and augur much promise for future progress.
Wang, Zhaocai; Pu, Jun; Cao, Liling; Tan, Jian
2015-01-01
The unbalanced assignment problem (UAP) is to optimally resolve the problem of assigning n jobs to m individuals (m < n), such that minimum cost or maximum profit obtained. It is a vitally important Non-deterministic Polynomial (NP) complete problem in operation management and applied mathematics, having numerous real life applications. In this paper, we present a new parallel DNA algorithm for solving the unbalanced assignment problem using DNA molecular operations. We reasonably design flexible-length DNA strands representing different jobs and individuals, take appropriate steps, and get the solutions of the UAP in the proper length range and O(mn) time. We extend the application of DNA molecular operations and simultaneity to simplify the complexity of the computation. PMID:26512650
A spatially localized architecture for fast and modular DNA computing
NASA Astrophysics Data System (ADS)
Chatterjee, Gourab; Dalchau, Neil; Muscat, Richard A.; Phillips, Andrew; Seelig, Georg
2017-09-01
Cells use spatial constraints to control and accelerate the flow of information in enzyme cascades and signalling networks. Synthetic silicon-based circuitry similarly relies on spatial constraints to process information. Here, we show that spatial organization can be a similarly powerful design principle for overcoming limitations of speed and modularity in engineered molecular circuits. We create logic gates and signal transmission lines by spatially arranging reactive DNA hairpins on a DNA origami. Signal propagation is demonstrated across transmission lines of different lengths and orientations and logic gates are modularly combined into circuits that establish the universality of our approach. Because reactions preferentially occur between neighbours, identical DNA hairpins can be reused across circuits. Co-localization of circuit elements decreases computation time from hours to minutes compared to circuits with diffusible components. Detailed computational models enable predictive circuit design. We anticipate our approach will motivate using spatial constraints for future molecular control circuit designs.
Structural DNA Nanotechnology: State of the Art and Future Perspective
2015-01-01
Over the past three decades DNA has emerged as an exceptional molecular building block for nanoconstruction due to its predictable conformation and programmable intra- and intermolecular Watson–Crick base-pairing interactions. A variety of convenient design rules and reliable assembly methods have been developed to engineer DNA nanostructures of increasing complexity. The ability to create designer DNA architectures with accurate spatial control has allowed researchers to explore novel applications in many directions, such as directed material assembly, structural biology, biocatalysis, DNA computing, nanorobotics, disease diagnosis, and drug delivery. This Perspective discusses the state of the art in the field of structural DNA nanotechnology and presents some of the challenges and opportunities that exist in DNA-based molecular design and programming. PMID:25029570
Evaluating the role of coherent delocalized phonon-like modes in DNA cyclization
Alexandrov, Ludmil B.; Rasmussen, Kim Ã.; Bishop, Alan R.; ...
2017-08-29
The innate flexibility of a DNA sequence is quantified by the Jacobson-Stockmayer’s J-factor, which measures the propensity for DNA loop formation. Recent studies of ultra-short DNA sequences revealed a discrepancy of up to six orders of magnitude between experimentally measured and theoretically predicted J-factors. These large differences suggest that, in addition to the elastic moduli of the double helix, other factors contribute to loop formation. We develop a new theoretical model that explores how coherent delocalized phonon-like modes in DNA provide single-stranded ”flexible hinges” to assist in loop formation. We also combine the Czapla-Swigon-Olson structural model of DNA with ourmore » extended Peyrard-Bishop-Dauxois model and, without changing any of the parameters of the two models, apply this new computational framework to 86 experimentally characterized DNA sequences. Our results demonstrate that the new computational framework can predict J-factors within an order of magnitude of experimental measurements for most ultra-short DNA sequences, while continuing to accurately describe the J-factors of longer sequences. Furthermore, we demonstrate that our computational framework can be used to describe the cyclization of DNA sequences that contain a base pair mismatch. Overall, our results support the conclusion that coherent delocalized phonon-like modes play an important role in DNA cyclization.« less
Evaluating the role of coherent delocalized phonon-like modes in DNA cyclization
DOE Office of Scientific and Technical Information (OSTI.GOV)
Alexandrov, Ludmil B.; Rasmussen, Kim Ã.; Bishop, Alan R.
The innate flexibility of a DNA sequence is quantified by the Jacobson-Stockmayer’s J-factor, which measures the propensity for DNA loop formation. Recent studies of ultra-short DNA sequences revealed a discrepancy of up to six orders of magnitude between experimentally measured and theoretically predicted J-factors. These large differences suggest that, in addition to the elastic moduli of the double helix, other factors contribute to loop formation. We develop a new theoretical model that explores how coherent delocalized phonon-like modes in DNA provide single-stranded ”flexible hinges” to assist in loop formation. We also combine the Czapla-Swigon-Olson structural model of DNA with ourmore » extended Peyrard-Bishop-Dauxois model and, without changing any of the parameters of the two models, apply this new computational framework to 86 experimentally characterized DNA sequences. Our results demonstrate that the new computational framework can predict J-factors within an order of magnitude of experimental measurements for most ultra-short DNA sequences, while continuing to accurately describe the J-factors of longer sequences. Furthermore, we demonstrate that our computational framework can be used to describe the cyclization of DNA sequences that contain a base pair mismatch. Overall, our results support the conclusion that coherent delocalized phonon-like modes play an important role in DNA cyclization.« less
Testing the Use of Implicit Solvent in the Molecular Dynamics Modelling of DNA Flexibility
NASA Astrophysics Data System (ADS)
Mitchell, J.; Harris, S.
DNA flexibility controls packaging, looping and in some cases sequence specific protein binding. Molecular dynamics simulations carried out with a computationally efficient implicit solvent model are potentially a powerful tool for studying larger DNA molecules than can be currently simulated when water and counterions are represented explicitly. In this work we compare DNA flexibility at the base pair step level modelled using an implicit solvent model to that previously determined from explicit solvent simulations and database analysis. Although much of the sequence dependent behaviour is preserved in implicit solvent, the DNA is considerably more flexible when the approximate model is used. In addition we test the ability of the implicit solvent to model stress induced DNA disruptions by simulating a series of DNA minicircle topoisomers which vary in size and superhelical density. When compared with previously run explicit solvent simulations, we find that while the levels of DNA denaturation are similar using both computational methodologies, the specific structural form of the disruptions is different.
Cowell, Robert G
2018-05-04
Current models for single source and mixture samples, and probabilistic genotyping software based on them used for analysing STR electropherogram data, assume simple probability distributions, such as the gamma distribution, to model the allelic peak height variability given the initial amount of DNA prior to PCR amplification. Here we illustrate how amplicon number distributions, for a model of the process of sample DNA collection and PCR amplification, may be efficiently computed by evaluating probability generating functions using discrete Fourier transforms. Copyright © 2018 Elsevier B.V. All rights reserved.
Computational and experimental analysis of DNA shuffling
Maheshri, Narendra; Schaffer, David V.
2003-01-01
We describe a computational model of DNA shuffling based on the thermodynamics and kinetics of this process. The model independently tracks a representative ensemble of DNA molecules and records their states at every stage of a shuffling reaction. These data can subsequently be analyzed to yield information on any relevant metric, including reassembly efficiency, crossover number, type and distribution, and DNA sequence length distributions. The predictive ability of the model was validated by comparison to three independent sets of experimental data, and analysis of the simulation results led to several unique insights into the DNA shuffling process. We examine a tradeoff between crossover frequency and reassembly efficiency and illustrate the effects of experimental parameters on this relationship. Furthermore, we discuss conditions that promote the formation of useless “junk” DNA sequences or multimeric sequences containing multiple copies of the reassembled product. This model will therefore aid in the design of optimal shuffling reaction conditions. PMID:12626764
RDNAnalyzer: A tool for DNA secondary structure prediction and sequence analysis.
Afzal, Muhammad; Shahid, Ahmad Ali; Shehzadi, Abida; Nadeem, Shahid; Husnain, Tayyab
2012-01-01
RDNAnalyzer is an innovative computer based tool designed for DNA secondary structure prediction and sequence analysis. It can randomly generate the DNA sequence or user can upload the sequences of their own interest in RAW format. It uses and extends the Nussinov dynamic programming algorithm and has various application for the sequence analysis. It predicts the DNA secondary structure and base pairings. It also provides the tools for routinely performed sequence analysis by the biological scientists such as DNA replication, reverse compliment generation, transcription, translation, sequence specific information as total number of nucleotide bases, ATGC base contents along with their respective percentages and sequence cleaner. RDNAnalyzer is a unique tool developed in Microsoft Visual Studio 2008 using Microsoft Visual C# and Windows Presentation Foundation and provides user friendly environment for sequence analysis. It is freely available. http://www.cemb.edu.pk/sw.html RDNAnalyzer - Random DNA Analyser, GUI - Graphical user interface, XAML - Extensible Application Markup Language.
Alkylpurine glycosylase D employs DNA sculpting as a strategy to extrude and excise damaged bases.
Kossmann, Bradley; Ivanov, Ivaylo
2014-07-01
Alkylpurine glycosylase D (AlkD) exhibits a unique base excision strategy. Instead of interacting directly with the lesion, the enzyme engages the non-lesion DNA strand. AlkD induces flipping of the alkylated and opposing base accompanied by DNA stack compression. Since this strategy leaves the alkylated base solvent exposed, the means to achieve enzymatic cleavage had remained unclear. We determined a minimum energy path for flipping out a 3-methyl adenine by AlkD and computed a potential of mean force along this path to delineate the energetics of base extrusion. We show that AlkD acts as a scaffold to stabilize three distinct DNA conformations, including the final extruded state. These states are almost equivalent in free energy and separated by low barriers. Thus, AlkD acts by sculpting the global DNA conformation to achieve lesion expulsion from DNA. N-glycosidic bond scission is then facilitated by a backbone phosphate group proximal to the alkylated base.
Jana, Kalyanashis; Ganguly, Bishwajit
2014-10-16
DNA nucleobases are reactive in nature and undergo modifications by deamination, oxidation, alkylation, or hydrolysis processes. Many such modified bases are susceptible to mutagenesis when formed in cellular DNA. The mutagenesis can occur by mispairing with DNA nucleobases by a DNA polymerase during replication. We have performed a study of mispairing of DNA bases with unnatural bases computationally. 5-Halo uracils have been studied as mispairs in mutagenesis; however, the reports on their different forms are scarce in the literature. The stability of mispairs with keto form, enol form, and ionized form of 5-halo-uracil has been computed with the M06-2X/6-31+G** level of theory. The enol form of 5-halo-uracil showed remarkable stability toward DNA mispair compared to the corresponding keto and ionized forms. (F)U-G mispair showed the highest stability in the series and (Halo)(U(enol/ionized)-G mispair interactions energies are more stable than the natural G-C basepair of DNA. To enhance the stability of DNA mispairs, we have introduced the hydroxyl group in the place of halogen atoms, which provides additional hydrogen-bonding interactions in the system while forming the 5-membered ring. The study has been further extended with lithiated 5-hydroxymethyl-uracil to stabilize the DNA mispair. (CH2OLi)U(ionized)-G mispair has shown the highest stability (ΔG = -32.4 kcal/mol) with multi O-Li interactions. AIM (atoms in molecules) and EDA (energy decomposition analysis) analysis has been performed to examine the nature of noncovalent interactions in such mispairs. EDA analysis has shown that electrostatic energy mainly contributes toward the interaction energy of mispairs. The higher stability achieved in these studied mispairs can play a pivotal role in the mutagenesis and can help to attain the mutation for many desired biological processes.
Fast parallel molecular algorithms for DNA-based computation: factoring integers.
Chang, Weng-Long; Guo, Minyi; Ho, Michael Shan-Hui
2005-06-01
The RSA public-key cryptosystem is an algorithm that converts input data to an unrecognizable encryption and converts the unrecognizable data back into its original decryption form. The security of the RSA public-key cryptosystem is based on the difficulty of factoring the product of two large prime numbers. This paper demonstrates to factor the product of two large prime numbers, and is a breakthrough in basic biological operations using a molecular computer. In order to achieve this, we propose three DNA-based algorithms for parallel subtractor, parallel comparator, and parallel modular arithmetic that formally verify our designed molecular solutions for factoring the product of two large prime numbers. Furthermore, this work indicates that the cryptosystems using public-key are perhaps insecure and also presents clear evidence of the ability of molecular computing to perform complicated mathematical operations.
DNA nanotechnology: a future perspective
2013-01-01
In addition to its genetic function, DNA is one of the most distinct and smart self-assembling nanomaterials. DNA nanotechnology exploits the predictable self-assembly of DNA oligonucleotides to design and assemble innovative and highly discrete nanostructures. Highly ordered DNA motifs are capable of providing an ultra-fine framework for the next generation of nanofabrications. The majority of these applications are based upon the complementarity of DNA base pairing: adenine with thymine, and guanine with cytosine. DNA provides an intelligent route for the creation of nanoarchitectures with programmable and predictable patterns. DNA strands twist along one helix for a number of bases before switching to the other helix by passing through a crossover junction. The association of two crossovers keeps the helices parallel and holds them tightly together, allowing the assembly of bigger structures. Because of the DNA molecule's unique and novel characteristics, it can easily be applied in a vast variety of multidisciplinary research areas like biomedicine, computer science, nano/optoelectronics, and bionanotechnology. PMID:23497147
Muhire, Brejnev Muhizi; Golden, Michael; Murrell, Ben; Lefeuvre, Pierre; Lett, Jean-Michel; Gray, Alistair; Poon, Art Y F; Ngandu, Nobubelo Kwanele; Semegni, Yves; Tanov, Emil Pavlov; Monjane, Adérito Luis; Harkins, Gordon William; Varsani, Arvind; Shepherd, Dionne Natalie; Martin, Darren Patrick
2014-02-01
Single-stranded DNA (ssDNA) viruses have genomes that are potentially capable of forming complex secondary structures through Watson-Crick base pairing between their constituent nucleotides. A few of the structural elements formed by such base pairings are, in fact, known to have important functions during the replication of many ssDNA viruses. Unknown, however, are (i) whether numerous additional ssDNA virus genomic structural elements predicted to exist by computational DNA folding methods actually exist and (ii) whether those structures that do exist have any biological relevance. We therefore computationally inferred lists of the most evolutionarily conserved structures within a diverse selection of animal- and plant-infecting ssDNA viruses drawn from the families Circoviridae, Anelloviridae, Parvoviridae, Nanoviridae, and Geminiviridae and analyzed these for evidence of natural selection favoring the maintenance of these structures. While we find evidence that is consistent with purifying selection being stronger at nucleotide sites that are predicted to be base paired than at sites predicted to be unpaired, we also find strong associations between sites that are predicted to pair with one another and site pairs that are apparently coevolving in a complementary fashion. Collectively, these results indicate that natural selection actively preserves much of the pervasive secondary structure that is evident within eukaryote-infecting ssDNA virus genomes and, therefore, that much of this structure is biologically functional. Lastly, we provide examples of various highly conserved but completely uncharacterized structural elements that likely have important functions within some of the ssDNA virus genomes analyzed here.
Muhire, Brejnev Muhizi; Golden, Michael; Murrell, Ben; Lefeuvre, Pierre; Lett, Jean-Michel; Gray, Alistair; Poon, Art Y. F.; Ngandu, Nobubelo Kwanele; Semegni, Yves; Tanov, Emil Pavlov; Monjane, Adérito Luis; Harkins, Gordon William; Varsani, Arvind; Shepherd, Dionne Natalie
2014-01-01
Single-stranded DNA (ssDNA) viruses have genomes that are potentially capable of forming complex secondary structures through Watson-Crick base pairing between their constituent nucleotides. A few of the structural elements formed by such base pairings are, in fact, known to have important functions during the replication of many ssDNA viruses. Unknown, however, are (i) whether numerous additional ssDNA virus genomic structural elements predicted to exist by computational DNA folding methods actually exist and (ii) whether those structures that do exist have any biological relevance. We therefore computationally inferred lists of the most evolutionarily conserved structures within a diverse selection of animal- and plant-infecting ssDNA viruses drawn from the families Circoviridae, Anelloviridae, Parvoviridae, Nanoviridae, and Geminiviridae and analyzed these for evidence of natural selection favoring the maintenance of these structures. While we find evidence that is consistent with purifying selection being stronger at nucleotide sites that are predicted to be base paired than at sites predicted to be unpaired, we also find strong associations between sites that are predicted to pair with one another and site pairs that are apparently coevolving in a complementary fashion. Collectively, these results indicate that natural selection actively preserves much of the pervasive secondary structure that is evident within eukaryote-infecting ssDNA virus genomes and, therefore, that much of this structure is biologically functional. Lastly, we provide examples of various highly conserved but completely uncharacterized structural elements that likely have important functions within some of the ssDNA virus genomes analyzed here. PMID:24284329
Marck, C
1988-01-01
DNA Strider is a new integrated DNA and Protein sequence analysis program written with the C language for the Macintosh Plus, SE and II computers. It has been designed as an easy to learn and use program as well as a fast and efficient tool for the day-to-day sequence analysis work. The program consists of a multi-window sequence editor and of various DNA and Protein analysis functions. The editor may use 4 different types of sequences (DNA, degenerate DNA, RNA and one-letter coded protein) and can handle simultaneously 6 sequences of any type up to 32.5 kB each. Negative numbering of the bases is allowed for DNA sequences. All classical restriction and translation analysis functions are present and can be performed in any order on any open sequence or part of a sequence. The main feature of the program is that the same analysis function can be repeated several times on different sequences, thus generating multiple windows on the screen. Many graphic capabilities have been incorporated such as graphic restriction map, hydrophobicity profile and the CAI plot- codon adaptation index according to Sharp and Li. The restriction sites search uses a newly designed fast hexamer look-ahead algorithm. Typical runtime for the search of all sites with a library of 130 restriction endonucleases is 1 second per 10,000 bases. The circular graphic restriction map of the pBR322 plasmid can be therefore computed from its sequence and displayed on the Macintosh Plus screen within 2 seconds and its multiline restriction map obtained in a scrolling window within 5 seconds. PMID:2832831
Computer-aided design of nano-filter construction using DNA self-assembly
NASA Astrophysics Data System (ADS)
Mohammadzadegan, Reza; Mohabatkar, Hassan
2007-01-01
Computer-aided design plays a fundamental role in both top-down and bottom-up nano-system fabrication. This paper presents a bottom-up nano-filter patterning process based on DNA self-assembly. In this study we designed a new method to construct fully designed nano-filters with the pores between 5 nm and 9 nm in diameter. Our calculations illustrated that by constructing such a nano-filter we would be able to separate many molecules.
2012-09-30
computational tools provide the ability to display, browse, select, filter and summarize spatio-temporal relationships of these individual-based...her research assistant at Esri, Shaun Walbridge, and members of the Marine Mammal Institute ( MMI ), including Tomas Follet and Debbie Steel. This...Genomics Laboratory, MMI , OSU. 4 As part of the geneGIS initiative, these SPLASH photo-identification records and the geneSPLASH DNA profiles
Song, Tianqi; Garg, Sudhanshu; Mokhtar, Reem; Bui, Hieu; Reif, John
2018-01-19
A main goal in DNA computing is to build DNA circuits to compute designated functions using a minimal number of DNA strands. Here, we propose a novel architecture to build compact DNA strand displacement circuits to compute a broad scope of functions in an analog fashion. A circuit by this architecture is composed of three autocatalytic amplifiers, and the amplifiers interact to perform computation. We show DNA circuits to compute functions sqrt(x), ln(x) and exp(x) for x in tunable ranges with simulation results. A key innovation in our architecture, inspired by Napier's use of logarithm transforms to compute square roots on a slide rule, is to make use of autocatalytic amplifiers to do logarithmic and exponential transforms in concentration and time. In particular, we convert from the input that is encoded by the initial concentration of the input DNA strand, to time, and then back again to the output encoded by the concentration of the output DNA strand at equilibrium. This combined use of strand-concentration and time encoding of computational values may have impact on other forms of molecular computation.
Utro, Filippo; Di Benedetto, Valeria; Corona, Davide F V; Giancarlo, Raffaele
2016-03-15
Thanks to research spanning nearly 30 years, two major models have emerged that account for nucleosome organization in chromatin: statistical and sequence specific. The first is based on elegant, easy to compute, closed-form mathematical formulas that make no assumptions of the physical and chemical properties of the underlying DNA sequence. Moreover, they need no training on the data for their computation. The latter is based on some sequence regularities but, as opposed to the statistical model, it lacks the same type of closed-form formulas that, in this case, should be based on the DNA sequence only. We contribute to close this important methodological gap between the two models by providing three very simple formulas for the sequence specific one. They are all based on well-known formulas in Computer Science and Bioinformatics, and they give different quantifications of how complex a sequence is. In view of how remarkably well they perform, it is very surprising that measures of sequence complexity have not even been considered as candidates to close the mentioned gap. We provide experimental evidence that the intrinsic level of combinatorial organization and information-theoretic content of subsequences within a genome are strongly correlated to the level of DNA encoded nucleosome organization discovered by Kaplan et al Our results establish an important connection between the intrinsic complexity of subsequences in a genome and the intrinsic, i.e. DNA encoded, nucleosome organization of eukaryotic genomes. It is a first step towards a mathematical characterization of this latter 'encoding'. Supplementary data are available at Bioinformatics online. futro@us.ibm.com. © The Author 2015. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.
Kilina, Svetlana; Yarotski, Dzmitry A.; Talin, A. Alec; ...
2011-01-01
We present a combined approach that relies on computational simulations and scanning tunneling microscopy (STM) measurements to reveal morphological properties and stability criteria of carbon nanotube-DNA (CNT-DNA) constructs. Application of STM allows direct observation of very stable CNT-DNA hybrid structures with the well-defined DNA wrapping angle of 63.4 ° and a coiling period of 3.3 nm. Using force field simulations, we determine how the DNA-CNT binding energy depends on the sequence and binding geometry of a single strand DNA. This dependence allows us to quantitatively characterize the stability of a hybrid structure with an optimal π-stacking between DNA nucleotides and themore » tube surface and better interpret STM data. Our simulations clearly demonstrate the existence of a very stable DNA binding geometry for (6,5) CNT as evidenced by the presence of a well-defined minimum in the binding energy as a function of an angle between DNA strand and the nanotube chiral vector. This novel approach demonstrates the feasibility of CNT-DNA geometry studies with subnanometer resolution and paves the way towards complete characterization of the structural and electronic properties of drug-delivering systems based on DNA-CNT hybrids as a function of DNA sequence and a nanotube chirality.« less
Teixeira, Erico S; Uppulury, Karthik; Privett, Austin J; Stopera, Christopher; McLaurin, Patrick M; Morales, Jorge A
2018-05-06
Proton cancer therapy (PCT) utilizes high-energy proton projectiles to obliterate cancerous tumors with low damage to healthy tissues and without the side effects of X-ray therapy. The healing action of the protons results from their damage on cancerous cell DNA. Despite established clinical use, the chemical mechanisms of PCT reactions at the molecular level remain elusive. This situation prevents a rational design of PCT that can maximize its therapeutic power and minimize its side effects. The incomplete characterization of PCT reactions is partially due to the health risks associated with experimental/clinical techniques applied to human subjects. To overcome this situation, we are conducting time-dependent and non-adiabatic computer simulations of PCT reactions with the electron nuclear dynamics (END) method. Herein, we present a review of our previous and new END research on three fundamental types of PCT reactions: water radiolysis reactions, proton-induced DNA damage and electron-induced DNA damage. These studies are performed on the computational prototypes: proton + H₂O clusters, proton + DNA/RNA bases and + cytosine nucleotide, and electron + cytosine nucleotide + H₂O. These simulations provide chemical mechanisms and dynamical properties of the selected PCT reactions in comparison with available experimental and alternative computational results.
Energy barriers and rates of tautomeric transitions in DNA bases: ab initio quantum chemical study.
Basu, Soumalee; Majumdar, Rabi; Das, Gourab K; Bhattacharyya, Dhananjay
2005-12-01
Tautomeric transitions of DNA bases are proton transfer reactions, which are important in biology. These reactions are involved in spontaneous point mutations of the genetic material. In the present study, intrinsic reaction coordinates (IRC) analyses through ab initio quantum chemical calculations have been carried out for the individual DNA bases A, T, G, C and also A:T and G:C base pairs to estimate the kinetic and thermodynamic barriers using MP2/6-31G** method for tautomeric transitions. Relatively higher values of kinetic barriers (about 50-60 kcal/mol) have been observed for the single bases, indicating that tautomeric alterations of isolated single bases are quite unlikely. On the other hand, relatively lower values of the kinetic barriers (about 20-25 kcal/mol) for the DNA base pairs A:T and G:C clearly suggest that the tautomeric shifts are much more favorable in DNA base pairs than in isolated single bases. The unusual base pairing A':C, T':G, C':A or G':T in the daughter DNA molecule, resulting from a parent DNA molecule with tautomeric shifts, is found to be stable enough to result in a mutation. The transition rate constants for the single DNA bases in addition to the base pairs are also calculated by computing the free energy differences between the transition states and the reactants.
Matsumoto, Atsushi; Tobias, Irwin; Olson, Wilma K
2005-01-01
Fine structural and energetic details embedded in the DNA base sequence, such as intrinsic curvature, are important to the packaging and processing of the genetic material. Here we investigate the internal dynamics of a 200 bp closed circular molecule with natural curvature using a newly developed normal-mode treatment of DNA in terms of neighboring base-pair "step" parameters. The intrinsic curvature of the DNA is described by a 10 bp repeating pattern of bending distortions at successive base-pair steps. We vary the degree of intrinsic curvature and the superhelical stress on the molecule and consider the normal-mode fluctuations of both the circle and the stable figure-8 configuration under conditions where the energies of the two states are similar. To extract the properties due solely to curvature, we ignore other important features of the double helix, such as the extensibility of the chain, the anisotropy of local bending, and the coupling of step parameters. We compare the computed normal modes of the curved DNA model with the corresponding dynamical features of a covalently closed duplex of the same chain length constructed from naturally straight DNA and with the theoretically predicted dynamical properties of a naturally circular, inextensible elastic rod, i.e., an O-ring. The cyclic molecules with intrinsic curvature are found to be more deformable under superhelical stress than rings formed from naturally straight DNA. As superhelical stress is accumulated in the DNA, the frequency, i.e., energy, of the dominant bending mode decreases in value, and if the imposed stress is sufficiently large, a global configurational rearrangement of the circle to the figure-8 form takes place. We combine energy minimization with normal-mode calculations of the two states to decipher the configurational pathway between the two states. We also describe and make use of a general analytical treatment of the thermal fluctuations of an elastic rod to characterize the motions of the minicircle as a whole from knowledge of the full set of normal modes. The remarkable agreement between computed and theoretically predicted values of the average deviation and dispersion of the writhe of the circular configuration adds to the reliability in the computational approach. Application of the new formalism to the computed modes of the figure-8 provides insights into macromolecular motions which are beyond the scope of current theoretical treatments.
Abolfath, Ramin M; Biswas, P K; Rajnarayanam, R; Brabec, Thomas; Kodym, Reinhard; Papiez, Lech
2012-04-19
Understanding the damage of DNA bases from hydrogen abstraction by free OH radicals is of particular importance to understanding the indirect effect of ionizing radiation. Previous studies address the problem with truncated DNA bases as ab initio quantum simulations required to study such electronic-spin-dependent processes are computationally expensive. Here, for the first time, we employ a multiscale and hybrid quantum mechanical-molecular mechanical simulation to study the interaction of OH radicals with a guanine-deoxyribose-phosphate DNA molecular unit in the presence of water, where all of the water molecules and the deoxyribose-phosphate fragment are treated with the simplistic classical molecular mechanical scheme. Our result illustrates that the presence of water strongly alters the hydrogen-abstraction reaction as the hydrogen bonding of OH radicals with water restricts the relative orientation of the OH radicals with respect to the DNA base (here, guanine). This results in an angular anisotropy in the chemical pathway and a lower efficiency in the hydrogen-abstraction mechanisms than previously anticipated for identical systems in vacuum. The method can easily be extended to single- and double-stranded DNA without any appreciable computational cost as these molecular units can be treated in the classical subsystem, as has been demonstrated here. © 2012 American Chemical Society
A novel model for DNA sequence similarity analysis based on graph theory.
Qi, Xingqin; Wu, Qin; Zhang, Yusen; Fuller, Eddie; Zhang, Cun-Quan
2011-01-01
Determination of sequence similarity is one of the major steps in computational phylogenetic studies. As we know, during evolutionary history, not only DNA mutations for individual nucleotide but also subsequent rearrangements occurred. It has been one of major tasks of computational biologists to develop novel mathematical descriptors for similarity analysis such that various mutation phenomena information would be involved simultaneously. In this paper, different from traditional methods (eg, nucleotide frequency, geometric representations) as bases for construction of mathematical descriptors, we construct novel mathematical descriptors based on graph theory. In particular, for each DNA sequence, we will set up a weighted directed graph. The adjacency matrix of the directed graph will be used to induce a representative vector for DNA sequence. This new approach measures similarity based on both ordering and frequency of nucleotides so that much more information is involved. As an application, the method is tested on a set of 0.9-kb mtDNA sequences of twelve different primate species. All output phylogenetic trees with various distance estimations have the same topology, and are generally consistent with the reported results from early studies, which proves the new method's efficiency; we also test the new method on a simulated data set, which shows our new method performs better than traditional global alignment method when subsequent rearrangements happen frequently during evolutionary history.
RDNAnalyzer: A tool for DNA secondary structure prediction and sequence analysis
Afzal, Muhammad; Shahid, Ahmad Ali; Shehzadi, Abida; Nadeem, Shahid; Husnain, Tayyab
2012-01-01
RDNAnalyzer is an innovative computer based tool designed for DNA secondary structure prediction and sequence analysis. It can randomly generate the DNA sequence or user can upload the sequences of their own interest in RAW format. It uses and extends the Nussinov dynamic programming algorithm and has various application for the sequence analysis. It predicts the DNA secondary structure and base pairings. It also provides the tools for routinely performed sequence analysis by the biological scientists such as DNA replication, reverse compliment generation, transcription, translation, sequence specific information as total number of nucleotide bases, ATGC base contents along with their respective percentages and sequence cleaner. RDNAnalyzer is a unique tool developed in Microsoft Visual Studio 2008 using Microsoft Visual C# and Windows Presentation Foundation and provides user friendly environment for sequence analysis. It is freely available. Availability http://www.cemb.edu.pk/sw.html Abbreviations RDNAnalyzer - Random DNA Analyser, GUI - Graphical user interface, XAML - Extensible Application Markup Language. PMID:23055611
High-speed DNA-based rolling motors powered by RNase H
Yehl, Kevin; Mugler, Andrew; Vivek, Skanda; Liu, Yang; Zhang, Yun; Fan, Mengzhen; Weeks, Eric R.
2016-01-01
DNA-based machines that walk by converting chemical energy into controlled motion could be of use in applications such as next generation sensors, drug delivery platforms, and biological computing. Despite their exquisite programmability, DNA-based walkers are, however, challenging to work with due to their low fidelity and slow rates (~1 nm/min). Here, we report DNA-based machines that roll rather than walk, and consequently have a maximum speed and processivity that is three-orders of magnitude greater than conventional DNA motors. The motors are made from DNA-coated spherical particles that hybridise to a surface modified with complementary RNA; motion is achieved through the addition of RNase H, which selectively hydrolyses hybridised RNA. Spherical motors move in a self-avoiding manner, whereas anisotropic particles, such as dimerised particles or rod-shaped particles travel linearly without a track or external force. Finally, we demonstrate detection of single nucleotide polymorphism by measuring particle displacement using a smartphone camera. PMID:26619152
Dedkov, V S
2009-01-01
The specificity of DNA-methyltransferase M.Bsc4I was defined in cellular lysate of Bacillus schlegelii 4. For this purpose, we used methylation sensitivity of restriction endonucleases, and also modeling of methylation. The modeling consisted in editing sequences of DNA using replacements of methylated bases and their complementary bases. The substratum DNA processed by M.Bsc4I also were used for studying sensitivity of some restriction endonucleases to methylation. Thus, it was shown that M.Bsc4I methylated 5'-Cm4CNNNNNNNGG-3' and the overlapped dcm-methylation blocked its activity. The offered approach can appear universal enough and simple for definition of specificity of DNA-methyltransferases.
Simulations Using Random-Generated DNA and RNA Sequences
ERIC Educational Resources Information Center
Bryce, C. F. A.
1977-01-01
Using a very simple computer program written in BASIC, a very large number of random-generated DNA or RNA sequences are obtained. Students use these sequences to predict complementary sequences and translational products, evaluate base compositions, determine frequencies of particular triplet codons, and suggest possible secondary structures.…
HTSFinder: Powerful Pipeline of DNA Signature Discovery by Parallel and Distributed Computing
Karimi, Ramin; Hajdu, Andras
2016-01-01
Comprehensive effort for low-cost sequencing in the past few years has led to the growth of complete genome databases. In parallel with this effort, a strong need, fast and cost-effective methods and applications have been developed to accelerate sequence analysis. Identification is the very first step of this task. Due to the difficulties, high costs, and computational challenges of alignment-based approaches, an alternative universal identification method is highly required. Like an alignment-free approach, DNA signatures have provided new opportunities for the rapid identification of species. In this paper, we present an effective pipeline HTSFinder (high-throughput signature finder) with a corresponding k-mer generator GkmerG (genome k-mers generator). Using this pipeline, we determine the frequency of k-mers from the available complete genome databases for the detection of extensive DNA signatures in a reasonably short time. Our application can detect both unique and common signatures in the arbitrarily selected target and nontarget databases. Hadoop and MapReduce as parallel and distributed computing tools with commodity hardware are used in this pipeline. This approach brings the power of high-performance computing into the ordinary desktop personal computers for discovering DNA signatures in large databases such as bacterial genome. A considerable number of detected unique and common DNA signatures of the target database bring the opportunities to improve the identification process not only for polymerase chain reaction and microarray assays but also for more complex scenarios such as metagenomics and next-generation sequencing analysis. PMID:26884678
HTSFinder: Powerful Pipeline of DNA Signature Discovery by Parallel and Distributed Computing.
Karimi, Ramin; Hajdu, Andras
2016-01-01
Comprehensive effort for low-cost sequencing in the past few years has led to the growth of complete genome databases. In parallel with this effort, a strong need, fast and cost-effective methods and applications have been developed to accelerate sequence analysis. Identification is the very first step of this task. Due to the difficulties, high costs, and computational challenges of alignment-based approaches, an alternative universal identification method is highly required. Like an alignment-free approach, DNA signatures have provided new opportunities for the rapid identification of species. In this paper, we present an effective pipeline HTSFinder (high-throughput signature finder) with a corresponding k-mer generator GkmerG (genome k-mers generator). Using this pipeline, we determine the frequency of k-mers from the available complete genome databases for the detection of extensive DNA signatures in a reasonably short time. Our application can detect both unique and common signatures in the arbitrarily selected target and nontarget databases. Hadoop and MapReduce as parallel and distributed computing tools with commodity hardware are used in this pipeline. This approach brings the power of high-performance computing into the ordinary desktop personal computers for discovering DNA signatures in large databases such as bacterial genome. A considerable number of detected unique and common DNA signatures of the target database bring the opportunities to improve the identification process not only for polymerase chain reaction and microarray assays but also for more complex scenarios such as metagenomics and next-generation sequencing analysis.
ERIC Educational Resources Information Center
Grenville-Briggs, Laura J.; Stansfield, Ian
2011-01-01
This report describes a linked series of Masters-level computer practical workshops. They comprise an advanced functional genomics investigation, based upon analysis of a microarray dataset probing yeast DNA damage responses. The workshops require the students to analyse highly complex transcriptomics datasets, and were designed to stimulate…
Wang, Zhaocai; Huang, Dongmei; Meng, Huajun; Tang, Chengpei
2013-10-01
The minimum spanning tree (MST) problem is to find minimum edge connected subsets containing all the vertex of a given undirected graph. It is a vitally important NP-complete problem in graph theory and applied mathematics, having numerous real life applications. Moreover in previous studies, DNA molecular operations usually were used to solve NP-complete head-to-tail path search problems, rarely for NP-hard problems with multi-lateral path solutions result, such as the minimum spanning tree problem. In this paper, we present a new fast DNA algorithm for solving the MST problem using DNA molecular operations. For an undirected graph with n vertex and m edges, we reasonably design flexible length DNA strands representing the vertex and edges, take appropriate steps and get the solutions of the MST problem in proper length range and O(3m+n) time complexity. We extend the application of DNA molecular operations and simultaneity simplify the complexity of the computation. Results of computer simulative experiments show that the proposed method updates some of the best known values with very short time and that the proposed method provides a better performance with solution accuracy over existing algorithms. Copyright © 2013 The Authors. Published by Elsevier Ireland Ltd.. All rights reserved.
NASA Technical Reports Server (NTRS)
Ho, P. S.; Ellison, M. J.; Quigley, G. J.; Rich, A.
1986-01-01
The ease with which a particular DNA segment adopts the left-handed Z-conformation depends largely on the sequence and on the degree of negative supercoiling to which it is subjected. We describe a computer program (Z-hunt) that is designed to search long sequences of naturally occurring DNA and retrieve those nucleotide combinations of up to 24 bp in length which show a strong propensity for Z-DNA formation. Incorporated into Z-hunt is a statistical mechanical model based on empirically determined energetic parameters for the B to Z transition accumulated to date. The Z-forming potential of a sequence is assessed by ranking its behavior as a function of negative superhelicity relative to the behavior of similar sized randomly generated nucleotide sequences assembled from over 80,000 combinations. The program makes it possible to compare directly the Z-forming potential of sequences with different base compositions and different sequence lengths. Using Z-hunt, we have analyzed the DNA sequences of the bacteriophage phi X174, plasmid pBR322, the animal virus SV40 and the replicative form of the eukaryotic adenovirus-2. The results are compared with those previously obtained by others from experiments designed to locate Z-DNA forming regions in these sequences using probes which show specificity for the left-handed DNA conformation.
Gu, Jiande; Wang, Jing; Leszczynski, Jerzy
2014-01-30
Computational chemistry approach was applied to explore the nature of electron attachment to cytosine-rich DNA single strands. An oligomer dinucleoside phosphate deoxycytidylyl-3',5'-deoxycytidine (dCpdC) was selected as a model system for investigations by density functional theory. Electron distribution patterns for the radical anions of dCpdC in aqueous solution were explored. The excess electron may reside on the nucleobase at the 5' position (dC(•-)pdC) or at the 3' position (dCpdC(•-)). From comparison with electron attachment to the cytosine related DNA fragments, the electron affinity for the formation of the cytosine-centered radical anion in DNA is estimated to be around 2.2 eV. Electron attachment to cytosine sites in DNA single strands might cause perturbations of local structural characteristics. Visible absorption spectroscopy may be applied to validate computational results and determine experimentally the existence of the base-centered radical anion. The time-dependent DFT study shows the absorption around 550-600 nm for the cytosine-centered radical anions of DNA oligomers. This indicates that if such species are detected experimentally they would be characterized by a distinctive color.
Implementation of Arithmetic and Nonarithmetic Functions on a Label-free and DNA-based Platform
NASA Astrophysics Data System (ADS)
Wang, Kun; He, Mengqi; Wang, Jin; He, Ronghuan; Wang, Jianhua
2016-10-01
A series of complex logic gates were constructed based on graphene oxide and DNA-templated silver nanoclusters to perform both arithmetic and nonarithmetic functions. For the purpose of satisfying the requirements of progressive computational complexity and cost-effectiveness, a label-free and universal platform was developed by integration of various functions, including half adder, half subtractor, multiplexer and demultiplexer. The label-free system avoided laborious modification of biomolecules. The designed DNA-based logic gates can be implemented with readout of near-infrared fluorescence, and exhibit great potential applications in the field of bioimaging as well as disease diagnosis.
Self-Assembling Molecular Logic Gates Based on DNA Crossover Tiles.
Campbell, Eleanor A; Peterson, Evan; Kolpashchikov, Dmitry M
2017-07-05
DNA-based computational hardware has attracted ever-growing attention due to its potential to be useful in the analysis of complex mixtures of biological markers. Here we report the design of self-assembling logic gates that recognize DNA inputs and assemble into crossover tiles when the output signal is high; the crossover structures disassemble to form separate DNA stands when the output is low. The output signal can be conveniently detected by fluorescence using a molecular beacon probe as a reporter. AND, NOT, and OR logic gates were designed. We demonstrate that the gates can connect to each other to produce other logic functions. © 2017 Wiley-VCH Verlag GmbH & Co. KGaA, Weinheim.
NASA Astrophysics Data System (ADS)
Enayatifar, Rasul; Sadaei, Hossein Javedani; Abdullah, Abdul Hanan; Lee, Malrey; Isnin, Ismail Fauzi
2015-08-01
Currently, there are many studies have conducted on developing security of the digital image in order to protect such data while they are sending on the internet. This work aims to propose a new approach based on a hybrid model of the Tinkerbell chaotic map, deoxyribonucleic acid (DNA) and cellular automata (CA). DNA rules, DNA sequence XOR operator and CA rules are used simultaneously to encrypt the plain-image pixels. To determine rule number in DNA sequence and also CA, a 2-dimension Tinkerbell chaotic map is employed. Experimental results and computer simulations, both confirm that the proposed scheme not only demonstrates outstanding encryption, but also resists various typical attacks.
OSG-GEM: Gene Expression Matrix Construction Using the Open Science Grid.
Poehlman, William L; Rynge, Mats; Branton, Chris; Balamurugan, D; Feltus, Frank A
2016-01-01
High-throughput DNA sequencing technology has revolutionized the study of gene expression while introducing significant computational challenges for biologists. These computational challenges include access to sufficient computer hardware and functional data processing workflows. Both these challenges are addressed with our scalable, open-source Pegasus workflow for processing high-throughput DNA sequence datasets into a gene expression matrix (GEM) using computational resources available to U.S.-based researchers on the Open Science Grid (OSG). We describe the usage of the workflow (OSG-GEM), discuss workflow design, inspect performance data, and assess accuracy in mapping paired-end sequencing reads to a reference genome. A target OSG-GEM user is proficient with the Linux command line and possesses basic bioinformatics experience. The user may run this workflow directly on the OSG or adapt it to novel computing environments.
OSG-GEM: Gene Expression Matrix Construction Using the Open Science Grid
Poehlman, William L.; Rynge, Mats; Branton, Chris; Balamurugan, D.; Feltus, Frank A.
2016-01-01
High-throughput DNA sequencing technology has revolutionized the study of gene expression while introducing significant computational challenges for biologists. These computational challenges include access to sufficient computer hardware and functional data processing workflows. Both these challenges are addressed with our scalable, open-source Pegasus workflow for processing high-throughput DNA sequence datasets into a gene expression matrix (GEM) using computational resources available to U.S.-based researchers on the Open Science Grid (OSG). We describe the usage of the workflow (OSG-GEM), discuss workflow design, inspect performance data, and assess accuracy in mapping paired-end sequencing reads to a reference genome. A target OSG-GEM user is proficient with the Linux command line and possesses basic bioinformatics experience. The user may run this workflow directly on the OSG or adapt it to novel computing environments. PMID:27499617
Modeling photoionization of aqueous DNA and its components.
Pluhařová, Eva; Slavíček, Petr; Jungwirth, Pavel
2015-05-19
Radiation damage to DNA is usually considered in terms of UVA and UVB radiation. These ultraviolet rays, which are part of the solar spectrum, can indeed cause chemical lesions in DNA, triggered by photoexcitation particularly in the UVB range. Damage can, however, be also caused by higher energy radiation, which can ionize directly the DNA or its immediate surroundings, leading to indirect damage. Thanks to absorption in the atmosphere, the intensity of such ionizing radiation is negligible in the solar spectrum at the surface of Earth. Nevertheless, such an ionizing scenario can become dangerously plausible for astronauts or flight personnel, as well as for persons present at nuclear power plant accidents. On the beneficial side, ionizing radiation is employed as means for destroying the DNA of cancer cells during radiation therapy. Quantitative information about ionization of DNA and its components is important not only for DNA radiation damage, but also for understanding redox properties of DNA in redox sensing or labeling, as well as charge migration along the double helix in nanoelectronics applications. Until recently, the vast majority of experimental and computational data on DNA ionization was pertinent to its components in the gas phase, which is far from its native aqueous environment. The situation has, however, changed for the better due to the advent of photoelectron spectroscopy in liquid microjets and its most recent application to photoionization of aqueous nucleosides, nucleotides, and larger DNA fragments. Here, we present a consistent and efficient computational methodology, which allows to accurately evaluate ionization energies and model photoelectron spectra of aqueous DNA and its individual components. After careful benchmarking, the method based on density functional theory and its time-dependent variant with properly chosen hybrid functionals and polarizable continuum solvent model provides ionization energies with accuracy of 0.2-0.3 eV, allowing for faithful modeling and interpretation of DNA photoionization. The key finding is that the aqueous medium is remarkably efficient in screening the interactions within DNA such that, unlike in the gas phase, ionization of a base, nucleoside, or nucleotide depends only very weakly on the particular DNA context. An exception is the electronic interaction between neighboring bases which can lead to sequence-specific effects, such as a partial delocalization of the cationic hole upon ionization enabled by presence of adjacent bases of the same type.
In Silico Design and Characterization of DNA Nanomaterials
NASA Astrophysics Data System (ADS)
Nash, Jessica A.
Deoxyribonucleic acid (DNA) and ribonucleic acid (RNA) function biologically as carriers of genetic information. However, due to their ability to self-assemble via base pairing, nucleic acid molecules have become widely used in nanotechnology. In this dissertation, in silico techniques are used to probe the structure-property relationships of nucleic acid based nanomaterials. In Part 1, computational methods are employed to formulate nanoparticle design rules for applications in nucleic acid packaging and delivery. Nanoparticles (NPs) play increasingly important roles in nanomedicine, where the surface chemistry allows for control over interactions with biomolecules. Understanding how DNA and RNA compaction occurs is relevant to biological systems and systems in nanotechnology, and critical for the development of more efficient and effective nanoparticle carriers. Computational modeling allows for the description of bio-nano systems and processes with unprecedented detail, and can provide insights and guidelines for the creation of new nanomaterials. Using all-atom molecular dynamics simulations, the effect of nanoparticle surface chemistry, size, and solvent ionic strength on interactions with DNA and RNA are reported. In Chapter 2, a systematic study of the effect of nanoparticle charge on ability to bend and wrap short sequences of DNA and RNA is presented. To cause bending of DNA, a nanoparticle charge of at least +30 is required. Higher nanoparticle charges cause a greater degree of compaction. For RNA, however, charged ligand end-groups bind internally and prevent RNA bending. Nanoparticles were designed to test the influence of NP ligand shell shape and length on RNA binding using these results. In Chapter 3, all-atom simulation of NPs with long double stranded RNA are reported. Simulations show that by shortening NP ligand length, double stranded RNA can be wrapped. In Chapter 4, we consider compaction of long DNA by nanoparticles. NPs with +120 charge can fully compact DNA, but the wrapping is unordered on the surface. Chapter 5 reports the influence of NPs on the structure of single stranded DNA and RNA, showing that NPs have a greater influence on poly-pyrimidine strands than poly-purine strands, and can interrupt hydrogen bonds and pi-pi stacking. In Part II of this dissertation, computational techniques are applied to study DNA tiles and origami. Due to base-pairing DNA can be used to place objects with nanoscale precision, with applications in nanoscience and nanomedicine. Chapter 6 presents the development of anticoagulants using DNA weave tiles and aptamers. More effective anticoagulants can be created by varying the DNA aptamer used, and increasing local concentration by attaching aptamers to a DNA tile. Molecular dynamics simulations show that increasing the number of helices on a DNA weave tile increases tile flexibility. Chapter 7 introduces a tool developed for visualization of DNA origami design. We develop circle map visualizations for DNA origami and maps of the base composition, allowing for visualizations of DNA origami that were not previously available. This tool is currently available online via nanohub (open source) for users around the world. The results reported here provide a fundamental understanding of the behavior of DNA systems in nanotechnology. Results are expected to aid in the development of more effective NP compaction agents, DNA delivery vehicles, and DNA origami design.
ERIC Educational Resources Information Center
Langheinrich, Jessica; Bogner, Franz X.
2015-01-01
As non-scientific conceptions interfere with learning processes, teachers need both, to know about them and to address them in their classrooms. For our study, based on 182 eleventh graders, we analyzed the level of conceptual understanding by implementing the "draw and write" technique during a computer-supported gene technology module.…
The Essential Component in DNA-Based Information Storage System: Robust Error-Tolerating Module
Yim, Aldrin Kay-Yuen; Yu, Allen Chi-Shing; Li, Jing-Woei; Wong, Ada In-Chun; Loo, Jacky F. C.; Chan, King Ming; Kong, S. K.; Yip, Kevin Y.; Chan, Ting-Fung
2014-01-01
The size of digital data is ever increasing and is expected to grow to 40,000 EB by 2020, yet the estimated global information storage capacity in 2011 is <300 EB, indicating that most of the data are transient. DNA, as a very stable nano-molecule, is an ideal massive storage device for long-term data archive. The two most notable illustrations are from Church et al. and Goldman et al., whose approaches are well-optimized for most sequencing platforms – short synthesized DNA fragments without homopolymer. Here, we suggested improvements on error handling methodology that could enable the integration of DNA-based computational process, e.g., algorithms based on self-assembly of DNA. As a proof of concept, a picture of size 438 bytes was encoded to DNA with low-density parity-check error-correction code. We salvaged a significant portion of sequencing reads with mutations generated during DNA synthesis and sequencing and successfully reconstructed the entire picture. A modular-based programing framework – DNAcodec with an eXtensible Markup Language-based data format was also introduced. Our experiments demonstrated the practicability of long DNA message recovery with high error tolerance, which opens the field to biocomputing and synthetic biology. PMID:25414846
Li, Kenli; Zou, Shuting; Xv, Jin
2008-01-01
Elliptic curve cryptographic algorithms convert input data to unrecognizable encryption and the unrecognizable data back again into its original decrypted form. The security of this form of encryption hinges on the enormous difficulty that is required to solve the elliptic curve discrete logarithm problem (ECDLP), especially over GF(2(n)), n in Z+. This paper describes an effective method to find solutions to the ECDLP by means of a molecular computer. We propose that this research accomplishment would represent a breakthrough for applied biological computation and this paper demonstrates that in principle this is possible. Three DNA-based algorithms: a parallel adder, a parallel multiplier, and a parallel inverse over GF(2(n)) are described. The biological operation time of all of these algorithms is polynomial with respect to n. Considering this analysis, cryptography using a public key might be less secure. In this respect, a principal contribution of this paper is to provide enhanced evidence of the potential of molecular computing to tackle such ambitious computations.
Li, Kenli; Zou, Shuting; Xv, Jin
2008-01-01
Elliptic curve cryptographic algorithms convert input data to unrecognizable encryption and the unrecognizable data back again into its original decrypted form. The security of this form of encryption hinges on the enormous difficulty that is required to solve the elliptic curve discrete logarithm problem (ECDLP), especially over GF(2n), n ∈ Z+. This paper describes an effective method to find solutions to the ECDLP by means of a molecular computer. We propose that this research accomplishment would represent a breakthrough for applied biological computation and this paper demonstrates that in principle this is possible. Three DNA-based algorithms: a parallel adder, a parallel multiplier, and a parallel inverse over GF(2n) are described. The biological operation time of all of these algorithms is polynomial with respect to n. Considering this analysis, cryptography using a public key might be less secure. In this respect, a principal contribution of this paper is to provide enhanced evidence of the potential of molecular computing to tackle such ambitious computations. PMID:18431451
Fast algorithms for computing phylogenetic divergence time.
Crosby, Ralph W; Williams, Tiffani L
2017-12-06
The inference of species divergence time is a key step in most phylogenetic studies. Methods have been available for the last ten years to perform the inference, but the performance of the methods does not yet scale well to studies with hundreds of taxa and thousands of DNA base pairs. For example a study of 349 primate taxa was estimated to require over 9 months of processing time. In this work, we present a new algorithm, AncestralAge, that significantly improves the performance of the divergence time process. As part of AncestralAge, we demonstrate a new method for the computation of phylogenetic likelihood and our experiments show a 90% improvement in likelihood computation time on the aforementioned dataset of 349 primates taxa with over 60,000 DNA base pairs. Additionally, we show that our new method for the computation of the Bayesian prior on node ages reduces the running time for this computation on the 349 taxa dataset by 99%. Through the use of these new algorithms we open up the ability to perform divergence time inference on large phylogenetic studies.
Holes influence the mutation spectrum of human mitochondrial DNA
NASA Astrophysics Data System (ADS)
Villagran, Martha; Miller, John
Mutations drive evolution and disease, showing highly non-random patterns of variant frequency vs. nucleotide position. We use computational DNA hole spectroscopy [M.Y. Suarez-Villagran & J.H. Miller, Sci. Rep. 5, 13571 (2015)] to reveal sites of enhanced hole probability in selected regions of human mitochondrial DNA. A hole is a mobile site of positive charge created when an electron is removed, for example by radiation or contact with a mutagenic agent. The hole spectra are quantum mechanically computed using a two-stranded tight binding model of DNA. We observe significant correlation between spectra of hole probabilities and of genetic variation frequencies from the MITOMAP database. These results suggest that hole-enhanced mutation mechanisms exert a substantial, perhaps dominant, influence on mutation patterns in DNA. One example is where a trapped hole induces a hydrogen bond shift, known as tautomerization, which then triggers a base-pair mismatch during replication. Our results deepen overall understanding of sequence specific mutation rates, encompassing both hotspots and cold spots, which drive molecular evolution.
Measurement of inelastic cross sections for low-energy electron scattering from DNA bases.
Michaud, Marc; Bazin, Marc; Sanche, Léon
2012-01-01
To determine experimentally the absolute cross sections (CS) to deposit various amount of energies into DNA bases by low-energy electron (LEE) impact. Electron energy loss (EEL) spectra of DNA bases were recorded for different LEE impact energies on the molecules deposited at very low coverage on an inert argon (Ar) substrate. Following their normalisation to the effective incident electron current and molecular surface number density, the EEL spectra were then fitted with multiple Gaussian functions in order to delimit the various excitation energy regions. The CS to excite a molecule into its various excitation modes were finally obtained from computing the area under the corresponding Gaussians. The EEL spectra and absolute CS for the electronic excitations of pyrimidine and the DNA bases thymine, adenine, and cytosine by electron impacts below 18 eV were reported for the molecules deposited at about monolayer coverage on a solid Ar substrate. The CS for electronic excitations of DNA bases by LEE impact were found to lie within the 10(216) to 10(218) cm(2) range. The large value of the total ionisation CS indicated that ionisation of DNA bases by LEE is an important dissipative process via which ionising radiation degrades and is absorbed in DNA.
Measurement of inelastic cross sections for low-energy electron scattering from DNA bases
Michaud, Marc; Bazin, Marc.; Sanche, Léon
2013-01-01
Purpose Determine experimentally the absolute cross sections (CS) to deposit various amount of energies into DNA bases by low-energy electron (LEE) impact. Materials and methods Electron energy loss (EEL) spectra of DNA bases are recorded for different LEE impact energies on the molecules deposited at very low coverage on an inert argon (Ar) substrate. Following their normalisation to the effective incident electron current and molecular surface number density, the EEL spectra are then fitted with multiple Gaussian functions in order to delimit the various excitation energy regions. The CS to excite a molecule into its various excitation modes are finally obtained from computing the area under the corresponding Gaussians. Results The EEL spectra and absolute CS for the electronic excitations of pyrimidine and the DNA bases thymine, adenine, and cytosine by electron impacts below 18 eV are reported for the molecules deposited at about monolayer coverage on a solid Ar substrate. Conclusions The CS for electronic excitations of DNA bases by LEE impact are found to lie within the 10−16 – 10−18 cm2 range. The large value of the total ionisation CS indicates that ionisation of DNA bases by LEE is an important dissipative process via which ionising radiation degrades and is absorbed in DNA. PMID:21615242
Christen, Matthias; Del Medico, Luca; Christen, Heinz; Christen, Beat
2017-01-01
Recent advances in lower-cost DNA synthesis techniques have enabled new innovations in the field of synthetic biology. Still, efficient design and higher-order assembly of genome-scale DNA constructs remains a labor-intensive process. Given the complexity, computer assisted design tools that fragment large DNA sequences into fabricable DNA blocks are needed to pave the way towards streamlined assembly of biological systems. Here, we present the Genome Partitioner software implemented as a web-based interface that permits multi-level partitioning of genome-scale DNA designs. Without the need for specialized computing skills, biologists can submit their DNA designs to a fully automated pipeline that generates the optimal retrosynthetic route for higher-order DNA assembly. To test the algorithm, we partitioned a 783 kb Caulobacter crescentus genome design. We validated the partitioning strategy by assembling a 20 kb test segment encompassing a difficult to synthesize DNA sequence. Successful assembly from 1 kb subblocks into the 20 kb segment highlights the effectiveness of the Genome Partitioner for reducing synthesis costs and timelines for higher-order DNA assembly. The Genome Partitioner is broadly applicable to translate DNA designs into ready to order sequences that can be assembled with standardized protocols, thus offering new opportunities to harness the diversity of microbial genomes for synthetic biology applications. The Genome Partitioner web tool can be accessed at https://christenlab.ethz.ch/GenomePartitioner.
Yang, Changwon; Kim, Eunae; Pak, Youngshang
2015-01-01
Houghton (HG) base pairing plays a central role in the DNA binding of proteins and small ligands. Probing detailed transition mechanism from Watson–Crick (WC) to HG base pair (bp) formation in duplex DNAs is of fundamental importance in terms of revealing intrinsic functions of double helical DNAs beyond their sequence determined functions. We investigated a free energy landscape of a free B-DNA with an adenosine–thymine (A–T) rich sequence to probe its conformational transition pathways from WC to HG base pairing. The free energy landscape was computed with a state-of-art two-dimensional umbrella molecular dynamics simulation at the all-atom level. The present simulation showed that in an isolated duplex DNA, the spontaneous transition from WC to HG bp takes place via multiple pathways. Notably, base flipping into the major and minor grooves was found to play an important role in forming these multiple transition pathways. This finding suggests that naked B-DNA under normal conditions has an inherent ability to form HG bps via spontaneous base opening events. PMID:26250116
Logic integration of mRNA signals by an RNAi-based molecular computer.
Xie, Zhen; Liu, Siyuan John; Bleris, Leonidas; Benenson, Yaakov
2010-05-01
Synthetic in vivo molecular 'computers' could rewire biological processes by establishing programmable, non-native pathways between molecular signals and biological responses. Multiple molecular computer prototypes have been shown to work in simple buffered solutions. Many of those prototypes were made of DNA strands and performed computations using cycles of annealing-digestion or strand displacement. We have previously introduced RNA interference (RNAi)-based computing as a way of implementing complex molecular logic in vivo. Because it also relies on nucleic acids for its operation, RNAi computing could benefit from the tools developed for DNA systems. However, these tools must be harnessed to produce bioactive components and be adapted for harsh operating environments that reflect in vivo conditions. In a step toward this goal, we report the construction and implementation of biosensors that 'transduce' mRNA levels into bioactive, small interfering RNA molecules via RNA strand exchange in a cell-free Drosophila embryo lysate, a step beyond simple buffered environments. We further integrate the sensors with our RNAi 'computational' module to evaluate two-input logic functions on mRNA concentrations. Our results show how RNA strand exchange can expand the utility of RNAi computing and point toward the possibility of using strand exchange in a native biological setting.
Twisting short dsDNA with applied tension
NASA Astrophysics Data System (ADS)
Zoli, Marco
2018-02-01
The twisting deformation of mechanically stretched DNA molecules is studied by a coarse grained Hamiltonian model incorporating the fundamental interactions that stabilize the double helix and accounting for the radial and angular base pair fluctuations. The latter are all the more important at short length scales in which DNA fragments maintain an intrinsic flexibility. The presented computational method simulates a broad ensemble of possible molecule conformations characterized by a specific average twist and determines the energetically most convenient helical twist by free energy minimization. As this is done for any external load, the method yields the characteristic twist-stretch profile of the molecule and also computes the changes in the macroscopic helix parameters i.e. average diameter and rise distance. It is predicted that short molecules under stretching should first over-twist and then untwist by increasing the external load. Moreover, applying a constant load and simulating a torsional strain which over-twists the helix, it is found that the average helix diameter shrinks while the molecule elongates, in agreement with the experimental trend observed in kilo-base long sequences. The quantitative relation between percent relative elongation and superhelical density at fixed load is derived. The proposed theoretical model and computational method offer a general approach to characterize specific DNA fragments and predict their macroscopic elastic response as a function of the effective potential parameters of the mesoscopic Hamiltonian.
DNA-based construction at the nanoscale: emerging trends and applications
NASA Astrophysics Data System (ADS)
Lourdu Xavier, P.; Chandrasekaran, Arun Richard
2018-02-01
The field of structural DNA nanotechnology has evolved remarkably—from the creation of artificial immobile junctions to the recent DNA-protein hybrid nanoscale shapes—in a span of about 35 years. It is now possible to create complex DNA-based nanoscale shapes and large hierarchical assemblies with greater stability and predictability, thanks to the development of computational tools and advances in experimental techniques. Although it started with the original goal of DNA-assisted structure determination of difficult-to-crystallize molecules, DNA nanotechnology has found its applications in a myriad of fields. In this review, we cover some of the basic and emerging assembly principles: hybridization, base stacking/shape complementarity, and protein-mediated formation of nanoscale structures. We also review various applications of DNA nanostructures, with special emphasis on some of the biophysical applications that have been reported in recent years. In the outlook, we discuss further improvements in the assembly of such structures, and explore possible future applications involving super-resolved fluorescence, single-particle cryo-electron (cryo-EM) and x-ray free electron laser (XFEL) nanoscopic imaging techniques, and in creating new synergistic designer materials.
DNA-based construction at the nanoscale: emerging trends and applications.
Xavier, P Lourdu; Chandrasekaran, Arun Richard
2018-02-09
The field of structural DNA nanotechnology has evolved remarkably-from the creation of artificial immobile junctions to the recent DNA-protein hybrid nanoscale shapes-in a span of about 35 years. It is now possible to create complex DNA-based nanoscale shapes and large hierarchical assemblies with greater stability and predictability, thanks to the development of computational tools and advances in experimental techniques. Although it started with the original goal of DNA-assisted structure determination of difficult-to-crystallize molecules, DNA nanotechnology has found its applications in a myriad of fields. In this review, we cover some of the basic and emerging assembly principles: hybridization, base stacking/shape complementarity, and protein-mediated formation of nanoscale structures. We also review various applications of DNA nanostructures, with special emphasis on some of the biophysical applications that have been reported in recent years. In the outlook, we discuss further improvements in the assembly of such structures, and explore possible future applications involving super-resolved fluorescence, single-particle cryo-electron (cryo-EM) and x-ray free electron laser (XFEL) nanoscopic imaging techniques, and in creating new synergistic designer materials.
RNA nanotechnology for computer design and in vivo computation
Qiu, Meikang; Khisamutdinov, Emil; Zhao, Zhengyi; Pan, Cheryl; Choi, Jeong-Woo; Leontis, Neocles B.; Guo, Peixuan
2013-01-01
Molecular-scale computing has been explored since 1989 owing to the foreseeable limitation of Moore's law for silicon-based computation devices. With the potential of massive parallelism, low energy consumption and capability of working in vivo, molecular-scale computing promises a new computational paradigm. Inspired by the concepts from the electronic computer, DNA computing has realized basic Boolean functions and has progressed into multi-layered circuits. Recently, RNA nanotechnology has emerged as an alternative approach. Owing to the newly discovered thermodynamic stability of a special RNA motif (Shu et al. 2011 Nat. Nanotechnol. 6, 658–667 (doi:10.1038/nnano.2011.105)), RNA nanoparticles are emerging as another promising medium for nanodevice and nanomedicine as well as molecular-scale computing. Like DNA, RNA sequences can be designed to form desired secondary structures in a straightforward manner, but RNA is structurally more versatile and more thermodynamically stable owing to its non-canonical base-pairing, tertiary interactions and base-stacking property. A 90-nucleotide RNA can exhibit 490 nanostructures, and its loops and tertiary architecture can serve as a mounting dovetail that eliminates the need for external linking dowels. Its enzymatic and fluorogenic activity creates diversity in computational design. Varieties of small RNA can work cooperatively, synergistically or antagonistically to carry out computational logic circuits. The riboswitch and enzymatic ribozyme activities and its special in vivo attributes offer a great potential for in vivo computation. Unique features in transcription, termination, self-assembly, self-processing and acid resistance enable in vivo production of RNA nanoparticles that harbour various regulators for intracellular manipulation. With all these advantages, RNA computation is promising, but it is still in its infancy. Many challenges still exist. Collaborations between RNA nanotechnologists and computer scientists are necessary to advance this nascent technology. PMID:24000362
RNA nanotechnology for computer design and in vivo computation.
Qiu, Meikang; Khisamutdinov, Emil; Zhao, Zhengyi; Pan, Cheryl; Choi, Jeong-Woo; Leontis, Neocles B; Guo, Peixuan
2013-10-13
Molecular-scale computing has been explored since 1989 owing to the foreseeable limitation of Moore's law for silicon-based computation devices. With the potential of massive parallelism, low energy consumption and capability of working in vivo, molecular-scale computing promises a new computational paradigm. Inspired by the concepts from the electronic computer, DNA computing has realized basic Boolean functions and has progressed into multi-layered circuits. Recently, RNA nanotechnology has emerged as an alternative approach. Owing to the newly discovered thermodynamic stability of a special RNA motif (Shu et al. 2011 Nat. Nanotechnol. 6, 658-667 (doi:10.1038/nnano.2011.105)), RNA nanoparticles are emerging as another promising medium for nanodevice and nanomedicine as well as molecular-scale computing. Like DNA, RNA sequences can be designed to form desired secondary structures in a straightforward manner, but RNA is structurally more versatile and more thermodynamically stable owing to its non-canonical base-pairing, tertiary interactions and base-stacking property. A 90-nucleotide RNA can exhibit 4⁹⁰ nanostructures, and its loops and tertiary architecture can serve as a mounting dovetail that eliminates the need for external linking dowels. Its enzymatic and fluorogenic activity creates diversity in computational design. Varieties of small RNA can work cooperatively, synergistically or antagonistically to carry out computational logic circuits. The riboswitch and enzymatic ribozyme activities and its special in vivo attributes offer a great potential for in vivo computation. Unique features in transcription, termination, self-assembly, self-processing and acid resistance enable in vivo production of RNA nanoparticles that harbour various regulators for intracellular manipulation. With all these advantages, RNA computation is promising, but it is still in its infancy. Many challenges still exist. Collaborations between RNA nanotechnologists and computer scientists are necessary to advance this nascent technology.
End-to-end distance and contour length distribution functions of DNA helices
NASA Astrophysics Data System (ADS)
Zoli, Marco
2018-06-01
I present a computational method to evaluate the end-to-end and the contour length distribution functions of short DNA molecules described by a mesoscopic Hamiltonian. The method generates a large statistical ensemble of possible configurations for each dimer in the sequence, selects the global equilibrium twist conformation for the molecule, and determines the average base pair distances along the molecule backbone. Integrating over the base pair radial and angular fluctuations, I derive the room temperature distribution functions as a function of the sequence length. The obtained values for the most probable end-to-end distance and contour length distance, providing a measure of the global molecule size, are used to examine the DNA flexibility at short length scales. It is found that, also in molecules with less than ˜60 base pairs, coiled configurations maintain a large statistical weight and, consistently, the persistence lengths may be much smaller than in kilo-base DNA.
Model Checking Temporal Logic Formulas Using Sticker Automata
Feng, Changwei; Wu, Huanmei
2017-01-01
As an important complex problem, the temporal logic model checking problem is still far from being fully resolved under the circumstance of DNA computing, especially Computation Tree Logic (CTL), Interval Temporal Logic (ITL), and Projection Temporal Logic (PTL), because there is still a lack of approaches for DNA model checking. To address this challenge, a model checking method is proposed for checking the basic formulas in the above three temporal logic types with DNA molecules. First, one-type single-stranded DNA molecules are employed to encode the Finite State Automaton (FSA) model of the given basic formula so that a sticker automaton is obtained. On the other hand, other single-stranded DNA molecules are employed to encode the given system model so that the input strings of the sticker automaton are obtained. Next, a series of biochemical reactions are conducted between the above two types of single-stranded DNA molecules. It can then be decided whether the system satisfies the formula or not. As a result, we have developed a DNA-based approach for checking all the basic formulas of CTL, ITL, and PTL. The simulated results demonstrate the effectiveness of the new method. PMID:29119114
Huang, Ying; Li, Cao; Liu, Linhai; Jia, Xianbo; Lai, Song-Jia
2016-01-01
Although various computer tools have been elaborately developed to calculate a series of statistics in molecular population genetics for both small- and large-scale DNA data, there is no efficient and easy-to-use toolkit available yet for exclusively focusing on the steps of mathematical calculation. Here, we present PopSc, a bioinformatic toolkit for calculating 45 basic statistics in molecular population genetics, which could be categorized into three classes, including (i) genetic diversity of DNA sequences, (ii) statistical tests for neutral evolution, and (iii) measures of genetic differentiation among populations. In contrast to the existing computer tools, PopSc was designed to directly accept the intermediate metadata, such as allele frequencies, rather than the raw DNA sequences or genotyping results. PopSc is first implemented as the web-based calculator with user-friendly interface, which greatly facilitates the teaching of population genetics in class and also promotes the convenient and straightforward calculation of statistics in research. Additionally, we also provide the Python library and R package of PopSc, which can be flexibly integrated into other advanced bioinformatic packages of population genetics analysis. PMID:27792763
Chen, Shi-Yi; Deng, Feilong; Huang, Ying; Li, Cao; Liu, Linhai; Jia, Xianbo; Lai, Song-Jia
2016-01-01
Although various computer tools have been elaborately developed to calculate a series of statistics in molecular population genetics for both small- and large-scale DNA data, there is no efficient and easy-to-use toolkit available yet for exclusively focusing on the steps of mathematical calculation. Here, we present PopSc, a bioinformatic toolkit for calculating 45 basic statistics in molecular population genetics, which could be categorized into three classes, including (i) genetic diversity of DNA sequences, (ii) statistical tests for neutral evolution, and (iii) measures of genetic differentiation among populations. In contrast to the existing computer tools, PopSc was designed to directly accept the intermediate metadata, such as allele frequencies, rather than the raw DNA sequences or genotyping results. PopSc is first implemented as the web-based calculator with user-friendly interface, which greatly facilitates the teaching of population genetics in class and also promotes the convenient and straightforward calculation of statistics in research. Additionally, we also provide the Python library and R package of PopSc, which can be flexibly integrated into other advanced bioinformatic packages of population genetics analysis.
Petkevičiūtė, D; Pasi, M; Gonzalez, O; Maddocks, J H
2014-11-10
cgDNA is a package for the prediction of sequence-dependent configuration-space free energies for B-form DNA at the coarse-grain level of rigid bases. For a fragment of any given length and sequence, cgDNA calculates the configuration of the associated free energy minimizer, i.e. the relative positions and orientations of each base, along with a stiffness matrix, which together govern differences in free energies. The model predicts non-local (i.e. beyond base-pair step) sequence dependence of the free energy minimizer. Configurations can be input or output in either the Curves+ definition of the usual helical DNA structural variables, or as a PDB file of coordinates of base atoms. We illustrate the cgDNA package by comparing predictions of free energy minimizers from (a) the cgDNA model, (b) time-averaged atomistic molecular dynamics (or MD) simulations, and (c) NMR or X-ray experimental observation, for (i) the Dickerson-Drew dodecamer and (ii) three oligomers containing A-tracts. The cgDNA predictions are rather close to those of the MD simulations, but many orders of magnitude faster to compute. Both the cgDNA and MD predictions are in reasonable agreement with the available experimental data. Our conclusion is that cgDNA can serve as a highly efficient tool for studying structural variations in B-form DNA over a wide range of sequences. © The Author(s) 2014. Published by Oxford University Press on behalf of Nucleic Acids Research.
Significance of DNA bond strength in programmable nanoparticle thermodynamics and dynamics.
Yu, Qiuyan; Hu, Jinglei; Hu, Yi; Wang, Rong
2018-04-04
Assembly of nanoparticles (NPs) coated with complementary DNA strands leads to novel crystals with nanosized basic units rather than classic atoms, ions or molecules. The assembly process is mediated by hybridization of DNA via specific base pairing interaction, and is kinetically linked to the disassociation of DNA duplexes. DNA-level physiochemical quantities, both thermodynamic and kinetic, are key to understanding this process and essential for the design of DNA-NP crystals. The melting transition properties are helpful to judge the thermostability and sensitivity of relative DNA probes or other applications. Three different cases are investigated by changing the linker length and the spacer length on which the melting properties depend using the molecular dynamics method. Melting temperature is determined by sigmoidal melting curves based on hybridization percentage versus temperature and the Lindemann melting rule simultaneously. We provide a computational strategy based on a coarse-grained model to estimate the hybridization enthalpy, entropy and free energy from percentages of hybridizations which are readily accessible in experiments. Importantly, the lifetime of DNA bond dehybridization based on temperature and the activation energy depending on DNA bond strength are also calculated. The simulation results are in good agreement with the theoretical analysis and the present experimental data. Our study provides a good strategy to predict the melting temperature which is important for the DNA-directed nanoparticle system, and bridges the dynamics and thermodynamics of DNA-directed nanoparticle systems by estimating the equilibrium constant from the hybridization of DNA bonds quantitatively.
Sequential addition of short DNA oligos in DNA-polymerase-based synthesis reactions
Gardner, Shea N; Mariella, Jr., Raymond P; Christian, Allen T; Young, Jennifer A; Clague, David S
2013-06-25
A method of preselecting a multiplicity of DNA sequence segments that will comprise the DNA molecule of user-defined sequence, separating the DNA sequence segments temporally, and combining the multiplicity of DNA sequence segments with at least one polymerase enzyme wherein the multiplicity of DNA sequence segments join to produce the DNA molecule of user-defined sequence. Sequence segments may be of length n, where n is an odd integer. In one embodiment the length of desired hybridizing overlap is specified by the user and the sequences and the protocol for combining them are guided by computational (bioinformatics) predictions. In one embodiment sequence segments are combined from multiple reading frames to span the same region of a sequence, so that multiple desired hybridizations may occur with different overlap lengths.
Cloud-based adaptive exon prediction for DNA analysis.
Putluri, Srinivasareddy; Zia Ur Rahman, Md; Fathima, Shaik Yasmeen
2018-02-01
Cloud computing offers significant research and economic benefits to healthcare organisations. Cloud services provide a safe place for storing and managing large amounts of such sensitive data. Under conventional flow of gene information, gene sequence laboratories send out raw and inferred information via Internet to several sequence libraries. DNA sequencing storage costs will be minimised by use of cloud service. In this study, the authors put forward a novel genomic informatics system using Amazon Cloud Services, where genomic sequence information is stored and accessed for processing. True identification of exon regions in a DNA sequence is a key task in bioinformatics, which helps in disease identification and design drugs. Three base periodicity property of exons forms the basis of all exon identification techniques. Adaptive signal processing techniques found to be promising in comparison with several other methods. Several adaptive exon predictors (AEPs) are developed using variable normalised least mean square and its maximum normalised variants to reduce computational complexity. Finally, performance evaluation of various AEPs is done based on measures such as sensitivity, specificity and precision using various standard genomic datasets taken from National Center for Biotechnology Information genomic sequence database.
3DNALandscapes: a database for exploring the conformational features of DNA.
Zheng, Guohui; Colasanti, Andrew V; Lu, Xiang-Jun; Olson, Wilma K
2010-01-01
3DNALandscapes, located at: http://3DNAscapes.rutgers.edu, is a new database for exploring the conformational features of DNA. In contrast to most structural databases, which archive the Cartesian coordinates and/or derived parameters and images for individual structures, 3DNALandscapes enables searches of conformational information across multiple structures. The database contains a wide variety of structural parameters and molecular images, computed with the 3DNA software package and known to be useful for characterizing and understanding the sequence-dependent spatial arrangements of the DNA sugar-phosphate backbone, sugar-base side groups, base pairs, base-pair steps, groove structure, etc. The data comprise all DNA-containing structures--both free and bound to proteins, drugs and other ligands--currently available in the Protein Data Bank. The web interface allows the user to link, report, plot and analyze this information from numerous perspectives and thereby gain insight into DNA conformation, deformability and interactions in different sequence and structural contexts. The data accumulated from known, well-resolved DNA structures can serve as useful benchmarks for the analysis and simulation of new structures. The collective data can also help to understand how DNA deforms in response to proteins and other molecules and undergoes conformational rearrangements.
Shi, Liang; Khandurina, Julia; Ronai, Zsolt; Li, Bi-Yu; Kwan, Wai King; Wang, Xun; Guttman, András
2003-01-01
A capillary gel electrophoresis based automated DNA fraction collection technique was developed to support a novel DNA fragment-pooling strategy for expressed sequence tag (EST) library construction. The cDNA population is first cleaved by BsaJ I and EcoR I restriction enzymes, and then subpooled by selective ligation with specific adapters followed by polymerase chain reaction (PCR) amplification and labeling. Combination of this cDNA fingerprinting method with high-resolution capillary gel electrophoresis separation and precise fractionation of individual cDNA transcript representatives avoids redundant fragment selection and concomitant repetitive sequencing of abundant transcripts. Using a computer-controlled capillary electrophoresis device the transcript representatives were separated by their size and fractions were automatically collected in every 30 s into 96-well plates. The high resolving power of the sieving matrix ensured sequencing grade separation of the DNA fragments (i.e., single-base resolution) and successful fraction collection. Performance and precision of the fraction collection procedure was validated by PCR amplification of the collected DNA fragments followed by capillary electrophoresis analysis for size and purity verification. The collected and PCR-amplified transcript representatives, ranging up to several hundred base pairs, were then sequenced to create an EST library.
A simple method for the computation of first neighbour frequencies of DNAs from CD spectra
Marck, Christian; Guschlbauer, Wilhelm
1978-01-01
A procedure for the computation of the first neighbour frequencies of DNA's is presented. This procedure is based on the first neighbour approximation of Gray and Tinoco. We show that the knowledge of all the ten elementary CD signals attached to the ten double stranded first neighbour configurations is not necessary. One can obtain the ten frequencies of an unknown DNA with the use of eight elementary CD signals corresponding to eight linearly independent polymer sequences. These signals can be extracted very simply from any eight or more CD spectra of double stranded DNA's of known frequencies. The ten frequencies of a DNA are obtained by least square fit of its CD spectrum with these elementary signals. One advantage of this procedure is that it does not necessitate linear programming, it can be used with CD data digitalized using a large number of wavelengths, thus permitting an accurate resolution of the CD spectra. Under favorable case, the ten frequencies of a DNA (not used as input data) can be determined with an average absolute error < 2%. We have also observed that certain satellite DNA's, those of Drosophila virilis and Callinectes sapidus have CD spectra compatible with those of DNA's of quasi random sequence; these satellite DNA's should adopt also the B-form in solution. PMID:673843
Kanazawa, Yuki; Ehara, Masahiro; Sommerfeld, Thomas
2016-03-10
Low-lying π* resonance states of DNA and RNA bases have been investigated by the recently developed projected complex absorbing potential (CAP)/symmetry-adapted cluster-configuration interaction (SAC-CI) method using a smooth Voronoi potential as CAP. In spite of the challenging CAP applications to higher resonance states of molecules of this size, the present calculations reproduce resonance positions observed by electron transmission spectra (ETS) provided the anticipated deviations due to vibronic effects and limited basis sets are taken into account. Moreover, for the standard nucleobases, the calculated positions and widths qualitatively agree with those obtained in previous electron scattering calculations. For guanine, both keto and enol forms were examined, and the calculated values of the keto form agree clearly better with the experimental findings. In addition to these standard bases, three modified forms of cytosine, which serve as epigenetic or biomarkers, were investigated: formylcytosine, methylcytosine, and chlorocytosine. Last, a strong correlation between the computed positions and the observed ETS values is demonstrated, clearly suggesting that the present computational protocol should be useful for predicting the π* resonances of congeners of DNA and RNA bases.
A DNA sequence analysis package for the IBM personal computer.
Lagrimini, L M; Brentano, S T; Donelson, J E
1984-01-01
We present here a collection of DNA sequence analysis programs, called "PC Sequence" (PCS), which are designed to run on the IBM Personal Computer (PC). These programs are written in IBM PC compiled BASIC and take full advantage of the IBM PC's speed, error handling, and graphics capabilities. For a modest initial expense in hardware any laboratory can use these programs to quickly perform computer analysis on DNA sequences. They are written with the novice user in mind and require very little training or previous experience with computers. Also provided are a text editing program for creating and modifying DNA sequence files and a communications program which enables the PC to communicate with and collect information from mainframe computers and DNA sequence databases. PMID:6546433
Effective Design of Multifunctional Peptides by Combining Compatible Functions
Diener, Christian; Garza Ramos Martínez, Georgina; Moreno Blas, Daniel; Castillo González, David A.; Corzo, Gerardo; Castro-Obregon, Susana; Del Rio, Gabriel
2016-01-01
Multifunctionality is a common trait of many natural proteins and peptides, yet the rules to generate such multifunctionality remain unclear. We propose that the rules defining some protein/peptide functions are compatible. To explore this hypothesis, we trained a computational method to predict cell-penetrating peptides at the sequence level and learned that antimicrobial peptides and DNA-binding proteins are compatible with the rules of our predictor. Based on this finding, we expected that designing peptides for CPP activity may render AMP and DNA-binding activities. To test this prediction, we designed peptides that embedded two independent functional domains (nuclear localization and yeast pheromone activity), linked by optimizing their composition to fit the rules characterizing cell-penetrating peptides. These peptides presented effective cell penetration, DNA-binding, pheromone and antimicrobial activities, thus confirming the effectiveness of our computational approach to design multifunctional peptides with potential therapeutic uses. Our computational implementation is available at http://bis.ifc.unam.mx/en/software/dcf. PMID:27096600
Del Medico, Luca; Christen, Heinz; Christen, Beat
2017-01-01
Recent advances in lower-cost DNA synthesis techniques have enabled new innovations in the field of synthetic biology. Still, efficient design and higher-order assembly of genome-scale DNA constructs remains a labor-intensive process. Given the complexity, computer assisted design tools that fragment large DNA sequences into fabricable DNA blocks are needed to pave the way towards streamlined assembly of biological systems. Here, we present the Genome Partitioner software implemented as a web-based interface that permits multi-level partitioning of genome-scale DNA designs. Without the need for specialized computing skills, biologists can submit their DNA designs to a fully automated pipeline that generates the optimal retrosynthetic route for higher-order DNA assembly. To test the algorithm, we partitioned a 783 kb Caulobacter crescentus genome design. We validated the partitioning strategy by assembling a 20 kb test segment encompassing a difficult to synthesize DNA sequence. Successful assembly from 1 kb subblocks into the 20 kb segment highlights the effectiveness of the Genome Partitioner for reducing synthesis costs and timelines for higher-order DNA assembly. The Genome Partitioner is broadly applicable to translate DNA designs into ready to order sequences that can be assembled with standardized protocols, thus offering new opportunities to harness the diversity of microbial genomes for synthetic biology applications. The Genome Partitioner web tool can be accessed at https://christenlab.ethz.ch/GenomePartitioner. PMID:28531174
Prykhozhij, Sergey V; Rajan, Vinothkumar; Berman, Jason N
2016-02-01
The development of clustered regularly interspaced short palindromic repeats (CRISPR)/Cas9 technology for mainstream biotechnological use based on its discovery as an adaptive immune mechanism in bacteria has dramatically improved the ability of molecular biologists to modify genomes of model organisms. The zebrafish is highly amenable to applications of CRISPR/Cas9 for mutation generation and a variety of DNA insertions. Cas9 protein in complex with a guide RNA molecule recognizes where to cut the homologous DNA based on a short stretch of DNA termed the protospacer-adjacent motif (PAM). Rapid and efficient identification of target sites immediately preceding PAM sites, quantification of genomic occurrences of similar (off target) sites and predictions of cutting efficiency are some of the features where computational tools play critical roles in CRISPR/Cas9 applications. Given the rapid advent and development of this technology, it can be a challenge for researchers to remain up to date with all of the important technological developments in this field. We have contributed to the armamentarium of CRISPR/Cas9 bioinformatics tools and trained other researchers in the use of appropriate computational programs to develop suitable experimental strategies. Here we provide an in-depth guide on how to use CRISPR/Cas9 and other relevant computational tools at each step of a host of genome editing experimental strategies. We also provide detailed conceptual outlines of the steps involved in the design and execution of CRISPR/Cas9-based experimental strategies, such as generation of frameshift mutations, larger chromosomal deletions and inversions, homology-independent insertion of gene cassettes and homology-based knock-in of defined point mutations and larger gene constructs.
Monte Carlo approach in assessing damage in higher order structures of DNA
NASA Technical Reports Server (NTRS)
Chatterjee, A.; Schmidt, J. B.; Holley, W. R.
1994-01-01
We have developed a computer monitor of nuclear DNA in the form of chromatin fibre. The fibres are modeled as a ideal solenoid consisting of twenty helical turns with six nucleosomes per turn. The chromatin model, in combination with are Monte Carlo theory of radiation damage induces by charged particles, based on general features of tack structure and stopping power theory, has been used to evaluate the influence of DNA structure on initial damage. An interesting has emerged from our calculations. Our calculated results predict the existence of strong spatial correlations in damage sites associated with the symmetries in the solenoidal model. We have calculated spectra of short fragments of double stranded DNA produced by multiple double strand breaks induced by both high and low LET radiation. The spectra exhibit peaks at multiples of approximately 85 base pairs (the nucleosome periodicity), and approximately 1000 base pairs (solenoid periodicity). Preliminary experiments to investigate the fragment distributions from irradiated DNA, made by B. Rydberg at Lawrence Berkeley Laboratory, confirm the existence of short DNA fragments and are in substantial agreement with the predictions of our theory.
Ni-DNA-based nanowires and nanodevices
NASA Astrophysics Data System (ADS)
Chang, Chia-Ching; Yuan, Chiun-Jye; Jian, Wen-Bin; Chen, Yu-Chang; di Ventra, Massimiliano
DNA is a highly versatile biopolymer that has been a recent focus in the field of nanomachines and nanoelectronics. DNA exhibits high stability, adjustable conductance, self-organizing capability, programmability and vast information storage. It is an ideal material in the applications of nanodevices, nanoelectronics, and molecular computing. Low conductance of native DNA renders applications difficult. However, doping with nickel ions tunes the DNA into a conducting polymer. Further studies showed that nickel ions containing DNA (Ni-DNA) nanowires exhibit characteristics of memristor and memcapacitor making them a potential mass information storage system. In summary, Ni-DNA has promising applications in a variety of fields, including nanoelectronics, biosensors and memcomputing. This study was supported in part by the Ministry of Science and Technology (MOST), Taiwan (ROC) MOST 103-2112-M-009-011 -MY3, and MOST 105-2627-M-009-006.
Theoretical determination of one-electron redox potentials for DNA bases, base pairs, and stacks.
Paukku, Y; Hill, G
2011-05-12
Electron affinities, ionization potentials, and redox potentials for DNA bases, base pairs, and N-methylated derivatives are computed at the DFT/M06-2X/6-31++G(d,p) level of theory. Redox properties of a guanine-guanine stack model are explored as well. Reduction and oxidation potentials are in good agreement with the experimental ones. Electron affinities of base pairs were found to be negative. Methylation of canonical bases affects the ionization potentials the most. Base pair formation and base stacking lower ionization potentials by 0.3 eV. Pairing of guanine with the 5-methylcytosine does not seem to influence the redox properties of this base pair much.
Yang, Changwon; Kim, Eunae; Pak, Youngshang
2015-09-18
Houghton (HG) base pairing plays a central role in the DNA binding of proteins and small ligands. Probing detailed transition mechanism from Watson-Crick (WC) to HG base pair (bp) formation in duplex DNAs is of fundamental importance in terms of revealing intrinsic functions of double helical DNAs beyond their sequence determined functions. We investigated a free energy landscape of a free B-DNA with an adenosine-thymine (A-T) rich sequence to probe its conformational transition pathways from WC to HG base pairing. The free energy landscape was computed with a state-of-art two-dimensional umbrella molecular dynamics simulation at the all-atom level. The present simulation showed that in an isolated duplex DNA, the spontaneous transition from WC to HG bp takes place via multiple pathways. Notably, base flipping into the major and minor grooves was found to play an important role in forming these multiple transition pathways. This finding suggests that naked B-DNA under normal conditions has an inherent ability to form HG bps via spontaneous base opening events. © The Author(s) 2015. Published by Oxford University Press on behalf of Nucleic Acids Research.
Sub-Terrahertz Spectroscopy of E.COLI Dna: Experiment, Statistical Model, and MD Simulations
NASA Astrophysics Data System (ADS)
Sizov, I.; Dorofeeva, T.; Khromova, T.; Gelmont, B.; Globus, T.
2012-06-01
We will present result of combined experimental and computational study of sub-THz absorption spectra from Escherichia coli (E.coli) DNA. Measurements were conducted using a Bruker FTIR spectrometer with a liquid helium cooled bolometer and a recently developed frequency domain sensor operating at room temperature, with spectral resolution of 0.25 cm-1 and 0.03 cm-1, correspondingly. We have earlier demonstrated that molecular dynamics (MD) simulation can be effectively applied for characterizing relatively small biological molecules, such as transfer RNA or small protein thioredoxin from E. coli , and help to understand and predict their absorption spectra. Large size of DNA macromolecules ( 5 million base pairs for E. coli DNA) prevents, however, direct application of MD simulation at the current level of computational capabilities. Therefore, by applying a second order Markov chain approach and Monte-Carlo technique, we have developed a new statistical model to construct DNA sequences from biological cells. These short representative sequences (20-60 base pairs) are built upon the most frequently repeated fragments (2-10 base pairs) in the original DNA. Using this new approach, we constructed DNA sequences for several non-pathogenic strains of E.coli, including a well-known strain BL21, uro-pathogenic strain, CFT073, and deadly EDL933 strain (O157:H7), and used MD simulations to calculate vibrational absorption spectra of these strains. Significant differences are clearly present in spectra of strains in averaged spectra and in all components for particular orientations. The mechanism of interaction of THz radiation with a biological molecule is studied by analyzing dynamics of atoms and correlation of local vibrations in the modeled molecule. Simulated THz vibrational spectra of DNA are compared with experimental results. With the spectral resolution of 0.1 cm-1 or better, which is now available in experiments, the very easy discrimination between different strains of the same bacteria becomes possible.
Molecular robots with sensors and intelligence.
Hagiya, Masami; Konagaya, Akihiko; Kobayashi, Satoshi; Saito, Hirohide; Murata, Satoshi
2014-06-17
CONSPECTUS: What we can call a molecular robot is a set of molecular devices such as sensors, logic gates, and actuators integrated into a consistent system. The molecular robot is supposed to react autonomously to its environment by receiving molecular signals and making decisions by molecular computation. Building such a system has long been a dream of scientists; however, despite extensive efforts, systems having all three functions (sensing, computation, and actuation) have not been realized yet. This Account introduces an ongoing research project that focuses on the development of molecular robotics funded by MEXT (Ministry of Education, Culture, Sports, Science and Technology, Japan). This 5 year project started in July 2012 and is titled "Development of Molecular Robots Equipped with Sensors and Intelligence". The major issues in the field of molecular robotics all correspond to a feedback (i.e., plan-do-see) cycle of a robotic system. More specifically, these issues are (1) developing molecular sensors capable of handling a wide array of signals, (2) developing amplification methods of signals to drive molecular computing devices, (3) accelerating molecular computing, (4) developing actuators that are controllable by molecular computers, and (5) providing bodies of molecular robots encapsulating the above molecular devices, which implement the conformational changes and locomotion of the robots. In this Account, the latest contributions to the project are reported. There are four research teams in the project that specialize on sensing, intelligence, amoeba-like actuation, and slime-like actuation, respectively. The molecular sensor team is focusing on the development of molecular sensors that can handle a variety of signals. This team is also investigating methods to amplify signals from the molecular sensors. The molecular intelligence team is developing molecular computers and is currently focusing on a new photochemical technology for accelerating DNA-based computations. They also introduce novel computational models behind various kinds of molecular computers necessary for designing such computers. The amoeba robot team aims at constructing amoeba-like robots. The team is trying to incorporate motor proteins, including kinesin and microtubules (MTs), for use as actuators implemented in a liposomal compartment as a robot body. They are also developing a methodology to link DNA-based computation and molecular motor control. The slime robot team focuses on the development of slime-like robots. The team is evaluating various gels, including DNA gel and BZ gel, for use as actuators, as well as the body material to disperse various molecular devices in it. They also try to control the gel actuators by DNA signals coming from molecular computers.
NASA Astrophysics Data System (ADS)
Mielke, Steven P.; Grønbech-Jensen, Niels; Krishnan, V. V.; Fink, William H.; Benham, Craig J.
2005-09-01
The topological state of DNA in vivo is dynamically regulated by a number of processes that involve interactions with bound proteins. In one such process, the tracking of RNA polymerase along the double helix during transcription, restriction of rotational motion of the polymerase and associated structures, generates waves of overtwist downstream and undertwist upstream from the site of transcription. The resulting superhelical stress is often sufficient to drive double-stranded DNA into a denatured state at locations such as promoters and origins of replication, where sequence-specific duplex opening is a prerequisite for biological function. In this way, transcription and other events that actively supercoil the DNA provide a mechanism for dynamically coupling genetic activity with regulatory and other cellular processes. Although computer modeling has provided insight into the equilibrium dynamics of DNA supercoiling, to date no model has appeared for simulating sequence-dependent DNA strand separation under the nonequilibrium conditions imposed by the dynamic introduction of torsional stress. Here, we introduce such a model and present results from an initial set of computer simulations in which the sequences of dynamically superhelical, 147 base pair DNA circles were systematically altered in order to probe the accuracy with which the model can predict location, extent, and time of stress-induced duplex denaturation. The results agree both with well-tested statistical mechanical calculations and with available experimental information. Additionally, we find that sites susceptible to denaturation show a propensity for localizing to supercoil apices, suggesting that base sequence determines locations of strand separation not only through the energetics of interstrand interactions, but also by influencing the geometry of supercoiling.
Mielke, Steven P; Grønbech-Jensen, Niels; Krishnan, V V; Fink, William H; Benham, Craig J
2005-09-22
The topological state of DNA in vivo is dynamically regulated by a number of processes that involve interactions with bound proteins. In one such process, the tracking of RNA polymerase along the double helix during transcription, restriction of rotational motion of the polymerase and associated structures, generates waves of overtwist downstream and undertwist upstream from the site of transcription. The resulting superhelical stress is often sufficient to drive double-stranded DNA into a denatured state at locations such as promoters and origins of replication, where sequence-specific duplex opening is a prerequisite for biological function. In this way, transcription and other events that actively supercoil the DNA provide a mechanism for dynamically coupling genetic activity with regulatory and other cellular processes. Although computer modeling has provided insight into the equilibrium dynamics of DNA supercoiling, to date no model has appeared for simulating sequence-dependent DNA strand separation under the nonequilibrium conditions imposed by the dynamic introduction of torsional stress. Here, we introduce such a model and present results from an initial set of computer simulations in which the sequences of dynamically superhelical, 147 base pair DNA circles were systematically altered in order to probe the accuracy with which the model can predict location, extent, and time of stress-induced duplex denaturation. The results agree both with well-tested statistical mechanical calculations and with available experimental information. Additionally, we find that sites susceptible to denaturation show a propensity for localizing to supercoil apices, suggesting that base sequence determines locations of strand separation not only through the energetics of interstrand interactions, but also by influencing the geometry of supercoiling.
Connecting localized DNA strand displacement reactions
NASA Astrophysics Data System (ADS)
Mullor Ruiz, Ismael; Arbona, Jean-Michel; Lad, Amitkumar; Mendoza, Oscar; Aimé, Jean-Pierre; Elezgaray, Juan
2015-07-01
Logic circuits based on DNA strand displacement reactions have been shown to be versatile enough to compute the square root of four-bit numbers. The implementation of these circuits as a set of bulk reactions faces difficulties which include leaky reactions and intrinsically slow, diffusion-limited reaction rates. In this paper, we consider simple examples of these circuits when they are attached to platforms (DNA origamis). As expected, constraining distances between DNA strands leads to faster reaction rates. However, it also induces side-effects that are not detectable in the solution-phase version of this circuitry. Appropriate design of the system, including protection and asymmetry between input and fuel strands, leads to a reproducible behaviour, at least one order of magnitude faster than the one observed under bulk conditions.Logic circuits based on DNA strand displacement reactions have been shown to be versatile enough to compute the square root of four-bit numbers. The implementation of these circuits as a set of bulk reactions faces difficulties which include leaky reactions and intrinsically slow, diffusion-limited reaction rates. In this paper, we consider simple examples of these circuits when they are attached to platforms (DNA origamis). As expected, constraining distances between DNA strands leads to faster reaction rates. However, it also induces side-effects that are not detectable in the solution-phase version of this circuitry. Appropriate design of the system, including protection and asymmetry between input and fuel strands, leads to a reproducible behaviour, at least one order of magnitude faster than the one observed under bulk conditions. Electronic supplementary information (ESI) available. See DOI: 10.1039/C5NR02434J
Computational Identification of Diverse Mechanisms Underlying Transcription Factor-DNA Occupancy
Cheng, Qiong; Kazemian, Majid; Pham, Hannah; Blatti, Charles; Celniker, Susan E.; Wolfe, Scot A.; Brodsky, Michael H.; Sinha, Saurabh
2013-01-01
ChIP-based genome-wide assays of transcription factor (TF) occupancy have emerged as a powerful, high-throughput method to understand transcriptional regulation, especially on a global scale. This has led to great interest in the underlying biochemical mechanisms that direct TF-DNA binding, with the ultimate goal of computationally predicting a TF's occupancy profile in any cellular condition. In this study, we examined the influence of various potential determinants of TF-DNA binding on a much larger scale than previously undertaken. We used a thermodynamics-based model of TF-DNA binding, called “STAP,” to analyze 45 TF-ChIP data sets from Drosophila embryonic development. We built a cross-validation framework that compares a baseline model, based on the ChIP'ed (“primary”) TF's motif, to more complex models where binding by secondary TFs is hypothesized to influence the primary TF's occupancy. Candidates interacting TFs were chosen based on RNA-SEQ expression data from the time point of the ChIP experiment. We found widespread evidence of both cooperative and antagonistic effects by secondary TFs, and explicitly quantified these effects. We were able to identify multiple classes of interactions, including (1) long-range interactions between primary and secondary motifs (separated by ≤150 bp), suggestive of indirect effects such as chromatin remodeling, (2) short-range interactions with specific inter-site spacing biases, suggestive of direct physical interactions, and (3) overlapping binding sites suggesting competitive binding. Furthermore, by factoring out the previously reported strong correlation between TF occupancy and DNA accessibility, we were able to categorize the effects into those that are likely to be mediated by the secondary TF's effect on local accessibility and those that utilize accessibility-independent mechanisms. Finally, we conducted in vitro pull-down assays to test model-based predictions of short-range cooperative interactions, and found that seven of the eight TF pairs tested physically interact and that some of these interactions mediate cooperative binding to DNA. PMID:23935523
"First generation" automated DNA sequencing technology.
Slatko, Barton E; Kieleczawa, Jan; Ju, Jingyue; Gardner, Andrew F; Hendrickson, Cynthia L; Ausubel, Frederick M
2011-10-01
Beginning in the 1980s, automation of DNA sequencing has greatly increased throughput, reduced costs, and enabled large projects to be completed more easily. The development of automation technology paralleled the development of other aspects of DNA sequencing: better enzymes and chemistry, separation and imaging technology, sequencing protocols, robotics, and computational advancements (including base-calling algorithms with quality scores, database developments, and sequence analysis programs). Despite the emergence of high-throughput sequencing platforms, automated Sanger sequencing technology remains useful for many applications. This unit provides background and a description of the "First-Generation" automated DNA sequencing technology. It also includes protocols for using the current Applied Biosystems (ABI) automated DNA sequencing machines. © 2011 by John Wiley & Sons, Inc.
Construction of a fuzzy and Boolean logic gates based on DNA.
Zadegan, Reza M; Jepsen, Mette D E; Hildebrandt, Lasse L; Birkedal, Victoria; Kjems, Jørgen
2015-04-17
Logic gates are devices that can perform logical operations by transforming a set of inputs into a predictable single detectable output. The hybridization properties, structure, and function of nucleic acids can be used to make DNA-based logic gates. These devices are important modules in molecular computing and biosensing. The ideal logic gate system should provide a wide selection of logical operations, and be integrable in multiple copies into more complex structures. Here we show the successful construction of a small DNA-based logic gate complex that produces fluorescent outputs corresponding to the operation of the six Boolean logic gates AND, NAND, OR, NOR, XOR, and XNOR. The logic gate complex is shown to work also when implemented in a three-dimensional DNA origami box structure, where it controlled the position of the lid in a closed or open position. Implementation of multiple microRNA sensitive DNA locks on one DNA origami box structure enabled fuzzy logical operation that allows biosensing of complex molecular signals. Integrating logic gates with DNA origami systems opens a vast avenue to applications in the fields of nanomedicine for diagnostics and therapeutics. © 2015 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
DNA dynamics in aqueous solution: opening the double helix
NASA Technical Reports Server (NTRS)
Pohorille, A.; Ross, W. S.; Tinoco, I. Jr; MacElroy, R. D. (Principal Investigator)
1990-01-01
The opening of a DNA base pair is a simple reaction that is a prerequisite for replication, transcription, and other vital biological functions. Understanding the molecular mechanisms of biological reactions is crucial for predicting and, ultimately, controlling them. Realistic computer simulations of the reactions can provide the needed understanding. To model even the simplest reaction in aqueous solution requires hundreds of hours of supercomputing time. We have used molecular dynamics techniques to simulate fraying of the ends of a six base pair double strand of DNA, [TCGCGA]2, where the four bases of DNA are denoted by T (thymine), C (cytosine), G (guanine), and A (adenine), and to estimate the free energy barrier to this process. The calculations, in which the DNA was surrounded by 2,594 water molecules, required 50 hours of CRAY-2 CPU time for every simulated 100 picoseconds. A free energy barrier to fraying, which is mainly characterized by the movement of adenine away from thymine into aqueous environment, was estimated to be 4 kcal/mol. Another fraying pathway, which leads to stacking between terminal adenine and thymine, was also observed. These detailed pictures of the motions and energetics of DNA base pair opening in water are a first step toward understanding how DNA will interact with any molecule.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Orimoto, Yuuichi, E-mail: orimoto.yuuichi.888@m.kyushu-u.ac.jp; Aoki, Yuriko; Japan Science and Technology Agency, CREST, 4-1-8 Hon-chou, Kawaguchi, Saitama 332-0012
An automated property optimization method was developed based on the ab initio O(N) elongation (ELG) method and applied to the optimization of nonlinear optical (NLO) properties in DNA as a first test. The ELG method mimics a polymerization reaction on a computer, and the reaction terminal of a starting cluster is attacked by monomers sequentially to elongate the electronic structure of the system by solving in each step a limited space including the terminal (localized molecular orbitals at the terminal) and monomer. The ELG-finite field (ELG-FF) method for calculating (hyper-)polarizabilities was used as the engine program of the optimization method,more » and it was found to show linear scaling efficiency while maintaining high computational accuracy for a random sequenced DNA model. Furthermore, the self-consistent field convergence was significantly improved by using the ELG-FF method compared with a conventional method, and it can lead to more feasible NLO property values in the FF treatment. The automated optimization method successfully chose an appropriate base pair from four base pairs (A, T, G, and C) for each elongation step according to an evaluation function. From test optimizations for the first order hyper-polarizability (β) in DNA, a substantial difference was observed depending on optimization conditions between “choose-maximum” (choose a base pair giving the maximum β for each step) and “choose-minimum” (choose a base pair giving the minimum β). In contrast, there was an ambiguous difference between these conditions for optimizing the second order hyper-polarizability (γ) because of the small absolute value of γ and the limitation of numerical differential calculations in the FF method. It can be concluded that the ab initio level property optimization method introduced here can be an effective step towards an advanced computer aided material design method as long as the numerical limitation of the FF method is taken into account.« less
Orimoto, Yuuichi; Aoki, Yuriko
2016-07-14
An automated property optimization method was developed based on the ab initio O(N) elongation (ELG) method and applied to the optimization of nonlinear optical (NLO) properties in DNA as a first test. The ELG method mimics a polymerization reaction on a computer, and the reaction terminal of a starting cluster is attacked by monomers sequentially to elongate the electronic structure of the system by solving in each step a limited space including the terminal (localized molecular orbitals at the terminal) and monomer. The ELG-finite field (ELG-FF) method for calculating (hyper-)polarizabilities was used as the engine program of the optimization method, and it was found to show linear scaling efficiency while maintaining high computational accuracy for a random sequenced DNA model. Furthermore, the self-consistent field convergence was significantly improved by using the ELG-FF method compared with a conventional method, and it can lead to more feasible NLO property values in the FF treatment. The automated optimization method successfully chose an appropriate base pair from four base pairs (A, T, G, and C) for each elongation step according to an evaluation function. From test optimizations for the first order hyper-polarizability (β) in DNA, a substantial difference was observed depending on optimization conditions between "choose-maximum" (choose a base pair giving the maximum β for each step) and "choose-minimum" (choose a base pair giving the minimum β). In contrast, there was an ambiguous difference between these conditions for optimizing the second order hyper-polarizability (γ) because of the small absolute value of γ and the limitation of numerical differential calculations in the FF method. It can be concluded that the ab initio level property optimization method introduced here can be an effective step towards an advanced computer aided material design method as long as the numerical limitation of the FF method is taken into account.
Logic integration of mRNA signals by an RNAi-based molecular computer
Xie, Zhen; Liu, Siyuan John; Bleris, Leonidas; Benenson, Yaakov
2010-01-01
Synthetic in vivo molecular ‘computers’ could rewire biological processes by establishing programmable, non-native pathways between molecular signals and biological responses. Multiple molecular computer prototypes have been shown to work in simple buffered solutions. Many of those prototypes were made of DNA strands and performed computations using cycles of annealing-digestion or strand displacement. We have previously introduced RNA interference (RNAi)-based computing as a way of implementing complex molecular logic in vivo. Because it also relies on nucleic acids for its operation, RNAi computing could benefit from the tools developed for DNA systems. However, these tools must be harnessed to produce bioactive components and be adapted for harsh operating environments that reflect in vivo conditions. In a step toward this goal, we report the construction and implementation of biosensors that ‘transduce’ mRNA levels into bioactive, small interfering RNA molecules via RNA strand exchange in a cell-free Drosophila embryo lysate, a step beyond simple buffered environments. We further integrate the sensors with our RNAi ‘computational’ module to evaluate two-input logic functions on mRNA concentrations. Our results show how RNA strand exchange can expand the utility of RNAi computing and point toward the possibility of using strand exchange in a native biological setting. PMID:20194121
DOE Office of Scientific and Technical Information (OSTI.GOV)
Sherman, W.B.
2012-04-16
Synthetic DNA nanostructures are typically held together primarily by Holliday junctions. One of the most basic types of structures possible to assemble with only DNA and Holliday junctions is the triangle. To date, however, only equilateral triangles have been assembled in this manner - primarily because it is difficult to figure out what configurations of Holliday triangles have low strain. Early attempts at identifying such configurations relied upon calculations that followed the strained helical paths of DNA. Those methods, however, were computationally expensive, and failed to find many of the possible solutions. I have developed a new approach to identifyingmore » Holliday triangles that is computationally faster, and finds well over 95% of the possible solutions. The new approach is based on splitting the problem into two parts. The first part involves figuring out all the different ways that three featureless rods of the appropriate length and diameter can weave over and under one another to form a triangle. The second part of the computation entails seeing whether double helical DNA backbones can fit into the shape dictated by the rods in such a manner that the strands can cross over from one domain to the other at the appropriate spots. Structures with low strain (that is, good fit between the rods and the helices) on all three edges are recorded as promising for assembly.« less
Langheinrich, Jessica; Bogner, Franz X
2015-01-01
As non-scientific conceptions interfere with learning processes, teachers need both, to know about them and to address them in their classrooms. For our study, based on 182 eleventh graders, we analyzed the level of conceptual understanding by implementing the "draw and write" technique during a computer-supported gene technology module. To give participants the hierarchical organizational level which they have to draw, was a specific feature of our study. We introduced two objective category systems for analyzing drawings and inscriptions. Our results indicated a long- as well as a short-term increase in the level of conceptual understanding and in the number of drawn elements and their grades concerning the DNA structure. Consequently, we regard the "draw and write" technique as a tool for a teacher to get to know students' alternative conceptions. Furthermore, our study points the modification potential of hands-on and computer-supported learning modules. © 2015 The International Union of Biochemistry and Molecular Biology.
In vitro molecular machine learning algorithm via symmetric internal loops of DNA.
Lee, Ji-Hoon; Lee, Seung Hwan; Baek, Christina; Chun, Hyosun; Ryu, Je-Hwan; Kim, Jin-Woo; Deaton, Russell; Zhang, Byoung-Tak
2017-08-01
Programmable biomolecules, such as DNA strands, deoxyribozymes, and restriction enzymes, have been used to solve computational problems, construct large-scale logic circuits, and program simple molecular games. Although studies have shown the potential of molecular computing, the capability of computational learning with DNA molecules, i.e., molecular machine learning, has yet to be experimentally verified. Here, we present a novel molecular learning in vitro model in which symmetric internal loops of double-stranded DNA are exploited to measure the differences between training instances, thus enabling the molecules to learn from small errors. The model was evaluated on a data set of twenty dialogue sentences obtained from the television shows Friends and Prison Break. The wet DNA-computing experiments confirmed that the molecular learning machine was able to generalize the dialogue patterns of each show and successfully identify the show from which the sentences originated. The molecular machine learning model described here opens the way for solving machine learning problems in computer science and biology using in vitro molecular computing with the data encoded in DNA molecules. Copyright © 2017. Published by Elsevier B.V.
Liao, Wei-Ching; Chuang, Min-Chieh; Ho, Ja-An Annie
2013-12-15
Genetically modified (GM) technique, one of the modern biomolecular engineering technologies, has been deemed as profitable strategy to fight against global starvation. Yet rapid and reliable analytical method is deficient to evaluate the quality and potential risk of such resulting GM products. We herein present a biomolecular analytical system constructed with distinct biochemical activities to expedite the computational detection of genetically modified organisms (GMOs). The computational mechanism provides an alternative to the complex procedures commonly involved in the screening of GMOs. Given that the bioanalytical system is capable of processing promoter, coding and species genes, affirmative interpretations succeed to identify specified GM event in terms of both electrochemical and optical fashions. The biomolecular computational assay exhibits detection capability of genetically modified DNA below sub-nanomolar level and is found interference-free by abundant coexistence of non-GM DNA. This bioanalytical system, furthermore, sophisticates in array fashion operating multiplex screening against variable GM events. Such a biomolecular computational assay and biosensor holds great promise for rapid, cost-effective, and high-fidelity screening of GMO. Copyright © 2013 Elsevier B.V. All rights reserved.
Simulation and display of macromolecular complexes
NASA Technical Reports Server (NTRS)
Nir, S.; Garduno, R.; Rein, R.; Macelroy, R. D.
1977-01-01
In association with an investigation of the interaction of proteins with DNA and RNA, an interactive computer program for building, manipulating, and displaying macromolecular complexes has been designed. The system provides perspective, planar, and stereoscopic views on the computer terminal display, as well as views for standard and nonstandard observer locations. The molecule or its parts may be rotated and/or translated in any direction; bond connections may be added or removed by the viewer. Molecular fragments may be juxtaposed in such a way that given bonds are aligned, and given planes and points coincide. Another subroutine provides for the duplication of a given unit such as a DNA or amino-acid base.
Non-linear molecular pattern classification using molecular beacons with multiple targets.
Lee, In-Hee; Lee, Seung Hwan; Park, Tai Hyun; Zhang, Byoung-Tak
2013-12-01
In vitro pattern classification has been highlighted as an important future application of DNA computing. Previous work has demonstrated the feasibility of linear classifiers using DNA-based molecular computing. However, complex tasks require non-linear classification capability. Here we design a molecular beacon that can interact with multiple targets and experimentally shows that its fluorescent signals form a complex radial-basis function, enabling it to be used as a building block for non-linear molecular classification in vitro. The proposed method was successfully applied to solving artificial and real-world classification problems: XOR and microRNA expression patterns. Copyright © 2013 Elsevier Ireland Ltd. All rights reserved.
Shi, Zhenyu; Wedd, Anthony G.; Gras, Sally L.
2013-01-01
The development of synthetic biology requires rapid batch construction of large gene networks from combinations of smaller units. Despite the availability of computational predictions for well-characterized enzymes, the optimization of most synthetic biology projects requires combinational constructions and tests. A new building-brick-style parallel DNA assembly framework for simple and flexible batch construction is presented here. It is based on robust recombination steps and allows a variety of DNA assembly techniques to be organized for complex constructions (with or without scars). The assembly of five DNA fragments into a host genome was performed as an experimental demonstration. PMID:23468883
Moghadam, Neda Hosseinpour; Salehzadeh, Sadegh; Shahabadi, Nahid; Golbedaghi, Reza
2017-07-03
The possible interaction between the antiviral drug oseltamivir and calf thymus DNA at physiological pH was studied by spectrophotometry, competitive spectrofluorimetry, differential pulse voltammogram (DPV), circular dichroism spectroscopy (CD), viscosity measurements, salt effect, and computational studies. Intercalation of oseltamivir between the base pairs of DNA was shown by a sharp increase in specific viscosity of DNA and a decrease of the peak current and a positive shift in differential pulse voltammogram. Competitive fluorescence experiments were performed using neutral red (NR) as a probe for the intercalation binding mode. The studies showed that oseltamivir is able to release the NR.
Computational Design of DNA-Binding Proteins.
Thyme, Summer; Song, Yifan
2016-01-01
Predicting the outcome of engineered and naturally occurring sequence perturbations to protein-DNA interfaces requires accurate computational modeling technologies. It has been well established that computational design to accommodate small numbers of DNA target site substitutions is possible. This chapter details the basic method of design used in the Rosetta macromolecular modeling program that has been successfully used to modulate the specificity of DNA-binding proteins. More recently, combining computational design and directed evolution has become a common approach for increasing the success rate of protein engineering projects. The power of such high-throughput screening depends on computational methods producing multiple potential solutions. Therefore, this chapter describes several protocols for increasing the diversity of designed output. Lastly, we describe an approach for building comparative models of protein-DNA complexes in order to utilize information from homologous sequences. These models can be used to explore how nature modulates specificity of protein-DNA interfaces and potentially can even be used as starting templates for further engineering.
Cloud-based adaptive exon prediction for DNA analysis
Putluri, Srinivasareddy; Fathima, Shaik Yasmeen
2018-01-01
Cloud computing offers significant research and economic benefits to healthcare organisations. Cloud services provide a safe place for storing and managing large amounts of such sensitive data. Under conventional flow of gene information, gene sequence laboratories send out raw and inferred information via Internet to several sequence libraries. DNA sequencing storage costs will be minimised by use of cloud service. In this study, the authors put forward a novel genomic informatics system using Amazon Cloud Services, where genomic sequence information is stored and accessed for processing. True identification of exon regions in a DNA sequence is a key task in bioinformatics, which helps in disease identification and design drugs. Three base periodicity property of exons forms the basis of all exon identification techniques. Adaptive signal processing techniques found to be promising in comparison with several other methods. Several adaptive exon predictors (AEPs) are developed using variable normalised least mean square and its maximum normalised variants to reduce computational complexity. Finally, performance evaluation of various AEPs is done based on measures such as sensitivity, specificity and precision using various standard genomic datasets taken from National Center for Biotechnology Information genomic sequence database. PMID:29515813
Electron holes appear to trigger cancer-implicated mutations
NASA Astrophysics Data System (ADS)
Miller, John; Villagran, Martha
Malignant tumors are caused by mutations, which also affect their subsequent growth and evolution. We use a novel approach, computational DNA hole spectroscopy [M.Y. Suarez-Villagran & J.H. Miller, Sci. Rep. 5, 13571 (2015)], to compute spectra of enhanced hole probability based on actual sequence data. A hole is a mobile site of positive charge created when an electron is removed, for example by radiation or contact with a mutagenic agent. Peaks in the hole spectrum depict sites where holes tend to localize and potentially trigger a base pair mismatch during replication. Our studies of reveal a correlation between hole spectrum peaks and spikes in human mutation frequencies. Importantly, we also find that hole peak positions that do not coincide with large variant frequencies often coincide with cancer-implicated mutations and/or (for coding DNA) encoded conserved amino acids. This enables combining hole spectra with variant data to identify critical base pairs and potential cancer `driver' mutations. Such integration of DNA hole and variance spectra could also prove invaluable for pinpointing critical regions, and sites of driver mutations, in the vast non-protein-coding genome. Supported by the State of Texas through the Texas Ctr. for Superconductivity.
The Crystal Structure of TAL Effector PthXo1 Bound to Its DNA Target
DOE Office of Scientific and Technical Information (OSTI.GOV)
Mak, Amanda Nga-Sze; Bradley, Philip; Cernadas, Raul A.
2012-02-10
DNA recognition by TAL effectors is mediated by tandem repeats, each 33 to 35 residues in length, that specify nucleotides via unique repeat-variable diresidues (RVDs). The crystal structure of PthXo1 bound to its DNA target was determined by high-throughput computational structure prediction and validated by heavy-atom derivatization. Each repeat forms a left-handed, two-helix bundle that presents an RVD-containing loop to the DNA. The repeats self-associate to form a right-handed superhelix wrapped around the DNA major groove. The first RVD residue forms a stabilizing contact with the protein backbone, while the second makes a base-specific contact to the DNA sense strand.more » Two degenerate amino-terminal repeats also interact with the DNA. Containing several RVDs and noncanonical associations, the structure illustrates the basis of TAL effector-DNA recognition.« less
Salvatore, Princia; Nazmutdinov, Renat R; Ulstrup, Jens; Zhang, Jingdong
2015-02-19
Among the low-index single-crystal gold surfaces, the Au(110) surface is the most active toward molecular adsorption and the one with fewest electrochemical adsorption data reported. Cyclic voltammetry (CV), electrochemically controlled scanning tunneling microscopy (EC-STM), and density functional theory (DFT) calculations have been employed in the present study to address the adsorption of the four nucleobases adenine (A), cytosine (C), guanine (G), and thymine (T), on the Au(110)-electrode surface. Au(110) undergoes reconstruction to the (1 × 3) surface in electrochemical environment, accompanied by a pair of strong voltammetry peaks in the double-layer region in acid solutions. Adsorption of the DNA bases gives featureless voltammograms with lower double-layer capacitance, suggesting that all the bases are chemisorbed on the Au(110) surface. Further investigation of the surface structures of the adlayers of the four DNA bases by EC-STM disclosed lifting of the Au(110) reconstruction, specific molecular packing in dense monolayers, and pH dependence of the A and G adsorption. DFT computations based on a cluster model for the Au(110) surface were performed to investigate the adsorption energy and geometry of the DNA bases in different adsorbate orientations. The optimized geometry is further used to compute models for STM images which are compared with the recorded STM images. This has provided insight into the physical nature of the adsorption. The specific orientations of A, C, G, and T on Au(110) and the nature of the physical adsorbate/surface interaction based on the combination of the experimental and theoretical studies are proposed, and differences from nucleobase adsorption on Au(111)- and Au(100)-electrode surfaces are discussed.
Jakubec, David; Laskowski, Roman A.; Vondrasek, Jiri
2016-01-01
Decades of intensive experimental studies of the recognition of DNA sequences by proteins have provided us with a view of a diverse and complicated world in which few to no features are shared between individual DNA-binding protein families. The originally conceived direct readout of DNA residue sequences by amino acid side chains offers very limited capacity for sequence recognition, while the effects of the dynamic properties of the interacting partners remain difficult to quantify and almost impossible to generalise. In this work we investigated the energetic characteristics of all DNA residue—amino acid side chain combinations in the conformations found at the interaction interface in a very large set of protein—DNA complexes by the means of empirical potential-based calculations. General specificity-defining criteria were derived and utilised to look beyond the binding motifs considered in previous studies. Linking energetic favourability to the observed geometrical preferences, our approach reveals several additional amino acid motifs which can distinguish between individual DNA bases. Our results remained valid in environments with various dielectric properties. PMID:27384774
Iacovelli, Federico; Falconi, Mattia
2015-09-01
DNA and RNA are large and flexible polymers selected by nature to transmit information. The most common DNA three-dimensional structure is represented by the double helix, but this biopolymer is extremely flexible and polymorphic, and can easily change its conformation to adapt to different interactions and purposes. DNA can also adopt singular topologies, giving rise, for instance, to supercoils, formed because of the limited free rotation of the DNA domain flanking a replication or transcription complex. Our understanding of the importance of these unusual or transient structures is growing, as recent studies of DNA topology, supercoiling, knotting and linking have shown that the geometric changes can drive, or strongly influence, the interactions between protein and DNA, so altering its own metabolism. On the other hand, the unique self-recognition properties of DNA, determined by the strict Watson-Crick rules of base pairing, make this material ideal for the creation of self-assembling, predesigned nanostructures. The construction of such structures is one of the main focuses of the thriving area of DNA nanotechnology, where several assembly strategies have been employed to build increasingly complex DNA nanostructures. DNA nanodevices can have direct applications in biomedicine, but also in the materials science field, requiring the immersion of DNA in an environment far from the physiological one. Crucial help in the understanding and planning of natural and artificial nanostructures is given by modern computer simulation techniques, which are able to provide a reliable structural and dynamic description of nucleic acids. © 2015 FEBS.
Poltev, V; Anisimov, V M; Dominguez, V; Gonzalez, E; Deriabina, A; Garcia, D; Rivas, F; Polteva, N A
2018-02-01
Deciphering the mechanism of functioning of DNA as the carrier of genetic information requires identifying inherent factors determining its structure and function. Following this path, our previous DFT studies attributed the origin of unique conformational characteristics of right-handed Watson-Crick duplexes (WCDs) to the conformational profile of deoxydinucleoside monophosphates (dDMPs) serving as the minimal repeating units of DNA strand. According to those findings, the directionality of the sugar-phosphate chain and the characteristic ranges of dihedral angles of energy minima combined with the geometric differences between purines and pyrimidines determine the dependence on base sequence of the three-dimensional (3D) structure of WCDs. This work extends our computational study to complementary deoxydinucleotide-monophosphates (cdDMPs) of non-standard conformation, including those of Z-family, Hoogsteen duplexes, parallel-stranded structures, and duplexes with mispaired bases. For most of these systems, except Z-conformation, computations closely reproduce experimental data within the tolerance of characteristic limits of dihedral parameters for each conformation family. Computation of cdDMPs with Z-conformation reveals that their experimental structures do not correspond to the internal energy minimum. This finding establishes the leading role of external factors in formation of the Z-conformation. Energy minima of cdDMPs of non-Watson-Crick duplexes demonstrate different sequence-dependence features than those known for WCDs. The obtained results provide evidence that the biologically important regularities of 3D structure distinguish WCDs from duplexes having non-Watson-Crick nucleotide pairing.
NGS-based likelihood ratio for identifying contributors in two- and three-person DNA mixtures.
Chan Mun Wei, Joshua; Zhao, Zicheng; Li, Shuai Cheng; Ng, Yen Kaow
2018-06-01
DNA fingerprinting, also known as DNA profiling, serves as a standard procedure in forensics to identify a person by the short tandem repeat (STR) loci in their DNA. By comparing the STR loci between DNA samples, practitioners can calculate a probability of match to identity the contributors of a DNA mixture. Most existing methods are based on 13 core STR loci which were identified by the Federal Bureau of Investigation (FBI). Analyses based on these loci of DNA mixture for forensic purposes are highly variable in procedures, and suffer from subjectivity as well as bias in complex mixture interpretation. With the emergence of next-generation sequencing (NGS) technologies, the sequencing of billions of DNA molecules can be parallelized, thus greatly increasing throughput and reducing the associated costs. This allows the creation of new techniques that incorporate more loci to enable complex mixture interpretation. In this paper, we propose a computation for likelihood ratio that uses NGS (next generation sequencing) data for DNA testing on mixed samples. We have applied the method to 4480 simulated DNA mixtures, which consist of various mixture proportions of 8 unrelated whole-genome sequencing data. The results confirm the feasibility of utilizing NGS data in DNA mixture interpretations. We observed an average likelihood ratio as high as 285,978 for two-person mixtures. Using our method, all 224 identity tests for two-person mixtures and three-person mixtures were correctly identified. Copyright © 2018 Elsevier Ltd. All rights reserved.
Alchemical Free Energy Calculations for Nucleotide Mutations in Protein-DNA Complexes.
Gapsys, Vytautas; de Groot, Bert L
2017-12-12
Nucleotide-sequence-dependent interactions between proteins and DNA are responsible for a wide range of gene regulatory functions. Accurate and generalizable methods to evaluate the strength of protein-DNA binding have long been sought. While numerous computational approaches have been developed, most of them require fitting parameters to experimental data to a certain degree, e.g., machine learning algorithms or knowledge-based statistical potentials. Molecular-dynamics-based free energy calculations offer a robust, system-independent, first-principles-based method to calculate free energy differences upon nucleotide mutation. We present an automated procedure to set up alchemical MD-based calculations to evaluate free energy changes occurring as the result of a nucleotide mutation in DNA. We used these methods to perform a large-scale mutation scan comprising 397 nucleotide mutation cases in 16 protein-DNA complexes. The obtained prediction accuracy reaches 5.6 kJ/mol average unsigned deviation from experiment with a correlation coefficient of 0.57 with respect to the experimentally measured free energies. Overall, the first-principles-based approach performed on par with the molecular modeling approaches Rosetta and FoldX. Subsequently, we utilized the MD-based free energy calculations to construct protein-DNA binding profiles for the zinc finger protein Zif268. The calculation results compare remarkably well with the experimentally determined binding profiles. The software automating the structure and topology setup for alchemical calculations is a part of the pmx package; the utilities have also been made available online at http://pmx.mpibpc.mpg.de/dna_webserver.html .
MSuPDA: A Memory Efficient Algorithm for Sequence Alignment.
Khan, Mohammad Ibrahim; Kamal, Md Sarwar; Chowdhury, Linkon
2016-03-01
Space complexity is a million dollar question in DNA sequence alignments. In this regard, memory saving under pushdown automata can help to reduce the occupied spaces in computer memory. Our proposed process is that anchor seed (AS) will be selected from given data set of nucleotide base pairs for local sequence alignment. Quick splitting techniques will separate the AS from all the DNA genome segments. Selected AS will be placed to pushdown automata's (PDA) input unit. Whole DNA genome segments will be placed into PDA's stack. AS from input unit will be matched with the DNA genome segments from stack of PDA. Match, mismatch and indel of nucleotides will be popped from the stack under the control unit of pushdown automata. During the POP operation on stack, it will free the memory cell occupied by the nucleotide base pair.
[Development of laboratory sequence analysis software based on WWW and UNIX].
Huang, Y; Gu, J R
2001-01-01
Sequence analysis tools based on WWW and UNIX were developed in our laboratory to meet the needs of molecular genetics research in our laboratory. General principles of computer analysis of DNA and protein sequences were also briefly discussed in this paper.
Computational fishing of new DNA methyltransferase inhibitors from natural products.
Maldonado-Rojas, Wilson; Olivero-Verbel, Jesus; Marrero-Ponce, Yovani
2015-07-01
DNA methyltransferase inhibitors (DNMTis) have become an alternative for cancer therapies. However, only two DNMTis have been approved as anticancer drugs, although with some restrictions. Natural products (NPs) are a promising source of drugs. In order to find NPs with novel chemotypes as DNMTis, 47 compounds with known activity against these enzymes were used to build a LDA-based QSAR model for active/inactive molecules (93% accuracy) based on molecular descriptors. This classifier was employed to identify potential DNMTis on 800 NPs from NatProd Collection. 447 selected compounds were docked on two human DNA methyltransferase (DNMT) structures (PDB codes: 3SWR and 2QRV) using AutoDock Vina and Surflex-Dock, prioritizing according to their score values, contact patterns at 4 Å and molecular diversity. Six consensus NPs were identified as virtual hits against DNMTs, including 9,10-dihydro-12-hydroxygambogic, phloridzin, 2',4'-dihydroxychalcone 4'-glucoside, daunorubicin, pyrromycin and centaurein. This method is an innovative computational strategy for identifying DNMTis, useful in the identification of potent and selective anticancer drugs. Copyright © 2015 Elsevier Inc. All rights reserved.
Optically Controlled Signal Amplification for DNA Computation.
Prokup, Alexander; Hemphill, James; Liu, Qingyang; Deiters, Alexander
2015-10-16
The hybridization chain reaction (HCR) and fuel-catalyst cycles have been applied to address the problem of signal amplification in DNA-based computation circuits. While they function efficiently, these signal amplifiers cannot be switched ON or OFF quickly and noninvasively. To overcome these limitations, a light-activated initiator strand for the HCR, which enabled fast optical OFF → ON switching, was developed. Similarly, when a light-activated version of the catalyst strand or the inhibitor strand of a fuel-catalyst cycle was applied, the cycle could be optically switched from OFF → ON or ON → OFF, respectively. To move the capabilities of these devices beyond solution-based operations, the components were embedded in agarose gels. Irradiation with customizable light patterns and at different time points demonstrated both spatial and temporal control. The addition of a translator gate enabled a spatially activated signal to travel along a predefined path, akin to a chemical wire. Overall, the addition of small light-cleavable photocaging groups to DNA signal amplification circuits enabled conditional control as well as fast photocontrol of signal amplification.
Meher, Prabina Kumar; Sahu, Tanmaya Kumar; Rao, A R
2016-11-05
DNA barcoding is a molecular diagnostic method that allows automated and accurate identification of species based on a short and standardized fragment of DNA. To this end, an attempt has been made in this study to develop a computational approach for identifying the species by comparing its barcode with the barcode sequence of known species present in the reference library. Each barcode sequence was first mapped onto a numeric feature vector based on k-mer frequencies and then Random forest methodology was employed on the transformed dataset for species identification. The proposed approach outperformed similarity-based, tree-based, diagnostic-based approaches and found comparable with existing supervised learning based approaches in terms of species identification success rate, while compared using real and simulated datasets. Based on the proposed approach, an online web interface SPIDBAR has also been developed and made freely available at http://cabgrid.res.in:8080/spidbar/ for species identification by the taxonomists. Copyright © 2016 Elsevier B.V. All rights reserved.
Guo, Yahui; Cheng, Junjie; Wang, Jine; Zhou, Xiaodong; Hu, Jiming; Pei, Renjun
2014-09-01
A simple, versatile, and label-free DNA computing strategy was designed by using toehold-mediated strand displacement and stem-loop probes. A full set of logic gates (YES, NOT, OR, NAND, AND, INHIBIT, NOR, XOR, XNOR) and a two-layer logic cascade were constructed. The probes contain a G-quadruplex domain, which was blocked or unfolded through inputs initiating strand displacement and the obviously distinguishable light-up fluorescent signal of G-quadruplex/NMM complex was used as the output readout. The inputs are the disease-specific nucleotide sequences with potential for clinic diagnosis. The developed versatile computing system based on our label-free and modular strategy might be adapted in multi-target diagnosis through DNA hybridization and aptamer-target interaction. © 2014 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
Wendelsdorf, Katherine V.; Song, Zhuo; Cao, Yang; Samuels, David C.
2009-01-01
Nucleoside analogs used in antiretroviral treatment have been associated with mitochondrial toxicity. The polymerase-γ hypothesis states that this toxicity stems from the analogs' inhibition of the mitochondrial DNA polymerase (polymerase-γ) leading to mitochondrial DNA (mtDNA) depletion. We have constructed a computational model of the interaction of polymerase-γ with activated nucleoside and nucleotide analog drugs, based on experimentally measured reaction rates and base excision rates, together with the mtDNA genome size, the human mtDNA sequence, and mitochondrial dNTP concentrations. The model predicts an approximately 1000-fold difference in the activated drug concentration required for a 50% probability of mtDNA strand termination between the activated di-deoxy analogs d4T, ddC, and ddI (activated to ddA) and the activated forms of the analogs 3TC, TDF, AZT, FTC, and ABC. These predictions are supported by experimental and clinical data showing significantly greater mtDNA depletion in cell culture and patient samples caused by the di-deoxy analog drugs. For zidovudine (AZT) we calculated a very low mtDNA replication termination probability, in contrast to its reported mitochondrial toxicity in vitro and clinically. Therefore AZT mitochondrial toxicity is likely due to a mechanism that does not involve strand termination of mtDNA replication. PMID:19132079
Gao, Jinting; Liu, Yaqing; Lin, Xiaodong; Deng, Jiankang; Yin, Jinjin; Wang, Shuo
2017-10-25
Wiring a series of simple logic gates to process complex data is significantly important and a large challenge for untraditional molecular computing systems. The programmable property of DNA endows its powerful application in molecular computing. In our investigation, it was found that DNA exhibits excellent peroxidase-like activity in a colorimetric system of TMB/H 2 O 2 /Hemin (TMB, 3,3', 5,5'-Tetramethylbenzidine) in the presence of K + and Cu 2+ , which is significantly inhibited by the addition of an antioxidant. According to the modulated catalytic activity of this DNA-based catalyst, three cascade logic gates including AND-OR-INH (INHIBIT), AND-INH and OR-INH were successfully constructed. Interestingly, by only modulating the concentration of Cu 2+ , a majority logic gate with a single-vote veto function was realized following the same threshold value as that of the cascade logic gates. The strategy is quite straightforward and versatile and provides an instructive method for constructing multiple logic gates on a simple platform to implement complex molecular computing.
Noise reduction in single time frame optical DNA maps
Müller, Vilhelm; Westerlund, Fredrik
2017-01-01
In optical DNA mapping technologies sequence-specific intensity variations (DNA barcodes) along stretched and stained DNA molecules are produced. These “fingerprints” of the underlying DNA sequence have a resolution of the order one kilobasepairs and the stretching of the DNA molecules are performed by surface adsorption or nano-channel setups. A post-processing challenge for nano-channel based methods, due to local and global random movement of the DNA molecule during imaging, is how to align different time frames in order to produce reproducible time-averaged DNA barcodes. The current solutions to this challenge are computationally rather slow. With high-throughput applications in mind, we here introduce a parameter-free method for filtering a single time frame noisy barcode (snap-shot optical map), measured in a fraction of a second. By using only a single time frame barcode we circumvent the need for post-processing alignment. We demonstrate that our method is successful at providing filtered barcodes which are less noisy and more similar to time averaged barcodes. The method is based on the application of a low-pass filter on a single noisy barcode using the width of the Point Spread Function of the system as a unique, and known, filtering parameter. We find that after applying our method, the Pearson correlation coefficient (a real number in the range from -1 to 1) between the single time-frame barcode and the time average of the aligned kymograph increases significantly, roughly by 0.2 on average. By comparing to a database of more than 3000 theoretical plasmid barcodes we show that the capabilities to identify plasmids is improved by filtering single time-frame barcodes compared to the unfiltered analogues. Since snap-shot experiments and computational time using our method both are less than a second, this study opens up for high throughput optical DNA mapping with improved reproducibility. PMID:28640821
The Teaching of Protein Synthesis--A Microcomputer Based Method.
ERIC Educational Resources Information Center
Goodridge, Frank
1983-01-01
Describes two computer programs (BASIC for 32K Commodore PET) for teaching protein synthesis. The first is an interactive test of base-pairing knowledge, and the second generates random DNA nucleotide sequences, with instructions for substitution, insertion, and deletion printed out for each student. (JN)
An efficient variational method to study the denaturation of DNA induced by superhelical stress
NASA Astrophysics Data System (ADS)
Jost, Daniel; Everaers, Ralf
2010-03-01
Many fundamental biological processes, like transcription or replication, need the opening of the double-stranded DNA. One common way to control the local denaturation is to impose superhelical stress to the DNA using protein machineries. To describe superhelical effect for circular molecules, Benham introduced a model where the standard thermodynamic description of base-pairing is coupled with torsional stress energetics. Here, we introduce an efficient mean-field approximation of the Benham model. Our self-consistent solution is confident and computationally-fast, compared to the full treatment of the model. In particular, our formulation allows to compute the probability of bubble formation for given length and position along the sequence. Evolution of this probability as a function of the superhelical stress could inform us on the ability for organisms to control the strength of superhelicity acting on their genomes.
The Effect of Basepair Mismatch on DNA Strand Displacement.
Broadwater, D W Bo; Kim, Harold D
2016-04-12
DNA strand displacement is a key reaction in DNA homologous recombination and DNA mismatch repair and is also heavily utilized in DNA-based computation and locomotion. Despite its ubiquity in science and engineering, sequence-dependent effects of displacement kinetics have not been extensively characterized. Here, we measured toehold-mediated strand displacement kinetics using single-molecule fluorescence in the presence of a single basepair mismatch. The apparent displacement rate varied significantly when the mismatch was introduced in the invading DNA strand. The rate generally decreased as the mismatch in the invader was encountered earlier in displacement. Our data indicate that a single base pair mismatch in the invader stalls branch migration and displacement occurs via direct dissociation of the destabilized incumbent strand from the substrate strand. We combined both branch migration and direct dissociation into a model, which we term the concurrent displacement model, and used the first passage time approach to quantitatively explain the salient features of the observed relationship. We also introduce the concept of splitting probabilities to justify that the concurrent model can be simplified into a three-step sequential model in the presence of an invader mismatch. We expect our model to become a powerful tool to design DNA-based reaction schemes with broad functionality. Copyright © 2016 Biophysical Society. Published by Elsevier Inc. All rights reserved.
Oxidative damage in DNA bases revealed by UV resonant Raman spectroscopy.
D'Amico, Francesco; Cammisuli, Francesca; Addobbati, Riccardo; Rizzardi, Clara; Gessini, Alessandro; Masciovecchio, Claudio; Rossi, Barbara; Pascolo, Lorella
2015-03-07
We report on the use of the UV Raman technique to monitor the oxidative damage of deoxynucleotide triphosphates (dATP, dGTP, dCTP and dTTP) and DNA (plasmid vector) solutions. Nucleotide and DNA aqueous solutions were exposed to hydrogen peroxide (H2O2) and iron containing carbon nanotubes (CNTs) to produce Fenton's reaction and induce oxidative damage. UV Raman spectroscopy is shown to be maximally efficient to reveal changes in the nitrogenous bases during the oxidative mechanisms occurring on these molecules. The analysis of Raman spectra, supported by numerical computations, revealed that the Fenton's reaction causes an oxidation of the nitrogenous bases in dATP, dGTP and dCTP solutions leading to the production of 2-hydroxyadenine, 8-hydroxyguanine and 5-hydroxycytosine. No thymine change was revealed in the dTTP solution under the same conditions. Compared to single nucleotide solutions, plasmid DNA oxidation has resulted in more radical damage that causes the breaking of the adenine and guanine aromatic rings. Our study demonstrates the advantage of using UV Raman spectroscopy for rapidly monitoring the oxidation changes in DNA aqueous solutions that can be assigned to specific nitrogenous bases.
Universal computing by DNA origami robots in a living animal
Levner, Daniel; Ittah, Shmulik; Abu-Horowitz, Almogit; Bachelet, Ido
2014-01-01
Biological systems are collections of discrete molecular objects that move around and collide with each other. Cells carry out elaborate processes by precisely controlling these collisions, but developing artificial machines that can interface with and control such interactions remains a significant challenge. DNA is a natural substrate for computing and has been used to implement a diverse set of mathematical problems1-3, logic circuits4-6 and robotics7-9. The molecule also naturally interfaces with living systems, and different forms of DNA-based biocomputing have previously been demonstrated10-13. Here we show that DNA origami14-16 can be used to fabricate nanoscale robots that are capable of dynamically interacting with each other17-18 in a living animal. The interactions generate logical outputs, which are relayed to switch molecular payloads on or off. As a proof-of-principle, we use the system to create architectures that emulate various logic gates (AND, OR, XOR, NAND, NOT, CNOT, and a half adder). Following an ex vivo prototyping phase, we successfully employed the DNA origami robots in living cockroaches (Blaberus discoidalis) to control a molecule that targets the cells of the animal. PMID:24705510
NASA Astrophysics Data System (ADS)
Urban, Matthias; Möller, Robert; Fritzsche, Wolfgang
2003-02-01
DNA analytics is a growing field based on the increasing knowledge about the genome with special implications for the understanding of molecular bases for diseases. Driven by the need for cost-effective and high-throughput methods for molecular detection, DNA chips are an interesting alternative to more traditional analytical methods in this field. The standard readout principle for DNA chips is fluorescence based. Fluorescence is highly sensitive and broadly established, but shows limitations regarding quantification (due to signal and/or dye instability) and the need for sophisticated (and therefore high-cost) equipment. This article introduces a readout system for an alternative detection scheme based on electrical detection of nanoparticle-labeled DNA. If labeled DNA is present in the analyte solution, it will bind on complementary capture DNA immobilized in a microelectrode gap. A subsequent metal enhancement step leads to a deposition of conductive material on the nanoparticles, and finally an electrical contact between the electrodes. This detection scheme offers the potential for a simple (low-cost as well as robust) and highly miniaturizable method, which could be well-suited for point-of-care applications in the context of lab-on-a-chip technologies. The demonstrated apparatus allows a parallel readout of an entire array of microstructured measurement sites. The readout is combined with data-processing by an embedded personal computer, resulting in an autonomous instrument that measures and presents the results. The design and realization of such a system is described, and first measurements are presented.
NASA Technical Reports Server (NTRS)
Ojha, R. P.; Dhingra, M. M.; Sarma, M. H.; Myer, Y. P.; Setlik, R. F.; Shibata, M.; Kazim, A. L.; Ornstein, R. L.; Rein, R.; Turner, C. J.;
1997-01-01
The structure of an anti-HIV-1 ribozyme-DNA abortive substrate complex was investigated by 750 MHz NMR and computer modeling experiments. The ribozyme was a chimeric molecule with 30 residues-18 DNA nucleotides, and 12 RNA residues in the conserved core. The DNA substrate analog had 17 residues. The chimeric ribozyme and the DNA substrate formed a shortened ribozyme-abortive substrate complex of 47 nucleotides with two DNA stems (stems I and III) and a loop consisting of the conserved core residues. Circular dichroism spectra showed that the DNA stems assume A-family conformation at the NMR concentration and a temperature of 15 degrees C, contrary to the conventional wisdom that DNA duplexes in aqueous solution populate entirely in the B-form. It is proposed that the A-family RNA residues at the core expand the A-family initiated at the core into the DNA stems because of the large free energy requirement for the formation of A/B junctions. Assignments of the base H8/H6 protons and H1' of the 47 residues were made by a NOESY walk. In addition to the methyl groups of all T's, the imino resonances of stems I and III and AH2's were assigned from appropriate NOESY walks. The extracted NMR data along with available crystallographic data, were used to derive a structural model of the complex. Stems I and III of the final model displayed a remarkable similarity to the A form of DNA; in stem III, a GC base pair was found to be moving into the floor of the minor groove defined by flanking AT pairs; data suggest the formation of a buckled rhombic structure with the adjacent pair; in addition, the base pair at the interface of stem III and the loop region displayed deformed geometry. The loop with the catalytic core, and the immediate region of the stems displayed conformational multiplicity within the NMR time scale. A catalytic mechanism for ribozyme action based on the derived structure, and consistent with biochemical data in the literature, is proposed. The complex between the anti HIV-1 gag ribozyme and its abortive DNA substrate manifests in the detection of a continuous track of A.T base pairs; this suggests that the interaction between the ribozyme and its DNA substrate is stronger than the one observed in the case of the free ribozyme where the bases in stem I and stem III regions interact strongly with the ribozyme core region (Sarma, R. H., et al. FEBS Letters 375, 317-23, 1995). The complex formation provides certain guidelines in the design of suitable therapeutic ribozymes. If the residues in the ribozyme stem regions interact with the conserved core, it may either prevent or interfere with the formation of a catalytically active tertiary structure.
Biswas, Sovan; Sen, Suman; Im, JongOne; Biswas, Sudipta; Krstic, Predrag; Ashcroft, Brian; Borges, Chad; Zhao, Yanan; Lindsay, Stuart; Zhang, Peiming
2016-12-27
A reader molecule, which recognizes all the naturally occurring nucleobases in an electron tunnel junction, is required for sequencing DNA by a recognition tunneling (RT) technique, referred to as a universal reader. In the present study, we have designed a series of heterocyclic carboxamides based on hydrogen bonding and a large-sized pyrene ring based on a π-π stacking interaction as universal reader candidates. Each of these compounds was synthesized to bear a thiolated linker for attachment to metal electrodes and examined for their interactions with naturally occurring DNA nucleosides and nucleotides by 1 H NMR, ESI-MS, computational calculations, and surface plasmon resonance. RT measurements were carried out in a scanning tunnel microscope. All of these molecules generated electrical signals with DNA nucleotides in tunneling junctions under physiological conditions (phosphate buffered aqueous solution, pH 7.4). Using a support vector machine as a tool for data analysis, we found that these candidates distinguished among naturally occurring DNA nucleotides with the accuracy of pyrene (by π-π stacking interactions) > azole carboxamides (by hydrogen-bonding interactions). In addition, the pyrene reader operated efficiently in a larger tunnel junction. However, the azole carboxamide could read abasic (AP) monophosphate, a product from spontaneous base hydrolysis or an intermediate of base excision repair. Thus, we envision that sequencing DNA using both π-π stacking and hydrogen-bonding-based universal readers in parallel should generate more comprehensive genome sequences than sequencing based on either reader molecule alone.
Roberts, Victoria A.; Pique, Michael E.; Hsu, Simon; Li, Sheng; Slupphaug, Geir; Rambo, Robert P.; Jamison, Jonathan W.; Liu, Tong; Lee, Jun H.; Tainer, John A.; Ten Eyck, Lynn F.; Woods, Virgil L.
2012-01-01
X-ray crystallography provides excellent structural data on protein–DNA interfaces, but crystallographic complexes typically contain only small fragments of large DNA molecules. We present a new approach that can use longer DNA substrates and reveal new protein–DNA interactions even in extensively studied systems. Our approach combines rigid-body computational docking with hydrogen/deuterium exchange mass spectrometry (DXMS). DXMS identifies solvent-exposed protein surfaces; docking is used to create a 3-dimensional model of the protein–DNA interaction. We investigated the enzyme uracil-DNA glycosylase (UNG), which detects and cleaves uracil from DNA. UNG was incubated with a 30 bp DNA fragment containing a single uracil, giving the complex with the abasic DNA product. Compared with free UNG, the UNG–DNA complex showed increased solvent protection at the UNG active site and at two regions outside the active site: residues 210–220 and 251–264. Computational docking also identified these two DNA-binding surfaces, but neither shows DNA contact in UNG–DNA crystallographic structures. Our results can be explained by separation of the two DNA strands on one side of the active site. These non-sequence-specific DNA-binding surfaces may aid local uracil search, contribute to binding the abasic DNA product and help present the DNA product to APE-1, the next enzyme on the DNA-repair pathway. PMID:22492624
Carbon-14 decay as a source of non-canonical bases in DNA.
Sassi, Michel; Carter, Damien J; Uberuaga, Blas P; Stanek, Chris R; Marks, Nigel A
2014-01-01
Significant experimental effort has been applied to study radioactive beta-decay in biological systems. Atomic-scale knowledge of this transmutation process is lacking due to the absence of computer simulations. Carbon-14 is an important beta-emitter, being ubiquitous in the environment and an intrinsic part of the genetic code. Over a lifetime, around 50 billion (14)C decays occur within human DNA. We apply ab initio molecular dynamics to quantify (14)C-induced bond rupture in a variety of organic molecules, including DNA base pairs. We show that double bonds and ring structures confer radiation resistance. These features, present in the canonical bases of the DNA, enhance their resistance to (14)C-induced bond-breaking. In contrast, the sugar group of the DNA and RNA backbone is vulnerable to single-strand breaking. We also show that Carbon-14 decay provides a mechanism for creating mutagenic wobble-type mispairs. The observation that DNA has a resistance to natural radioactivity has not previously been recognized. We show that (14)C decay can be a source for generating non-canonical bases. Our findings raise questions such as how the genetic apparatus deals with the appearance of an extra nitrogen in the canonical bases. It is not obvious whether or not the DNA repair mechanism detects this modification nor how DNA replication is affected by a non-canonical nucleobase. Accordingly, (14)C may prove to be a source of genetic alteration that is impossible to avoid due to the universal presence of radiocarbon in the environment. © 2013.
Lemkul, Justin A; MacKerell, Alexander D
2017-05-09
Empirical force fields seek to relate the configuration of a set of atoms to its energy, thus yielding the forces governing its dynamics, using classical physics rather than more expensive quantum mechanical calculations that are computationally intractable for large systems. Most force fields used to simulate biomolecular systems use fixed atomic partial charges, neglecting the influence of electronic polarization, instead making use of a mean-field approximation that may not be transferable across environments. Recent hardware and software developments make polarizable simulations feasible, and to this end, polarizable force fields represent the next generation of molecular dynamics simulation technology. In this work, we describe the refinement of a polarizable force field for DNA based on the classical Drude oscillator model by targeting quantum mechanical interaction energies and conformational energy profiles of model compounds necessary to build a complete DNA force field. The parametrization strategy employed in the present work seeks to correct weak base stacking in A- and B-DNA and the unwinding of Z-DNA observed in the previous version of the force field, called Drude-2013. Refinement of base nonbonded terms and reparametrization of dihedral terms in the glycosidic linkage, deoxyribofuranose rings, and important backbone torsions resulted in improved agreement with quantum mechanical potential energy surfaces. Notably, we expand on previous efforts by explicitly including Z-DNA conformational energetics in the refinement.
Markov chains: computing limit existence and approximations with DNA.
Cardona, M; Colomer, M A; Conde, J; Miret, J M; Miró, J; Zaragoza, A
2005-09-01
We present two algorithms to perform computations over Markov chains. The first one determines whether the sequence of powers of the transition matrix of a Markov chain converges or not to a limit matrix. If it does converge, the second algorithm enables us to estimate this limit. The combination of these algorithms allows the computation of a limit using DNA computing. In this sense, we have encoded the states and the transition probabilities using strands of DNA for generating paths of the Markov chain.
Rivilla, Iván; de Cózar, Abel; Schäfer, Thomas; Hernandez, Frank J.; Bittner, Alexander M.; Eleta-Lopez, Aitziber; Aboudzadeh, Ali; Santos, José I.; Miranda, José I.
2017-01-01
A novel catalytic system based on covalently modified DNA is described. This catalyst promotes 1,3-dipolar reactions between azomethine ylides and maleimides. The catalytic system is based on the distortion of the double helix of DNA by means of the formation of Pt(ii) adducts with guanine units. This distortion, similar to that generated in the interaction of DNA with platinum chemotherapeutic drugs, generates active sites that can accommodate N-metallated azomethine ylides. The proposed reaction mechanism, based on QM(DFT)/MM calculations, is compatible with thermally allowed concerted (but asynchronous) [π4s + π2s] mechanisms leading to the exclusive formation of racemic endo-cycloadducts. PMID:29147531
Sequential addition of short DNA oligos in DNA-polymerase-based synthesis reactions
Gardner, Shea N [San Leandro, CA; Mariella, Jr., Raymond P.; Christian, Allen T [Tracy, CA; Young, Jennifer A [Berkeley, CA; Clague, David S [Livermore, CA
2011-01-18
A method of fabricating a DNA molecule of user-defined sequence. The method comprises the steps of preselecting a multiplicity of DNA sequence segments that will comprise the DNA molecule of user-defined sequence, separating the DNA sequence segments temporally, and combining the multiplicity of DNA sequence segments with at least one polymerase enzyme wherein the multiplicity of DNA sequence segments join to produce the DNA molecule of user-defined sequence. Sequence segments may be of length n, where n is an even or odd integer. In one embodiment the length of desired hybridizing overlap is specified by the user and the sequences and the protocol for combining them are guided by computational (bioinformatics) predictions. In one embodiment sequence segments are combined from multiple reading frames to span the same region of a sequence, so that multiple desired hybridizations may occur with different overlap lengths. In one embodiment starting sequence fragments are of different lengths, n, n+1, n+2, etc.
Effects of sequence on DNA wrapping around histones
NASA Astrophysics Data System (ADS)
Ortiz, Vanessa
2011-03-01
A central question in biophysics is whether the sequence of a DNA strand affects its mechanical properties. In epigenetics, these are thought to influence nucleosome positioning and gene expression. Theoretical and experimental attempts to answer this question have been hindered by an inability to directly resolve DNA structure and dynamics at the base-pair level. In our previous studies we used a detailed model of DNA to measure the effects of sequence on the stability of naked DNA under bending. Sequence was shown to influence DNA's ability to form kinks, which arise when certain motifs slide past others to form non-native contacts. Here, we have now included histone-DNA interactions to see if the results obtained for naked DNA are transferable to the problem of nucleosome positioning. Different DNA sequences interacting with the histone protein complex are studied, and their equilibrium and mechanical properties are compared among themselves and with the naked case. NLM training grant to the Computation and Informatics in Biology and Medicine Training Program (NLM T15LM007359).
DNA melting profiles from a matrix method.
Poland, Douglas
2004-02-05
In this article we give a new method for the calculation of DNA melting profiles. Based on the matrix formulation of the DNA partition function, the method relies for its efficiency on the fact that the required matrices are very sparse, essentially reducing matrix multiplication to vector multiplication and thus making the computer time required to treat a DNA molecule containing N base pairs proportional to N(2). A key ingredient in the method is the result that multiplication by the inverse matrix can also be reduced to vector multiplication. The task of calculating the melting profile for the entire genome is further reduced by treating regions of the molecule between helix-plateaus, thus breaking the molecule up into independent parts that can each be treated individually. The method is easily modified to incorporate changes in the assignment of statistical weights to the different structural features of DNA. We illustrate the method using the genome of Haemophilus influenzae. Copyright 2003 Wiley Periodicals, Inc.
Application of permanents of square matrices for DNA identification in multiple-fatality cases
2013-01-01
Background DNA profiling is essential for individual identification. In forensic medicine, the likelihood ratio (LR) is commonly used to identify individuals. The LR is calculated by comparing two hypotheses for the sample DNA: that the sample DNA is identical or related to a reference DNA, and that it is randomly sampled from a population. For multiple-fatality cases, however, identification should be considered as an assignment problem, and a particular sample and reference pair should therefore be compared with other possibilities conditional on the entire dataset. Results We developed a new method to compute the probability via permanents of square matrices of nonnegative entries. As the exact permanent is known as a #P-complete problem, we applied the Huber–Law algorithm to approximate the permanents. We performed a computer simulation to evaluate the performance of our method via receiver operating characteristic curve analysis compared with LR under the assumption of a closed incident. Differences between the two methods were well demonstrated when references provided neither obligate alleles nor impossible alleles. The new method exhibited higher sensitivity (0.188 vs. 0.055) at a threshold value of 0.999, at which specificity was 1, and it exhibited higher area under a receiver operating characteristic curve (0.990 vs. 0.959, P = 9.6E-15). Conclusions Our method therefore offers a solution for a computationally intensive assignment problem and may be a viable alternative to LR-based identification for closed-incident multiple-fatality cases. PMID:23962363
Multiple DNA and protein sequence alignment on a workstation and a supercomputer.
Tajima, K
1988-11-01
This paper describes a multiple alignment method using a workstation and supercomputer. The method is based on the alignment of a set of aligned sequences with the new sequence, and uses a recursive procedure of such alignment. The alignment is executed in a reasonable computation time on diverse levels from a workstation to a supercomputer, from the viewpoint of alignment results and computational speed by parallel processing. The application of the algorithm is illustrated by several examples of multiple alignment of 12 amino acid and DNA sequences of HIV (human immunodeficiency virus) env genes. Colour graphic programs on a workstation and parallel processing on a supercomputer are discussed.
Would Dissociative Recombination of DNA+ be a Possible Pathway of DNA Damage?
NASA Astrophysics Data System (ADS)
Kwon, H. C.; Chen, Z. P.; Strom, R. A.; Andrianarijaona, V. M.
2015-05-01
It is known that dissociative recombination (DR) is one of the very efficient processes of destruction of molecular cations into neutral particles. During the past few years, the focus of DR has been expanded from small inorganic molecules to macromolecular cation. We are probing the possibility of the DR of DNA+ after ionization of DNA, for example due to ionizing radiation. Therefore we are investigating the existence of autoionization states within nucleotide bases (Guanine, Adenine, Cytosine, and Thymine). Our results from computational analysis using the modern electronic structure program ORCA will be presented. Authors wish to give special thanks to Pacific Union College Student Senate for their financial support.
Charge Structure and Counterion Distribution in Hexagonal DNA Liquid Crystal
Dai, Liang; Mu, Yuguang; Nordenskiöld, Lars; Lapp, Alain; van der Maarel, Johan R. C.
2007-01-01
A hexagonal liquid crystal of DNA fragments (double-stranded, 150 basepairs) with tetramethylammonium (TMA) counterions was investigated with small angle neutron scattering (SANS). We obtained the structure factors pertaining to the DNA and counterion density correlations with contrast matching in the water. Molecular dynamics (MD) computer simulation of a hexagonal assembly of nine DNA molecules showed that the inter-DNA distance fluctuates with a correlation time around 2 ns and a standard deviation of 8.5% of the interaxial spacing. The MD simulation also showed a minimal effect of the fluctuations in inter-DNA distance on the radial counterion density profile and significant penetration of the grooves by TMA. The radial density profile of the counterions was also obtained from a Monte Carlo (MC) computer simulation of a hexagonal array of charged rods with fixed interaxial spacing. Strong ordering of the counterions between the DNA molecules and the absence of charge fluctuations at longer wavelengths was shown by the SANS number and charge structure factors. The DNA-counterion and counterion structure factors are interpreted with the correlation functions derived from the Poisson-Boltzmann equation, MD, and MC simulation. Best agreement is observed between the experimental structure factors and the prediction based on the Poisson-Boltzmann equation and/or MC simulation. The SANS results show that TMA is too large to penetrate the grooves to a significant extent, in contrast to what is shown by MD simulation. PMID:17098791
Myers, E W; Mount, D W
1986-01-01
We describe a program which may be used to find approximate matches to a short predefined DNA sequence in a larger target DNA sequence. The program predicts the usefulness of specific DNA probes and sequencing primers and finds nearly identical sequences that might represent the same regulatory signal. The program is written in the C programming language and will run on virtually any computer system with a C compiler, such as the IBM/PC and other computers running under the MS/DOS and UNIX operating systems. The program has been integrated into an existing software package for the IBM personal computer (see article by Mount and Conrad, this volume). Some examples of its use are given. PMID:3753785
Lin, Xiaodong; Liu, Yaqing; Deng, Jiankang; Lyu, Yanlong; Qian, Pengcheng; Li, Yunfei; Wang, Shuo
2018-02-21
The integration of multiple DNA logic gates on a universal platform to implement advance logic functions is a critical challenge for DNA computing. Herein, a straightforward and powerful strategy in which a guanine-rich DNA sequence lighting up a silver nanocluster and fluorophore was developed to construct a library of logic gates on a simple DNA-templated silver nanoclusters (DNA-AgNCs) platform. This library included basic logic gates, YES, AND, OR, INHIBIT, and XOR, which were further integrated into complex logic circuits to implement diverse advanced arithmetic/non-arithmetic functions including half-adder, half-subtractor, multiplexer, and demultiplexer. Under UV irradiation, all the logic functions could be instantly visualized, confirming an excellent repeatability. The logic operations were entirely based on DNA hybridization in an enzyme-free and label-free condition, avoiding waste accumulation and reducing cost consumption. Interestingly, a DNA-AgNCs-based multiplexer was, for the first time, used as an intelligent biosensor to identify pathogenic genes, E. coli and S. aureus genes, with a high sensitivity. The investigation provides a prototype for the wireless integration of multiple devices on even the simplest single-strand DNA platform to perform diverse complex functions in a straightforward and cost-effective way.
MSuPDA: A memory efficient algorithm for sequence alignment.
Khan, Mohammad Ibrahim; Kamal, Md Sarwar; Chowdhury, Linkon
2015-01-16
Space complexity is a million dollar question in DNA sequence alignments. In this regards, MSuPDA (Memory Saving under Pushdown Automata) can help to reduce the occupied spaces in computer memory. Our proposed process is that Anchor Seed (AS) will be selected from given data set of Nucleotides base pairs for local sequence alignment. Quick Splitting (QS) techniques will separate the Anchor Seed from all the DNA genome segments. Selected Anchor Seed will be placed to pushdown Automata's (PDA) input unit. Whole DNA genome segments will be placed into PDA's stack. Anchor Seed from input unit will be matched with the DNA genome segments from stack of PDA. Whatever matches, mismatches or Indel, of Nucleotides will be POP from the stack under the control of control unit of Pushdown Automata. During the POP operation on stack it will free the memory cell occupied by the Nucleotide base pair.
Yang, Cheng-Hong; Wu, Kuo-Chuan; Chuang, Li-Yeh; Chang, Hsueh-Wei
2018-01-01
DNA barcode sequences are accumulating in large data sets. A barcode is generally a sequence larger than 1000 base pairs and generates a computational burden. Although the DNA barcode was originally envisioned as straightforward species tags, the identification usage of barcode sequences is rarely emphasized currently. Single-nucleotide polymorphism (SNP) association studies provide us an idea that the SNPs may be the ideal target of feature selection to discriminate between different species. We hypothesize that SNP-based barcodes may be more effective than the full length of DNA barcode sequences for species discrimination. To address this issue, we tested a r ibulose diphosphate carboxylase ( rbcL ) S NP b arcoding (RSB) strategy using a decision tree algorithm. After alignment and trimming, 31 SNPs were discovered in the rbcL sequences from 38 Brassicaceae plant species. In the decision tree construction, these SNPs were computed to set up the decision rule to assign the sequences into 2 groups level by level. After algorithm processing, 37 nodes and 31 loci were required for discriminating 38 species. Finally, the sequence tags consisting of 31 rbcL SNP barcodes were identified for discriminating 38 Brassicaceae species based on the decision tree-selected SNP pattern using RSB method. Taken together, this study provides the rational that the SNP aspect of DNA barcode for rbcL gene is a useful and effective sequence for tagging 38 Brassicaceae species.
21st International Conference on DNA Computing and Molecular Programming: 8.1 Biochemistry
include information storage and biological applications of DNA systems, biomolecular chemical reaction networks, applications of self -assembled DNA...nanostructures, tile self -assembly and computation, principles and models of self -assembly, and strand displacement and biomolecular circuits. The fund
Duplex Interrogation by a Direct DNA Repair Protein in Search of Base Damage
Yi, Chengqi; Chen, Baoen; Qi, Bo; Zhang, Wen; Jia, Guifang; Zhang, Liang; Li, Charles J.; Dinner, Aaron R.; Yang, Cai-Guang; He, Chuan
2012-01-01
ALKBH2 is a direct DNA repair dioxygenase guarding mammalian genome against N1-methyladenine, N3-methylcytosine, and 1,N6-ethenoadenine damage. A prerequisite for repair is to identify these lesions in the genome. Here we present crystal structures of ALKBH2 bound to different duplex DNAs. Together with computational and biochemical analyses, our results suggest that DNA interrogation by ALKBH2 displays two novel features: i) ALKBH2 probes base-pair stability and detects base pairs with reduced stability; ii) ALKBH2 does not have nor need a “damage-checking site”, which is critical for preventing spurious base-cleavage for several glycosylases. The demethylation mechanism of ALKBH2 insures that only cognate lesions are oxidized and reversed to normal bases, and that a flipped, non-substrate base remains intact in the active site. Overall, the combination of duplex interrogation and oxidation chemistry allows ALKBH2 to detect and process diverse lesions efficiently and correctly. PMID:22659876
DMINDA: an integrated web server for DNA motif identification and analyses
Ma, Qin; Zhang, Hanyuan; Mao, Xizeng; Zhou, Chuan; Liu, Bingqiang; Chen, Xin; Xu, Ying
2014-01-01
DMINDA (DNA motif identification and analyses) is an integrated web server for DNA motif identification and analyses, which is accessible at http://csbl.bmb.uga.edu/DMINDA/. This web site is freely available to all users and there is no login requirement. This server provides a suite of cis-regulatory motif analysis functions on DNA sequences, which are important to elucidation of the mechanisms of transcriptional regulation: (i) de novo motif finding for a given set of promoter sequences along with statistical scores for the predicted motifs derived based on information extracted from a control set, (ii) scanning motif instances of a query motif in provided genomic sequences, (iii) motif comparison and clustering of identified motifs, and (iv) co-occurrence analyses of query motifs in given promoter sequences. The server is powered by a backend computer cluster with over 150 computing nodes, and is particularly useful for motif prediction and analyses in prokaryotic genomes. We believe that DMINDA, as a new and comprehensive web server for cis-regulatory motif finding and analyses, will benefit the genomic research community in general and prokaryotic genome researchers in particular. PMID:24753419
A programming language for composable DNA circuits
Phillips, Andrew; Cardelli, Luca
2009-01-01
Recently, a range of information-processing circuits have been implemented in DNA by using strand displacement as their main computational mechanism. Examples include digital logic circuits and catalytic signal amplification circuits that function as efficient molecular detectors. As new paradigms for DNA computation emerge, the development of corresponding languages and tools for these paradigms will help to facilitate the design of DNA circuits and their automatic compilation to nucleotide sequences. We present a programming language for designing and simulating DNA circuits in which strand displacement is the main computational mechanism. The language includes basic elements of sequence domains, toeholds and branch migration, and assumes that strands do not possess any secondary structure. The language is used to model and simulate a variety of circuits, including an entropy-driven catalytic gate, a simple gate motif for synthesizing large-scale circuits and a scheme for implementing an arbitrary system of chemical reactions. The language is a first step towards the design of modelling and simulation tools for DNA strand displacement, which complements the emergence of novel implementation strategies for DNA computing. PMID:19535415
A programming language for composable DNA circuits.
Phillips, Andrew; Cardelli, Luca
2009-08-06
Recently, a range of information-processing circuits have been implemented in DNA by using strand displacement as their main computational mechanism. Examples include digital logic circuits and catalytic signal amplification circuits that function as efficient molecular detectors. As new paradigms for DNA computation emerge, the development of corresponding languages and tools for these paradigms will help to facilitate the design of DNA circuits and their automatic compilation to nucleotide sequences. We present a programming language for designing and simulating DNA circuits in which strand displacement is the main computational mechanism. The language includes basic elements of sequence domains, toeholds and branch migration, and assumes that strands do not possess any secondary structure. The language is used to model and simulate a variety of circuits, including an entropy-driven catalytic gate, a simple gate motif for synthesizing large-scale circuits and a scheme for implementing an arbitrary system of chemical reactions. The language is a first step towards the design of modelling and simulation tools for DNA strand displacement, which complements the emergence of novel implementation strategies for DNA computing.
Weak nanoscale chaos and anomalous relaxation in DNA
NASA Astrophysics Data System (ADS)
Mazur, Alexey K.
2017-06-01
Anomalous nonexponential relaxation in hydrated biomolecules is commonly attributed to the complexity of the free-energy landscapes, similarly to polymers and glasses. It was found recently that the hydrogen-bond breathing of terminal DNA base pairs exhibits a slow power-law relaxation attributable to weak Hamiltonian chaos, with parameters similar to experimental data. Here, the relationship is studied between this motion and spectroscopic signals measured in DNA with a small molecular photoprobe inserted into the base-pair stack. To this end, the earlier computational approach in combination with an analytical theory is applied to the experimental DNA fragment. It is found that the intensity of breathing dynamics is strongly increased in the internal base pairs that flank the photoprobe, with anomalous relaxation quantitatively close to that in terminal base pairs. A physical mechanism is proposed to explain the coupling between the relaxation of base-pair breathing and the experimental response signal. It is concluded that the algebraic relaxation observed experimentally is very likely a manifestation of weakly chaotic dynamics of hydrogen-bond breathing in the base pairs stacked to the photoprobe and that the weak nanoscale chaos can represent an ubiquitous hidden source of nonexponential relaxation in ultrafast spectroscopy.
Weak nanoscale chaos and anomalous relaxation in DNA.
Mazur, Alexey K
2017-06-01
Anomalous nonexponential relaxation in hydrated biomolecules is commonly attributed to the complexity of the free-energy landscapes, similarly to polymers and glasses. It was found recently that the hydrogen-bond breathing of terminal DNA base pairs exhibits a slow power-law relaxation attributable to weak Hamiltonian chaos, with parameters similar to experimental data. Here, the relationship is studied between this motion and spectroscopic signals measured in DNA with a small molecular photoprobe inserted into the base-pair stack. To this end, the earlier computational approach in combination with an analytical theory is applied to the experimental DNA fragment. It is found that the intensity of breathing dynamics is strongly increased in the internal base pairs that flank the photoprobe, with anomalous relaxation quantitatively close to that in terminal base pairs. A physical mechanism is proposed to explain the coupling between the relaxation of base-pair breathing and the experimental response signal. It is concluded that the algebraic relaxation observed experimentally is very likely a manifestation of weakly chaotic dynamics of hydrogen-bond breathing in the base pairs stacked to the photoprobe and that the weak nanoscale chaos can represent an ubiquitous hidden source of nonexponential relaxation in ultrafast spectroscopy.
Kawano, Tomonori
2013-03-01
There have been a wide variety of approaches for handling the pieces of DNA as the "unplugged" tools for digital information storage and processing, including a series of studies applied to the security-related area, such as DNA-based digital barcodes, water marks and cryptography. In the present article, novel designs of artificial genes as the media for storing the digitally compressed data for images are proposed for bio-computing purpose while natural genes principally encode for proteins. Furthermore, the proposed system allows cryptographical application of DNA through biochemically editable designs with capacity for steganographical numeric data embedment. As a model case of image-coding DNA technique application, numerically and biochemically combined protocols are employed for ciphering the given "passwords" and/or secret numbers using DNA sequences. The "passwords" of interest were decomposed into single letters and translated into the font image coded on the separate DNA chains with both the coding regions in which the images are encoded based on the novel run-length encoding rule, and the non-coding regions designed for biochemical editing and the remodeling processes revealing the hidden orientation of letters composing the original "passwords." The latter processes require the molecular biological tools for digestion and ligation of the fragmented DNA molecules targeting at the polymerase chain reaction-engineered termini of the chains. Lastly, additional protocols for steganographical overwriting of the numeric data of interests over the image-coding DNA are also discussed.
Three-input majority logic gate and multiple input logic circuit based on DNA strand displacement.
Li, Wei; Yang, Yang; Yan, Hao; Liu, Yan
2013-06-12
In biomolecular programming, the properties of biomolecules such as proteins and nucleic acids are harnessed for computational purposes. The field has gained considerable attention due to the possibility of exploiting the massive parallelism that is inherent in natural systems to solve computational problems. DNA has already been used to build complex molecular circuits, where the basic building blocks are logic gates that produce single outputs from one or more logical inputs. We designed and experimentally realized a three-input majority gate based on DNA strand displacement. One of the key features of a three-input majority gate is that the three inputs have equal priority, and the output will be true if any of the two inputs are true. Our design consists of a central, circular DNA strand with three unique domains between which are identical joint sequences. Before inputs are introduced to the system, each domain and half of each joint is protected by one complementary ssDNA that displays a toehold for subsequent displacement by the corresponding input. With this design the relationship between any two domains is analogous to the relationship between inputs in a majority gate. Displacing two or more of the protection strands will expose at least one complete joint and return a true output; displacing none or only one of the protection strands will not expose a complete joint and will return a false output. Further, we designed and realized a complex five-input logic gate based on the majority gate described here. By controlling two of the five inputs the complex gate can realize every combination of OR and AND gates of the other three inputs.
XLS (c9orf142) is a new component of mammalian DNA double-stranded break repair.
Craxton, A; Somers, J; Munnur, D; Jukes-Jones, R; Cain, K; Malewicz, M
2015-06-01
Repair of double-stranded DNA breaks (DSBs) in mammalian cells primarily occurs by the non-homologous end-joining (NHEJ) pathway, which requires seven core proteins (Ku70/Ku86, DNA-PKcs (DNA-dependent protein kinase catalytic subunit), Artemis, XRCC4-like factor (XLF), XRCC4 and DNA ligase IV). Here we show using combined affinity purification and mass spectrometry that DNA-PKcs co-purifies with all known core NHEJ factors. Furthermore, we have identified a novel evolutionary conserved protein associated with DNA-PKcs-c9orf142. Computer-based modelling of c9orf142 predicted a structure very similar to XRCC4, hence we have named c9orf142-XLS (XRCC4-like small protein). Depletion of c9orf142/XLS in cells impaired DSB repair consistent with a defect in NHEJ. Furthermore, c9orf142/XLS interacted with other core NHEJ factors. These results demonstrate the existence of a new component of the NHEJ DNA repair pathway in mammalian cells.
Kuang, Hua; Ma, Wei; Xu, Liguang; Wang, Libing; Xu, Chuanlai
2013-11-19
Polymerase chain reaction (PCR) is an essential tool in biotechnology laboratories and is becoming increasingly important in other areas of research. Extensive data obtained over the last 12 years has shown that the combination of PCR with nanoscale dispersions can resolve issues in the preparation DNA-based materials that include both inorganic and organic nanoscale components. Unlike conventional DNA hybridization and antibody-antigen complexes, PCR provides a new, effective assembly platform that both increases the yield of DNA-based nanomaterials and allows researchers to program and control assembly with predesigned parameters including those assisted and automated by computers. As a result, this method allows researchers to optimize to the combinatorial selection of the DNA strands for their nanoparticle conjugates. We have developed a PCR approach for producing various nanoscale assemblies including organic motifs such as small molecules, macromolecules, and inorganic building blocks, such as nanorods (NRs), metal, semiconductor, and magnetic nanoparticles (NPs). We start with a nanoscale primer and then modify that building block using the automated steps of PCR-based assembly including initialization, denaturation, annealing, extension, final elongation, and final hold. The intermediate steps of denaturation, annealing, and extension are cyclic, and we use computer control so that the assembled superstructures reach their predetermined complexity. The structures assembled using a small number of PCR cycles show a lower polydispersity than similar discrete structures obtained by direct hybridization between the nanoscale building blocks. Using different building blocks, we assembled the following structural motifs by PCR: (1) discrete nanostructures (NP dimers, NP multimers including trimers, pyramids, tetramers or hexamers, etc.), (2) branched NP superstructures and heterochains, (3) NP satellite-like superstructures, (4) Y-shaped nanostructures and DNA networks, (5) protein-DNA co-assembly structures, and (6) DNA block copolymers including trimers and pentamers. These results affirm that this method can produce a variety of chemical structures and in yields that are tunable. Using PCR-based preparation of DNA-bridged nanostructures, we can program the assembly of the nanoscale blocks through the adjustment of the primer intensity on the assembled units, the number of PCR cycles, or both. The resulting structures are highly complex and diverse and have interesting dynamics and collective properties. Potential applications of these materials include chirooptical materials, probe fabrication, and environmental and biomedical sensors.
Knowledge-Based Elastic Potentials for Docking Drugs or Proteins with Nucleic Acids
Ge, Wei; Schneider, Bohdan; Olson, Wilma K.
2005-01-01
Elastic ellipsoidal functions defined by the observed hydration patterns around the DNA bases provide a new basis for measuring the recognition of ligands in the grooves of double-helical structures. Here a set of knowledge-based potentials suitable for quantitative description of such behavior is extracted from the observed positions of water molecules and amino acid atoms that form hydrogen bonds with the nitrogenous bases in high resolution crystal structures. Energies based on the displacement of hydrogen-bonding sites on drugs in DNA-crystal complexes relative to the preferred locations of water binding around the heterocyclic bases are low, pointing to the reliability of the potentials and the apparent displacement of water molecules by drug atoms in these structures. The validity of the energy functions has been further examined in a series of sequence substitution studies based on the structures of DNA bound to polyamides that have been designed to recognize the minor-groove edges of Watson-Crick basepairs. The higher energies of binding to incorrect sequences superimposed (without conformational adjustment or displacement of polyamide ligands) on observed high resolution structures confirm the hypothesis that the drug subunits associate with specific DNA bases. The knowledge-based functions also account satisfactorily for the measured free energies of DNA-polyamide association in solution and the observed sites of polyamide binding on nucleosomal DNA. The computations are generally consistent with mechanisms by which minor-groove binding ligands are thought to recognize DNA basepairs. The calculations suggest that the asymmetric distributions of hydrogen-bond-forming atoms on the minor-groove edge of the basepairs may underlie ligand discrimination of G·C from C·G pairs, in addition to the commonly believed role of steric hindrance. The analysis of polyamide-bound nucleosomal structures reveals other discrepancies in the expected chemical design, including unexpected contacts to DNA and modified basepair targets of some ligands. The ellipsoidal potentials thus appear promising as a mathematical tool for the study of drug- and protein-DNA interactions and for gaining new insights into DNA-binding mechanisms. PMID:15501936
Enol tautomers of Watson-Crick base pair models are metastable because of nuclear quantum effects.
Pérez, Alejandro; Tuckerman, Mark E; Hjalmarson, Harold P; von Lilienfeld, O Anatole
2010-08-25
Intermolecular enol tautomers of Watson-Crick base pairs could emerge spontaneously via interbase double proton transfer. It has been hypothesized that their formation could be facilitated by thermal fluctuations and proton tunneling, and possibly be relevant to DNA damage. Theoretical and computational studies, assuming classical nuclei, have confirmed the dynamic stability of these rare tautomers. However, by accounting for nuclear quantum effects explicitly through Car-Parrinello path integral molecular dynamics calculations, we find the tautomeric enol form to be dynamically metastable, with lifetimes too insignificant to be implicated in DNA damage.
Macroscopic modeling and simulations of supercoiled DNA with bound proteins
NASA Astrophysics Data System (ADS)
Huang, Jing; Schlick, Tamar
2002-11-01
General methods are presented for modeling and simulating DNA molecules with bound proteins on the macromolecular level. These new approaches are motivated by the need for accurate and affordable methods to simulate slow processes (on the millisecond time scale) in DNA/protein systems, such as the large-scale motions involved in the Hin-mediated inversion process. Our approaches, based on the wormlike chain model of long DNA molecules, introduce inhomogeneous potentials for DNA/protein complexes based on available atomic-level structures. Electrostatically, treat those DNA/protein complexes as sets of effective charges, optimized by our discrete surface charge optimization package, in which the charges are distributed on an excluded-volume surface that represents the macromolecular complex. We also introduce directional bending potentials as well as non-identical bead hydrodynamics algorithm to further mimic the inhomogeneous effects caused by protein binding. These models thus account for basic elements of protein binding effects on DNA local structure but remain computational tractable. To validate these models and methods, we reproduce various properties measured by both Monte Carlo methods and experiments. We then apply the developed models to study the Hin-mediated inversion system in long DNA. By simulating supercoiled, circular DNA with or without bound proteins, we observe significant effects of protein binding on global conformations and long-time dynamics of the DNA on the kilo basepair length.
Lee, Hwan Young; Song, Injee; Ha, Eunho; Cho, Sung-Bae; Yang, Woo Ick; Shin, Kyoung-Jin
2008-01-01
Background For the past few years, scientific controversy has surrounded the large number of errors in forensic and literature mitochondrial DNA (mtDNA) data. However, recent research has shown that using mtDNA phylogeny and referring to known mtDNA haplotypes can be useful for checking the quality of sequence data. Results We developed a Web-based bioinformatics resource "mtDNAmanager" that offers a convenient interface supporting the management and quality analysis of mtDNA sequence data. The mtDNAmanager performs computations on mtDNA control-region sequences to estimate the most-probable mtDNA haplogroups and retrieves similar sequences from a selected database. By the phased designation of the most-probable haplogroups (both expected and estimated haplogroups), mtDNAmanager enables users to systematically detect errors whilst allowing for confirmation of the presence of clear key diagnostic mutations and accompanying mutations. The query tools of mtDNAmanager also facilitate database screening with two options of "match" and "include the queried nucleotide polymorphism". In addition, mtDNAmanager provides Web interfaces for users to manage and analyse their own data in batch mode. Conclusion The mtDNAmanager will provide systematic routines for mtDNA sequence data management and analysis via easily accessible Web interfaces, and thus should be very useful for population, medical and forensic studies that employ mtDNA analysis. mtDNAmanager can be accessed at . PMID:19014619
Structure solution of DNA-binding proteins and complexes with ARCIMBOLDO libraries
DOE Office of Scientific and Technical Information (OSTI.GOV)
Pröpper, Kevin; Instituto de Biologia Molecular de Barcelona; Meindl, Kathrin
2014-06-01
The structure solution of DNA-binding protein structures and complexes based on the combination of location of DNA-binding protein motif fragments with density modification in a multi-solution frame is described. Protein–DNA interactions play a major role in all aspects of genetic activity within an organism, such as transcription, packaging, rearrangement, replication and repair. The molecular detail of protein–DNA interactions can be best visualized through crystallography, and structures emphasizing insight into the principles of binding and base-sequence recognition are essential to understanding the subtleties of the underlying mechanisms. An increasing number of high-quality DNA-binding protein structure determinations have been witnessed despite themore » fact that the crystallographic particularities of nucleic acids tend to pose specific challenges to methods primarily developed for proteins. Crystallographic structure solution of protein–DNA complexes therefore remains a challenging area that is in need of optimized experimental and computational methods. The potential of the structure-solution program ARCIMBOLDO for the solution of protein–DNA complexes has therefore been assessed. The method is based on the combination of locating small, very accurate fragments using the program Phaser and density modification with the program SHELXE. Whereas for typical proteins main-chain α-helices provide the ideal, almost ubiquitous, small fragments to start searches, in the case of DNA complexes the binding motifs and DNA double helix constitute suitable search fragments. The aim of this work is to provide an effective library of search fragments as well as to determine the optimal ARCIMBOLDO strategy for the solution of this class of structures.« less
Wang, Pengfei; Gaitanaros, Stavros; Lee, Seungwoo; Bathe, Mark; Shih, William M; Ke, Yonggang
2016-06-22
Scaffolded DNA origami has proven to be a versatile method for generating functional nanostructures with prescribed sub-100 nm shapes. Programming DNA-origami tiles to form large-scale 2D lattices that span hundreds of nanometers to the micrometer scale could provide an enabling platform for diverse applications ranging from metamaterials to surface-based biophysical assays. Toward this end, here we design a family of hexagonal DNA-origami tiles using computer-aided design and demonstrate successful self-assembly of micrometer-scale 2D honeycomb lattices and tubes by controlling their geometric and mechanical properties including their interconnecting strands. Our results offer insight into programmed self-assembly of low-defect supra-molecular DNA-origami 2D lattices and tubes. In addition, we demonstrate that these DNA-origami hexagon tiles and honeycomb lattices are versatile platforms for assembling optical metamaterials via programmable spatial arrangement of gold nanoparticles (AuNPs) into cluster and superlattice geometries.
Efficient alignment-free DNA barcode analytics.
Kuksa, Pavel; Pavlovic, Vladimir
2009-11-10
In this work we consider barcode DNA analysis problems and address them using alternative, alignment-free methods and representations which model sequences as collections of short sequence fragments (features). The methods use fixed-length representations (spectrum) for barcode sequences to measure similarities or dissimilarities between sequences coming from the same or different species. The spectrum-based representation not only allows for accurate and computationally efficient species classification, but also opens possibility for accurate clustering analysis of putative species barcodes and identification of critical within-barcode loci distinguishing barcodes of different sample groups. New alignment-free methods provide highly accurate and fast DNA barcode-based identification and classification of species with substantial improvements in accuracy and speed over state-of-the-art barcode analysis methods. We evaluate our methods on problems of species classification and identification using barcodes, important and relevant analytical tasks in many practical applications (adverse species movement monitoring, sampling surveys for unknown or pathogenic species identification, biodiversity assessment, etc.) On several benchmark barcode datasets, including ACG, Astraptes, Hesperiidae, Fish larvae, and Birds of North America, proposed alignment-free methods considerably improve prediction accuracy compared to prior results. We also observe significant running time improvements over the state-of-the-art methods. Our results show that newly developed alignment-free methods for DNA barcoding can efficiently and with high accuracy identify specimens by examining only few barcode features, resulting in increased scalability and interpretability of current computational approaches to barcoding.
NASA Astrophysics Data System (ADS)
Rajasekhar, Bathula; Bodavarapu, Navya; Sridevi, M.; Thamizhselvi, G.; RizhaNazar, K.; Padmanaban, R.; Swu, Toka
2018-03-01
The present study reports the synthesis and evaluation of nonlinear optical property and G-Quadruplex DNA Stabilization of five novel copper(II) mixed ligand complexes. They were synthesized from copper(II) salt, 2,5- and 2,3- pyridinedicarboxylic acid, diethylenetriamine and amide based ligand (AL). The crystal structure of these complexes were determined through X-ray diffraction and supported by ESI-MAS, NMR, UV-Vis and FT-IR spectroscopic methods. Their nonlinear optical property was studied using Gaussian09 computer program. For structural optimization and nonlinear optical property, density functional theory (DFT) based B3LYP method was used with LANL2DZ basis set for metal ion and 6-31G∗ for C,H,N,O and Cl atoms. The present work reveals that pre-polarized Complex-2 showed higher β value (29.59 × 10-30e.s.u) as compared to that of neutral complex-1 (β = 0.276 × 10-30e.s.u.) which may be due to greater advantage of polarizability. Complex-2 is expected to be a potential material for optoelectronic and photonic technologies. Docking studies using AutodockVina revealed that complex-2 has higher binding energy for both G-Quadruplex DNA (-8.7 kcal/mol) and duplex DNA (-10.1 kcal/mol). It was also observed that structure plays an important role in binding efficiency.
DNA MemoChip: Long-Term and High Capacity Information Storage and Select Retrieval.
Stefano, George B; Wang, Fuzhou; Kream, Richard M
2018-02-26
Over the course of history, human beings have never stopped seeking effective methods for information storage. From rocks to paper, and through the past several decades of using computer disks, USB sticks, and on to the thin silicon "chips" and "cloud" storage of today, it would seem that we have reached an era of efficiency for managing innumerable and ever-expanding data. Astonishingly, when tracing this technological path, one realizes that our ancient methods of informational storage far outlast paper (10,000 vs. 1,000 years, respectively), let alone the computer-based memory devices that only last, on average, 5 to 25 years. During this time of fast-paced information generation, it becomes increasingly difficult for current storage methods to retain such massive amounts of data, and to maintain appropriate speeds with which to retrieve it, especially when in demand by a large number of users. Others have proposed that DNA-based information storage provides a way forward for information retention as a result of its temporal stability. It is now evident that DNA represents a potentially economical and sustainable mechanism for storing information, as demonstrated by its decoding from a 700,000 year-old horse genome. The fact that the human genome is present in a cell, containing also the varied mitochondrial genome, indicates DNA's great potential for large data storage in a 'smaller' space.
The role of structural parameters in DNA cyclization
Alexandrov, Ludmil B.; Bishop, Alan R.; Rasmussen, Kim O.; ...
2016-02-04
The intrinsic bendability of DNA plays an important role with relevance for myriad of essential cellular mechanisms. The flexibility of a DNA fragment can be experimentally and computationally examined by its propensity for cyclization, quantified by the Jacobson-Stockmayer J factor. In this paper, we use a well-established coarse-grained three-dimensional model of DNA and seven distinct sets of experimentally and computationally derived conformational parameters of the double helix to evaluate the role of structural parameters in calculating DNA cyclization.
A Hybrid Computer Simulation to Generate the DNA Distribution of a Cell Population.
ERIC Educational Resources Information Center
Griebling, John L.; Adams, William S.
1981-01-01
Described is a method of simulating the formation of a DNA distribution, on which statistical results and experimentally measured parameters from DNA distribution and percent-labeled mitosis studies are combined. An EAI-680 and DECSystem-10 Hybrid Computer configuration are used. (Author/CS)
DNA Microarray-based Ecotoxicological Biomarker Discovery in a Small Fish Model Species
This paper addresses several issues critical to use of zebrafish oligonucleotide microarrays for computational toxicology research on endocrine disrupting chemicals using small fish models, and more generally, the use of microarrays in aquatic toxicology.
Vladimirov, N V; Likhoshvaĭ, V A; Matushkin, Iu G
2007-01-01
Gene expression is known to correlate with degree of codon bias in many unicellular organisms. However, such correlation is absent in some organisms. Recently we demonstrated that inverted complementary repeats within coding DNA sequence must be considered for proper estimation of translation efficiency, since they may form secondary structures that obstruct ribosome movement. We have developed a program for estimation of potential coding DNA sequence expression in defined unicellular organism using its genome sequence. The program computes elongation efficiency index. Computation is based on estimation of coding DNA sequence elongation efficiency, taking into account three key factors: codon bias, average number of inverted complementary repeats, and free energy of potential stem-loop structures formed by the repeats. The influence of these factors on translation is numerically estimated. An optimal proportion of these factors is computed for each organism individually. Quantitative translational characteristics of 384 unicellular organisms (351 bacteria, 28 archaea, 5 eukaryota) have been computed using their annotated genomes from NCBI GenBank. Five potential evolutionary strategies of translational optimization have been determined among studied organisms. A considerable difference of preferred translational strategies between Bacteria and Archaea has been revealed. Significant correlations between elongation efficiency index and gene expression levels have been shown for two organisms (S. cerevisiae and H. pylori) using available microarray data. The proposed method allows to estimate numerically the coding DNA sequence translation efficiency and to optimize nucleotide composition of heterologous genes in unicellular organisms. http://www.mgs.bionet.nsc.ru/mgs/programs/eei-calculator/.
Promoter Sequences Prediction Using Relational Association Rule Mining
Czibula, Gabriela; Bocicor, Maria-Iuliana; Czibula, Istvan Gergely
2012-01-01
In this paper we are approaching, from a computational perspective, the problem of promoter sequences prediction, an important problem within the field of bioinformatics. As the conditions for a DNA sequence to function as a promoter are not known, machine learning based classification models are still developed to approach the problem of promoter identification in the DNA. We are proposing a classification model based on relational association rules mining. Relational association rules are a particular type of association rules and describe numerical orderings between attributes that commonly occur over a data set. Our classifier is based on the discovery of relational association rules for predicting if a DNA sequence contains or not a promoter region. An experimental evaluation of the proposed model and comparison with similar existing approaches is provided. The obtained results show that our classifier overperforms the existing techniques for identifying promoter sequences, confirming the potential of our proposal. PMID:22563233
NASA Technical Reports Server (NTRS)
Nakayama, S.; Kretsinger, R. H.
1993-01-01
In the first report in this series we presented dendrograms based on 152 individual proteins of the EF-hand family. In the second we used sequences from 228 proteins, containing 835 domains, and showed that eight of the 29 subfamilies are congruent and that the EF-hand domains of the remaining 21 subfamilies have diverse evolutionary histories. In this study we have computed dendrograms within and among the EF-hand subfamilies using the encoding DNA sequences. In most instances the dendrograms based on protein and on DNA sequences are very similar. Significant differences between protein and DNA trees for calmodulin remain unexplained. In our fourth report we evaluate the sequences and the distribution of introns within the EF-hand family and conclude that exon shuffling did not play a significant role in its evolution.
Exercises in molecular computing.
Stojanovic, Milan N; Stefanovic, Darko; Rudchenko, Sergei
2014-06-17
CONSPECTUS: The successes of electronic digital logic have transformed every aspect of human life over the last half-century. The word "computer" now signifies a ubiquitous electronic device, rather than a human occupation. Yet evidently humans, large assemblies of molecules, can compute, and it has been a thrilling challenge to develop smaller, simpler, synthetic assemblies of molecules that can do useful computation. When we say that molecules compute, what we usually mean is that such molecules respond to certain inputs, for example, the presence or absence of other molecules, in a precisely defined but potentially complex fashion. The simplest way for a chemist to think about computing molecules is as sensors that can integrate the presence or absence of multiple analytes into a change in a single reporting property. Here we review several forms of molecular computing developed in our laboratories. When we began our work, combinatorial approaches to using DNA for computing were used to search for solutions to constraint satisfaction problems. We chose to work instead on logic circuits, building bottom-up from units based on catalytic nucleic acids, focusing on DNA secondary structures in the design of individual circuit elements, and reserving the combinatorial opportunities of DNA for the representation of multiple signals propagating in a large circuit. Such circuit design directly corresponds to the intuition about sensors transforming the detection of analytes into reporting properties. While this approach was unusual at the time, it has been adopted since by other groups working on biomolecular computing with different nucleic acid chemistries. We created logic gates by modularly combining deoxyribozymes (DNA-based enzymes cleaving or combining other oligonucleotides), in the role of reporting elements, with stem-loops as input detection elements. For instance, a deoxyribozyme that normally exhibits an oligonucleotide substrate recognition region is modified such that a stem-loop closes onto the substrate recognition region, making it unavailable for the substrate and thus rendering the deoxyribozyme inactive. But a conformational change can then be induced by an input oligonucleotide, complementary to the loop, to open the stem, allow the substrate to bind, and allow its cleavage to proceed, which is eventually reported via fluorescence. In this Account, several designs of this form are reviewed, along with their application in the construction of large circuits that exhibited complex logical and temporal relationships between the inputs and the outputs. Intelligent (in the sense of being capable of nontrivial information processing) theranostic (therapy + diagnostic) applications have always been the ultimate motivation for developing computing (i.e., decision-making) circuits, and we review our experiments with logic-gate elements bound to cell surfaces that evaluate the proximal presence of multiple markers on lymphocytes.
NASA Technical Reports Server (NTRS)
Vecchioni, Simon; Toomey, Emily; Capece, Mark C.; Rothschild, Lynn; Wind, Shalom
2017-01-01
DNA is an ideal template for a biological nanowire-it has a linear structure several atoms thick; it possesses addressable nucleobase geometry that can be precisely defined; and it is massively scalable into branched networks. Until now, the drawback of DNA as a conducting nanowire been, simply put, its low conductance. To address this deficiency, we extensively characterize a chemical variant of canonical DNA that exploits the affinity of natural cytosine bases for silver ions. We successfully construct chains of single silver ions inside double-stranded DNA, confirm the basic dC-Ag+-dC bond geometry and kinetics, and show length-tunability dependent on mismatch distribution, ion availability and enzyme activity. An analysis of the absorbance spectra of natural DNA and silver-binding, poly-cytosine DNA demonstrates the heightened thermostability of the ion chain and its resistance to aqueous stresses such as precipitation, dialysis and forced reduction. These chemically critical traits lend themselves to an increase in electrical conductivity of over an order of magnitude for 11-base silver-paired duplexes over natural strands when assayed by STM break junction. We further construct and implement a genetic pathway in the E. coli bacterium for the biosynthesis of highly ionizable DNA sequences. Toward future circuits, we construct a model of transcription network architectures to determine the most efficient and robust connectivity for cell-based fabrication, and we perform sequence optimization with a genetic algorithm to identify oligonucleotides robust to changes in the base-pairing energy landscape. We propose that this system will serve as a synthetic biological fabrication platform for more complex DNA nanotechnology and nanoelectronics with applications to deep space and low resource environments.
NASA Technical Reports Server (NTRS)
Hu, Shaowen; Cucinotta, Francis A.
2009-01-01
The Ku70/80 heterodimer is the first repair protein in the initial binding of double-strand break (DSB) ends following DNA damage, and is a component of nonhomologous end joining repair, the primary pathway for DSB repair in mammalian cells. In this study we constructed a full-length human Ku70 structure based on its crystal structure, and performed 20 ns conventional molecular dynamic (CMD) simulations on this protein and several other complexes with short DNA duplexes of different sequences. The trajectories of these simulations indicated that, without the topological support of Ku80, the residues in the bridge and C-terminal arm of Ku70 are more flexible than other experimentally identified domains. We studied the two missing loops in the crystal structure and predicted that they are also very flexible. Simulations revealed that they make an important contribution to the Ku70 interaction with DNA. Dislocation of the previously studied SAP domain was observed in several systems, implying its role in DNA binding. Targeted molecular dynamic (TMD) simulation was also performed for one system with a far-away 14bp DNA duplex. The TMD trajectory and energetic analysis disclosed detailed interactions of the DNA-binding residues during the DNA dislocation, and revealed a possible conformational transition for a DSB end when encountering Ku70 in solution. Compared to experimentally based analysis, this study identified more detailed interactions between DNA and Ku70. Free energy analysis indicated Ku70 alone is able to bind DNA with relatively high affinity, with consistent contributions from various domains of Ku70 in different systems. The functional implications of these domains in the processes of Ku heterodimerization and DNA damage recognition and repair can be characterized in detail based upon this analysis.
Engineering bacteria to solve the Burnt Pancake Problem
Haynes, Karmella A; Broderick, Marian L; Brown, Adam D; Butner, Trevor L; Dickson, James O; Harden, W Lance; Heard, Lane H; Jessen, Eric L; Malloy, Kelly J; Ogden, Brad J; Rosemond, Sabriya; Simpson, Samantha; Zwack, Erin; Campbell, A Malcolm; Eckdahl, Todd T; Heyer, Laurie J; Poet, Jeffrey L
2008-01-01
Background We investigated the possibility of executing DNA-based computation in living cells by engineering Escherichia coli to address a classic mathematical puzzle called the Burnt Pancake Problem (BPP). The BPP is solved by sorting a stack of distinct objects (pancakes) into proper order and orientation using the minimum number of manipulations. Each manipulation reverses the order and orientation of one or more adjacent objects in the stack. We have designed a system that uses site-specific DNA recombination to mediate inversions of genetic elements that represent pancakes within plasmid DNA. Results Inversions (or "flips") of the DNA fragment pancakes are driven by the Salmonella typhimurium Hin/hix DNA recombinase system that we reconstituted as a collection of modular genetic elements for use in E. coli. Our system sorts DNA segments by inversions to produce different permutations of a promoter and a tetracycline resistance coding region; E. coli cells become antibiotic resistant when the segments are properly sorted. Hin recombinase can mediate all possible inversion operations on adjacent flippable DNA fragments. Mathematical modeling predicts that the system reaches equilibrium after very few flips, where equal numbers of permutations are randomly sorted and unsorted. Semiquantitative PCR analysis of in vivo flipping suggests that inversion products accumulate on a time scale of hours or days rather than minutes. Conclusion The Hin/hix system is a proof-of-concept demonstration of in vivo computation with the potential to be scaled up to accommodate larger and more challenging problems. Hin/hix may provide a flexible new tool for manipulating transgenic DNA in vivo. PMID:18492232
AlQuraishi, Mohammed; Tang, Shengdong; Xia, Xide
2015-11-19
Molecular interactions between proteins and DNA molecules underlie many cellular processes, including transcriptional regulation, chromosome replication, and nucleosome positioning. Computational analyses of protein-DNA interactions rely on experimental data characterizing known protein-DNA interactions structurally and biochemically. While many databases exist that contain either structural or biochemical data, few integrate these two data sources in a unified fashion. Such integration is becoming increasingly critical with the rapid growth of structural and biochemical data, and the emergence of algorithms that rely on the synthesis of multiple data types to derive computational models of molecular interactions. We have developed an integrated affinity-structure database in which the experimental and quantitative DNA binding affinities of helix-turn-helix proteins are mapped onto the crystal structures of the corresponding protein-DNA complexes. This database provides access to: (i) protein-DNA structures, (ii) quantitative summaries of protein-DNA binding affinities using position weight matrices, and (iii) raw experimental data of protein-DNA binding instances. Critically, this database establishes a correspondence between experimental structural data and quantitative binding affinity data at the single basepair level. Furthermore, we present a novel alignment algorithm that structurally aligns the protein-DNA complexes in the database and creates a unified residue-level coordinate system for comparing the physico-chemical environments at the interface between complexes. Using this unified coordinate system, we compute the statistics of atomic interactions at the protein-DNA interface of helix-turn-helix proteins. We provide an interactive website for visualization, querying, and analyzing this database, and a downloadable version to facilitate programmatic analysis. This database will facilitate the analysis of protein-DNA interactions and the development of programmatic computational methods that capitalize on integration of structural and biochemical datasets. The database can be accessed at http://ProteinDNA.hms.harvard.edu.
Artificial intelligence in hematology.
Zini, Gina
2005-10-01
Artificial intelligence (AI) is a computer based science which aims to simulate human brain faculties using a computational system. A brief history of this new science goes from the creation of the first artificial neuron in 1943 to the first artificial neural network application to genetic algorithms. The potential for a similar technology in medicine has immediately been identified by scientists and researchers. The possibility to store and process all medical knowledge has made this technology very attractive to assist or even surpass clinicians in reaching a diagnosis. Applications of AI in medicine include devices applied to clinical diagnosis in neurology and cardiopulmonary diseases, as well as the use of expert or knowledge-based systems in routine clinical use for diagnosis, therapeutic management and for prognostic evaluation. Biological applications include genome sequencing or DNA gene expression microarrays, modeling gene networks, analysis and clustering of gene expression data, pattern recognition in DNA and proteins, protein structure prediction. In the field of hematology the first devices based on AI have been applied to the routine laboratory data management. New tools concern the differential diagnosis in specific diseases such as anemias, thalassemias and leukemias, based on neural networks trained with data from peripheral blood analysis. A revolution in cancer diagnosis, including the diagnosis of hematological malignancies, has been the introduction of the first microarray based and bioinformatic approach for molecular diagnosis: a systematic approach based on the monitoring of simultaneous expression of thousands of genes using DNA microarray, independently of previous biological knowledge, analysed using AI devices. Using gene profiling, the traditional diagnostic pathways move from clinical to molecular based diagnostic systems.
An optimized and low-cost FPGA-based DNA sequence alignment--a step towards personal genomics.
Shah, Hurmat Ali; Hasan, Laiq; Ahmad, Nasir
2013-01-01
DNA sequence alignment is a cardinal process in computational biology but also is much expensive computationally when performing through traditional computational platforms like CPU. Of many off the shelf platforms explored for speeding up the computation process, FPGA stands as the best candidate due to its performance per dollar spent and performance per watt. These two advantages make FPGA as the most appropriate choice for realizing the aim of personal genomics. The previous implementation of DNA sequence alignment did not take into consideration the price of the device on which optimization was performed. This paper presents optimization over previous FPGA implementation that increases the overall speed-up achieved as well as the price incurred by the platform that was optimized. The optimizations are (1) The array of processing elements is made to run on change in input value and not on clock, so eliminating the need for tight clock synchronization, (2) the implementation is unrestrained by the size of the sequences to be aligned, (3) the waiting time required for the sequences to load to FPGA is reduced to the minimum possible and (4) an efficient method is devised to store the output matrix that make possible to save the diagonal elements to be used in next pass, in parallel with the computation of output matrix. Implemented on Spartan3 FPGA, this implementation achieved 20 times performance improvement in terms of CUPS over GPP implementation.
BarraCUDA - a fast short read sequence aligner using graphics processing units
2012-01-01
Background With the maturation of next-generation DNA sequencing (NGS) technologies, the throughput of DNA sequencing reads has soared to over 600 gigabases from a single instrument run. General purpose computing on graphics processing units (GPGPU), extracts the computing power from hundreds of parallel stream processors within graphics processing cores and provides a cost-effective and energy efficient alternative to traditional high-performance computing (HPC) clusters. In this article, we describe the implementation of BarraCUDA, a GPGPU sequence alignment software that is based on BWA, to accelerate the alignment of sequencing reads generated by these instruments to a reference DNA sequence. Findings Using the NVIDIA Compute Unified Device Architecture (CUDA) software development environment, we ported the most computational-intensive alignment component of BWA to GPU to take advantage of the massive parallelism. As a result, BarraCUDA offers a magnitude of performance boost in alignment throughput when compared to a CPU core while delivering the same level of alignment fidelity. The software is also capable of supporting multiple CUDA devices in parallel to further accelerate the alignment throughput. Conclusions BarraCUDA is designed to take advantage of the parallelism of GPU to accelerate the alignment of millions of sequencing reads generated by NGS instruments. By doing this, we could, at least in part streamline the current bioinformatics pipeline such that the wider scientific community could benefit from the sequencing technology. BarraCUDA is currently available from http://seqbarracuda.sf.net PMID:22244497
Quasiparticle properties of DNA bases from GW calculations in a Wannier basis
NASA Astrophysics Data System (ADS)
Qian, Xiaofeng; Marzari, Nicola; Umari, Paolo
2009-03-01
The quasiparticle GW-Wannier (GWW) approach [1] has been recently developed to overcome the size limitations of conventional planewave GW calculations. By taking advantage of the localization properties of the maximally-localized Wannier functions and choosing a small set of polarization basis we reduce the number of Bloch wavefunctions products required for the evaluation of dynamical polarizabilities, and in turn greatly reduce memory requirements and computational efficiency. We apply GWW to study quasiparticle properties of different DNA bases and base-pairs, and solvation effects on the energy gap, demonstrating in the process the key advantages of this approach. [1] P. Umari,G. Stenuit, and S. Baroni, cond-mat/0811.1453
Comprehensive restriction enzyme lists to update any DNA sequence computer program.
Raschke, E
1993-04-01
Restriction enzyme lists are presented for the practical working geneticist to update any DNA computer program. These lists combine formerly scattered information and contain all presently known restriction enzymes with a unique recognition sequence, a cut site, or methylation (in)sensitivity. The lists are in the shortest possible form to also be functional with small DNA computer programs, and will produce clear restriction maps without any redundancy or loss of information. The lists discern between commercial and noncommercial enzymes, and prototype enzymes and different isoschizomers are cross-referenced. Differences in general methylation sensitivities and (in)sensitivities against Dam and Dcm methylases of Escherichia coli are indicated. Commercial methylases and intron-encoded endonucleases are included. An address list is presented to contact commercial suppliers. The lists are constantly updated and available in electronic form as pure US ASCII files, and in formats for the DNA computer programs DNA-Strider for Apple Macintosh, and DNAsis for IBM personal computers or compatibles via e-mail from the internet address: NETSERV@EMBL-HEIDELBERG.DE by sending only the message HELP RELIBRARY.
Saito, Samuel; Silva, Givaldo; Santos, Regineide Xavier; Gosmann, Grace; Pungartnik, Cristina; Brendel, Martin
2012-01-01
Reverse phase-solid phase extraction from Cassia alata leaves (CaRP) was used to obtain a refined extract. Higher than wild-type sensitivity to CaRP was exhibited by 16 haploid Saccharomyces cerevisiae mutants with defects in DNA repair and membrane transport. CaRP had a strong DPPH free radical scavenging activity with an IC50 value of 2.27 μg mL−1 and showed no pro-oxidant activity in yeast. CaRP compounds were separated by HPLC and the three major components were shown to bind to DNA in vitro. The major HPLC peak was identified as kampferol-3-O-β-d-glucoside (astragalin), which showed high affinity to DNA as seen by HPLC-UV measurement after using centrifugal ultrafiltration of astragalin-DNA mixtures. Astragalin-DNA interaction was further studied by spectroscopic methods and its interaction with DNA was evaluated using solid-state FTIR. These and computational (in silico) docking studies revealed that astragalin-DNA binding occurs through interaction with G-C base pairs, possibly by intercalation stabilized by H-bond formation. PMID:22489129
Saito, Samuel; Silva, Givaldo; Santos, Regineide Xavier; Gosmann, Grace; Pungartnik, Cristina; Brendel, Martin
2012-01-01
Reverse phase-solid phase extraction from Cassia alata leaves (CaRP) was used to obtain a refined extract. Higher than wild-type sensitivity to CaRP was exhibited by 16 haploid Saccharomyces cerevisiae mutants with defects in DNA repair and membrane transport. CaRP had a strong DPPH free radical scavenging activity with an IC(50) value of 2.27 μg mL(-1) and showed no pro-oxidant activity in yeast. CaRP compounds were separated by HPLC and the three major components were shown to bind to DNA in vitro. The major HPLC peak was identified as kampferol-3-O-β-d-glucoside (astragalin), which showed high affinity to DNA as seen by HPLC-UV measurement after using centrifugal ultrafiltration of astragalin-DNA mixtures. Astragalin-DNA interaction was further studied by spectroscopic methods and its interaction with DNA was evaluated using solid-state FTIR. These and computational (in silico) docking studies revealed that astragalin-DNA binding occurs through interaction with G-C base pairs, possibly by intercalation stabilized by H-bond formation.
An experimental study of the putative mechanism of a synthetic autonomous rotary DNA nanomotor
NASA Astrophysics Data System (ADS)
Dunn, K. E.; Leake, M. C.; Wollman, A. J. M.; Trefzer, M. A.; Johnson, S.; Tyrrell, A. M.
2017-03-01
DNA has been used to construct a wide variety of nanoscale molecular devices. Inspiration for such synthetic molecular machines is frequently drawn from protein motors, which are naturally occurring and ubiquitous. However, despite the fact that rotary motors such as ATP synthase and the bacterial flagellar motor play extremely important roles in nature, very few rotary devices have been constructed using DNA. This paper describes an experimental study of the putative mechanism of a rotary DNA nanomotor, which is based on strand displacement, the phenomenon that powers many synthetic linear DNA motors. Unlike other examples of rotary DNA machines, the device described here is designed to be capable of autonomous operation after it is triggered. The experimental results are consistent with operation of the motor as expected, and future work on an enhanced motor design may allow rotation to be observed at the single-molecule level. The rotary motor concept presented here has potential applications in molecular processing, DNA computing, biosensing and photonics.
DNA packaging in viral capsids with peptide arms.
Cao, Qianqian; Bachmann, Michael
2017-01-18
Strong chain rigidity and electrostatic self-repulsion of packed double-stranded DNA in viruses require a molecular motor to pull the DNA into the capsid. However, what is the role of electrostatic interactions between different charged components in the packaging process? Though various theories and computer simulation models were developed for the understanding of viral assembly and packaging dynamics of the genome, long-range electrostatic interactions and capsid structure have typically been neglected or oversimplified. By means of molecular dynamics simulations, we explore the effects of electrostatic interactions on the packaging dynamics of DNA based on a coarse-grained DNA and capsid model by explicitly including peptide arms (PAs), linked to the inner surface of the capsid, and counterions. Our results indicate that the electrostatic interactions between PAs, DNA, and counterions have a significant influence on the packaging dynamics. We also find that the packed DNA conformations are largely affected by the structure of the PA layer, but the packaging rate is insensitive to the layer structure.
Khara, Dinesh C; Berger, Yaron; Ouldridge, Thomas E
2018-01-01
Abstract We present a detailed coarse-grained computer simulation and single molecule fluorescence study of the walking dynamics and mechanism of a DNA bipedal motor striding on a DNA origami. In particular, we study the dependency of the walking efficiency and stepping kinetics on step size. The simulations accurately capture and explain three different experimental observations. These include a description of the maximum possible step size, a decrease in the walking efficiency over short distances and a dependency of the efficiency on the walking direction with respect to the origami track. The former two observations were not expected and are non-trivial. Based on this study, we suggest three design modifications to improve future DNA walkers. Our study demonstrates the ability of the oxDNA model to resolve the dynamics of complex DNA machines, and its usefulness as an engineering tool for the design of DNA machines that operate in the three spatial dimensions. PMID:29294083
Electronic Transport in Single-Stranded DNA Molecule Related to Huntington's Disease
NASA Astrophysics Data System (ADS)
Sarmento, R. G.; Silva, R. N. O.; Madeira, M. P.; Frazão, N. F.; Sousa, J. O.; Macedo-Filho, A.
2018-04-01
We report a numerical analysis of the electronic transport in single chain DNA molecule consisting of 182 nucleotides. The DNA chains studied were extracted from a segment of the human chromosome 4p16.3, which were modified by expansion of CAG (cytosine-adenine-guanine) triplet repeats to mimics Huntington's disease. The mutated DNA chains were connected between two platinum electrodes to analyze the relationship between charge propagation in the molecule and Huntington's disease. The computations were performed within a tight-binding model, together with a transfer matrix technique, to investigate the current-voltage (I-V) of 23 types of DNA sequence and compare them with the distributions of the related CAG repeat numbers with the disease. All DNA sequences studied have a characteristic behavior of a semiconductor. In addition, the results showed a direct correlation between the current-voltage curves and the distributions of the CAG repeat numbers, suggesting possible applications in the development of DNA-based biosensors for molecular diagnostics.
Kawano, Tomonori
2013-01-01
There have been a wide variety of approaches for handling the pieces of DNA as the “unplugged” tools for digital information storage and processing, including a series of studies applied to the security-related area, such as DNA-based digital barcodes, water marks and cryptography. In the present article, novel designs of artificial genes as the media for storing the digitally compressed data for images are proposed for bio-computing purpose while natural genes principally encode for proteins. Furthermore, the proposed system allows cryptographical application of DNA through biochemically editable designs with capacity for steganographical numeric data embedment. As a model case of image-coding DNA technique application, numerically and biochemically combined protocols are employed for ciphering the given “passwords” and/or secret numbers using DNA sequences. The “passwords” of interest were decomposed into single letters and translated into the font image coded on the separate DNA chains with both the coding regions in which the images are encoded based on the novel run-length encoding rule, and the non-coding regions designed for biochemical editing and the remodeling processes revealing the hidden orientation of letters composing the original “passwords.” The latter processes require the molecular biological tools for digestion and ligation of the fragmented DNA molecules targeting at the polymerase chain reaction-engineered termini of the chains. Lastly, additional protocols for steganographical overwriting of the numeric data of interests over the image-coding DNA are also discussed. PMID:23750303
Spring-Connell, Alexander M.; Evich, Marina G.; Debelak, Harald; Seela, Frank; Germann, Markus W.
2016-01-01
A truly universal nucleobase enables a host of novel applications such as simplified templates for PCR primers, randomized sequencing and DNA based devices. A universal base must pair indiscriminately to each of the canonical bases with little or preferably no destabilization of the overall duplex. In reality, many candidates either destabilize the duplex or do not base pair indiscriminatingly. The novel base 8-aza-7-deazaadenine (pyrazolo[3,4-d]pyrimidin- 4-amine) N8-(2′deoxyribonucleoside), a deoxyadenosine analog (UB), pairs with each of the natural DNA bases with little sequence preference. We have utilized NMR complemented with molecular dynamic calculations to characterize the structure and dynamics of a UB incorporated into a DNA duplex. The UB participates in base stacking with little to no perturbation of the local structure yet forms an unusual base pair that samples multiple conformations. These local dynamics result in the complete disappearance of a single UB proton resonance under native conditions. Accommodation of the UB is additionally stabilized via heightened backbone conformational sampling. NMR combined with various computational techniques has allowed for a comprehensive characterization of both structural and dynamic effects of the UB in a DNA duplex and underlines that the UB as a strong candidate for universal base applications. PMID:27566150
Georgieva, Milena; Zagorchev, Plamen; Miloshev, George
2015-10-01
Comet assay is an invaluable tool in DNA research. It is widely used to detect DNA damage as an indicator of exposure to genotoxic stress. A canonical set of parameters and specialized software programs exist for Comet assay data quantification and analysis. None of them so far has proven its potential to employ a computer-based algorithm for assessment of the shape of the comet as an indicator of the exact mechanism by which the studied genotoxins cut in the molecule of DNA. Here, we present 14 unique measurements of the comet image based on the comet morphology. Their mathematical derivation and statistical analysis allowed precise description of the shape of the comet image which in turn discriminated the cause of genotoxic stress. This algorithm led to the development of the "CometShape" software which allowed easy discrimination among different genotoxins depending on the type of DNA damage they induce. © 2015 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
Fluctuations in the DNA double helix
NASA Astrophysics Data System (ADS)
Peyrard, M.; López, S. C.; Angelov, D.
2007-08-01
DNA is not the static entity suggested by the famous double helix structure. It shows large fluctuational openings, in which the bases, which contain the genetic code, are temporarily open. Therefore it is an interesting system to study the effect of nonlinearity on the physical properties of a system. A simple model for DNA, at a mesoscopic scale, can be investigated by computer simulation, in the same spirit as the original work of Fermi, Pasta and Ulam. These calculations raise fundamental questions in statistical physics because they show a temporary breaking of equipartition of energy, regions with large amplitude fluctuations being able to coexist with regions where the fluctuations are very small, even when the model is studied in the canonical ensemble. This phenomenon can be related to nonlinear excitations in the model. The ability of the model to describe the actual properties of DNA is discussed by comparing theoretical and experimental results for the probability that base pairs open an a given temperature in specific DNA sequences. These studies give us indications on the proper description of the effect of the sequence in the mesoscopic model.
Nucleotide exchange and excision technology DNA shuffling and directed evolution.
Speck, Janina; Stebel, Sabine C; Arndt, Katja M; Müller, Kristian M
2011-01-01
Remarkable success in optimizing complex properties within DNA and proteins has been achieved by directed evolution. In contrast to various random mutagenesis methods and high-throughput selection methods, the number of available DNA shuffling procedures is limited, and protocols are often difficult to adjust. The strength of the nucleotide exchange and excision technology (NExT) DNA shuffling described here is the robust, efficient, and easily controllable DNA fragmentation step based on random incorporation of the so-called 'exchange nucleotides' by PCR. The exchange nucleotides are removed enzymatically, followed by chemical cleavage of the DNA backbone. The oligonucleotide pool is reassembled into full-length genes by internal primer extension, and the recombined gene library is amplified by standard PCR. The technique has been demonstrated by shuffling a defined gene library of chloramphenicol acetyltransferase variants using uridine as fragmentation defining exchange nucleotide. Substituting 33% of the dTTP with dUTP in the incorporation PCR resulted in shuffled clones with an average parental fragment size of 86 bases and revealed a mutation rate of only 0.1%. Additionally, a computer program (NExTProg) has been developed that predicts the fragment size distribution depending on the relative amount of the exchange nucleotide.
Computational design of co-assembling protein-DNA nanowires
NASA Astrophysics Data System (ADS)
Mou, Yun; Yu, Jiun-Yann; Wannier, Timothy M.; Guo, Chin-Lin; Mayo, Stephen L.
2015-09-01
Biomolecular self-assemblies are of great interest to nanotechnologists because of their functional versatility and their biocompatibility. Over the past decade, sophisticated single-component nanostructures composed exclusively of nucleic acids, peptides and proteins have been reported, and these nanostructures have been used in a wide range of applications, from drug delivery to molecular computing. Despite these successes, the development of hybrid co-assemblies of nucleic acids and proteins has remained elusive. Here we use computational protein design to create a protein-DNA co-assembling nanomaterial whose assembly is driven via non-covalent interactions. To achieve this, a homodimerization interface is engineered onto the Drosophila Engrailed homeodomain (ENH), allowing the dimerized protein complex to bind to two double-stranded DNA (dsDNA) molecules. By varying the arrangement of protein-binding sites on the dsDNA, an irregular bulk nanoparticle or a nanowire with single-molecule width can be spontaneously formed by mixing the protein and dsDNA building blocks. We characterize the protein-DNA nanowire using fluorescence microscopy, atomic force microscopy and X-ray crystallography, confirming that the nanowire is formed via the proposed mechanism. This work lays the foundation for the development of new classes of protein-DNA hybrid materials. Further applications can be explored by incorporating DNA origami, DNA aptamers and/or peptide epitopes into the protein-DNA framework presented here.
Exercises in Molecular Computing
2014-01-01
Conspectus The successes of electronic digital logic have transformed every aspect of human life over the last half-century. The word “computer” now signifies a ubiquitous electronic device, rather than a human occupation. Yet evidently humans, large assemblies of molecules, can compute, and it has been a thrilling challenge to develop smaller, simpler, synthetic assemblies of molecules that can do useful computation. When we say that molecules compute, what we usually mean is that such molecules respond to certain inputs, for example, the presence or absence of other molecules, in a precisely defined but potentially complex fashion. The simplest way for a chemist to think about computing molecules is as sensors that can integrate the presence or absence of multiple analytes into a change in a single reporting property. Here we review several forms of molecular computing developed in our laboratories. When we began our work, combinatorial approaches to using DNA for computing were used to search for solutions to constraint satisfaction problems. We chose to work instead on logic circuits, building bottom-up from units based on catalytic nucleic acids, focusing on DNA secondary structures in the design of individual circuit elements, and reserving the combinatorial opportunities of DNA for the representation of multiple signals propagating in a large circuit. Such circuit design directly corresponds to the intuition about sensors transforming the detection of analytes into reporting properties. While this approach was unusual at the time, it has been adopted since by other groups working on biomolecular computing with different nucleic acid chemistries. We created logic gates by modularly combining deoxyribozymes (DNA-based enzymes cleaving or combining other oligonucleotides), in the role of reporting elements, with stem–loops as input detection elements. For instance, a deoxyribozyme that normally exhibits an oligonucleotide substrate recognition region is modified such that a stem–loop closes onto the substrate recognition region, making it unavailable for the substrate and thus rendering the deoxyribozyme inactive. But a conformational change can then be induced by an input oligonucleotide, complementary to the loop, to open the stem, allow the substrate to bind, and allow its cleavage to proceed, which is eventually reported via fluorescence. In this Account, several designs of this form are reviewed, along with their application in the construction of large circuits that exhibited complex logical and temporal relationships between the inputs and the outputs. Intelligent (in the sense of being capable of nontrivial information processing) theranostic (therapy + diagnostic) applications have always been the ultimate motivation for developing computing (i.e., decision-making) circuits, and we review our experiments with logic-gate elements bound to cell surfaces that evaluate the proximal presence of multiple markers on lymphocytes. PMID:24873234
Lin, Xiaodong; Deng, Jiankang; Lyu, Yanlong; Qian, Pengcheng; Li, Yunfei
2018-01-01
The integration of multiple DNA logic gates on a universal platform to implement advance logic functions is a critical challenge for DNA computing. Herein, a straightforward and powerful strategy in which a guanine-rich DNA sequence lighting up a silver nanocluster and fluorophore was developed to construct a library of logic gates on a simple DNA-templated silver nanoclusters (DNA-AgNCs) platform. This library included basic logic gates, YES, AND, OR, INHIBIT, and XOR, which were further integrated into complex logic circuits to implement diverse advanced arithmetic/non-arithmetic functions including half-adder, half-subtractor, multiplexer, and demultiplexer. Under UV irradiation, all the logic functions could be instantly visualized, confirming an excellent repeatability. The logic operations were entirely based on DNA hybridization in an enzyme-free and label-free condition, avoiding waste accumulation and reducing cost consumption. Interestingly, a DNA-AgNCs-based multiplexer was, for the first time, used as an intelligent biosensor to identify pathogenic genes, E. coli and S. aureus genes, with a high sensitivity. The investigation provides a prototype for the wireless integration of multiple devices on even the simplest single-strand DNA platform to perform diverse complex functions in a straightforward and cost-effective way. PMID:29675221
DMINDA: an integrated web server for DNA motif identification and analyses.
Ma, Qin; Zhang, Hanyuan; Mao, Xizeng; Zhou, Chuan; Liu, Bingqiang; Chen, Xin; Xu, Ying
2014-07-01
DMINDA (DNA motif identification and analyses) is an integrated web server for DNA motif identification and analyses, which is accessible at http://csbl.bmb.uga.edu/DMINDA/. This web site is freely available to all users and there is no login requirement. This server provides a suite of cis-regulatory motif analysis functions on DNA sequences, which are important to elucidation of the mechanisms of transcriptional regulation: (i) de novo motif finding for a given set of promoter sequences along with statistical scores for the predicted motifs derived based on information extracted from a control set, (ii) scanning motif instances of a query motif in provided genomic sequences, (iii) motif comparison and clustering of identified motifs, and (iv) co-occurrence analyses of query motifs in given promoter sequences. The server is powered by a backend computer cluster with over 150 computing nodes, and is particularly useful for motif prediction and analyses in prokaryotic genomes. We believe that DMINDA, as a new and comprehensive web server for cis-regulatory motif finding and analyses, will benefit the genomic research community in general and prokaryotic genome researchers in particular. © The Author(s) 2014. Published by Oxford University Press on behalf of Nucleic Acids Research.
Bertucci, Alessandro; Manicardi, Alex; Candiani, Alessandro; Giannetti, Sara; Cucinotta, Annamaria; Spoto, Giuseppe; Konstantaki, Maria; Pissadakis, Stavros; Selleri, Stefano; Corradini, Roberto
2015-01-15
Microstructured optical fibers containing microchannels and Bragg grating inscribed were internally functionalized with a peptide nucleic acid (PNA) probe specific for a gene tract of the genetically modified Roundup Ready soy. These fibers were used as an optofluidic device for the detection of DNA by measuring the shift in the wavelength of the reflected IR light. Enhancement of optical read-out was obtained using streptavidin coated gold-nanoparticles interacting with the genomic DNA captured in the fiber channels (0%, 0.1%, 1% and 10% RR-Soy), enabling to achieve statistically significant, label-free, and amplification-free detection of target DNA in low concentrations, low percentages, and very low sample volumes. Computer simulations of the fiber optics based on the finite element method (FEM) were consistent with the formation of a layer of organic material with an average thickness of 39 nm for the highest percentage (10% RR soy) analysed. Copyright © 2014 Elsevier B.V. All rights reserved.
Investigating the dynamics of surface-immobilized DNA nanomachines
Dunn, Katherine E.; Trefzer, Martin A.; Johnson, Steven; Tyrrell, Andy M.
2016-01-01
Surface-immobilization of molecules can have a profound influence on their structure, function and dynamics. Toehold-mediated strand displacement is often used in solution to drive synthetic nanomachines made from DNA, but the effects of surface-immobilization on the mechanism and kinetics of this reaction have not yet been fully elucidated. Here we show that the kinetics of strand displacement in surface-immobilized nanomachines are significantly different to those of the solution phase reaction, and we attribute this to the effects of intermolecular interactions within the DNA layer. We demonstrate that the dynamics of strand displacement can be manipulated by changing strand length, concentration and G/C content. By inserting mismatched bases it is also possible to tune the rates of the constituent displacement processes (toehold-binding and branch migration) independently, and information can be encoded in the time-dependence of the overall reaction. Our findings will facilitate the rational design of surface-immobilized dynamic DNA nanomachines, including computing devices and track-based motors. PMID:27387252
Investigating the dynamics of surface-immobilized DNA nanomachines
NASA Astrophysics Data System (ADS)
Dunn, Katherine E.; Trefzer, Martin A.; Johnson, Steven; Tyrrell, Andy M.
2016-07-01
Surface-immobilization of molecules can have a profound influence on their structure, function and dynamics. Toehold-mediated strand displacement is often used in solution to drive synthetic nanomachines made from DNA, but the effects of surface-immobilization on the mechanism and kinetics of this reaction have not yet been fully elucidated. Here we show that the kinetics of strand displacement in surface-immobilized nanomachines are significantly different to those of the solution phase reaction, and we attribute this to the effects of intermolecular interactions within the DNA layer. We demonstrate that the dynamics of strand displacement can be manipulated by changing strand length, concentration and G/C content. By inserting mismatched bases it is also possible to tune the rates of the constituent displacement processes (toehold-binding and branch migration) independently, and information can be encoded in the time-dependence of the overall reaction. Our findings will facilitate the rational design of surface-immobilized dynamic DNA nanomachines, including computing devices and track-based motors.
Entropic fluctuations in DNA sequences
NASA Astrophysics Data System (ADS)
Thanos, Dimitrios; Li, Wentian; Provata, Astero
2018-03-01
The Local Shannon Entropy (LSE) in blocks is used as a complexity measure to study the information fluctuations along DNA sequences. The LSE of a DNA block maps the local base arrangement information to a single numerical value. It is shown that despite this reduction of information, LSE allows to extract meaningful information related to the detection of repetitive sequences in whole chromosomes and is useful in finding evolutionary differences between organisms. More specifically, large regions of tandem repeats, such as centromeres, can be detected based on their low LSE fluctuations along the chromosome. Furthermore, an empirical investigation of the appropriate block sizes is provided and the relationship of LSE properties with the structure of the underlying repetitive units is revealed by using both computational and mathematical methods. Sequence similarity between the genomic DNA of closely related species also leads to similar LSE values at the orthologous regions. As an application, the LSE covariance function is used to measure the evolutionary distance between several primate genomes.
Chang, Weng-Long
2012-03-01
Assume that n is a positive integer. If there is an integer such that M (2) ≡ C (mod n), i.e., the congruence has a solution, then C is said to be a quadratic congruence (mod n). If the congruence does not have a solution, then C is said to be a quadratic noncongruence (mod n). The task of solving the problem is central to many important applications, the most obvious being cryptography. In this article, we describe a DNA-based algorithm for solving quadratic congruence and factoring integers. In additional to this novel contribution, we also show the utility of our encoding scheme, and of the algorithm's submodules. We demonstrate how a variety of arithmetic, shifted and comparative operations, namely bitwise and full addition, subtraction, left shifter and comparison perhaps are performed using strands of DNA.
Spreadsheet-based program for alignment of overlapping DNA sequences.
Anbazhagan, R; Gabrielson, E
1999-06-01
Molecular biology laboratories frequently face the challenge of aligning small overlapping DNA sequences derived from a long DNA segment. Here, we present a short program that can be used to adapt Excel spreadsheets as a tool for aligning DNA sequences, regardless of their orientation. The program runs on any Windows or Macintosh operating system computer with Excel 97 or Excel 98. The program is available for use as an Excel file, which can be downloaded from the BioTechniques Web site. Upon execution, the program opens a specially designed customized workbook and is capable of identifying overlapping regions between two sequence fragments and displaying the sequence alignment. It also performs a number of specialized functions such as recognition of restriction enzyme cutting sites and CpG island mapping without costly specialized software.
XLS (c9orf142) is a new component of mammalian DNA double-stranded break repair
Craxton, A; Somers, J; Munnur, D; Jukes-Jones, R; Cain, K; Malewicz, M
2015-01-01
Repair of double-stranded DNA breaks (DSBs) in mammalian cells primarily occurs by the non-homologous end-joining (NHEJ) pathway, which requires seven core proteins (Ku70/Ku86, DNA-PKcs (DNA-dependent protein kinase catalytic subunit), Artemis, XRCC4-like factor (XLF), XRCC4 and DNA ligase IV). Here we show using combined affinity purification and mass spectrometry that DNA-PKcs co-purifies with all known core NHEJ factors. Furthermore, we have identified a novel evolutionary conserved protein associated with DNA-PKcs—c9orf142. Computer-based modelling of c9orf142 predicted a structure very similar to XRCC4, hence we have named c9orf142—XLS (XRCC4-like small protein). Depletion of c9orf142/XLS in cells impaired DSB repair consistent with a defect in NHEJ. Furthermore, c9orf142/XLS interacted with other core NHEJ factors. These results demonstrate the existence of a new component of the NHEJ DNA repair pathway in mammalian cells. PMID:25941166
New t-gap insertion-deletion-like metrics for DNA hybridization thermodynamic modeling.
D'yachkov, Arkadii G; Macula, Anthony J; Pogozelski, Wendy K; Renz, Thomas E; Rykov, Vyacheslav V; Torney, David C
2006-05-01
We discuss the concept of t-gap block isomorphic subsequences and use it to describe new abstract string metrics that are similar to the Levenshtein insertion-deletion metric. Some of the metrics that we define can be used to model a thermodynamic distance function on single-stranded DNA sequences. Our model captures a key aspect of the nearest neighbor thermodynamic model for hybridized DNA duplexes. One version of our metric gives the maximum number of stacked pairs of hydrogen bonded nucleotide base pairs that can be present in any secondary structure in a hybridized DNA duplex without pseudoknots. Thermodynamic distance functions are important components in the construction of DNA codes, and DNA codes are important components in biomolecular computing, nanotechnology, and other biotechnical applications that employ DNA hybridization assays. We show how our new distances can be calculated by using a dynamic programming method, and we derive a Varshamov-Gilbert-like lower bound on the size of some of codes using these distance functions as constraints. We also discuss software implementation of our DNA code design methods.
Gonzalez, E; Lino, J; Deriabina, A; Herrera, J N F; Poltev, V I
2013-01-01
To elucidate details of the DNA-water interactions we performed the calculations and systemaitic search for minima of interaction energy of the systems consisting of one of DNA bases and one or two water molecules. The results of calculations using two force fields of molecular mechanics (MM) and correlated ab initio method MP2/6-31G(d, p) of quantum mechanics (QM) have been compared with one another and with experimental data. The calculations demonstrated a qualitative agreement between geometry characteristics of the most of local energy minima obtained via different methods. The deepest minima revealed by MM and QM methods correspond to water molecule position between two neighbor hydrophilic centers of the base and to the formation by water molecule of hydrogen bonds with them. Nevertheless, the relative depth of some minima and peculiarities of mutual water-base positions in' these minima depend on the method used. The analysis revealed insignificance of some differences in the results of calculations performed via different methods and the importance of other ones for the description of DNA hydration. The calculations via MM methods enable us to reproduce quantitatively all the experimental data on the enthalpies of complex formation of single water molecule with the set of mono-, di-, and trimethylated bases, as well as on water molecule locations near base hydrophilic atoms in the crystals of DNA duplex fragments, while some of these data cannot be rationalized by QM calculations.
Singh, Vinod Kumar; Krishnamachari, Annangarachari
2016-09-01
Genome-wide experimental studies in Saccharomyces cerevisiae reveal that autonomous replicating sequence (ARS) requires an essential consensus sequence (ACS) for replication activity. Computational studies identified thousands of ACS like patterns in the genome. However, only a few hundreds of these sites act as replicating sites and the rest are considered as dormant or evolving sites. In a bid to understand the sequence makeup of replication sites, a content and context-based analysis was performed on a set of replicating ACS sequences that binds to origin-recognition complex (ORC) denoted as ORC-ACS and non-replicating ACS sequences (nrACS), that are not bound by ORC. In this study, DNA properties such as base composition, correlation, sequence dependent thermodynamic and DNA structural profiles, and their positions have been considered for characterizing ORC-ACS and nrACS. Analysis reveals that ORC-ACS depict marked differences in nucleotide composition and context features in its vicinity compared to nrACS. Interestingly, an A-rich motif was also discovered in ORC-ACS sequences within its nucleosome-free region. Profound changes in the conformational features, such as DNA helical twist, inclination angle and stacking energy between ORC-ACS and nrACS were observed. Distribution of ACS motifs in the non-coding segments points to the locations of ORC-ACS which are found far away from the adjacent gene start position compared to nrACS thereby enabling an accessible environment for ORC-proteins. Our attempt is novel in considering the contextual view of ACS and its flanking region along with nucleosome positioning in the S. cerevisiae genome and may be useful for any computational prediction scheme.
Polymorphic design of DNA origami structures through mechanical control of modular components.
Lee, Chanseok; Lee, Jae Young; Kim, Do-Nyun
2017-12-12
Scaffolded DNA origami enables the bottom-up fabrication of diverse DNA nanostructures by designing hundreds of staple strands, comprised of complementary sequences to the specific binding locations of a scaffold strand. Despite its exceptionally high design flexibility, poor reusability of staples has been one of the major hurdles to fabricate assorted DNA constructs in an effective way. Here we provide a rational module-based design approach to create distinct bent shapes with controllable geometries and flexibilities from a single, reference set of staples. By revising the staple connectivity within the desired module, we can control the location, stiffness, and included angle of hinges precisely, enabling the construction of dozens of single- or multiple-hinge structures with the replacement of staple strands up to 12.8% only. Our design approach, combined with computational shape prediction and analysis, can provide a versatile and cost-effective procedure in the design of DNA origami shapes with stiffness-tunable units.
Comparing DNA damage-processing pathways by computer analysis of chromosome painting data.
Levy, Dan; Vazquez, Mariel; Cornforth, Michael; Loucas, Bradford; Sachs, Rainer K; Arsuaga, Javier
2004-01-01
Chromosome aberrations are large-scale illegitimate rearrangements of the genome. They are indicative of DNA damage and informative about damage processing pathways. Despite extensive investigations over many years, the mechanisms underlying aberration formation remain controversial. New experimental assays such as multiplex fluorescent in situ hybridyzation (mFISH) allow combinatorial "painting" of chromosomes and are promising for elucidating aberration formation mechanisms. Recently observed mFISH aberration patterns are so complex that computer and graph-theoretical methods are needed for their full analysis. An important part of the analysis is decomposing a chromosome rearrangement process into "cycles." A cycle of order n, characterized formally by the cyclic graph with 2n vertices, indicates that n chromatin breaks take part in a single irreducible reaction. We here describe algorithms for computing cycle structures from experimentally observed or computer-simulated mFISH aberration patterns. We show that analyzing cycles quantitatively can distinguish between different aberration formation mechanisms. In particular, we show that homology-based mechanisms do not generate the large number of complex aberrations, involving higher-order cycles, observed in irradiated human lymphocytes.
Intrinsically bent DNA in replication origins and gene promoters.
Gimenes, F; Takeda, K I; Fiorini, A; Gouveia, F S; Fernandez, M A
2008-06-24
Intrinsically bent DNA is an alternative conformation of the DNA molecule caused by the presence of dA/dT tracts, 2 to 6 bp long, in a helical turn phase DNA or with multiple intervals of 10 to 11 bp. Other than flexibility, intrinsic bending sites induce DNA curvature in particular chromosome regions such as replication origins and promoters. Intrinsically bent DNA sites are important in initiating DNA replication, and are sometimes found near to regions associated with the nuclear matrix. Many methods have been developed to localize bent sites, for example, circular permutation, computational analysis, and atomic force microscopy. This review discusses intrinsically bent DNA sites associated with replication origins and gene promoter regions in prokaryote and eukaryote cells. We also describe methods for identifying bent DNA sites for circular permutation and computational analysis.
Musset, Lise; Hubert, Véronique; Le Bras, Jacques
2014-01-01
The usefulness of atovaquone-proguanil (AP) as an antimalarial treatment is compromised by the emergence of atovaquone resistance during therapy. However, the origin of the parasite mitochondrial DNA (mtDNA) mutation conferring atovaquone resistance remains elusive. Here, we report a patient-based stochastic model that tracks the intrahost emergence of mutations in the multicopy mtDNA during the first erythrocytic parasite cycles leading to the malaria febrile episode. The effect of mtDNA copy number, mutation rate, mutation cost, and total parasite load on the mutant parasite load per patient was evaluated. Computer simulations showed that almost any infected patient carried, after four to seven erythrocytic cycles, de novo mutant parasites at low frequency, with varied frequencies of parasites carrying varied numbers of mutant mtDNA copies. A large interpatient variability in the size of this mutant reservoir was found; this variability was due to the different parameters tested but also to the relaxed replication and partitioning of mtDNA copies during mitosis. We also report seven clinical cases in which AP-resistant infections were treated by AP. These provided evidence that parasiticidal drug concentrations against AP-resistant parasites were transiently obtained within days after treatment initiation. Altogether, these results suggest that each patient carries new mtDNA mutant parasites that emerge before treatment but are killed by high starting drug concentrations. However, because the size of this mutant reservoir is highly variable from patient to patient, we propose that some patients fail to eliminate all of the mutant parasites, repeatedly producing de novo AP treatment failures. PMID:24867967
Efficient alignment-free DNA barcode analytics
Kuksa, Pavel; Pavlovic, Vladimir
2009-01-01
Background In this work we consider barcode DNA analysis problems and address them using alternative, alignment-free methods and representations which model sequences as collections of short sequence fragments (features). The methods use fixed-length representations (spectrum) for barcode sequences to measure similarities or dissimilarities between sequences coming from the same or different species. The spectrum-based representation not only allows for accurate and computationally efficient species classification, but also opens possibility for accurate clustering analysis of putative species barcodes and identification of critical within-barcode loci distinguishing barcodes of different sample groups. Results New alignment-free methods provide highly accurate and fast DNA barcode-based identification and classification of species with substantial improvements in accuracy and speed over state-of-the-art barcode analysis methods. We evaluate our methods on problems of species classification and identification using barcodes, important and relevant analytical tasks in many practical applications (adverse species movement monitoring, sampling surveys for unknown or pathogenic species identification, biodiversity assessment, etc.) On several benchmark barcode datasets, including ACG, Astraptes, Hesperiidae, Fish larvae, and Birds of North America, proposed alignment-free methods considerably improve prediction accuracy compared to prior results. We also observe significant running time improvements over the state-of-the-art methods. Conclusion Our results show that newly developed alignment-free methods for DNA barcoding can efficiently and with high accuracy identify specimens by examining only few barcode features, resulting in increased scalability and interpretability of current computational approaches to barcoding. PMID:19900305
Hu, Yue-Qing; Fung, Wing K
2003-08-01
The effect of a structured population on the likelihood ratio of a DNA mixture has been studied by the current authors and others. In practice, contributors of a DNA mixture may belong to different ethnic/racial origins, a situation especially common in multi-racial countries such as the USA and Singapore. We have developed a computer software which is available on the web for evaluating DNA mixtures in multi-structured populations. The software can deal with various DNA mixture problems that cannot be handled by the methods given in a recent article of Fung and Hu.
Hohenstein, Edward G; Parrish, Robert M; Sherrill, C David; Turney, Justin M; Schaefer, Henry F
2011-11-07
Symmetry-adapted perturbation theory (SAPT) provides a means of probing the fundamental nature of intermolecular interactions. Low-orders of SAPT (here, SAPT0) are especially attractive since they provide qualitative (sometimes quantitative) results while remaining tractable for large systems. The application of density fitting and Laplace transformation techniques to SAPT0 can significantly reduce the expense associated with these computations and make even larger systems accessible. We present new factorizations of the SAPT0 equations with density-fitted two-electron integrals and the first application of Laplace transformations of energy denominators to SAPT. The improved scalability of the DF-SAPT0 implementation allows it to be applied to systems with more than 200 atoms and 2800 basis functions. The Laplace-transformed energy denominators are compared to analogous partial Cholesky decompositions of the energy denominator tensor. Application of our new DF-SAPT0 program to the intercalation of DNA by proflavine has allowed us to determine the nature of the proflavine-DNA interaction. Overall, the proflavine-DNA interaction contains important contributions from both electrostatics and dispersion. The energetics of the intercalator interaction are are dominated by the stacking interactions (two-thirds of the total), but contain important contributions from the intercalator-backbone interactions. It is hypothesized that the geometry of the complex will be determined by the interactions of the intercalator with the backbone, because by shifting toward one side of the backbone, the intercalator can form two long hydrogen-bonding type interactions. The long-range interactions between the intercalator and the next-nearest base pairs appear to be negligible, justifying the use of truncated DNA models in computational studies of intercalation interaction energies.
NASA Astrophysics Data System (ADS)
Hohenstein, Edward G.; Parrish, Robert M.; Sherrill, C. David; Turney, Justin M.; Schaefer, Henry F.
2011-11-01
Symmetry-adapted perturbation theory (SAPT) provides a means of probing the fundamental nature of intermolecular interactions. Low-orders of SAPT (here, SAPT0) are especially attractive since they provide qualitative (sometimes quantitative) results while remaining tractable for large systems. The application of density fitting and Laplace transformation techniques to SAPT0 can significantly reduce the expense associated with these computations and make even larger systems accessible. We present new factorizations of the SAPT0 equations with density-fitted two-electron integrals and the first application of Laplace transformations of energy denominators to SAPT. The improved scalability of the DF-SAPT0 implementation allows it to be applied to systems with more than 200 atoms and 2800 basis functions. The Laplace-transformed energy denominators are compared to analogous partial Cholesky decompositions of the energy denominator tensor. Application of our new DF-SAPT0 program to the intercalation of DNA by proflavine has allowed us to determine the nature of the proflavine-DNA interaction. Overall, the proflavine-DNA interaction contains important contributions from both electrostatics and dispersion. The energetics of the intercalator interaction are are dominated by the stacking interactions (two-thirds of the total), but contain important contributions from the intercalator-backbone interactions. It is hypothesized that the geometry of the complex will be determined by the interactions of the intercalator with the backbone, because by shifting toward one side of the backbone, the intercalator can form two long hydrogen-bonding type interactions. The long-range interactions between the intercalator and the next-nearest base pairs appear to be negligible, justifying the use of truncated DNA models in computational studies of intercalation interaction energies.
Benesova, L; Belsanova, B; Suchanek, S; Kopeckova, M; Minarikova, P; Lipska, L; Levy, M; Visokai, V; Zavoral, M; Minarik, M
2013-02-15
Prognosis of solid cancers is generally more favorable if the disease is treated early and efficiently. A key to long cancer survival is in radical surgical therapy directed at the primary tumor followed by early detection of possible progression, with swift application of subsequent therapeutic intervention reducing the risk of disease generalization. The conventional follow-up care is based on regular observation of tumor markers in combination with computed tomography/endoscopic ultrasound/magnetic resonance/positron emission tomography imaging to monitor potential tumor progression. A recent development in methodologies allowing screening for a presence of cell-free DNA (cfDNA) brings a new viable tool in early detection and management of major cancers. It is believed that cfDNA is released from tumors primarily due to necrotization, whereas the origin of nontumorous cfDNA is mostly apoptotic. The process of cfDNA detection starts with proper collection and treatment of blood and isolation and storage of blood plasma. The next important steps include cfDNA extraction from plasma and its detection and/or quantification. To distinguish tumor cfDNA from nontumorous cfDNA, specific somatic DNA mutations, previously localized in the primary tumor tissue, are identified in the extracted cfDNA. Apart from conventional mutation detection approaches, several dedicated techniques have been presented to detect low levels of cfDNA in an excess of nontumorous (nonmutated) DNA, including real-time polymerase chain reaction (PCR), "BEAMing" (beads, emulsion, amplification, and magnetics), and denaturing capillary electrophoresis. Techniques to facilitate the mutant detection, such as mutant-enriched PCR and COLD-PCR (coamplification at lower denaturation temperature PCR), are also applicable. Finally, a number of newly developed miniaturized approaches, such as single-molecule sequencing, are promising for the future. Copyright © 2012 Elsevier Inc. All rights reserved.
Imaging and sizing of single DNA molecules on a mobile phone.
Wei, Qingshan; Luo, Wei; Chiang, Samuel; Kappel, Tara; Mejia, Crystal; Tseng, Derek; Chan, Raymond Yan Lok; Yan, Eddie; Qi, Hangfei; Shabbir, Faizan; Ozkan, Haydar; Feng, Steve; Ozcan, Aydogan
2014-12-23
DNA imaging techniques using optical microscopy have found numerous applications in biology, chemistry and physics and are based on relatively expensive, bulky and complicated set-ups that limit their use to advanced laboratory settings. Here we demonstrate imaging and length quantification of single molecule DNA strands using a compact, lightweight and cost-effective fluorescence microscope installed on a mobile phone. In addition to an optomechanical attachment that creates a high contrast dark-field imaging setup using an external lens, thin-film interference filters, a miniature dovetail stage and a laser-diode for oblique-angle excitation, we also created a computational framework and a mobile phone application connected to a server back-end for measurement of the lengths of individual DNA molecules that are labeled and stretched using disposable chips. Using this mobile phone platform, we imaged single DNA molecules of various lengths to demonstrate a sizing accuracy of <1 kilobase-pairs (kbp) for 10 kbp and longer DNA samples imaged over a field-of-view of ∼2 mm2.
Pauthenier, Cyrille; Faulon, Jean-Loup
2014-07-01
PrecisePrimer is a web-based primer design software made to assist experimentalists in any repetitive primer design task such as preparing, cloning and shuffling DNA libraries. Unlike other popular primer design tools, it is conceived to generate primer libraries with popular PCR polymerase buffers proposed as pre-set options. PrecisePrimer is also meant to design primers in batches, such as for DNA libraries creation of DNA shuffling experiments and to have the simplest interface possible. It integrates the most up-to-date melting temperature algorithms validated with experimental data, and cross validated with other computational tools. We generated a library of primers for the extraction and cloning of 61 genes from yeast DNA genomic extract using default parameters. All primer pairs efficiently amplified their target without any optimization of the PCR conditions. © The Author(s) 2014. Published by Oxford University Press on behalf of Nucleic Acids Research.
Conformational elasticity can facilitate TALE-DNA recognition
Lei, Hongxing; Sun, Jiya; Baldwin, Enoch P.; Segal, David J.; Duan, Yong
2015-01-01
Sequence-programmable transcription activator-like effector (TALE) proteins have emerged as a highly efficient tool for genome engineering. Recent crystal structures depict a transition between an open unbound solenoid and more compact DNA-bound solenoid formed by the 34 amino acid repeats. How TALEs switch conformation between these two forms without substantial energetic compensation, and how the repeat-variable di-residues (RVDs) discriminate between the cognate base and other bases still remain unclear. Computational analysis on these two aspects of TALE-DNA interaction mechanism has been conducted in order to achieve a better understanding of the energetics. High elasticity was observed in the molecular dynamics simulations of DNA-free TALE structure that started from the bound conformation where it sampled a wide range of conformations including the experimentally determined apo- and bound- conformations. This elastic feature was also observed in the simulations starting from the apo form which suggests low free energy barrier between the two conformations and small compensation required upon binding. To analyze binding specificity, we performed free energy calculations of various combinations of RVDs and bases using Poisson-Boltzmann/surface area (PBSA) and other approaches. The PBSA calculations indicated that the native RVD-base structures had lower binding free energy than mismatched structures for most of the RVDs examined. Our theoretical analyses provided new insight on the dynamics and energetics of TALE-DNA binding mechanism. PMID:24629191
Conformational elasticity can facilitate TALE-DNA recognition.
Lei, Hongxing; Sun, Jiya; Baldwin, Enoch P; Segal, David J; Duan, Yong
2014-01-01
Sequence-programmable transcription activator-like effector (TALE) proteins have emerged as a highly efficient tool for genome engineering. Recent crystal structures depict a transition between an open unbound solenoid and more compact DNA-bound solenoid formed by the 34 amino acid repeats. How TALEs switch conformation between these two forms without substantial energetic compensation, and how the repeat-variable di-residues (RVDs) discriminate between the cognate base and other bases still remain unclear. Computational analysis on these two aspects of TALE-DNA interaction mechanism has been conducted in order to achieve a better understanding of the energetics. High elasticity was observed in the molecular dynamics simulations of DNA-free TALE structure that started from the bound conformation where it sampled a wide range of conformations including the experimentally determined apo and bound conformations. This elastic feature was also observed in the simulations starting from the apo form which suggests low free energy barrier between the two conformations and small compensation required upon binding. To analyze binding specificity, we performed free energy calculations of various combinations of RVDs and bases using Poisson-Boltzmann surface area (PBSA) and other approaches. The PBSA calculations indicated that the native RVD-base structures had lower binding free energy than mismatched structures for most of the RVDs examined. Our theoretical analyses provided new insight on the dynamics and energetics of TALE-DNA binding mechanism. © 2014 Elsevier Inc. All rights reserved.
An affinity-structure database of helix-turn-helix: DNA complexes with a universal coordinate system
DOE Office of Scientific and Technical Information (OSTI.GOV)
AlQuraishi, Mohammed; Tang, Shengdong; Xia, Xide
Molecular interactions between proteins and DNA molecules underlie many cellular processes, including transcriptional regulation, chromosome replication, and nucleosome positioning. Computational analyses of protein-DNA interactions rely on experimental data characterizing known protein-DNA interactions structurally and biochemically. While many databases exist that contain either structural or biochemical data, few integrate these two data sources in a unified fashion. Such integration is becoming increasingly critical with the rapid growth of structural and biochemical data, and the emergence of algorithms that rely on the synthesis of multiple data types to derive computational models of molecular interactions. We have developed an integrated affinity-structure database inmore » which the experimental and quantitative DNA binding affinities of helix-turn-helix proteins are mapped onto the crystal structures of the corresponding protein-DNA complexes. This database provides access to: (i) protein-DNA structures, (ii) quantitative summaries of protein-DNA binding affinities using position weight matrices, and (iii) raw experimental data of protein-DNA binding instances. Critically, this database establishes a correspondence between experimental structural data and quantitative binding affinity data at the single basepair level. Furthermore, we present a novel alignment algorithm that structurally aligns the protein-DNA complexes in the database and creates a unified residue-level coordinate system for comparing the physico-chemical environments at the interface between complexes. Using this unified coordinate system, we compute the statistics of atomic interactions at the protein-DNA interface of helix-turn-helix proteins. We provide an interactive website for visualization, querying, and analyzing this database, and a downloadable version to facilitate programmatic analysis. Lastly, this database will facilitate the analysis of protein-DNA interactions and the development of programmatic computational methods that capitalize on integration of structural and biochemical datasets. The database can be accessed at http://ProteinDNA.hms.harvard.edu.« less
An affinity-structure database of helix-turn-helix: DNA complexes with a universal coordinate system
AlQuraishi, Mohammed; Tang, Shengdong; Xia, Xide
2015-11-19
Molecular interactions between proteins and DNA molecules underlie many cellular processes, including transcriptional regulation, chromosome replication, and nucleosome positioning. Computational analyses of protein-DNA interactions rely on experimental data characterizing known protein-DNA interactions structurally and biochemically. While many databases exist that contain either structural or biochemical data, few integrate these two data sources in a unified fashion. Such integration is becoming increasingly critical with the rapid growth of structural and biochemical data, and the emergence of algorithms that rely on the synthesis of multiple data types to derive computational models of molecular interactions. We have developed an integrated affinity-structure database inmore » which the experimental and quantitative DNA binding affinities of helix-turn-helix proteins are mapped onto the crystal structures of the corresponding protein-DNA complexes. This database provides access to: (i) protein-DNA structures, (ii) quantitative summaries of protein-DNA binding affinities using position weight matrices, and (iii) raw experimental data of protein-DNA binding instances. Critically, this database establishes a correspondence between experimental structural data and quantitative binding affinity data at the single basepair level. Furthermore, we present a novel alignment algorithm that structurally aligns the protein-DNA complexes in the database and creates a unified residue-level coordinate system for comparing the physico-chemical environments at the interface between complexes. Using this unified coordinate system, we compute the statistics of atomic interactions at the protein-DNA interface of helix-turn-helix proteins. We provide an interactive website for visualization, querying, and analyzing this database, and a downloadable version to facilitate programmatic analysis. Lastly, this database will facilitate the analysis of protein-DNA interactions and the development of programmatic computational methods that capitalize on integration of structural and biochemical datasets. The database can be accessed at http://ProteinDNA.hms.harvard.edu.« less
Abstractions for DNA circuit design.
Lakin, Matthew R; Youssef, Simon; Cardelli, Luca; Phillips, Andrew
2012-03-07
DNA strand displacement techniques have been used to implement a broad range of information processing devices, from logic gates, to chemical reaction networks, to architectures for universal computation. Strand displacement techniques enable computational devices to be implemented in DNA without the need for additional components, allowing computation to be programmed solely in terms of nucleotide sequences. A major challenge in the design of strand displacement devices has been to enable rapid analysis of high-level designs while also supporting detailed simulations that include known forms of interference. Another challenge has been to design devices capable of sustaining precise reaction kinetics over long periods, without relying on complex experimental equipment to continually replenish depleted species over time. In this paper, we present a programming language for designing DNA strand displacement devices, which supports progressively increasing levels of molecular detail. The language allows device designs to be programmed using a common syntax and then analysed at varying levels of detail, with or without interference, without needing to modify the program. This allows a trade-off to be made between the level of molecular detail and the computational cost of analysis. We use the language to design a buffered architecture for DNA devices, capable of maintaining precise reaction kinetics for a potentially unbounded period. We test the effectiveness of buffered gates to support long-running computation by designing a DNA strand displacement system capable of sustained oscillations.
Chiron: translating nanopore raw signal directly into nucleotide sequence using deep learning.
Teng, Haotian; Cao, Minh Duc; Hall, Michael B; Duarte, Tania; Wang, Sheng; Coin, Lachlan J M
2018-05-01
Sequencing by translocating DNA fragments through an array of nanopores is a rapidly maturing technology that offers faster and cheaper sequencing than other approaches. However, accurately deciphering the DNA sequence from the noisy and complex electrical signal is challenging. Here, we report Chiron, the first deep learning model to achieve end-to-end basecalling and directly translate the raw signal to DNA sequence without the error-prone segmentation step. Trained with only a small set of 4,000 reads, we show that our model provides state-of-the-art basecalling accuracy, even on previously unseen species. Chiron achieves basecalling speeds of more than 2,000 bases per second using desktop computer graphics processing units.
Chan, Sheng-Chieh; Chang, Kai-Ping; Fang, Yu-Hua Dean; Tsang, Ngan-Ming; Ng, Shu-Hang; Hsu, Cheng-Lung; Liao, Chun-Ta; Yen, Tzu-Chen
2017-01-01
Plasma Epstein-Barr virus (EBV) DNA concentrations predict prognosis in patients with nasopharyngeal carcinoma (NPC). Recent evidence also indicates that intratumor heterogeneity on F-18 fluorodeoxyglucose positron emission tomography ( 18 F-FDG PET) scans is predictive of treatment outcomes in different solid malignancies. Here, we sought to investigate the prognostic value of heterogeneity parameters in patients with primary NPC. Retrospective cohort study. We examined 101 patients with primary NPC who underwent pretreatment 18 F-FDG PET/computed tomography. Circulating levels of EBV DNA were measured in all participants. The following PET heterogeneity parameters were collected: histogram-based heterogeneity parameters, second-order texture features (uniformity, contrast, entropy, homogeneity, dissimilarity, inverse difference moment), and higher-order (coarseness, contrast, busyness, complexity, strength) texture features. The median follow-up time was 5.14 years. Total lesion glycolysis (TLG), tumor heterogeneity measured by histogram-based parameter skewness, and the majority of second-order or higher-order texture features were significantly associated with overall survival (OS) and/or recurrence-free survival (RFS). In multivariate analysis, age (P =.005), EBV DNA load (P = .0002), and uniformity (P = .001) independently predicted OS. Only skewness retained the independent prognostic significance for RFS. Tumor stage, standardized uptake value, or TLG did not show an independent association with survival endpoints. The combination of uniformity, EBV DNA load, and age resulted in a more reliable prognostic stratification (P < .001). Tumor heterogeneity is superior to traditional PET parameters for predicting outcomes in primary NPC. The combination of uniformity with EBV DNA load can improve prognostic stratification in this clinical entity. 4 Laryngoscope, 127:E22-E28, 2017. © 2016 The American Laryngological, Rhinological and Otological Society, Inc.
Butchosa, C; Simon, S; Blancafort, L; Voityuk, A
2012-07-12
Because hole transfer from nucleobases to amino acid residues in DNA-protein complexes can prevent oxidative damage of DNA in living cells, computational modeling of the process is of high interest. We performed MS-CASPT2 calculations of several model structures of π-stacked guanine and indole and derived electron-transfer (ET) parameters for these systems using the generalized Mulliken-Hush (GMH) method. We show that the two-state model commonly applied to treat thermal ET between adjacent donor and acceptor is of limited use for the considered systems because of the small gap between the ground and first excited states in the indole radical cation. The ET parameters obtained within the two-state GMH scheme can deviate significantly from the corresponding matrix elements of the two-state effective Hamiltonian based on the GMH treatment of three adiabatic states. The computed values of diabatic energies and electronic couplings provide benchmarks to assess the performance of less sophisticated computational methods.
GPU-BSM: A GPU-Based Tool to Map Bisulfite-Treated Reads
Manconi, Andrea; Orro, Alessandro; Manca, Emanuele; Armano, Giuliano; Milanesi, Luciano
2014-01-01
Cytosine DNA methylation is an epigenetic mark implicated in several biological processes. Bisulfite treatment of DNA is acknowledged as the gold standard technique to study methylation. This technique introduces changes in the genomic DNA by converting cytosines to uracils while 5-methylcytosines remain nonreactive. During PCR amplification 5-methylcytosines are amplified as cytosine, whereas uracils and thymines as thymine. To detect the methylation levels, reads treated with the bisulfite must be aligned against a reference genome. Mapping these reads to a reference genome represents a significant computational challenge mainly due to the increased search space and the loss of information introduced by the treatment. To deal with this computational challenge we devised GPU-BSM, a tool based on modern Graphics Processing Units. Graphics Processing Units are hardware accelerators that are increasingly being used successfully to accelerate general-purpose scientific applications. GPU-BSM is a tool able to map bisulfite-treated reads from whole genome bisulfite sequencing and reduced representation bisulfite sequencing, and to estimate methylation levels, with the goal of detecting methylation. Due to the massive parallelization obtained by exploiting graphics cards, GPU-BSM aligns bisulfite-treated reads faster than other cutting-edge solutions, while outperforming most of them in terms of unique mapped reads. PMID:24842718
DNA Sequence-Dependent Ionic Currents in Ultra-Small Solid-State Nanopores†
Comer, Jeffrey
2016-01-01
Measurements of ionic currents through nanopores partially blocked by DNA have emerged as a powerful method for characterization of the DNA nucleotide sequence. Although the effect of the nucleotide sequence on the nanopore blockade current has been experimentally demonstrated, prediction and interpretation of such measurements remain a formidable challenge. Using atomic resolution computational approaches, here we show how the sequence, molecular conformation, and pore geometry affect the blockade ionic current in model solid-state nanopores. We demonstrate that the blockade current from a DNA molecule is determined by the chemical identities and conformations of at least three consecutive nucleotides. We find the blockade currents produced by the nucleotide triplets to vary considerably with their nucleotide sequence despite having nearly identical molecular conformations. Encouragingly, we find blockade current differences as large as 25% for single-base substitutions in ultra small (1.6 nm × 1.1 nm cross section; 2 nm length) solid-state nanopores. Despite the complex dependence of the blockade current on the sequence and conformation of the DNA triplets, we find that, under many conditions, the number of thymine bases is positively correlated with the current, whereas the number of purine bases and the presence of both purine and pyrimidines in the triplet are negatively correlated with the current. Based on these observations, we construct a simple theoretical model that relates the ion current to the base content of a solid-state nanopore. Furthermore, we show that compact conformations of DNA in narrow pores provide the greatest signal-to-noise ratio for single base detection, whereas reduction of the nanopore length increases the ionic current noise. Thus, the sequence dependence of nanopore blockade current can be theoretically rationalized, although the predictions will likely need to be customized for each nanopore type. PMID:27103233
Recurrence time statistics: versatile tools for genomic DNA sequence analysis.
Cao, Yinhe; Tung, Wen-Wen; Gao, J B
2004-01-01
With the completion of the human and a few model organisms' genomes, and the genomes of many other organisms waiting to be sequenced, it has become increasingly important to develop faster computational tools which are capable of easily identifying the structures and extracting features from DNA sequences. One of the more important structures in a DNA sequence is repeat-related. Often they have to be masked before protein coding regions along a DNA sequence are to be identified or redundant expressed sequence tags (ESTs) are to be sequenced. Here we report a novel recurrence time based method for sequence analysis. The method can conveniently study all kinds of periodicity and exhaustively find all repeat-related features from a genomic DNA sequence. An efficient codon index is also derived from the recurrence time statistics, which has the salient features of being largely species-independent and working well on very short sequences. Efficient codon indices are key elements of successful gene finding algorithms, and are particularly useful for determining whether a suspected EST belongs to a coding or non-coding region. We illustrate the power of the method by studying the genomes of E. coli, the yeast S. cervisivae, the nematode worm C. elegans, and the human, Homo sapiens. Computationally, our method is very efficient. It allows us to carry out analysis of genomes on the whole genomic scale by a PC.
Transformation of personal computers and mobile phones into genetic diagnostic systems.
Walker, Faye M; Ahmad, Kareem M; Eisenstein, Michael; Soh, H Tom
2014-09-16
Molecular diagnostics based on the polymerase chain reaction (PCR) offer rapid and sensitive means for detecting infectious disease, but prohibitive costs have impeded their use in resource-limited settings where such diseases are endemic. In this work, we report an innovative method for transforming a desktop computer and a mobile camera phone--devices that have become readily accessible in developing countries--into a highly sensitive DNA detection system. This transformation was achieved by converting a desktop computer into a de facto thermal cycler with software that controls the temperature of the central processing unit (CPU), allowing for highly efficient PCR. Next, we reconfigured the mobile phone into a fluorescence imager by adding a low-cost filter, which enabled us to quantitatively measure the resulting PCR amplicons. Our system is highly sensitive, achieving quantitative detection of as little as 9.6 attograms of target DNA, and we show that its performance is comparable to advanced laboratory instruments at approximately 1/500th of the cost. Finally, in order to demonstrate clinical utility, we have used our platform for the successful detection of genomic DNA from the parasite that causes Chagas disease, Trypanosoma cruzi, directly in whole, unprocessed human blood at concentrations 4-fold below the clinical titer of the parasite.
Transformation of Personal Computers and Mobile Phones into Genetic Diagnostic Systems
2014-01-01
Molecular diagnostics based on the polymerase chain reaction (PCR) offer rapid and sensitive means for detecting infectious disease, but prohibitive costs have impeded their use in resource-limited settings where such diseases are endemic. In this work, we report an innovative method for transforming a desktop computer and a mobile camera phone—devices that have become readily accessible in developing countries—into a highly sensitive DNA detection system. This transformation was achieved by converting a desktop computer into a de facto thermal cycler with software that controls the temperature of the central processing unit (CPU), allowing for highly efficient PCR. Next, we reconfigured the mobile phone into a fluorescence imager by adding a low-cost filter, which enabled us to quantitatively measure the resulting PCR amplicons. Our system is highly sensitive, achieving quantitative detection of as little as 9.6 attograms of target DNA, and we show that its performance is comparable to advanced laboratory instruments at approximately 1/500th of the cost. Finally, in order to demonstrate clinical utility, we have used our platform for the successful detection of genomic DNA from the parasite that causes Chagas disease, Trypanosoma cruzi, directly in whole, unprocessed human blood at concentrations 4-fold below the clinical titer of the parasite. PMID:25223929
Programmable and autonomous computing machine made of biomolecules
Benenson, Yaakov; Paz-Elizur, Tamar; Adar, Rivka; Keinan, Ehud; Livneh, Zvi; Shapiro, Ehud
2013-01-01
Devices that convert information from one form into another according to a definite procedure are known as automata. One such hypothetical device is the universal Turing machine1, which stimulated work leading to the development of modern computers. The Turing machine and its special cases2, including finite automata3, operate by scanning a data tape, whose striking analogy to information-encoding biopolymers inspired several designs for molecular DNA computers4–8. Laboratory-scale computing using DNA and human-assisted protocols has been demonstrated9–15, but the realization of computing devices operating autonomously on the molecular scale remains rare16–20. Here we describe a programmable finite automaton comprising DNA and DNA-manipulating enzymes that solves computational problems autonomously. The automaton’s hardware consists of a restriction nuclease and ligase, the software and input are encoded by double-stranded DNA, and programming amounts to choosing appropriate software molecules. Upon mixing solutions containing these components, the automaton processes the input molecule via a cascade of restriction, hybridization and ligation cycles, producing a detectable output molecule that encodes the automaton’s final state, and thus the computational result. In our implementation 1012 automata sharing the same software run independently and in parallel on inputs (which could, in principle, be distinct) in 120 μl solution at room temperature at a combined rate of 109 transitions per second with a transition fidelity greater than 99.8%, consuming less than 10−10 W. PMID:11719800
Shahabadi, Nahid; Pourfoulad, Mehdi; Moghadam, Neda Hosseinpour
2017-01-02
DNA-binding properties of an antiviral drug, valganciclovir (valcyte) was studied by using emission, absorption, circular dichroism, viscosity, differential pulse voltammetry, fluorescence techniques, and computational studies. The drug bound to calf thymus DNA (ct-DNA) in a groove-binding mode. The calculated binding constant of UV-vis, K a , is comparable to groove-binding drugs. Competitive fluorimetric studies with Hoechst 33258 showed that valcyte could displace the DNA-bound Hoechst 33258. The drug could not displace intercalated methylene blue from DNA double helix. Furthermore, the induced detectable changes in the CD spectrum of ct-DNA as well as changes in its viscosity confirm the groove-binding mode. In addition, an integrated molecular docking was employed to further investigate the binding interactions between valcyte and calf thymus DNA.
Hidden Markov Model-Based CNV Detection Algorithms for Illumina Genotyping Microarrays.
Seiser, Eric L; Innocenti, Federico
2014-01-01
Somatic alterations in DNA copy number have been well studied in numerous malignancies, yet the role of germline DNA copy number variation in cancer is still emerging. Genotyping microarrays generate allele-specific signal intensities to determine genotype, but may also be used to infer DNA copy number using additional computational approaches. Numerous tools have been developed to analyze Illumina genotype microarray data for copy number variant (CNV) discovery, although commonly utilized algorithms freely available to the public employ approaches based upon the use of hidden Markov models (HMMs). QuantiSNP, PennCNV, and GenoCN utilize HMMs with six copy number states but vary in how transition and emission probabilities are calculated. Performance of these CNV detection algorithms has been shown to be variable between both genotyping platforms and data sets, although HMM approaches generally outperform other current methods. Low sensitivity is prevalent with HMM-based algorithms, suggesting the need for continued improvement in CNV detection methodologies.
Sequence-based prediction of protein-binding sites in DNA: comparative study of two SVM models.
Park, Byungkyu; Im, Jinyong; Tuvshinjargal, Narankhuu; Lee, Wook; Han, Kyungsook
2014-11-01
As many structures of protein-DNA complexes have been known in the past years, several computational methods have been developed to predict DNA-binding sites in proteins. However, its inverse problem (i.e., predicting protein-binding sites in DNA) has received much less attention. One of the reasons is that the differences between the interaction propensities of nucleotides are much smaller than those between amino acids. Another reason is that DNA exhibits less diverse sequence patterns than protein. Therefore, predicting protein-binding DNA nucleotides is much harder than predicting DNA-binding amino acids. We computed the interaction propensity (IP) of nucleotide triplets with amino acids using an extensive dataset of protein-DNA complexes, and developed two support vector machine (SVM) models that predict protein-binding nucleotides from sequence data alone. One SVM model predicts protein-binding nucleotides using DNA sequence data alone, and the other SVM model predicts protein-binding nucleotides using both DNA and protein sequences. In a 10-fold cross-validation with 1519 DNA sequences, the SVM model that uses DNA sequence data only predicted protein-binding nucleotides with an accuracy of 67.0%, an F-measure of 67.1%, and a Matthews correlation coefficient (MCC) of 0.340. With an independent dataset of 181 DNAs that were not used in training, it achieved an accuracy of 66.2%, an F-measure 66.3% and a MCC of 0.324. Another SVM model that uses both DNA and protein sequences achieved an accuracy of 69.6%, an F-measure of 69.6%, and a MCC of 0.383 in a 10-fold cross-validation with 1519 DNA sequences and 859 protein sequences. With an independent dataset of 181 DNAs and 143 proteins, it showed an accuracy of 67.3%, an F-measure of 66.5% and a MCC of 0.329. Both in cross-validation and independent testing, the second SVM model that used both DNA and protein sequence data showed better performance than the first model that used DNA sequence data. To the best of our knowledge, this is the first attempt to predict protein-binding nucleotides in a given DNA sequence from the sequence data alone. Copyright © 2014 Elsevier Ireland Ltd. All rights reserved.
Mono- and Di-Alkylation Processes of DNA Bases by Nitrogen Mustard Mechlorethamine.
Larrañaga, Olatz; de Cózar, Abel; Cossío, Fernando P
2017-12-06
The reactivity of nitrogen mustard mechlorethamine (mec) with purine bases towards formation of mono- (G-mec and A-mec) and dialkylated (AA-mec, GG-mec and AG-mec) adducts has been studied using density functional theory (DFT). To gain a complete overview of DNA-alkylation processes, direct chloride substitution and formation through activated aziridinium species were considered as possible reaction paths for adduct formation. Our results confirm that DNA alkylation by mec occurs via aziridine intermediates instead of direct substitution. Consideration of explicit water molecules in conjunction with polarizable continuum model (PCM) was shown as an adequate computational method for a proper representation of the system. Moreover, Runge-Kutta numerical kinetic simulations including the possible bisadducts have been performed. These simulations predicted a product ratio of 83:17 of GG-mec and AG-mec diadducts, respectively. © 2017 Wiley-VCH Verlag GmbH & Co. KGaA, Weinheim.
A DNA-based semantic fusion model for remote sensing data.
Sun, Heng; Weng, Jian; Yu, Guangchuang; Massawe, Richard H
2013-01-01
Semantic technology plays a key role in various domains, from conversation understanding to algorithm analysis. As the most efficient semantic tool, ontology can represent, process and manage the widespread knowledge. Nowadays, many researchers use ontology to collect and organize data's semantic information in order to maximize research productivity. In this paper, we firstly describe our work on the development of a remote sensing data ontology, with a primary focus on semantic fusion-driven research for big data. Our ontology is made up of 1,264 concepts and 2,030 semantic relationships. However, the growth of big data is straining the capacities of current semantic fusion and reasoning practices. Considering the massive parallelism of DNA strands, we propose a novel DNA-based semantic fusion model. In this model, a parallel strategy is developed to encode the semantic information in DNA for a large volume of remote sensing data. The semantic information is read in a parallel and bit-wise manner and an individual bit is converted to a base. By doing so, a considerable amount of conversion time can be saved, i.e., the cluster-based multi-processes program can reduce the conversion time from 81,536 seconds to 4,937 seconds for 4.34 GB source data files. Moreover, the size of result file recording DNA sequences is 54.51 GB for parallel C program compared with 57.89 GB for sequential Perl. This shows that our parallel method can also reduce the DNA synthesis cost. In addition, data types are encoded in our model, which is a basis for building type system in our future DNA computer. Finally, we describe theoretically an algorithm for DNA-based semantic fusion. This algorithm enables the process of integration of the knowledge from disparate remote sensing data sources into a consistent, accurate, and complete representation. This process depends solely on ligation reaction and screening operations instead of the ontology.
A DNA-Based Semantic Fusion Model for Remote Sensing Data
Sun, Heng; Weng, Jian; Yu, Guangchuang; Massawe, Richard H.
2013-01-01
Semantic technology plays a key role in various domains, from conversation understanding to algorithm analysis. As the most efficient semantic tool, ontology can represent, process and manage the widespread knowledge. Nowadays, many researchers use ontology to collect and organize data's semantic information in order to maximize research productivity. In this paper, we firstly describe our work on the development of a remote sensing data ontology, with a primary focus on semantic fusion-driven research for big data. Our ontology is made up of 1,264 concepts and 2,030 semantic relationships. However, the growth of big data is straining the capacities of current semantic fusion and reasoning practices. Considering the massive parallelism of DNA strands, we propose a novel DNA-based semantic fusion model. In this model, a parallel strategy is developed to encode the semantic information in DNA for a large volume of remote sensing data. The semantic information is read in a parallel and bit-wise manner and an individual bit is converted to a base. By doing so, a considerable amount of conversion time can be saved, i.e., the cluster-based multi-processes program can reduce the conversion time from 81,536 seconds to 4,937 seconds for 4.34 GB source data files. Moreover, the size of result file recording DNA sequences is 54.51 GB for parallel C program compared with 57.89 GB for sequential Perl. This shows that our parallel method can also reduce the DNA synthesis cost. In addition, data types are encoded in our model, which is a basis for building type system in our future DNA computer. Finally, we describe theoretically an algorithm for DNA-based semantic fusion. This algorithm enables the process of integration of the knowledge from disparate remote sensing data sources into a consistent, accurate, and complete representation. This process depends solely on ligation reaction and screening operations instead of the ontology. PMID:24116207
Magro, Massimiliano; Martinello, Tiziana; Bonaiuto, Emanuela; Gomiero, Chiara; Baratella, Davide; Zoppellaro, Giorgio; Cozza, Giorgio; Patruno, Marco; Zboril, Radek; Vianello, Fabio
2017-11-01
Conversely to common coated iron oxide nanoparticles, novel naked surface active maghemite nanoparticles (SAMNs) can covalently bind DNA. Plasmid (pDNA) harboring the coding gene for GFP was directly chemisorbed onto SAMNs, leading to a novel DNA nanovector (SAMN@pDNA). The spontaneous internalization of SAMN@pDNA into cells was compared with an extensively studied fluorescent SAMN derivative (SAMN@RITC). Moreover, the transfection efficiency of SAMN@pDNA was evaluated and explained by computational model. SAMN@pDNA was prepared and characterized by spectroscopic and computational methods, and molecular dynamic simulation. The size and hydrodynamic properties of SAMN@pDNA and SAMN@RITC were studied by electron transmission microscopy, light scattering and zeta-potential. The two nanomaterials were tested by confocal scanning microscopy on equine peripheral blood-derived mesenchymal stem cells (ePB-MSCs) and GFP expression by SAMN@pDNA was determined. Nanomaterials characterized by similar hydrodynamic properties were successfully internalized and stored into mesenchymal stem cells. Transfection by SAMN@pDNA occurred and GFP expression was higher than lipofectamine procedure, even in the absence of an external magnetic field. A computational model clarified that transfection efficiency can be ascribed to DNA availability inside cells. Direct covalent binding of DNA on naked magnetic nanoparticles led to an extremely robust gene delivery tool. Hydrodynamic and chemical-physical properties of SAMN@pDNA were responsible of the successful uptake by cells and of the efficiency of GFP gene transfection. SAMNs are characterized by colloidal stability, excellent cell uptake, persistence in the host cells, low toxicity and are proposed as novel intelligent DNA nanovectors for efficient cell transfection. Copyright © 2017 Elsevier B.V. All rights reserved.
Weiss, Gunter; Schlegel, Anne; Kottwitz, Denise; König, Thomas; Tetzner, Reimo
2017-01-01
Low-dose computed tomography (LDCT) is used for screening for lung cancer (LC) in high-risk patients in the United States. The definition of high risk and the impact of frequent false-positive results of low-dose computed tomography remains a challenge. DNA methylation biomarkers are valuable noninvasive diagnostic tools for cancer detection. This study reports on the evaluation of methylation markers in plasma DNA for LC detection and discrimination of malignant from nonmalignant lung disease. Circulating DNA was extracted from 3.5-mL plasma samples, treated with bisulfite using a commercially available kit, purified, and assayed by real-time polymerase chain reaction for assessment of DNA methylation of short stature homeobox 2 gene (SHOX2), prostaglandin E receptor 4 gene (PTGER4), and forkhead box L2 gene (FOXL2). In three independent case-control studies these assays were evaluated and optimized. The resultant assay, a triplex polymerase chain reaction combining SHOX2, PTGER4, and the reference gene actin, beta gene (ACTB), was validated using plasma from patients with and without malignant disease. A panel of SHOX2 and PTGER4 provided promising results in three independent case-control studies examining a total of 330 plasma specimens (area under the receiver operating characteristic curve = 91%-98%). A validation study with 172 patient samples demonstrated significant discriminatory performance in distinguishing patients with LC from subjects without malignancy (area under the curve = 0.88). At a fixed specificity of 90%, sensitivity for LC was 67%; at a fixed sensitivity of 90%, specificity was 73%. Measurement of SHOX2 and PTGER4 methylation in plasma DNA allowed detection of LC and differentiation of nonmalignant diseases. Development of a diagnostic test based on this panel may provide clinical utility in combination with current imaging techniques to improve LC risk stratification. Copyright © 2016 International Association for the Study of Lung Cancer. Published by Elsevier Inc. All rights reserved.
2009-01-01
Background This study reports progress in assembling a DNA barcode reference library for Ephemeroptera, Plecoptera, and Trichoptera ("EPTs") from a Canadian subarctic site, which is the focus of a comprehensive biodiversity inventory using DNA barcoding. These three groups of aquatic insects exhibit a moderate level of species diversity, making them ideal for testing the feasibility of DNA barcoding for routine biotic surveys. We explore the correlation between the morphological species delineations, DNA barcode-based haplotype clusters delimited by a sequence threshold (2%), and a threshold-free approach to biodiversity quantification--phylogenetic diversity. Results A DNA barcode reference library is built for 112 EPT species for the focal region, consisting of 2277 COI sequences. Close correspondence was found between EPT morphospecies and haplotype clusters as designated using a standard threshold value. Similarly, the shapes of taxon accumulation curves based upon haplotype clusters were very similar to those generated using phylogenetic diversity accumulation curves, but were much more computationally efficient. Conclusion The results of this study will facilitate other lines of research on northern EPTs and also bode well for rapidly conducting initial biodiversity assessments in unknown EPT faunas. PMID:20003245
Conformations of stereoisomeric base adducts to 4-hydroxyequilenin.
Ding, Shuang; Shapiro, Robert; Geacintov, Nicholas E; Broyde, Suse
2003-06-01
Exposure to estrogen through estrogen replacement therapy increases the risk of women developing cancer in hormone sensitive tissues. Premarin (Wyeth), which has been the most frequent choice for estrogen replacement therapy in the United States, contains the equine estrogens equilin and equilenin as major components. 4-Hydroxyequilenin (4-OHEN) is a phase I metabolite of both of these substances. This catechol estrogen autoxidizes to potent cytotoxic quinoids that can react with dG, dA, and dC to form unusual stereoisomeric cyclic adducts (Bolton, J. L., et al. (1998) Chem. Res. Toxicol. 11, 1113-1127). Like other bulky DNA adducts, these lesions may exhibit different susceptibilities to DNA repair and mutagenic potential, if not repaired in a structure-dependent manner. To ultimately gain insights into structure-function relationships, we computed conformations of stereoisomeric guanine, adenine, and cytosine base adducts using density functional theory. We find near mirror image conformations in stereoisomer adduct pairs for each modified base, suggesting opposite orientations with respect to the 5' --> 3' direction of the modified strand when the stereoisomer pairs are incorporated into duplex DNA. Such opposite orientations could cause stereoisomer pairs of lesions to respond differently to DNA replication and repair enzymes.
Coalescence computations for large samples drawn from populations of time-varying sizes
Polanski, Andrzej; Szczesna, Agnieszka; Garbulowski, Mateusz; Kimmel, Marek
2017-01-01
We present new results concerning probability distributions of times in the coalescence tree and expected allele frequencies for coalescent with large sample size. The obtained results are based on computational methodologies, which involve combining coalescence time scale changes with techniques of integral transformations and using analytical formulae for infinite products. We show applications of the proposed methodologies for computing probability distributions of times in the coalescence tree and their limits, for evaluation of accuracy of approximate expressions for times in the coalescence tree and expected allele frequencies, and for analysis of large human mitochondrial DNA dataset. PMID:28170404
Wen, Yanhua; Wei, Yanjun; Zhang, Shumei; Li, Song; Liu, Hongbo; Wang, Fang; Zhao, Yue; Zhang, Dongwei; Zhang, Yan
2017-05-01
Tumour heterogeneity describes the coexistence of divergent tumour cell clones within tumours, which is often caused by underlying epigenetic changes. DNA methylation is commonly regarded as a significant regulator that differs across cells and tissues. In this study, we comprehensively reviewed research progress on estimating of tumour heterogeneity. Bioinformatics-based analysis of DNA methylation has revealed the evolutionary relationships between breast cancer cell lines and tissues. Further analysis of the DNA methylation profiles in 33 breast cancer-related cell lines identified cell line-specific methylation patterns. Next, we reviewed the computational methods in inferring clonal evolution of tumours from different perspectives and then proposed a deconvolution strategy for modelling cell subclonal populations dynamics in breast cancer tissues based on DNA methylation. Further analysis of simulated cancer tissues and real cell lines revealed that this approach exhibits satisfactory performance and relative stability in estimating the composition and proportions of cellular subpopulations. The application of this strategy to breast cancer individuals of the Cancer Genome Atlas's identified different cellular subpopulations with distinct molecular phenotypes. Moreover, the current and potential future applications of this deconvolution strategy to clinical breast cancer research are discussed, and emphasis was placed on the DNA methylation-based recognition of intra-tumour heterogeneity. The wide use of these methods for estimating heterogeneity to further clinical cohorts will improve our understanding of neoplastic progression and the design of therapeutic interventions for treating breast cancer and other malignancies. © The Author 2016. Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com.
Programmable DNA-Mediated Multitasking Processor.
Shu, Jian-Jun; Wang, Qi-Wen; Yong, Kian-Yan; Shao, Fangwei; Lee, Kee Jin
2015-04-30
Because of DNA appealing features as perfect material, including minuscule size, defined structural repeat and rigidity, programmable DNA-mediated processing is a promising computing paradigm, which employs DNAs as information storing and processing substrates to tackle the computational problems. The massive parallelism of DNA hybridization exhibits transcendent potential to improve multitasking capabilities and yield a tremendous speed-up over the conventional electronic processors with stepwise signal cascade. As an example of multitasking capability, we present an in vitro programmable DNA-mediated optimal route planning processor as a functional unit embedded in contemporary navigation systems. The novel programmable DNA-mediated processor has several advantages over the existing silicon-mediated methods, such as conducting massive data storage and simultaneous processing via much fewer materials than conventional silicon devices.
The Role of Cytosine Methylation on Charge Transport through a DNA Strand
DOE Office of Scientific and Technical Information (OSTI.GOV)
Qi, Jianqing; Govind, Niranjan; Anantram, M. P.
Cytosine methylation has been found to play a crucial role in various biological processes, including a number of human diseases. The detection of this small modifi-cation remains challenging. In this work, we computationally explore the possibility of detecting methylated DNA strands through direct electrical conductance measurements. Using density functional theory and the Landauer-Buttiker method, we study the electronic properties and charge transport through an eight base-pair methylated DNA strand and its native counterpart. Specifically, we compare the results generated with the widely used B3LYP exchange-correlation (XC) functional and CAM-B3LYP based tuned range-separated hybrid density functional. We first analyze the effectmore » of cytosine methylation on the tight-binding parameters of two DNA strands and then model the transmission of the electrons and conductance through the strands both with and without decoherence. We find that with both functionals, the main difference of the tight-binding parameters between the native DNA and the methylated DNA lies in the on-site energies of (methylated) cytosine bases. The intra- and interstrand hopping integrals between two nearest neighboring guanine base and (methylated) cytosine base also change with the addition of the methyl groups. Our calculations show that in the phase-coherent limit, the transmission of the methylated strand is close to the native strand when the energy is nearby the highest occupied molecular orbital (HOMO) level and larger than the native strand by 5 times in the bandgap. The trend in transmission also holds in the presence of the decoherence with both functionals. We also study the effect of contact coupling by choosing coupling strengths ranging from weak to strong coupling limit. Our results suggest that the effect of the two different functionals is to alter the on-site energies of the DNA bases at the HOMO level, while the transport properties don't depend much on the two functionals.« less
Etienne, Thibaud; Very, Thibaut; Perpète, Eric A; Monari, Antonio; Assfeld, Xavier
2013-05-02
We present a time-dependent density functional theory computation of the absorption spectra of one β-carboline system: the harmane molecule in its neutral and cationic forms. The spectra are computed in aqueous solution. The interaction of cationic harmane with DNA is also studied. In particular, the use of hybrid quantum mechanics/molecular mechanics methods is discussed, together with its coupling to a molecular dynamics strategy to take into account dynamic effects of the environment and the vibrational degrees of freedom of the chromophore. Different levels of treatment of the environment are addressed starting from purely mechanical embedding to electrostatic and polarizable embedding. We show that a static description of the spectrum based on equilibrium geometry only is unable to give a correct agreement with experimental results, and dynamic effects need to be taken into account. The presence of two stable noncovalent interaction modes between harmane and DNA is also presented, as well as the associated absorption spectrum of harmane cation.
Vander Lugt correlation of DNA sequence data
NASA Astrophysics Data System (ADS)
Christens-Barry, William A.; Hawk, James F.; Martin, James C.
1990-12-01
DNA, the molecule containing the genetic code of an organism, is a linear chain of subunits. It is the sequence of subunits, of which there are four kinds, that constitutes the unique blueprint of an individual. This sequence is the focus of a large number of analyses performed by an army of geneticists, biologists, and computer scientists. Most of these analyses entail searches for specific subsequences within the larger set of sequence data. Thus, most analyses are essentially pattern recognition or correlation tasks. Yet, there are special features to such analysis that influence the strategy and methods of an optical pattern recognition approach. While the serial processing employed in digital electronic computers remains the main engine of sequence analyses, there is no fundamental reason that more efficient parallel methods cannot be used. We describe an approach using optical pattern recognition (OPR) techniques based on matched spatial filtering. This allows parallel comparison of large blocks of sequence data. In this study we have simulated a Vander Lugt1 architecture implementing our approach. Searches for specific target sequence strings within a block of DNA sequence from the Co/El plasmid2 are performed.
BiRen: predicting enhancers with a deep-learning-based model using the DNA sequence alone.
Yang, Bite; Liu, Feng; Ren, Chao; Ouyang, Zhangyi; Xie, Ziwei; Bo, Xiaochen; Shu, Wenjie
2017-07-01
Enhancer elements are noncoding stretches of DNA that play key roles in controlling gene expression programmes. Despite major efforts to develop accurate enhancer prediction methods, identifying enhancer sequences continues to be a challenge in the annotation of mammalian genomes. One of the major issues is the lack of large, sufficiently comprehensive and experimentally validated enhancers for humans or other species. Thus, the development of computational methods based on limited experimentally validated enhancers and deciphering the transcriptional regulatory code encoded in the enhancer sequences is urgent. We present a deep-learning-based hybrid architecture, BiRen, which predicts enhancers using the DNA sequence alone. Our results demonstrate that BiRen can learn common enhancer patterns directly from the DNA sequence and exhibits superior accuracy, robustness and generalizability in enhancer prediction relative to other state-of-the-art enhancer predictors based on sequence characteristics. Our BiRen will enable researchers to acquire a deeper understanding of the regulatory code of enhancer sequences. Our BiRen method can be freely accessed at https://github.com/wenjiegroup/BiRen . shuwj@bmi.ac.cn or boxc@bmi.ac.cn. Supplementary data are available at Bioinformatics online. © The Author 2017. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com
Redesigning the specificity of protein-DNA interactions with Rosetta.
Thyme, Summer; Baker, David
2014-01-01
Building protein tools that can selectively bind or cleave specific DNA sequences requires efficient technologies for modifying protein-DNA interactions. Computational design is one method for accomplishing this goal. In this chapter, we present the current state of protein-DNA interface design with the Rosetta macromolecular modeling program. The LAGLIDADG endonuclease family of DNA-cleaving enzymes, under study as potential gene therapy reagents, has been the main testing ground for these in silico protocols. At this time, the computational methods are most useful for designing endonuclease variants that can accommodate small numbers of target site substitutions. Attempts to engineer for more extensive interface changes will likely benefit from an approach that uses the computational design results in conjunction with a high-throughput directed evolution or screening procedure. The family of enzymes presents an engineering challenge because their interfaces are highly integrated and there is significant coordination between the binding and catalysis events. Future developments in the computational algorithms depend on experimental feedback to improve understanding and modeling of these complex enzymatic features. This chapter presents both the basic method of design that has been successfully used to modulate specificity and more advanced procedures that incorporate DNA flexibility and other properties that are likely necessary for reliable modeling of more extensive target site changes.
Function-Based Algorithms for Biological Sequences
ERIC Educational Resources Information Center
Mohanty, Pragyan Sheela P.
2015-01-01
Two problems at two different abstraction levels of computational biology are studied. At the molecular level, efficient pattern matching algorithms in DNA sequences are presented. For gene order data, an efficient data structure is presented capable of storing all gene re-orderings in a systematic manner. A common characteristic of presented…
Eldar, Amir; Rozenberg, Haim; Diskin-Posner, Yael; Rohs, Remo; Shakked, Zippora
2013-01-01
A p53 hot-spot mutation found frequently in human cancer is the replacement of R273 by histidine or cysteine residues resulting in p53 loss of function as a tumor suppressor. These mutants can be reactivated by the incorporation of second-site suppressor mutations. Here, we present high-resolution crystal structures of the p53 core domains of the cancer-related proteins, the rescued proteins and their complexes with DNA. The structures show that inactivation of p53 results from the incapacity of the mutated residues to form stabilizing interactions with the DNA backbone, and that reactivation is achieved through alternative interactions formed by the suppressor mutations. Detailed structural and computational analysis demonstrates that the rescued p53 complexes are not fully restored in terms of DNA structure and its interface with p53. Contrary to our previously studied wild-type (wt) p53-DNA complexes showing non-canonical Hoogsteen A/T base pairs of the DNA helix that lead to local minor-groove narrowing and enhanced electrostatic interactions with p53, the current structures display Watson–Crick base pairs associated with direct or water-mediated hydrogen bonds with p53 at the minor groove. These findings highlight the pivotal role played by R273 residues in supporting the unique geometry of the DNA target and its sequence-specific complex with p53. PMID:23863845
Ma, Xin; Guo, Jing; Sun, Xiao
2016-01-01
DNA-binding proteins are fundamentally important in cellular processes. Several computational-based methods have been developed to improve the prediction of DNA-binding proteins in previous years. However, insufficient work has been done on the prediction of DNA-binding proteins from protein sequence information. In this paper, a novel predictor, DNABP (DNA-binding proteins), was designed to predict DNA-binding proteins using the random forest (RF) classifier with a hybrid feature. The hybrid feature contains two types of novel sequence features, which reflect information about the conservation of physicochemical properties of the amino acids, and the binding propensity of DNA-binding residues and non-binding propensities of non-binding residues. The comparisons with each feature demonstrated that these two novel features contributed most to the improvement in predictive ability. Furthermore, to improve the prediction performance of the DNABP model, feature selection using the minimum redundancy maximum relevance (mRMR) method combined with incremental feature selection (IFS) was carried out during the model construction. The results showed that the DNABP model could achieve 86.90% accuracy, 83.76% sensitivity, 90.03% specificity and a Matthews correlation coefficient of 0.727. High prediction accuracy and performance comparisons with previous research suggested that DNABP could be a useful approach to identify DNA-binding proteins from sequence information. The DNABP web server system is freely available at http://www.cbi.seu.edu.cn/DNABP/.
NASA Astrophysics Data System (ADS)
Voityuk, Alexander A.
2008-03-01
The electron hole transfer (HT) properties of DNA are substantially affected by thermal fluctuations of the π stack structure. Depending on the mutual position of neighboring nucleobases, electronic coupling V may change by several orders of magnitude. In the present paper, we report the results of systematic QM/molecular dynamic (MD) calculations of the electronic couplings and on-site energies for the hole transfer. Based on 15ns MD trajectories for several DNA oligomers, we calculate the average coupling squares ⟨V2⟩ and the energies of basepair triplets XG +Y and XA +Y, where X, Y =G, A, T, and C. For each of the 32 systems, 15 000 conformations separated by 1ps are considered. The three-state generalized Mulliken-Hush method is used to derive electronic couplings for HT between neighboring basepairs. The adiabatic energies and dipole moment matrix elements are computed within the INDO/S method. We compare the rms values of V with the couplings estimated for the idealized B-DNA structure and show that in several important cases the couplings calculated for the idealized B-DNA structure are considerably underestimated. The rms values for intrastrand couplings G-G, A-A, G-A, and A-G are found to be similar, ˜0.07eV, while the interstrand couplings are quite different. The energies of hole states G+ and A+ in the stack depend on the nature of the neighboring pairs. The XG +Y are by 0.5eV more stable than XA +Y. The thermal fluctuations of the DNA structure facilitate the HT process from guanine to adenine. The tabulated couplings and on-site energies can be used as reference parameters in theoretical and computational studies of HT processes in DNA.
An accurate algorithm for the detection of DNA fragments from dilution pool sequencing experiments.
Bansal, Vikas
2018-01-01
The short read lengths of current high-throughput sequencing technologies limit the ability to recover long-range haplotype information. Dilution pool methods for preparing DNA sequencing libraries from high molecular weight DNA fragments enable the recovery of long DNA fragments from short sequence reads. These approaches require computational methods for identifying the DNA fragments using aligned sequence reads and assembling the fragments into long haplotypes. Although a number of computational methods have been developed for haplotype assembly, the problem of identifying DNA fragments from dilution pool sequence data has not received much attention. We formulate the problem of detecting DNA fragments from dilution pool sequencing experiments as a genome segmentation problem and develop an algorithm that uses dynamic programming to optimize a likelihood function derived from a generative model for the sequence reads. This algorithm uses an iterative approach to automatically infer the mean background read depth and the number of fragments in each pool. Using simulated data, we demonstrate that our method, FragmentCut, has 25-30% greater sensitivity compared with an HMM based method for fragment detection and can also detect overlapping fragments. On a whole-genome human fosmid pool dataset, the haplotypes assembled using the fragments identified by FragmentCut had greater N50 length, 16.2% lower switch error rate and 35.8% lower mismatch error rate compared with two existing methods. We further demonstrate the greater accuracy of our method using two additional dilution pool datasets. FragmentCut is available from https://bansal-lab.github.io/software/FragmentCut. vibansal@ucsd.edu. Supplementary data are available at Bioinformatics online. © The Author 2017. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com
Dahmcke, Christina M; Steven, Kenneth E; Larsen, Louise K; Poulsen, Asger L; Abdul-Al, Ahmad; Dahl, Christina; Guldberg, Per
2016-12-01
Retrospective studies have provided proof of principle that bladder cancer can be detected by testing for the presence of tumor DNA in urine. We have conducted a prospective blinded study to determine whether a urine-based DNA test can replace flexible cystoscopy in the initial assessment of gross hematuria. A total of 475 consecutive patients underwent standard urological examination including flexible cystoscopy and computed tomography urography, and provided urine samples immediately before (n=461) and after (n=444) cystoscopy. Urine cells were collected using a filtration device and tested for eight DNA mutation and methylation biomarkers. Clinical evaluation identified 99 (20.8%) patients with urothelial bladder tumors. With this result as a reference and based on the analysis of all urine samples, the DNA test had a sensitivity of 97.0%, a specificity of 76.9%, a positive predictive value of 52.5%, and a negative predictive value of 99.0%. In three patients with a positive urine-DNA test without clinical evidence of cancer, a tumor was detected at repeat cystoscopy within 16 mo. Our results suggest that urine-DNA testing can be used to identify a large subgroup of patients with gross hematuria in whom cystoscopy is not required. We tested the possibility of using a urine-based DNA test to check for bladder cancer in patients with visible blood in the urine. Our results show that the test efficiently detects bladder cancer and therefore may be used to greatly reduce the number of patients who would need to undergo cystoscopy. Copyright © 2016 European Association of Urology. Published by Elsevier B.V. All rights reserved.
NASA Technical Reports Server (NTRS)
Kretsinger, R. H.; Nakayama, S.
1993-01-01
In the previous three reports in this series we demonstrated that the EF-hand family of proteins evolved by a complex pattern of gene duplication, transposition, and splicing. The dendrograms based on exon sequences are nearly identical to those based on protein sequences for troponin C, the essential light chain myosin, the regulatory light chain, and calpain. This validates both the computational methods and the dendrograms for these subfamilies. The proposal of congruence for calmodulin, troponin C, essential light chain, and regulatory light chain was confirmed. There are, however, significant differences in the calmodulin dendrograms computed from DNA and from protein sequences. In this study we find that introns are distributed throughout the EF-hand domain and the interdomain regions. Further, dendrograms based on intron type and distribution bear little resemblance to those based on protein or on DNA sequences. We conclude that introns are inserted, and probably deleted, with relatively high frequency. Further, in the EF-hand family exons do not correspond to structural domains and exon shuffling played little if any role in the evolution of this widely distributed homolog family. Calmodulin has had a turbulent evolution. Its dendrograms based on protein sequence, exon sequence, 3'-tail sequence, intron sequences, and intron positions all show significant differences.
NASA Astrophysics Data System (ADS)
Smith, Jarrod Anson
2D homonuclear 1H NMR methods and restrained molecular dynamics (rMD) calculations have been applied to determining the three-dimensional structures of DNA and minor groove-binding ligand-DNA complexes in solution. The structure of the DNA decamer sequence d(GCGTTAACGC)2 has been solved both with a distance-based rMD protocol and an NOE relaxation matrix backcalculation-based protocol in order to probe the relative merits of the different refinement methods. In addition, three minor groove binding ligand-DNA complexes have been examined. The solution structure of the oligosaccharide moiety of the antitumor DNA scission agent calicheamicin γ1I has been determined in complex with a decamer duplex containing its high affinity 5'-TCCT- 3' binding sequence. The structure of the complex reinforces the belief that the oligosaccharide moiety is responsible for the sequence selective minor-groove binding activity of the agent, and critical intermolecular contacts are revealed. The solution structures of both the (+) and (-) enantiomers of the minor groove binding DNA alkylating agent duocarmycin SA have been determined in covalent complex with the undecamer DNA duplex d(GACTAATTGTC).d(GAC AATTAGTC). The results support the proposal that the alkylation activity of the duocarmycin antitumor antibiotics is catalyzed by a binding-induced conformational change in the ligand which activates the cyclopropyl group for reaction with the DNA. Comparisons between the structures of the two enantiomers covalently bound to the same DNA sequence at the same 5'-AATTA-3 ' site have provided insight into the binding orientation and site selectivity, as well as the relative rates of reactivity of these two agents.
Hummer, G; García, A E; Soumpasis, D M
1995-01-01
A computationally efficient method to describe the organization of water around solvated biomolecules is presented. It is based on a statistical mechanical expression for the water-density distribution in terms of particle correlation functions. The method is applied to analyze the hydration of small nucleic acid molecules in the crystal environment, for which high-resolution x-ray crystal structures have been reported. Results for RNA [r(ApU).r(ApU)] and DNA [d(CpG).d(CpG) in Z form and with parallel strand orientation] and for DNA-drug complexes [d(CpG).d(CpG) with the drug proflavine intercalated] are described. A detailed comparison of theoretical and experimental data shows positional agreement for the experimentally observed water sites. The presented method can be used for refinement of the water structure in x-ray crystallography, hydration analysis of nuclear magnetic resonance structures, and theoretical modeling of biological macromolecules such as molecular docking studies. The speed of the computations allows hydration analyses of molecules of almost arbitrary size (tRNA, protein-nucleic acid complexes, etc.) in the crystal environment and in aqueous solution. Images FIGURE 1 FIGURE 2 FIGURE 5 FIGURE 6 FIGURE 9 FIGURE 12 FIGURE 13 PMID:7542034
Validation of DNA-based identification software by computation of pedigree likelihood ratios.
Slooten, K
2011-08-01
Disaster victim identification (DVI) can be aided by DNA-evidence, by comparing the DNA-profiles of unidentified individuals with those of surviving relatives. The DNA-evidence is used optimally when such a comparison is done by calculating the appropriate likelihood ratios. Though conceptually simple, the calculations can be quite involved, especially with large pedigrees, precise mutation models etc. In this article we describe a series of test cases designed to check if software designed to calculate such likelihood ratios computes them correctly. The cases include both simple and more complicated pedigrees, among which inbred ones. We show how to calculate the likelihood ratio numerically and algebraically, including a general mutation model and possibility of allelic dropout. In Appendix A we show how to derive such algebraic expressions mathematically. We have set up these cases to validate new software, called Bonaparte, which performs pedigree likelihood ratio calculations in a DVI context. Bonaparte has been developed by SNN Nijmegen (The Netherlands) for the Netherlands Forensic Institute (NFI). It is available free of charge for non-commercial purposes (see www.dnadvi.nl for details). Commercial licenses can also be obtained. The software uses Bayesian networks and the junction tree algorithm to perform its calculations. Copyright © 2010 Elsevier Ireland Ltd. All rights reserved.
Simulations Meet Experiment to Reveal New Insights into DNA Intrinsic Mechanics
Ben Imeddourene, Akli; Elbahnsi, Ahmad; Guéroult, Marc; Oguey, Christophe; Foloppe, Nicolas; Hartmann, Brigitte
2015-01-01
The accurate prediction of the structure and dynamics of DNA remains a major challenge in computational biology due to the dearth of precise experimental information on DNA free in solution and limitations in the DNA force-fields underpinning the simulations. A new generation of force-fields has been developed to better represent the sequence-dependent B-DNA intrinsic mechanics, in particular with respect to the BI ↔ BII backbone equilibrium, which is essential to understand the B-DNA properties. Here, the performance of MD simulations with the newly updated force-fields Parmbsc0εζOLI and CHARMM36 was tested against a large ensemble of recent NMR data collected on four DNA dodecamers involved in nucleosome positioning. We find impressive progress towards a coherent, realistic representation of B-DNA in solution, despite residual shortcomings. This improved representation allows new and deeper interpretation of the experimental observables, including regarding the behavior of facing phosphate groups in complementary dinucleotides, and their modulation by the sequence. It also provides the opportunity to extensively revisit and refine the coupling between backbone states and inter base pair parameters, which emerges as a common theme across all the complementary dinucleotides. In sum, the global agreement between simulations and experiment reveals new aspects of intrinsic DNA mechanics, a key component of DNA-protein recognition. PMID:26657165
Poltev, V I; Anisimov, V M; Sanchez, C; Deriabina, A; Gonzalez, E; Garcia, D; Rivas, F; Polteva, N A
2016-01-01
It is generally accepted that the important characteristic features of the Watson-Crick duplex originate from the molecular structure of its subunits. However, it still remains to elucidate what properties of each subunit are responsible for the significant characteristic features of the DNA structure. The computations of desoxydinucleoside monophosphates complexes with Na-ions using density functional theory revealed a pivotal role of DNA conformational properties of single-chain minimal fragments in the development of unique features of the Watson-Crick duplex. We found that directionality of the sugar-phosphate backbone and the preferable ranges of its torsion angles, combined with the difference between purines and pyrimidines. in ring bases, define the dependence of three-dimensional structure of the Watson-Crick duplex on nucleotide base sequence. In this work, we extended these density functional theory computations to the minimal' fragments of DNA duplex, complementary desoxydinucleoside monophosphates complexes with Na-ions. Using several computational methods and various functionals, we performed a search for energy minima of BI-conformation for complementary desoxydinucleoside monophosphates complexes with different nucleoside sequences. Two sequences are optimized using ab initio method at the MP2/6-31++G** level of theory. The analysis of torsion angles, sugar ring puckering and mutual base positions of optimized structures demonstrates that the conformational characteristic features of complementary desoxydinucleoside monophosphates complexes with Na-ions remain within BI ranges and become closer to the corresponding characteristic features of the Watson-Crick duplex crystals. Qualitatively, the main characteristic features of each studied complementary desoxydinucleoside monophosphates complex remain invariant when different computational methods are used, although the quantitative values of some conformational parameters could vary lying within the limits typical for the corresponding family. We observe that popular functionals in density functional theory calculations lead to the overestimated distances between base pairs, while MP2 computations and the newer complex functionals produce the structures that have too close atom-atom contacts. A detailed study of some complementary desoxydinucleoside monophosphate complexes with Na-ions highlights the existence of several energy minima corresponding to BI-conformations, in other words, the complexity of the relief pattern of the potential energy surface of complementary desoxydinucleoside monophosphate complexes. This accounts for variability of conformational parameters of duplex fragments with the same base sequence. Popular molecular mechanics force fields AMBER and CHARMM reproduce most of the conformational characteristics of desoxydinucleoside monophosphates and their complementary complexes with Na-ions but fail to reproduce some details of the dependence of the Watson-Crick duplex conformation on the nucleotide sequence.
Raisali, Gholamreza; Mirzakhanian, Lalageh; Masoudi, Seyed Farhad; Semsarha, Farid
2013-01-01
In this work the number of DNA single-strand breaks (SSB) and double-strand breaks (DSB) due to direct and indirect effects of Auger electrons from incorporated (123)I and (125)I have been calculated by using the Geant4-DNA toolkit. We have performed and compared the calculations for several cases: (125)I versus (123)I, source positions and direct versus indirect breaks to study the capability of the Geant4-DNA in calculations of DNA damage yields. Two different simple geometries of a 41 base pair of B-DNA have been simulated. The location of (123)I has been considered to be in (123)IdUrd and three different locations for (125)I. The results showed that the simpler geometry is sufficient for direct break calculations while indirect damage yield is more sensitive to the helical shape of DNA. For (123)I Auger electrons, the average number of DSB due to the direct hits is almost twice the DSB due to the indirect hits. Furthermore, a comparison between the average number of SSB or DSB caused by Auger electrons of (125)I and (123)I in (125)IdUrd and (123)IdUrd shows that (125)I is 1.5 times more effective than (123)I per decay. The results are in reasonable agreement with previous experimental and theoretical results which shows the applicability of the Geant-DNA toolkit in nanodosimetry calculations which benefits from the open-source accessibility with the advantage that the DNA models used in this work enable us to save the computational time. Also, the results showed that the simpler geometry is suitable for direct break calculations, while for the indirect damage yield, the more precise model is preferred.
Systolic array IC for genetic computation
NASA Technical Reports Server (NTRS)
Anderson, D.
1991-01-01
Measuring similarities between large sequences of genetic information is a formidable task requiring enormous amounts of computer time. Geneticists claim that nearly two months of CRAY-2 time are required to run a single comparison of the known database against the new bases that will be found this year, and more than a CRAY-2 year for next year's genetic discoveries, and so on. The DNA IC, designed at HP-ICBD in cooperation with the California Institute of Technology and the Jet Propulsion Laboratory, is being implemented in order to move the task of genetic comparison onto workstations and personal computers, while vastly improving performance. The chip is a systolic (pumped) array comprised of 16 processors, control logic, and global RAM, totaling 400,000 FETS. At 12 MHz, each chip performs 2.7 billion 16 bit operations per second. Using 35 of these chips in series on one PC board (performing nearly 100 billion operations per second), a sequence of 560 bases can be compared against the eventual total genome of 3 billion bases, in minutes--on a personal computer. While the designed purpose of the DNA chip is for genetic research, other disciplines requiring similarity measurements between strings of 7 bit encoded data could make use of this chip as well. Cryptography and speech recognition are two examples. A mix of full custom design and standard cells, in CMOS34, were used to achieve these goals. Innovative test methods were developed to enhance controllability and observability in the array. This paper describes these techniques as well as the chip's functionality. This chip was designed in the 1989-90 timeframe.
A Novel Fast and Secure Approach for Voice Encryption Based on DNA Computing
NASA Astrophysics Data System (ADS)
Kakaei Kate, Hamidreza; Razmara, Jafar; Isazadeh, Ayaz
2018-06-01
Today, in the world of information communication, voice information has a particular importance. One way to preserve voice data from attacks is voice encryption. The encryption algorithms use various techniques such as hashing, chaotic, mixing, and many others. In this paper, an algorithm is proposed for voice encryption based on three different schemes to increase flexibility and strength of the algorithm. The proposed algorithm uses an innovative encoding scheme, the DNA encryption technique and a permutation function to provide a secure and fast solution for voice encryption. The algorithm is evaluated based on various measures including signal to noise ratio, peak signal to noise ratio, correlation coefficient, signal similarity and signal frequency content. The results demonstrate applicability of the proposed method in secure and fast encryption of voice files
Zhao, Peiwen; Bu, Yuxiang
2016-01-14
In this work, we computationally design radical nucleobases which possess improved electronic properties, especially diradical properties through introducing a cyclopentadiene radical. We predict that the detailed electromagnetic features of base assemblies are based on the orientation of the extra five-membered cyclopentadiene ring. Broken symmetry DFT calculations take into account the relevant structures and properties. Our results reveal that both the radicalized DNA bases and the base pairs formed when they combine with their counterparts remain stable and display larger spin delocalization. The mode of embedding the cyclopentadiene free radical in the structures has some influence on the degree of π-conjugation, which results in various diradical characteristics. Single-layered radical base pairs all have an open-shell singlet ground state, but the energy difference between singlet and triplet is not significant. For two-layered radical base pairs, the situation is more complex. All of them have an open-shell state as their ground state, including an open-shell singlet state and an open-shell triplet state. That is, the majority of radical base pairs possess anti-ferromagnetic or ferromagnetic characteristics. We present here a more in-depth discussion and analyses to study the magnetic characteristics of radical bases and base pairs. As an important factor, two-layered radical base pairs also have been carefully analyzed. We hope that all the measurements and results presented here will stimulate further detailed insights into the related mechanisms in modified DNA bases and the design of better ring-expanded DNA magnetic materials.
Genome wide approaches to identify protein-DNA interactions.
Ma, Tao; Ye, Zhenqing; Wang, Liguo
2018-05-29
Transcription factors are DNA-binding proteins that play key roles in many fundamental biological processes. Unraveling their interactions with DNA is essential to identify their target genes and understand the regulatory network. Genome-wide identification of their binding sites became feasible thanks to recent progress in experimental and computational approaches. ChIP-chip, ChIP-seq, and ChIP-exo are three widely used techniques to demarcate genome-wide transcription factor binding sites. This review aims to provide an overview of these three techniques including their experiment procedures, computational approaches, and popular analytic tools. ChIP-chip, ChIP-seq, and ChIP-exo have been the major techniques to study genome-wide in vivo protein-DNA interaction. Due to the rapid development of next-generation sequencing technology, array-based ChIP-chip is deprecated and ChIP-seq has become the most widely used technique to identify transcription factor binding sites in genome-wide. The newly developed ChIP-exo further improves the spatial resolution to single nucleotide. Numerous tools have been developed to analyze ChIP-chip, ChIP-seq and ChIP-exo data. However, different programs may employ different mechanisms or underlying algorithms thus each will inherently include its own set of statistical assumption and bias. So choosing the most appropriate analytic program for a given experiment needs careful considerations. Moreover, most programs only have command line interface so their installation and usage will require basic computation expertise in Unix/Linux. Copyright© Bentham Science Publishers; For any queries, please email at epub@benthamscience.org.
Balakrishnan, C; Subha, L; Neelakantan, M A; Mariappan, S S
2015-11-05
A propargyl arms containing Schiff base (L) was synthesized by the condensation of 1-[2-hydroxy-4-(prop-2-yn-1-yloxy)phenyl]ethanone with trans-1,2-diaminocyclohexane. The structure of L was characterized by IR, (1)H NMR, (13)C NMR and UV-Vis spectroscopy and by single crystal X-ray diffraction analysis. The UV-Visible spectral behavior of L in different solvents exhibits positive solvatochromism. Density functional calculation of the L in gas phase was performed by using DFT (B3LYP) method with 6-31G basis set. The computed vibrational frequencies and NMR signals of L were compared with the experimental data. Tautomeric stability study inferred that the enolimine is more stable than the ketoamine form. The charge delocalization has been analyzed using natural bond orbital (NBO) analysis. Electronic absorption and emission spectral studies were used to study the binding of L with CT-DNA. The molecular docking was done to identify the interaction of L with A-DNA and B-DNA. Copyright © 2015 Elsevier B.V. All rights reserved.
Mukhina, T M; Nikolaienko, T Yu
2015-01-01
Recent studies on Escherichia coli bacteria cultivation, in which DNA thymine was replaced with 5-chlorouracil have refreshed the problem of understanding the changes to physical properties of DNA monomers resultant from chemical modifications. These studies have shown that the replacement did not affect the normal activities and division of the bacteria, but has significantly reduced its life span. In this paper a comparative analysis was carried out by the methods of computational experiment of a set of 687 possible conformers of natural monomeric DNA unit (2'-deoxyribonucleotide thymidine monophosphate) and 660 conformers of 5-chloro-2'-deoxyuridine monophosphate - a similar molecules in which the natural nitrogenous base thymine is substituted with 5-chlorouracil. Structures of stable conformers of the modified deoxyribonucleotide have been obtained and physical factors, which determine their variation from the conformers of the unmodified molecule have been analyzed. A comparative analysis of the elastic properties of conformers of investigated molecules and non-covalent interactions present in them was conducted. The results can be usedfor planning experiments on synthesis of artficial DNA suitable for incorporation into living organisms.
Quantitative analysis and prediction of G-quadruplex forming sequences in double-stranded DNA
Kim, Minji; Kreig, Alex; Lee, Chun-Ying; Rube, H. Tomas; Calvert, Jacob; Song, Jun S.; Myong, Sua
2016-01-01
Abstract G-quadruplex (GQ) is a four-stranded DNA structure that can be formed in guanine-rich sequences. GQ structures have been proposed to regulate diverse biological processes including transcription, replication, translation and telomere maintenance. Recent studies have demonstrated the existence of GQ DNA in live mammalian cells and a significant number of potential GQ forming sequences in the human genome. We present a systematic and quantitative analysis of GQ folding propensity on a large set of 438 GQ forming sequences in double-stranded DNA by integrating fluorescence measurement, single-molecule imaging and computational modeling. We find that short minimum loop length and the thymine base are two main factors that lead to high GQ folding propensity. Linear and Gaussian process regression models further validate that the GQ folding potential can be predicted with high accuracy based on the loop length distribution and the nucleotide content of the loop sequences. Our study provides important new parameters that can inform the evaluation and classification of putative GQ sequences in the human genome. PMID:27095201
A structural-alphabet-based strategy for finding structural motifs across protein families
Wu, Chih Yuan; Chen, Yao Chi; Lim, Carmay
2010-01-01
Proteins with insignificant sequence and overall structure similarity may still share locally conserved contiguous structural segments; i.e. structural/3D motifs. Most methods for finding 3D motifs require a known motif to search for other similar structures or functionally/structurally crucial residues. Here, without requiring a query motif or essential residues, a fully automated method for discovering 3D motifs of various sizes across protein families with different folds based on a 16-letter structural alphabet is presented. It was applied to structurally non-redundant proteins bound to DNA, RNA, obligate/non-obligate proteins as well as free DNA-binding proteins (DBPs) and proteins with known structures but unknown function. Its usefulness was illustrated by analyzing the 3D motifs found in DBPs. A non-specific motif was found with a ‘corner’ architecture that confers a stable scaffold and enables diverse interactions, making it suitable for binding not only DNA but also RNA and proteins. Furthermore, DNA-specific motifs present ‘only’ in DBPs were discovered. The motifs found can provide useful guidelines in detecting binding sites and computational protein redesign. PMID:20525797
Scaling up digital circuit computation with DNA strand displacement cascades.
Qian, Lulu; Winfree, Erik
2011-06-03
To construct sophisticated biochemical circuits from scratch, one needs to understand how simple the building blocks can be and how robustly such circuits can scale up. Using a simple DNA reaction mechanism based on a reversible strand displacement process, we experimentally demonstrated several digital logic circuits, culminating in a four-bit square-root circuit that comprises 130 DNA strands. These multilayer circuits include thresholding and catalysis within every logical operation to perform digital signal restoration, which enables fast and reliable function in large circuits with roughly constant switching time and linear signal propagation delays. The design naturally incorporates other crucial elements for large-scale circuitry, such as general debugging tools, parallel circuit preparation, and an abstraction hierarchy supported by an automated circuit compiler.
NASA Astrophysics Data System (ADS)
Rajina, S. R.; Sudhi, Geethu; Austin, P.; Praveen, S. G.; Xavier, T. S.; Kenny, Peter T. M.; Binoy, J.
2018-05-01
The interaction of a drug with DNA and BSA play a great role in studying anti cancer activity and drug transport properties, which can be effectively, investigated using vibrational spectroscopy, UV visible spectroscopy and Fluorescence spectroscopy. The present work reports the structural features of N-(6-ferrocenyl-2-naphthoyl)-gamma-amino butyric acid Methyl ester (FNGABME) based on FTIR and FTRaman spectroscopy. The absorption and fluorescence spectroscopic methods were used to study the efficiency of the interaction of the compound FNGABME with BSA and DNA and also molecular docking were performed computationally to validate the results which shows that the title compound may exhibit inhibitory activity against the cancer cells.
Guo, Yan; Cai, Qiuyin; Samuels, David C; Ye, Fei; Long, Jirong; Li, Chung-I; Winther, Jeanette F; Tawn, E Janet; Stovall, Marilyn; Lähteenmäki, Päivi; Malila, Nea; Levy, Shawn; Shaffer, Christian; Shyr, Yu; Shu, Xiao-Ou; Boice, John D
2012-05-15
The human mitochondrial genome has an exclusively maternal mode of inheritance. Mitochondrial DNA (mtDNA) is particularly vulnerable to environmental insults due in part to an underdeveloped DNA repair system, limited to base excision and homologous recombination repair. Radiation exposure to the ovaries may cause mtDNA mutations in oocytes, which may in turn be transmitted to offspring. We hypothesized that the children of female cancer survivors who received radiation therapy may have an increased rate of mtDNA heteroplasmy mutations, which conceivably could increase their risk of developing cancer and other diseases. We evaluated 44 DNA blood samples from 17 Danish and 1 Finnish families (18 mothers and 26 children). All mothers had been treated for cancer as children and radiation doses to their ovaries were determined based on medical records and computational models. DNA samples were sequenced for the entire mitochondrial genome using the Illumina GAII system. Mother's age at sample collection was positively correlated with mtDNA heteroplasmy mutations. There was evidence of heteroplasmy inheritance in that 9 of the 18 families had at least one child who inherited at least one heteroplasmy site from his or her mother. No significant difference in single nucleotide polymorphisms between mother and offspring, however, was observed. Radiation therapy dose to ovaries also was not significantly associated with the heteroplasmy mutation rate among mothers and children. No evidence was found that radiotherapy for pediatric cancer is associated with the mitochondrial genome mutation rate in female cancer survivors and their children. Copyright © 2012 Elsevier B.V. All rights reserved.
ParticleCall: A particle filter for base calling in next-generation sequencing systems
2012-01-01
Background Next-generation sequencing systems are capable of rapid and cost-effective DNA sequencing, thus enabling routine sequencing tasks and taking us one step closer to personalized medicine. Accuracy and lengths of their reads, however, are yet to surpass those provided by the conventional Sanger sequencing method. This motivates the search for computationally efficient algorithms capable of reliable and accurate detection of the order of nucleotides in short DNA fragments from the acquired data. Results In this paper, we consider Illumina’s sequencing-by-synthesis platform which relies on reversible terminator chemistry and describe the acquired signal by reformulating its mathematical model as a Hidden Markov Model. Relying on this model and sequential Monte Carlo methods, we develop a parameter estimation and base calling scheme called ParticleCall. ParticleCall is tested on a data set obtained by sequencing phiX174 bacteriophage using Illumina’s Genome Analyzer II. The results show that the developed base calling scheme is significantly more computationally efficient than the best performing unsupervised method currently available, while achieving the same accuracy. Conclusions The proposed ParticleCall provides more accurate calls than the Illumina’s base calling algorithm, Bustard. At the same time, ParticleCall is significantly more computationally efficient than other recent schemes with similar performance, rendering it more feasible for high-throughput sequencing data analysis. Improvement of base calling accuracy will have immediate beneficial effects on the performance of downstream applications such as SNP and genotype calling. ParticleCall is freely available at https://sourceforge.net/projects/particlecall. PMID:22776067
DNA Compass: a secure, client-side site for navigating personal genetic information
Curnin, Charles; Gordon, Assaf; Erlich, Yaniv
2017-01-01
Abstract Motivation: Millions of individuals have access to raw genomic data using direct-to-consumer companies. The advent of large-scale sequencing projects, such as the Precision Medicine Initiative, will further increase the number of individuals with access to their own genomic information. However, querying genomic data requires a computer terminal and computational skill to analyze the data—an impediment for the general public. Results: DNA Compass is a website designed to empower the public by enabling simple navigation of personal genomic data. Users can query the status of their genomic variants for over 1658 markers or tens of millions of documented single nucleotide polymorphisms (SNPs). DNA Compass presents the relevant genotypes of the user side-by-side with explanatory scientific resources. The genotype data never leaves the user’s computer, a feature that provides improved security and performance. More than 12 000 unique users, mainly from the general genetic genealogy community, have already used DNA Compass, demonstrating its utility. Availability and Implementation: DNA Compass is freely available on https://compass.dna.land. Contact: yaniv@cs.columbia.edu PMID:28334237
CMG-biotools, a free workbench for basic comparative microbial genomics.
Vesth, Tammi; Lagesen, Karin; Acar, Öncel; Ussery, David
2013-01-01
Today, there are more than a hundred times as many sequenced prokaryotic genomes than were present in the year 2000. The economical sequencing of genomic DNA has facilitated a whole new approach to microbial genomics. The real power of genomics is manifested through comparative genomics that can reveal strain specific characteristics, diversity within species and many other aspects. However, comparative genomics is a field not easily entered into by scientists with few computational skills. The CMG-biotools package is designed for microbiologists with limited knowledge of computational analysis and can be used to perform a number of analyses and comparisons of genomic data. The CMG-biotools system presents a stand-alone interface for comparative microbial genomics. The package is a customized operating system, based on Xubuntu 10.10, available through the open source Ubuntu project. The system can be installed on a virtual computer, allowing the user to run the system alongside any other operating system. Source codes for all programs are provided under GNU license, which makes it possible to transfer the programs to other systems if so desired. We here demonstrate the package by comparing and analyzing the diversity within the class Negativicutes, represented by 31 genomes including 10 genera. The analyses include 16S rRNA phylogeny, basic DNA and codon statistics, proteome comparisons using BLAST and graphical analyses of DNA structures. This paper shows the strength and diverse use of the CMG-biotools system. The system can be installed on a vide range of host operating systems and utilizes as much of the host computer as desired. It allows the user to compare multiple genomes, from various sources using standardized data formats and intuitive visualizations of results. The examples presented here clearly shows that users with limited computational experience can perform complicated analysis without much training.
Lauerman, Lloyd H
2004-12-01
Since the discovery of the polymerase chain reaction (PCR) 20 years ago, an avalanche of scientific publications have reported major developments and changes in specialized equipment, reagents, sample preparation, computer programs and techniques, generated through business, government and university research. The requirement for genetic sequences for primer selection and validation has been greatly facilitated by the development of new sequencing techniques, machines and computer programs. Genetic libraries, such as GenBank, EMBL and DDBJ continue to accumulate a wealth of genetic sequence information for the development and validation of molecular-based diagnostic procedures concerning human and veterinary disease agents. The mechanization of various aspects of the PCR assay, such as robotics, microfluidics and nanotechnology, has made it possible for the rapid advancement of new procedures. Real-time PCR, DNA microarray and DNA chips utilize these newer techniques in conjunction with computer and computer programs. Instruments for hand-held PCR assays are being developed. The PCR and reverse transcription-PCR (RT-PCR) assays have greatly accelerated the speed and accuracy of diagnoses of human and animal disease, especially of the infectious agents that are difficult to isolate or demonstrate. The PCR has made it possible to genetically characterize a microbial isolate inexpensively and rapidly for identification, typing and epidemiological comparison.
A DNA-Inspired Encryption Methodology for Secure, Mobile Ad Hoc Networks
NASA Technical Reports Server (NTRS)
Shaw, Harry
2012-01-01
Users are pushing for greater physical mobility with their network and Internet access. Mobile ad hoc networks (MANET) can provide an efficient mobile network architecture, but security is a key concern. A figure summarizes differences in the state of network security for MANET and fixed networks. MANETs require the ability to distinguish trusted peers, and tolerate the ingress/egress of nodes on an unscheduled basis. Because the networks by their very nature are mobile and self-organizing, use of a Public Key Infra structure (PKI), X.509 certificates, RSA, and nonce ex changes becomes problematic if the ideal of MANET is to be achieved. Molecular biology models such as DNA evolution can provide a basis for a proprietary security architecture that achieves high degrees of diffusion and confusion, and resistance to cryptanalysis. A proprietary encryption mechanism was developed that uses the principles of DNA replication and steganography (hidden word cryptography) for confidentiality and authentication. The foundation of the approach includes organization of coded words and messages using base pairs organized into genes, an expandable genome consisting of DNA-based chromosome keys, and a DNA-based message encoding, replication, and evolution and fitness. In evolutionary computing, a fitness algorithm determines whether candidate solutions, in this case encrypted messages, are sufficiently encrypted to be transmitted. The technology provides a mechanism for confidential electronic traffic over a MANET without a PKI for authenticating users.
Prediction of TF target sites based on atomistic models of protein-DNA complexes
Angarica, Vladimir Espinosa; Pérez, Abel González; Vasconcelos, Ana T; Collado-Vides, Julio; Contreras-Moreira, Bruno
2008-01-01
Background The specific recognition of genomic cis-regulatory elements by transcription factors (TFs) plays an essential role in the regulation of coordinated gene expression. Studying the mechanisms determining binding specificity in protein-DNA interactions is thus an important goal. Most current approaches for modeling TF specific recognition rely on the knowledge of large sets of cognate target sites and consider only the information contained in their primary sequence. Results Here we describe a structure-based methodology for predicting sequence motifs starting from the coordinates of a TF-DNA complex. Our algorithm combines information regarding the direct and indirect readout of DNA into an atomistic statistical model, which is used to estimate the interaction potential. We first measure the ability of our method to correctly estimate the binding specificities of eight prokaryotic and eukaryotic TFs that belong to different structural superfamilies. Secondly, the method is applied to two homology models, finding that sampling of interface side-chain rotamers remarkably improves the results. Thirdly, the algorithm is compared with a reference structural method based on contact counts, obtaining comparable predictions for the experimental complexes and more accurate sequence motifs for the homology models. Conclusion Our results demonstrate that atomic-detail structural information can be feasibly used to predict TF binding sites. The computational method presented here is universal and might be applied to other systems involving protein-DNA recognition. PMID:18922190
Genetic dissection of the consensus sequence for the class 2 and class 3 flagellar promoters
Wozniak, Christopher E.; Hughes, Kelly T.
2008-01-01
Summary Computational searches for DNA binding sites often utilize consensus sequences. These search models make assumptions that the frequency of a base pair in an alignment relates to the base pair’s importance in binding and presume that base pairs contribute independently to the overall interaction with the DNA binding protein. These two assumptions have generally been found to be accurate for DNA binding sites. However, these assumptions are often not satisfied for promoters, which are involved in additional steps in transcription initiation after RNA polymerase has bound to the DNA. To test these assumptions for the flagellar regulatory hierarchy, class 2 and class 3 flagellar promoters were randomly mutagenized in Salmonella. Important positions were then saturated for mutagenesis and compared to scores calculated from the consensus sequence. Double mutants were constructed to determine how mutations combined for each promoter type. Mutations in the binding site for FlhD4C2, the activator of class 2 promoters, better satisfied the assumptions for the binding model than did mutations in the class 3 promoter, which is recognized by the σ28 transcription factor. These in vivo results indicate that the activator sites within flagellar promoters can be modeled using simple assumptions but that the DNA sequences recognized by the flagellar sigma factor require more complex models. PMID:18486950
Wormlike Chain Theory and Bending of Short DNA
NASA Astrophysics Data System (ADS)
Mazur, Alexey K.
2007-05-01
The probability distributions for bending angles in double helical DNA obtained in all-atom molecular dynamics simulations are compared with theoretical predictions. The computed distributions remarkably agree with the wormlike chain theory and qualitatively differ from predictions of the subelastic chain model. The computed data exhibit only small anomalies in the apparent flexibility of short DNA and cannot account for the recently reported AFM data. It is possible that the current atomistic DNA models miss some essential mechanisms of DNA bending on intermediate length scales. Analysis of bent DNA structures reveal, however, that the bending motion is structurally heterogeneous and directionally anisotropic on the length scales where the experimental anomalies were detected. These effects are essential for interpretation of the experimental data and they also can be responsible for the apparent discrepancy.
A Reversible DNA Logic Gate Platform Operated by One- and Two-Photon Excitations.
Tam, Dick Yan; Dai, Ziwen; Chan, Miu Shan; Liu, Ling Sum; Cheung, Man Ching; Bolze, Frederic; Tin, Chung; Lo, Pik Kwan
2016-01-04
We demonstrate the use of two different wavelength ranges of excitation light as inputs to remotely trigger the responses of the self-assembled DNA devices (D-OR). As an important feature of this device, the dependence of the readout fluorescent signals on the two external inputs, UV excitation for 1 min and/or near infrared irradiation (NIR) at 800 nm fs laser pulses, can mimic function of signal communication in OR logic gates. Their operations could be reset easily to its initial state. Furthermore, these DNA devices exhibit efficient cellular uptake, low cytotoxicity, and high bio-stability in different cell lines. They are considered as the first example of a photo-responsive DNA logic gate system, as well as a biocompatible, multi-wavelength excited system in response to UV and NIR. This is an important step to explore the concept of photo-responsive DNA-based systems as versatile tools in DNA computing, display devices, optical communication, and biology. © 2016 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
Simulation studies of DNA at the nanoscale: Interactions with proteins, polycations, and surfaces
NASA Astrophysics Data System (ADS)
Elder, Robert M.
Understanding the nanoscale interactions of DNA, a multifunctional biopolymer with sequence-dependent properties, with other biological and synthetic substrates and molecules is essential to advancing these technologies. This doctoral thesis research is aimed at understanding the thermodynamics and molecular-level structure when DNA interacts with proteins, polycations, and functionalized surfaces. First, we investigate the ability of a DNA damage recognition protein (HMGB1a) to bind to anti-cancer drug-induced DNA damage, seeking to explain how HMGB1a differentiates between the drugs in vivo. Using atomistic molecular dynamics simulations, we show that the structure of the drug-DNA molecule exhibits drug- and base sequence-dependence that explains some of the experimentally observed differential recognition of the drugs in various sequence contexts. Then, we show how steric hindrance from the drug decreases the deformability of the drug-DNA molecule, which decreases recognition by the protein, a concept that can be applied to rational drug design. Second, we study how polycation architecture and chemistry affect polycation-DNA binding so as to design optimal polycations for high efficiency gene (DNA) delivery. Using a multiscale computational approach involving atomistic and coarse-grained simulations, we examine how rearranging polylysine from a linear to a grafted architecture, and several aspects of the grafted architecture, affect polycation-DNA binding and the structure of polycation-DNA complexes. Next, going beyond lysine we examine how oligopeptide chemistry and sequence in the grafted architecture affects polycation-DNA binding and find that strategic placement of hydrophobic peptides might be used to tailor binding strength. Third, we study the adsorption and conformations of single-stranded DNA (an amphiphilic biopolymer) on model hydrophilic and hydrophobic surfaces. Short ssDNA oligomers adsorb to both surfaces with similar strength, with the strength of adsorption to the hydrophobic surface depending on the composition of the DNA strands, i.e. purine or pyrimidine bases. Additionally, DNA-surface and DNA-water interactions near the surfaces govern the adsorption. For longer ssDNA oligomers, the effects of surface chemistry and temperature on ssDNA conformations are rather small, but either the hydrophilic surface or increased temperature favor slightly more compact conformations due to energetic and entropic effects, respectively.
Bhattacharjee, Kaushik; Banerjee, Subhro; Joshi, Santa Ram
2012-01-01
Isolation and characterization of actinomycetes from soil samples from altitudinal gradient of North-East India were investigated for computational RNomics based phylogeny. A total of 52 diverse isolates of Streptomyces from the soil samples were isolated on four different media and from these 6 isolates were selected on the basis of cultural characteristics, microscopic and biochemical studies. Sequencing of 16S rDNA of the selected isolates identified them to belong to six different species of Streptomyces. The molecular morphometric and physico-kinetic analysis of 16S rRNA sequences were performed to predict the diversity of the genus. The computational RNomics study revealed the significance of the structural RNA based phylogenetic analysis in a relatively diverse group of Streptomyces. PMID:22829729
4P: fast computing of population genetics statistics from large DNA polymorphism panels
Benazzo, Andrea; Panziera, Alex; Bertorelle, Giorgio
2015-01-01
Massive DNA sequencing has significantly increased the amount of data available for population genetics and molecular ecology studies. However, the parallel computation of simple statistics within and between populations from large panels of polymorphic sites is not yet available, making the exploratory analyses of a set or subset of data a very laborious task. Here, we present 4P (parallel processing of polymorphism panels), a stand-alone software program for the rapid computation of genetic variation statistics (including the joint frequency spectrum) from millions of DNA variants in multiple individuals and multiple populations. It handles a standard input file format commonly used to store DNA variation from empirical or simulation experiments. The computational performance of 4P was evaluated using large SNP (single nucleotide polymorphism) datasets from human genomes or obtained by simulations. 4P was faster or much faster than other comparable programs, and the impact of parallel computing using multicore computers or servers was evident. 4P is a useful tool for biologists who need a simple and rapid computer program to run exploratory population genetics analyses in large panels of genomic data. It is also particularly suitable to analyze multiple data sets produced in simulation studies. Unix, Windows, and MacOs versions are provided, as well as the source code for easier pipeline implementations. PMID:25628874
Probabilistic simple sticker systems
NASA Astrophysics Data System (ADS)
Selvarajoo, Mathuri; Heng, Fong Wan; Sarmin, Nor Haniza; Turaev, Sherzod
2017-04-01
A model for DNA computing using the recombination behavior of DNA molecules, known as a sticker system, was introduced by by L. Kari, G. Paun, G. Rozenberg, A. Salomaa, and S. Yu in the paper entitled DNA computing, sticker systems and universality from the journal of Acta Informatica vol. 35, pp. 401-420 in the year 1998. A sticker system uses the Watson-Crick complementary feature of DNA molecules: starting from the incomplete double stranded sequences, and iteratively using sticking operations until a complete double stranded sequence is obtained. It is known that sticker systems with finite sets of axioms and sticker rules generate only regular languages. Hence, different types of restrictions have been considered to increase the computational power of sticker systems. Recently, a variant of restricted sticker systems, called probabilistic sticker systems, has been introduced [4]. In this variant, the probabilities are initially associated with the axioms, and the probability of a generated string is computed by multiplying the probabilities of all occurrences of the initial strings in the computation of the string. Strings for the language are selected according to some probabilistic requirements. In this paper, we study fundamental properties of probabilistic simple sticker systems. We prove that the probabilistic enhancement increases the computational power of simple sticker systems.
Use of prior odds for missing persons identifications.
Budowle, Bruce; Ge, Jianye; Chakraborty, Ranajit; Gill-King, Harrell
2011-06-27
Identification of missing persons from mass disasters is based on evaluation of a number of variables and observations regarding the combination of features derived from these variables. DNA typing now is playing a more prominent role in the identification of human remains, and particularly so for highly decomposed and fragmented remains. The strength of genetic associations, by either direct or kinship analyses, is often quantified by calculating a likelihood ratio. The likelihood ratio can be multiplied by prior odds based on nongenetic evidence to calculate the posterior odds, that is, by applying Bayes' Theorem, to arrive at a probability of identity. For the identification of human remains, the path creating the set and intersection of variables that contribute to the prior odds needs to be appreciated and well defined. Other than considering the total number of missing persons, the forensic DNA community has been silent on specifying the elements of prior odds computations. The variables include the number of missing individuals, eyewitness accounts, anthropological features, demographics and other identifying characteristics. The assumptions, supporting data and reasoning that are used to establish a prior probability that will be combined with the genetic data need to be considered and justified. Otherwise, data may be unintentionally or intentionally manipulated to achieve a probability of identity that cannot be supported and can thus misrepresent the uncertainty with associations. The forensic DNA community needs to develop guidelines for objectively computing prior odds.
Muver, a computational framework for accurately calling accumulated mutations.
Burkholder, Adam B; Lujan, Scott A; Lavender, Christopher A; Grimm, Sara A; Kunkel, Thomas A; Fargo, David C
2018-05-09
Identification of mutations from next-generation sequencing data typically requires a balance between sensitivity and accuracy. This is particularly true of DNA insertions and deletions (indels), that can impart significant phenotypic consequences on cells but are harder to call than substitution mutations from whole genome mutation accumulation experiments. To overcome these difficulties, we present muver, a computational framework that integrates established bioinformatics tools with novel analytical methods to generate mutation calls with the extremely low false positive rates and high sensitivity required for accurate mutation rate determination and comparison. Muver uses statistical comparison of ancestral and descendant allelic frequencies to identify variant loci and assigns genotypes with models that include per-sample assessments of sequencing errors by mutation type and repeat context. Muver identifies maximally parsimonious mutation pathways that connect these genotypes, differentiating potential allelic conversion events and delineating ambiguities in mutation location, type, and size. Benchmarking with a human gold standard father-son pair demonstrates muver's sensitivity and low false positive rates. In DNA mismatch repair (MMR) deficient Saccharomyces cerevisiae, muver detects multi-base deletions in homopolymers longer than the replicative polymerase footprint at rates greater than predicted for sequential single-base deletions, implying a novel multi-repeat-unit slippage mechanism. Benchmarking results demonstrate the high accuracy and sensitivity achieved with muver, particularly for indels, relative to available tools. Applied to an MMR-deficient Saccharomyces cerevisiae system, muver mutation calls facilitate mechanistic insights into DNA replication fidelity.
DNA MemoChip: Long-Term and High Capacity Information Storage and Select Retrieval
Wang, Fuzhou; Kream, Richard M.
2018-01-01
Over the course of history, human beings have never stopped seeking effective methods for information storage. From rocks to paper, and through the past several decades of using computer disks, USB sticks, and on to the thin silicon “chips” and “cloud” storage of today, it would seem that we have reached an era of efficiency for managing innumerable and ever-expanding data. Astonishingly, when tracing this technological path, one realizes that our ancient methods of informational storage far outlast paper (10,000 vs. 1,000 years, respectively), let alone the computer-based memory devices that only last, on average, 5 to 25 years. During this time of fast-paced information generation, it becomes increasingly difficult for current storage methods to retain such massive amounts of data, and to maintain appropriate speeds with which to retrieve it, especially when in demand by a large number of users. Others have proposed that DNA-based information storage provides a way forward for information retention as a result of its temporal stability. It is now evident that DNA represents a potentially economical and sustainable mechanism for storing information, as demonstrated by its decoding from a 700,000 year-old horse genome. The fact that the human genome is present in a cell, containing also the varied mitochondrial genome, indicates DNA’s great potential for large data storage in a ‘smaller’ space. PMID:29481548
Brunk, Elizabeth; Ashari, Negar; Athri, Prashanth; Campomanes, Pablo; de Carvalho, F Franco; Curchod, Basile F E; Diamantis, Polydefkis; Doemer, Manuel; Garrec, Julian; Laktionov, Andrey; Micciarelli, Marco; Neri, Marilisa; Palermo, Giulia; Penfold, Thomas J; Vanni, Stefano; Tavernelli, Ivano; Rothlisberger, Ursula
2011-01-01
The Laboratory of Computational Chemistry and Biochemistry is active in the development and application of first-principles based simulations of complex chemical and biochemical phenomena. Here, we review some of our recent efforts in extending these methods to larger systems, longer time scales and increased accuracies. Their versatility is illustrated with a diverse range of applications, ranging from the determination of the gas phase structure of the cyclic decapeptide gramicidin S, to the study of G protein coupled receptors, the interaction of transition metal based anti-cancer agents with protein targets, the mechanism of action of DNA repair enzymes, the role of metal ions in neurodegenerative diseases and the computational design of dye-sensitized solar cells. Many of these projects are done in collaboration with experimental groups from the Institute of Chemical Sciences and Engineering (ISIC) at the EPFL.
Johnston, Iain G; Burgstaller, Joerg P; Havlicek, Vitezslav; Kolbe, Thomas; Rülicke, Thomas; Brem, Gottfried; Poulton, Jo; Jones, Nick S
2015-01-01
Dangerous damage to mitochondrial DNA (mtDNA) can be ameliorated during mammalian development through a highly debated mechanism called the mtDNA bottleneck. Uncertainty surrounding this process limits our ability to address inherited mtDNA diseases. We produce a new, physically motivated, generalisable theoretical model for mtDNA populations during development, allowing the first statistical comparison of proposed bottleneck mechanisms. Using approximate Bayesian computation and mouse data, we find most statistical support for a combination of binomial partitioning of mtDNAs at cell divisions and random mtDNA turnover, meaning that the debated exact magnitude of mtDNA copy number depletion is flexible. New experimental measurements from a wild-derived mtDNA pairing in mice confirm the theoretical predictions of this model. We analytically solve a mathematical description of this mechanism, computing probabilities of mtDNA disease onset, efficacy of clinical sampling strategies, and effects of potential dynamic interventions, thus developing a quantitative and experimentally-supported stochastic theory of the bottleneck. DOI: http://dx.doi.org/10.7554/eLife.07464.001 PMID:26035426
Liang, JingXin; Nguyen, Quynh L.; Matsika, Spiridoula
2016-01-01
Fluorescent analogues of the natural DNA bases are useful in the study of nucleic acids’ structure and dynamics. 2-Aminopurine (2AP) is a widely used analogue with environmentally sensitive fluorescence behavior. The quantum yield of 2AP has been found to be significantly decreased when engaged in π-stacking interactions with the native bases. We present a theoretical study on fluorescence quenching mechanisms in dimers of 2AP π-stacked with adenine or guanine as in natural DNA. Relaxation pathways on the potential energy surfaces of the first excited states have been computed and reveal the importance of exciplexes and conical intersections in the fluorescence quenching process. PMID:23625036
High-resolution mapping of bifurcations in nonlinear biochemical circuits
NASA Astrophysics Data System (ADS)
Genot, A. J.; Baccouche, A.; Sieskind, R.; Aubert-Kato, N.; Bredeche, N.; Bartolo, J. F.; Taly, V.; Fujii, T.; Rondelez, Y.
2016-08-01
Analog molecular circuits can exploit the nonlinear nature of biochemical reaction networks to compute low-precision outputs with fewer resources than digital circuits. This analog computation is similar to that employed by gene-regulation networks. Although digital systems have a tractable link between structure and function, the nonlinear and continuous nature of analog circuits yields an intricate functional landscape, which makes their design counter-intuitive, their characterization laborious and their analysis delicate. Here, using droplet-based microfluidics, we map with high resolution and dimensionality the bifurcation diagrams of two synthetic, out-of-equilibrium and nonlinear programs: a bistable DNA switch and a predator-prey DNA oscillator. The diagrams delineate where function is optimal, dynamics bifurcates and models fail. Inverse problem solving on these large-scale data sets indicates interference from enzymatic coupling. Additionally, data mining exposes the presence of rare, stochastically bursting oscillators near deterministic bifurcations.
Escherichia coli promoter sequences predict in vitro RNA polymerase selectivity.
Mulligan, M E; Hawley, D K; Entriken, R; McClure, W R
1984-01-11
We describe a simple algorithm for computing a homology score for Escherichia coli promoters based on DNA sequence alone. The homology score was related to 31 values, measured in vitro, of RNA polymerase selectivity, which we define as the product KBk2, the apparent second order rate constant for open complex formation. We found that promoter strength could be predicted to within a factor of +/-4.1 in KBk2 over a range of 10(4) in the same parameter. The quantitative evaluation was linked to an automated (Apple II) procedure for searching and evaluating possible promoters in DNA sequence files.
DNA as Sensors and Imaging Agents for Metal Ions
Xiang, Yu
2014-01-01
Increasing interests in detecting metal ions in many chemical and biomedical fields have created demands for developing sensors and imaging agents for metal ions with high sensitivity and selectivity. This review covers recent progress in DNA-based sensors and imaging agents for metal ions. Through both combinatorial selection and rational design, a number of metal ion-dependent DNAzymes and metal ion-binding DNA structures that can selectively recognize specific metal ions have been obtained. By attaching these DNA molecules with signal reporters such as fluorophores, chromophores, electrochemical tags, and Raman tags, a number of DNA-based sensors for both diamagnetic and paramagnetic metal ions have been developed for fluorescent, colorimetric, electrochemical, and surface Raman detections. These sensors are highly sensitive (with detection limit down to 11 ppt) and selective (with selectivity up to millions-fold) toward specific metal ions. In addition, through further development to simplify the operation, such as the use of “dipstick tests”, portable fluorometers, computer-readable discs, and widely available glucose meters, these sensors have been applied for on-site and real-time environmental monitoring and point-of-care medical diagnostics. The use of these sensors for in situ cellular imaging has also been reported. The generality of the combinatorial selection to obtain DNAzymes for almost any metal ion in any oxidation state, and the ease of modification of the DNA with different signal reporters make DNA an emerging and promising class of molecules for metal ion sensing and imaging in many fields of applications. PMID:24359450
A discriminatory function for prediction of protein-DNA interactions based on alpha shape modeling.
Zhou, Weiqiang; Yan, Hong
2010-10-15
Protein-DNA interaction has significant importance in many biological processes. However, the underlying principle of the molecular recognition process is still largely unknown. As more high-resolution 3D structures of protein-DNA complex are becoming available, the surface characteristics of the complex become an important research topic. In our work, we apply an alpha shape model to represent the surface structure of the protein-DNA complex and developed an interface-atom curvature-dependent conditional probability discriminatory function for the prediction of protein-DNA interaction. The interface-atom curvature-dependent formalism captures atomic interaction details better than the atomic distance-based method. The proposed method provides good performance in discriminating the native structures from the docking decoy sets, and outperforms the distance-dependent formalism in terms of the z-score. Computer experiment results show that the curvature-dependent formalism with the optimal parameters can achieve a native z-score of -8.17 in discriminating the native structure from the highest surface-complementarity scored decoy set and a native z-score of -7.38 in discriminating the native structure from the lowest RMSD decoy set. The interface-atom curvature-dependent formalism can also be used to predict apo version of DNA-binding proteins. These results suggest that the interface-atom curvature-dependent formalism has a good prediction capability for protein-DNA interactions. The code and data sets are available for download on http://www.hy8.com/bioinformatics.htm kenandzhou@hotmail.com.
The role of cytosine methylation on charge transport through a DNA strand
DOE Office of Scientific and Technical Information (OSTI.GOV)
Qi, Jianqing, E-mail: jqqi@uw.edu; Anantram, M. P., E-mail: anantmp@uw.edu; Govind, Niranjan, E-mail: niri.govind@pnnl.gov
Cytosine methylation has been found to play a crucial role in various biological processes, including a number of human diseases. The detection of this small modification remains challenging. In this work, we computationally explore the possibility of detecting methylated DNA strands through direct electrical conductance measurements. Using density functional theory and the Landauer-Büttiker method, we study the electronic properties and charge transport through an eight base-pair methylated DNA strand and its native counterpart. We first analyze the effect of cytosine methylation on the tight-binding parameters of two DNA strands and then model the transmission of the electrons and conductance throughmore » the strands both with and without decoherence. We find that the main difference of the tight-binding parameters between the native DNA and the methylated DNA lies in the on-site energies of (methylated) cytosine bases. The intra- and inter-strand hopping integrals between two nearest neighboring guanine base and (methylated) cytosine base also change with the addition of the methyl groups. Our calculations show that in the phase-coherent limit, the transmission of the methylated strand is close to the native strand when the energy is nearby the highest occupied molecular orbital level and larger than the native strand by 5 times in the bandgap. The trend in transmission also holds in the presence of the decoherence with the same rate. The lower conductance for the methylated strand in the experiment is suggested to be caused by the more stable structure due to the introduction of the methyl groups. We also study the role of the exchange-correlation functional and the effect of contact coupling by choosing coupling strengths ranging from weak to strong coupling limit.« less
The role of cytosine methylation on charge transport through a DNA strand
NASA Astrophysics Data System (ADS)
Qi, Jianqing; Govind, Niranjan; Anantram, M. P.
2015-09-01
Cytosine methylation has been found to play a crucial role in various biological processes, including a number of human diseases. The detection of this small modification remains challenging. In this work, we computationally explore the possibility of detecting methylated DNA strands through direct electrical conductance measurements. Using density functional theory and the Landauer-Büttiker method, we study the electronic properties and charge transport through an eight base-pair methylated DNA strand and its native counterpart. We first analyze the effect of cytosine methylation on the tight-binding parameters of two DNA strands and then model the transmission of the electrons and conductance through the strands both with and without decoherence. We find that the main difference of the tight-binding parameters between the native DNA and the methylated DNA lies in the on-site energies of (methylated) cytosine bases. The intra- and inter-strand hopping integrals between two nearest neighboring guanine base and (methylated) cytosine base also change with the addition of the methyl groups. Our calculations show that in the phase-coherent limit, the transmission of the methylated strand is close to the native strand when the energy is nearby the highest occupied molecular orbital level and larger than the native strand by 5 times in the bandgap. The trend in transmission also holds in the presence of the decoherence with the same rate. The lower conductance for the methylated strand in the experiment is suggested to be caused by the more stable structure due to the introduction of the methyl groups. We also study the role of the exchange-correlation functional and the effect of contact coupling by choosing coupling strengths ranging from weak to strong coupling limit.
2010-01-01
Background The robust storage, updating and utilization of information are necessary for the maintenance and perpetuation of dynamic systems. These systems can exist as constructs of metal-oxide semiconductors and silicon, as in a digital computer, or in the "wetware" of organic compounds, proteins and nucleic acids that make up biological organisms. We propose that there are essential functional properties of centralized information-processing systems; for digital computers these properties reside in the computer's hard drive, and for eukaryotic cells they are manifest in the DNA and associated structures. Methods Presented herein is a descriptive framework that compares DNA and its associated proteins and sub-nuclear structure with the structure and function of the computer hard drive. We identify four essential properties of information for a centralized storage and processing system: (1) orthogonal uniqueness, (2) low level formatting, (3) high level formatting and (4) translation of stored to usable form. The corresponding aspects of the DNA complex and a computer hard drive are categorized using this classification. This is intended to demonstrate a functional equivalence between the components of the two systems, and thus the systems themselves. Results Both the DNA complex and the computer hard drive contain components that fulfill the essential properties of a centralized information storage and processing system. The functional equivalence of these components provides insight into both the design process of engineered systems and the evolved solutions addressing similar system requirements. However, there are points where the comparison breaks down, particularly when there are externally imposed information-organizing structures on the computer hard drive. A specific example of this is the imposition of the File Allocation Table (FAT) during high level formatting of the computer hard drive and the subsequent loading of an operating system (OS). Biological systems do not have an external source for a map of their stored information or for an operational instruction set; rather, they must contain an organizational template conserved within their intra-nuclear architecture that "manipulates" the laws of chemistry and physics into a highly robust instruction set. We propose that the epigenetic structure of the intra-nuclear environment and the non-coding RNA may play the roles of a Biological File Allocation Table (BFAT) and biological operating system (Bio-OS) in eukaryotic cells. Conclusions The comparison of functional and structural characteristics of the DNA complex and the computer hard drive leads to a new descriptive paradigm that identifies the DNA as a dynamic storage system of biological information. This system is embodied in an autonomous operating system that inductively follows organizational structures, data hierarchy and executable operations that are well understood in the computer science industry. Characterizing the "DNA hard drive" in this fashion can lead to insights arising from discrepancies in the descriptive framework, particularly with respect to positing the role of epigenetic processes in an information-processing context. Further expansions arising from this comparison include the view of cells as parallel computing machines and a new approach towards characterizing cellular control systems. PMID:20092652
D'Onofrio, David J; An, Gary
2010-01-21
The robust storage, updating and utilization of information are necessary for the maintenance and perpetuation of dynamic systems. These systems can exist as constructs of metal-oxide semiconductors and silicon, as in a digital computer, or in the "wetware" of organic compounds, proteins and nucleic acids that make up biological organisms. We propose that there are essential functional properties of centralized information-processing systems; for digital computers these properties reside in the computer's hard drive, and for eukaryotic cells they are manifest in the DNA and associated structures. Presented herein is a descriptive framework that compares DNA and its associated proteins and sub-nuclear structure with the structure and function of the computer hard drive. We identify four essential properties of information for a centralized storage and processing system: (1) orthogonal uniqueness, (2) low level formatting, (3) high level formatting and (4) translation of stored to usable form. The corresponding aspects of the DNA complex and a computer hard drive are categorized using this classification. This is intended to demonstrate a functional equivalence between the components of the two systems, and thus the systems themselves. Both the DNA complex and the computer hard drive contain components that fulfill the essential properties of a centralized information storage and processing system. The functional equivalence of these components provides insight into both the design process of engineered systems and the evolved solutions addressing similar system requirements. However, there are points where the comparison breaks down, particularly when there are externally imposed information-organizing structures on the computer hard drive. A specific example of this is the imposition of the File Allocation Table (FAT) during high level formatting of the computer hard drive and the subsequent loading of an operating system (OS). Biological systems do not have an external source for a map of their stored information or for an operational instruction set; rather, they must contain an organizational template conserved within their intra-nuclear architecture that "manipulates" the laws of chemistry and physics into a highly robust instruction set. We propose that the epigenetic structure of the intra-nuclear environment and the non-coding RNA may play the roles of a Biological File Allocation Table (BFAT) and biological operating system (Bio-OS) in eukaryotic cells. The comparison of functional and structural characteristics of the DNA complex and the computer hard drive leads to a new descriptive paradigm that identifies the DNA as a dynamic storage system of biological information. This system is embodied in an autonomous operating system that inductively follows organizational structures, data hierarchy and executable operations that are well understood in the computer science industry. Characterizing the "DNA hard drive" in this fashion can lead to insights arising from discrepancies in the descriptive framework, particularly with respect to positing the role of epigenetic processes in an information-processing context. Further expansions arising from this comparison include the view of cells as parallel computing machines and a new approach towards characterizing cellular control systems.
Statistical Physics Approaches to RNA Editing
NASA Astrophysics Data System (ADS)
Bundschuh, Ralf
2012-02-01
The central dogma of molecular Biology states that DNA is transcribed base by base into RNA which is in turn translated into proteins. However, some organisms edit their RNA before translation by inserting, deleting, or substituting individual or short stretches of bases. In many instances the mechanisms by which an organism recognizes the positions at which to edit or by which it performs the actual editing are unknown. One model system that stands out by its very high rate of on average one out of 25 bases being edited are the Myxomycetes, a class of slime molds. In this talk we will show how the computational methods and concepts from statistical Physics can be used to analyze DNA and protein sequence data to predict editing sites in these slime molds and to guide experiments that identified previously unknown types of editing as well as the complete set of editing events in the slime mold Physarum polycephalum.
Modeling Hybridization Kinetics of Gene Probes in a DNA Biochip Using FEMLAB
Munir, Ahsan; Waseem, Hassan; Williams, Maggie R.; Stedtfeld, Robert D.; Gulari, Erdogan; Tiedje, James M.; Hashsham, Syed A.
2017-01-01
Microfluidic DNA biochips capable of detecting specific DNA sequences are useful in medical diagnostics, drug discovery, food safety monitoring and agriculture. They are used as miniaturized platforms for analysis of nucleic acids-based biomarkers. Binding kinetics between immobilized single stranded DNA on the surface and its complementary strand present in the sample are of interest. To achieve optimal sensitivity with minimum sample size and rapid hybridization, ability to predict the kinetics of hybridization based on the thermodynamic characteristics of the probe is crucial. In this study, a computer aided numerical model for the design and optimization of a flow-through biochip was developed using a finite element technique packaged software tool (FEMLAB; package included in COMSOL Multiphysics) to simulate the transport of DNA through a microfluidic chamber to the reaction surface. The model accounts for fluid flow, convection and diffusion in the channel and on the reaction surface. Concentration, association rate constant, dissociation rate constant, recirculation flow rate, and temperature were key parameters affecting the rate of hybridization. The model predicted the kinetic profile and signal intensities of eighteen 20-mer probes targeting vancomycin resistance genes (VRGs). Predicted signal intensities and hybridization kinetics strongly correlated with experimental data in the biochip (R2 = 0.8131). PMID:28555058
Modeling Hybridization Kinetics of Gene Probes in a DNA Biochip Using FEMLAB.
Munir, Ahsan; Waseem, Hassan; Williams, Maggie R; Stedtfeld, Robert D; Gulari, Erdogan; Tiedje, James M; Hashsham, Syed A
2017-05-29
Microfluidic DNA biochips capable of detecting specific DNA sequences are useful in medical diagnostics, drug discovery, food safety monitoring and agriculture. They are used as miniaturized platforms for analysis of nucleic acids-based biomarkers. Binding kinetics between immobilized single stranded DNA on the surface and its complementary strand present in the sample are of interest. To achieve optimal sensitivity with minimum sample size and rapid hybridization, ability to predict the kinetics of hybridization based on the thermodynamic characteristics of the probe is crucial. In this study, a computer aided numerical model for the design and optimization of a flow-through biochip was developed using a finite element technique packaged software tool (FEMLAB; package included in COMSOL Multiphysics) to simulate the transport of DNA through a microfluidic chamber to the reaction surface. The model accounts for fluid flow, convection and diffusion in the channel and on the reaction surface. Concentration, association rate constant, dissociation rate constant, recirculation flow rate, and temperature were key parameters affecting the rate of hybridization. The model predicted the kinetic profile and signal intensities of eighteen 20-mer probes targeting vancomycin resistance genes (VRGs). Predicted signal intensities and hybridization kinetics strongly correlated with experimental data in the biochip (R² = 0.8131).
Gan, Lin; Camacho-Alanis, Fernanda; Ros, Alexandra
2015-12-15
DNA nanoassemblies, such as DNA origamis, hold promise in biosensing, drug delivery, nanoelectronic circuits, and biological computing, which require suitable methods for migration and precision positioning. Insulator-based dielectrophoresis (iDEP) has been demonstrated as a powerful migration and trapping tool for μm- and nm-sized colloids as well as DNA origamis. However, little is known about the polarizability of origami species, which is responsible for their dielectrophoretic migration. Here, we report the experimentally determined polarizabilities of the six-helix bundle origami (6HxB) and triangle origami by measuring the migration times through a potential landscape exhibiting dielectrophoretic barriers. The resulting migration times correlate to the depth of the dielectrophoretic potential barrier and the escape characteristics of the origami according to an adapted Kramer's rate model, allowing their polarizabilities to be determined. We found that the 6HxB polarizability is larger than that of the triangle origami, which correlates with the variations in charge density of both origamis. Further, we discuss the orientation of both origami species in the dielectrophoretic trap and discuss the influence of diffusion during the escape process. Our study provides detailed insight into the factors contributing to the migration through dielectrophoretic potential landscapes, which can be exploited for applications with DNA and other nanoassemblies based on dielectrophoresis.
Fantin, Yuri S.; Neverov, Alexey D.; Favorov, Alexander V.; Alvarez-Figueroa, Maria V.; Braslavskaya, Svetlana I.; Gordukova, Maria A.; Karandashova, Inga V.; Kuleshov, Konstantin V.; Myznikova, Anna I.; Polishchuk, Maya S.; Reshetov, Denis A.; Voiciehovskaya, Yana A.; Mironov, Andrei A.; Chulanov, Vladimir P.
2013-01-01
Sanger sequencing is a common method of reading DNA sequences. It is less expensive than high-throughput methods, and it is appropriate for numerous applications including molecular diagnostics. However, sequencing mixtures of similar DNA of pathogens with this method is challenging. This is important because most clinical samples contain such mixtures, rather than pure single strains. The traditional solution is to sequence selected clones of PCR products, a complicated, time-consuming, and expensive procedure. Here, we propose the base-calling with vocabulary (BCV) method that computationally deciphers Sanger chromatograms obtained from mixed DNA samples. The inputs to the BCV algorithm are a chromatogram and a dictionary of sequences that are similar to those we expect to obtain. We apply the base-calling function on a test dataset of chromatograms without ambiguous positions, as well as one with 3–14% sequence degeneracy. Furthermore, we use BCV to assemble a consensus sequence for an HIV genome fragment in a sample containing a mixture of viral DNA variants and to determine the positions of the indels. Finally, we detect drug-resistant Mycobacterium tuberculosis strains carrying frameshift mutations mixed with wild-type bacteria in the pncA gene, and roughly characterize bacterial communities in clinical samples by direct 16S rRNA sequencing. PMID:23382983
Architecture with GIDEON, A Program for Design in Structural DNA Nanotechnology
Birac, Jeffrey J.; Sherman, William B.; Kopatsch, Jens; Constantinou, Pamela E.; Seeman, Nadrian C.
2012-01-01
We present geometry based design strategies for DNA nanostructures. The strategies have been implemented with GIDEON – a Graphical Integrated Development Environment for OligoNucleotides. GIDEON has a highly flexible graphical user interface that facilitates the development of simple yet precise models, and the evaluation of strains therein. Models are built on a simple model of undistorted B-DNA double-helical domains. Simple point and click manipulations of the model allow the minimization of strain in the phosphate-backbone linkages between these domains and the identification of any steric clashes that might occur as a result. Detailed analysis of 3D triangles yields clear predictions of the strains associated with triangles of different sizes. We have carried out experiments that confirm that 3D triangles form well only when their geometrical strain is less than 4% deviation from the estimated relaxed structure. Thus geometry-based techniques alone, without energetic considerations, can be used to explain general trends in DNA structure formation. We have used GIDEON to build detailed models of double crossover and triple crossover molecules, evaluating the non-planarity associated with base tilt and junction mis-alignments. Computer modeling using a graphical user interface overcomes the limited precision of physical models for larger systems, and the limited interaction rate associated with earlier, command-line driven software. PMID:16630733
NASA Astrophysics Data System (ADS)
Al-Otaibi, Jamelah S.; Teesdale Spittle, Paul; El Gogary, Tarek M.
2017-01-01
Anthraquinones form the basis of several anticancer drugs. Anthraquinones anticancer drugs carry out their cytotoxic activities through their interaction with DNA, and inhibition of topoisomerase II activity. Anthraquinones (AQ4 and AQ4H) were synthesized and studied along with 1,4-DAAQ by computational and experimental tools. The purpose of this study is to shade more light on mechanism of interaction between anthraquinone DNA affinic agents and different types of DNA. This study will lead to gain of information useful for drug design and development. Molecular structures were optimized using DFT B3LYP/6-31 + G(d). Depending on intramolecular hydrogen bonding interactions two conformers of AQ4 were detected and computed as 25.667 kcal/mol apart. Molecular reactivity of the anthraquinone compounds was explored using global and condensed descriptors (electrophilicity and Fukui functions). Molecular docking studies for the inhibition of CDK2 and DNA binding were carried out to explore the anti cancer potency of these drugs. NMR and UV-VIS electronic absorption spectra of anthraquinones/DNA were investigated at the physiological pH. The interaction of the three anthraquinones (AQ4, AQ4H and 1,4-DAAQ) were studied with three DNA (calf thymus DNA, (Poly[dA].Poly[dT]) and (Poly[dG].Poly[dC]). NMR study shows a qualitative pattern of drug/DNA interaction in terms of band shift and broadening. UV-VIS electronic absorption spectra were employed to measure the affinity constants of drug/DNA binding using Scatchard analysis.
Map-invariant spectral analysis for the identification of DNA periodicities
2012-01-01
Many signal processing based methods for finding hidden periodicities in DNA sequences have primarily focused on assigning numerical values to the symbolic DNA sequence and then applying spectral analysis tools such as the short-time discrete Fourier transform (ST-DFT) to locate these repeats. The key results pertaining to this approach are however obtained using a very specific symbolic to numerical map, namely the so-called Voss representation. An important research problem is to therefore quantify the sensitivity of these results to the choice of the symbolic to numerical map. In this article, a novel algebraic approach to the periodicity detection problem is presented and provides a natural framework for studying the role of the symbolic to numerical map in finding these repeats. More specifically, we derive a new matrix-based expression of the DNA spectrum that comprises most of the widely used mappings in the literature as special cases, shows that the DNA spectrum is in fact invariable under all these mappings, and generates a necessary and sufficient condition for the invariance of the DNA spectrum to the symbolic to numerical map. Furthermore, the new algebraic framework decomposes the periodicity detection problem into several fundamental building blocks that are totally independent of each other. Sophisticated digital filters and/or alternate fast data transforms such as the discrete cosine and sine transforms can therefore be always incorporated in the periodicity detection scheme regardless of the choice of the symbolic to numerical map. Although the newly proposed framework is matrix based, identification of these periodicities can be achieved at a low computational cost. PMID:23067324
Coarse-grained molecular dynamics simulations for giant protein-DNA complexes
NASA Astrophysics Data System (ADS)
Takada, Shoji
Biomolecules are highly hierarchic and intrinsically flexible. Thus, computational modeling calls for multi-scale methodologies. We have been developing a coarse-grained biomolecular model where on-average 10-20 atoms are grouped into one coarse-grained (CG) particle. Interactions among CG particles are tuned based on atomistic interactions and the fluctuation matching algorithm. CG molecular dynamics methods enable us to simulate much longer time scale motions of much larger molecular systems than fully atomistic models. After broad sampling of structures with CG models, we can easily reconstruct atomistic models, from which one can continue conventional molecular dynamics simulations if desired. Here, we describe our CG modeling methodology for protein-DNA complexes, together with various biological applications, such as the DNA duplication initiation complex, model chromatins, and transcription factor dynamics on chromatin-like environment.
Applications of statistical physics and information theory to the analysis of DNA sequences
NASA Astrophysics Data System (ADS)
Grosse, Ivo
2000-10-01
DNA carries the genetic information of most living organisms, and the of genome projects is to uncover that genetic information. One basic task in the analysis of DNA sequences is the recognition of protein coding genes. Powerful computer programs for gene recognition have been developed, but most of them are based on statistical patterns that vary from species to species. In this thesis I address the question if there exist universal statistical patterns that are different in coding and noncoding DNA of all living species, regardless of their phylogenetic origin. In search for such species-independent patterns I study the mutual information function of genomic DNA sequences, and find that it shows persistent period-three oscillations. To understand the biological origin of the observed period-three oscillations, I compare the mutual information function of genomic DNA sequences to the mutual information function of stochastic model sequences. I find that the pseudo-exon model is able to reproduce the mutual information function of genomic DNA sequences. Moreover, I find that a generalization of the pseudo-exon model can connect the existence and the functional form of long-range correlations to the presence and the length distributions of coding and noncoding regions. Based on these theoretical studies I am able to find an information-theoretical quantity, the average mutual information (AMI), whose probability distributions are significantly different in coding and noncoding DNA, while they are almost identical in all studied species. These findings show that there exist universal statistical patterns that are different in coding and noncoding DNA of all studied species, and they suggest that the AMI may be used to identify genes in different living species, irrespective of their taxonomic origin.
Arias, María Elena; Sánchez-Villalba, Esther; Delgado, Andrea; Felmer, Ricardo
2017-02-01
Sperm-mediated gene transfer (SMGT) is based on the capacity of sperm to bind exogenous DNA and transfer it into the oocyte during fertilization. In bovines, the progress of this technology has been slow due to the poor reproducibility and efficiency of the production of transgenic embryos. The aim of the present study was to evaluate the effects of different sperm transfection systems on the quality and functional parameters of sperm. Additionally, the ability of sperm to bind and incorporate exogenous DNA was assessed. These analyses were carried out by flow cytometry and confocal fluorescence microscopy, and motility parameters were also evaluated by computer-assisted sperm analysis (CASA). Transfection was carried out using complexes of plasmid DNA with Lipofectamine, SuperFect and TurboFect for 0.5, 1, 2 or 4 h. The results showed that all of the transfection treatments promoted sperm binding and incorporation of exogenous DNA, similar to sperm incorporation of DNA alone, without affecting the viability. Nevertheless, the treatments and incubation times significantly affected the motility parameters, although no effect on the integrity of DNA or the levels of reactive oxygen species (ROS) was observed. Additionally, we observed that transfection using SuperFect and TurboFect negatively affected the acrosome integrity, and TurboFect affected the mitochondrial membrane potential of sperm. In conclusion, we demonstrated binding and incorporation of exogenous DNA by sperm after transfection and confirmed the capacity of sperm to spontaneously incorporate exogenous DNA. These findings will allow the establishment of the most appropriate method [intracytoplasmic sperm injection (ICSI) or in vitro fertilization (IVF)] of generating transgenic embryos via SMGT based on the fertilization capacity of transfected sperm.
DOE Office of Scientific and Technical Information (OSTI.GOV)
McGrath, B.C.; Dunn, J.J.; France, L.L.
1995-12-31
Lyme borreliosis, caused by the spirochete Borrelia burgdorferi, is the most common vector-borne disease in North America and Western Europe. As the major delayed immune response in humans, a better understanding of the major outer surface lipoproteins OspA and OspB are of much interest. These proteins have been shown to exhibit three distinct phylogenetic genotypes based on their DNA sequences. This paper describes the cloning of genomic DNA for each variant and amplification of PCR. DNA sequence data was used to derive computer driven phylogenetic analysis and deduced amino acid sequences. Overproduction of variant OspAs was carried out in E.more » coli using a T7-based expression system. Circular dichroism and fluorescence studies was carried out on the recombinant B31 PspA yielding evidence supporting a B31 protein containing 11% alpha-helix, 34% antiparallel beta-sheet, 12% parallel beta sheet.« less
Molecular modelling study of changes induced by netropsin binding to nucleosome core particles.
Pérez, J J; Portugal, J
1990-01-01
It is well known that certain sequence-dependent modulators in structure appear to determine the rotational positioning of DNA on the nucleosome core particle. That preference is rather weak and could be modified by some ligands as netropsin, a minor-groove binding antibiotic. We have undertaken a molecular modelling approach to calculate the relative energy of interaction between a DNA molecule and the protein core particle. The histones particle is considered as a distribution of positive charges on the protein surface that interacts with the DNA molecule. The molecular electrostatic potentials for the DNA, simulated as a discontinuous cylinder, were calculated using the values for all the base pairs. Computing these parameters, we calculated the relative energy of interaction and the more stable rotational setting of DNA. The binding of four molecules of netropsin to this model showed that a new minimum of energy is obtained when the DNA turns toward the protein surface by about 180 degrees, so a new energetically favoured structure appears where netropsin binding sites are located facing toward the histones surface. The effect of netropsin could be explained in terms of an induced change in the phasing of DNA on the core particle. The induced rotation is considered to optimize non-bonded contacts between the netropsin molecules and the DNA backbone. PMID:2165249
Performing SELEX experiments in silico
NASA Astrophysics Data System (ADS)
Wondergem, J. A. J.; Schiessel, H.; Tompitak, M.
2017-11-01
Due to the sequence-dependent nature of the elasticity of DNA, many protein-DNA complexes and other systems in which DNA molecules must be deformed have preferences for the type of DNA sequence they interact with. SELEX (Systematic Evolution of Ligands by EXponential enrichment) experiments and similar sequence selection experiments have been used extensively to examine the (indirect readout) sequence preferences of, e.g., nucleosomes (protein spools around which DNA is wound for compactification) and DNA rings. We show how recently developed computational and theoretical tools can be used to emulate such experiments in silico. Opening up this possibility comes with several benefits. First, it allows us a better understanding of our models and systems, specifically about the roles played by the simulation temperature and the selection pressure on the sequences. Second, it allows us to compare the predictions made by the model of choice with experimental results. We find agreement on important features between predictions of the rigid base-pair model and experimental results for DNA rings and interesting differences that point out open questions in the field. Finally, our simulations allow application of the SELEX methodology to systems that are experimentally difficult to realize because they come with high energetic costs and are therefore unlikely to form spontaneously, such as very short or overwound DNA rings.
DNA context represents transcription regulation of the gene in mouse embryonic stem cells
NASA Astrophysics Data System (ADS)
Ha, Misook; Hong, Soondo
2016-04-01
Understanding gene regulatory information in DNA remains a significant challenge in biomedical research. This study presents a computational approach to infer gene regulatory programs from primary DNA sequences. Using DNA around transcription start sites as attributes, our model predicts gene regulation in the gene. We find that H3K27ac around TSS is an informative descriptor of the transcription program in mouse embryonic stem cells. We build a computational model inferring the cell-type-specific H3K27ac signatures in the DNA around TSS. A comparison of embryonic stem cell and liver cell-specific H3K27ac signatures in DNA shows that the H3K27ac signatures in DNA around TSS efficiently distinguish the cell-type specific H3K27ac peaks and the gene regulation. The arrangement of the H3K27ac signatures inferred from the DNA represents the transcription regulation of the gene in mESC. We show that the DNA around transcription start sites is associated with the gene regulatory program by specific interaction with H3K27ac.
DNA context represents transcription regulation of the gene in mouse embryonic stem cells.
Ha, Misook; Hong, Soondo
2016-04-14
Understanding gene regulatory information in DNA remains a significant challenge in biomedical research. This study presents a computational approach to infer gene regulatory programs from primary DNA sequences. Using DNA around transcription start sites as attributes, our model predicts gene regulation in the gene. We find that H3K27ac around TSS is an informative descriptor of the transcription program in mouse embryonic stem cells. We build a computational model inferring the cell-type-specific H3K27ac signatures in the DNA around TSS. A comparison of embryonic stem cell and liver cell-specific H3K27ac signatures in DNA shows that the H3K27ac signatures in DNA around TSS efficiently distinguish the cell-type specific H3K27ac peaks and the gene regulation. The arrangement of the H3K27ac signatures inferred from the DNA represents the transcription regulation of the gene in mESC. We show that the DNA around transcription start sites is associated with the gene regulatory program by specific interaction with H3K27ac.
Lee, Ju Seok; Chen, Junghuei; Deaton, Russell; Kim, Jin-Woo
2014-01-01
Genetic material extracted from in situ microbial communities has high promise as an indicator of biological system status. However, the challenge is to access genomic information from all organisms at the population or community scale to monitor the biosystem's state. Hence, there is a need for a better diagnostic tool that provides a holistic view of a biosystem's genomic status. Here, we introduce an in vitro methodology for genomic pattern classification of biological samples that taps large amounts of genetic information from all genes present and uses that information to detect changes in genomic patterns and classify them. We developed a biosensing protocol, termed Biological Memory, that has in vitro computational capabilities to "learn" and "store" genomic sequence information directly from genomic samples without knowledge of their explicit sequences, and that discovers differences in vitro between previously unknown inputs and learned memory molecules. The Memory protocol was designed and optimized based upon (1) common in vitro recombinant DNA operations using 20-base random probes, including polymerization, nuclease digestion, and magnetic bead separation, to capture a snapshot of the genomic state of a biological sample as a DNA memory and (2) the thermal stability of DNA duplexes between new input and the memory to detect similarities and differences. For efficient read out, a microarray was used as an output method. When the microarray-based Memory protocol was implemented to test its capability and sensitivity using genomic DNA from two model bacterial strains, i.e., Escherichia coli K12 and Bacillus subtilis, results indicate that the Memory protocol can "learn" input DNA, "recall" similar DNA, differentiate between dissimilar DNA, and detect relatively small concentration differences in samples. This study demonstrated not only the in vitro information processing capabilities of DNA, but also its promise as a genomic pattern classifier that could access information from all organisms in a biological system without explicit genomic information. The Memory protocol has high potential for many applications, including in situ biomonitoring of ecosystems, screening for diseases, biosensing of pathological features in water and food supplies, and non-biological information processing of memory devices, among many.
Pakleza, Christophe; Cognet, Jean A. H.
2003-01-01
A new molecular modelling methodology is presented and shown to apply to all published solution structures of DNA hairpins with TTT in the loop. It is based on the theory of elasticity of thin rods and on the assumption that single-stranded B-DNA behaves as a continuous, unshearable, unstretchable and flexible thin rod. It requires four construction steps: (i) computation of the tri-dimensional trajectory of the elastic line, (ii) global deformation of single-stranded helical DNA onto the elastic line, (iii) optimisation of the nucleoside rotations about the elastic line, (iv) energy minimisation to restore backbone bond lengths and bond angles. This theoretical approach called ‘Biopolymer Chain Elasticity’ (BCE) is capable of reproducing the tri-dimensional course of the sugar–phosphate chain and, using NMR-derived distances, of reproducing models close to published solution structures. This is shown by computing three different types of distance criteria. The natural description provided by the elastic line and by the new parameter, Ω, which corresponds to the rotation angles of nucleosides about the elastic line, offers a considerable simplification of molecular modelling of hairpin loops. They can be varied independently from each other, since the global shape of the hairpin loop is preserved in all cases. PMID:12560506
Synthetic Biology: Knowledge Accessed by Everyone (Open Sources)
ERIC Educational Resources Information Center
Sánchez Reyes, Patricia Margarita
2016-01-01
Using the principles of biology, along with engineering and with the help of computer, scientists manage to copy. DNA sequences from nature and use them to create new organisms. DNA is created through engineering and computer science managing to create life inside a laboratory. We cannot dismiss the role that synthetic biology could lead in…
A Theoretical and Experimental Study of DNA Self-assembly
NASA Astrophysics Data System (ADS)
Chandran, Harish
The control of matter and phenomena at the nanoscale is fast becoming one of the most important challenges of the 21st century with wide-ranging applications from energy and health care to computing and material science. Conventional top-down approaches to nanotechnology, having served us well for long, are reaching their inherent limitations. Meanwhile, bottom-up methods such as self-assembly are emerging as viable alternatives for nanoscale fabrication and manipulation. A particularly successful bottom up technique is DNA self-assembly where a set of carefully designed DNA strands form a nanoscale object as a consequence of specific, local interactions among the different components, without external direction. The final product of the self-assembly process might be a static nanostructure or a dynamic nanodevice that performs a specific function. Over the past two decades, DNA self-assembly has produced stunning nanoscale objects such as 2D and 3D lattices, polyhedra and addressable arbitrary shaped substrates, and a myriad of nanoscale devices such as molecular tweezers, computational circuits, biosensors and molecular assembly lines. In this dissertation we study multiple problems in the theory, simulations and experiments of DNA self-assembly. We extend the Turing-universal mathematical framework of self-assembly known as the Tile Assembly Model by incorporating randomization during the assembly process. This allows us to reduce the tile complexity of linear assemblies. We develop multiple techniques to build linear assemblies of expected length N using far fewer tile types than previously possible. We abstract the fundamental properties of DNA and develop a biochemical system, which we call meta-DNA, based entirely on strands of DNA as the only component molecule. We further develop various enzyme-free protocols to manipulate meta-DNA systems and provide strand level details along with abstract notations for these mechanisms. We simulate DNA circuits by providing detailed designs for local molecular computations that involve spatially contiguous molecules arranged on addressable substrates via enzyme-free DNA hybridization reaction cascades. We use the Visual DSD simulation software in conjunction with localized reaction rates obtained from biophysical modeling to create chemical reaction networks of localized hybridization circuits that are then model checked using the PRISM model checking software. We develop a DNA detection system employing the triggered self-assembly of a novel DNA dendritic nanostructure. Detection begins when a specific, single-stranded target DNA strand triggers a hybridization chain reaction between two distinct DNA hairpins. Each hairpin opens and hybridizes up to two copies of the other, and hence each layer of the growing dendritic nanostructure can in principle accommodate an exponentially increasing number of cognate molecules, generating a nanostructure with high molecular weight. We build linear activatable assemblies employing a novel protection/deprotection strategy to strictly enforce the direction of tiling assembly growth to ensure the robustness of the assembly process. Our system consists of two tiles that can form a linear co-polymer. These tiles, which are initially protected such that they do not react with each other, can be activated to form linear co-polymers via the use of a strand displacing enzyme.
Mapping the Space of Genomic Signatures
Kari, Lila; Hill, Kathleen A.; Sayem, Abu S.; Karamichalis, Rallis; Bryans, Nathaniel; Davis, Katelyn; Dattani, Nikesh S.
2015-01-01
We propose a computational method to measure and visualize interrelationships among any number of DNA sequences allowing, for example, the examination of hundreds or thousands of complete mitochondrial genomes. An "image distance" is computed for each pair of graphical representations of DNA sequences, and the distances are visualized as a Molecular Distance Map: Each point on the map represents a DNA sequence, and the spatial proximity between any two points reflects the degree of structural similarity between the corresponding sequences. The graphical representation of DNA sequences utilized, Chaos Game Representation (CGR), is genome- and species-specific and can thus act as a genomic signature. Consequently, Molecular Distance Maps could inform species identification, taxonomic classifications and, to a certain extent, evolutionary history. The image distance employed, Structural Dissimilarity Index (DSSIM), implicitly compares the occurrences of oligomers of length up to k (herein k = 9) in DNA sequences. We computed DSSIM distances for more than 5 million pairs of complete mitochondrial genomes, and used Multi-Dimensional Scaling (MDS) to obtain Molecular Distance Maps that visually display the sequence relatedness in various subsets, at different taxonomic levels. This general-purpose method does not require DNA sequence alignment and can thus be used to compare similar or vastly different DNA sequences, genomic or computer-generated, of the same or different lengths. We illustrate potential uses of this approach by applying it to several taxonomic subsets: phylum Vertebrata, (super)kingdom Protista, classes Amphibia-Insecta-Mammalia, class Amphibia, and order Primates. This analysis of an extensive dataset confirms that the oligomer composition of full mtDNA sequences can be a source of taxonomic information. This method also correctly finds the mtDNA sequences most closely related to that of the anatomically modern human (the Neanderthal, the Denisovan, and the chimp), and that the sequence most different from it in this dataset belongs to a cucumber. PMID:26000734
Sarpeshkar, R
2014-03-28
We analyse the pros and cons of analog versus digital computation in living cells. Our analysis is based on fundamental laws of noise in gene and protein expression, which set limits on the energy, time, space, molecular count and part-count resources needed to compute at a given level of precision. We conclude that analog computation is significantly more efficient in its use of resources than deterministic digital computation even at relatively high levels of precision in the cell. Based on this analysis, we conclude that synthetic biology must use analog, collective analog, probabilistic and hybrid analog-digital computational approaches; otherwise, even relatively simple synthetic computations in cells such as addition will exceed energy and molecular-count budgets. We present schematics for efficiently representing analog DNA-protein computation in cells. Analog electronic flow in subthreshold transistors and analog molecular flux in chemical reactions obey Boltzmann exponential laws of thermodynamics and are described by astoundingly similar logarithmic electrochemical potentials. Therefore, cytomorphic circuits can help to map circuit designs between electronic and biochemical domains. We review recent work that uses positive-feedback linearization circuits to architect wide-dynamic-range logarithmic analog computation in Escherichia coli using three transcription factors, nearly two orders of magnitude more efficient in parts than prior digital implementations.
Computational model of chromosome aberration yield induced by high- and low-LET radiation exposures.
Ponomarev, Artem L; George, Kerry; Cucinotta, Francis A
2012-06-01
We present a computational model for calculating the yield of radiation-induced chromosomal aberrations in human cells based on a stochastic Monte Carlo approach and calibrated using the relative frequencies and distributions of chromosomal aberrations reported in the literature. A previously developed DNA-fragmentation model for high- and low-LET radiation called the NASARadiationTrackImage model was enhanced to simulate a stochastic process of the formation of chromosomal aberrations from DNA fragments. The current version of the model gives predictions of the yields and sizes of translocations, dicentrics, rings, and more complex-type aberrations formed in the G(0)/G(1) cell cycle phase during the first cell division after irradiation. As the model can predict smaller-sized deletions and rings (<3 Mbp) that are below the resolution limits of current cytogenetic analysis techniques, we present predictions of hypothesized small deletions that may be produced as a byproduct of properly repaired DNA double-strand breaks (DSB) by nonhomologous end-joining. Additionally, the model was used to scale chromosomal exchanges in two or three chromosomes that were obtained from whole-chromosome FISH painting analysis techniques to whole-genome equivalent values.
Methodological approach to crime scene investigation: the dangers of technology
NASA Astrophysics Data System (ADS)
Barnett, Peter D.
1997-02-01
The visitor to any modern forensic science laboratory is confronted with equipment and processes that did not exist even 10 years ago: thermocyclers to allow genetic typing of nanogram amounts of DNA isolated from a few spermatozoa; scanning electron microscopes that can nearly automatically detect submicrometer sized particles of molten lead, barium and antimony produced by the discharge of a firearm and deposited on the hands of the shooter; and computers that can compare an image of a latent fingerprint with millions of fingerprints stored in the computer memory. Analysis of populations of physical evidence has permitted statistically minded forensic scientists to use Bayesian inference to draw conclusions based on a priori assumptions which are often poorly understood, irrelevant, or misleading. National commissions who are studying quality control in DNA analysis propose that people with barely relevant graduate degrees and little forensic science experience be placed in charge of forensic DNA laboratories. It is undeniable that high- tech has reversed some miscarriages of justice by establishing the innocence of a number of people who were imprisoned for years for crimes that they did not commit. However, this papers deals with the dangers of technology in criminal investigations.
Replication-associated mutational asymmetry in the human genome.
Chen, Chun-Long; Duquenne, Lauranne; Audit, Benjamin; Guilbaud, Guillaume; Rappailles, Aurélien; Baker, Antoine; Huvet, Maxime; d'Aubenton-Carafa, Yves; Hyrien, Olivier; Arneodo, Alain; Thermes, Claude
2011-08-01
During evolution, mutations occur at rates that can differ between the two DNA strands. In the human genome, nucleotide substitutions occur at different rates on the transcribed and non-transcribed strands that may result from transcription-coupled repair. These mutational asymmetries generate transcription-associated compositional skews. To date, the existence of such asymmetries associated with replication has not yet been established. Here, we compute the nucleotide substitution matrices around replication initiation zones identified as sharp peaks in replication timing profiles and associated with abrupt jumps in the compositional skew profile. We show that the substitution matrices computed in these regions fully explain the jumps in the compositional skew profile when crossing initiation zones. In intergenic regions, we observe mutational asymmetries measured as differences between complementary substitution rates; their sign changes when crossing initiation zones. These mutational asymmetries are unlikely to result from cryptic transcription but can be explained by a model based on replication errors and strand-biased repair. In transcribed regions, mutational asymmetries associated with replication superimpose on the previously described mutational asymmetries associated with transcription. We separate the substitution asymmetries associated with both mechanisms, which allows us to determine for the first time in eukaryotes, the mutational asymmetries associated with replication and to reevaluate those associated with transcription. Replication-associated mutational asymmetry may result from unequal rates of complementary base misincorporation by the DNA polymerases coupled with DNA mismatch repair (MMR) acting with different efficiencies on the leading and lagging strands. Replication, acting in germ line cells during long evolutionary times, contributed equally with transcription to produce the present abrupt jumps in the compositional skew. These results demonstrate that DNA replication is one of the major processes that shape human genome composition.
WE-DE-202-00: Connecting Radiation Physics with Computational Biology
DOE Office of Scientific and Technical Information (OSTI.GOV)
NONE
Radiation therapy for the treatment of cancer has been established as a highly precise and effective way to eradicate a localized region of diseased tissue. To achieve further significant gains in the therapeutic ratio, we need to move towards biologically optimized treatment planning. To achieve this goal, we need to understand how the radiation-type dependent patterns of induced energy depositions within the cell (physics) connect via molecular, cellular and tissue reactions to treatment outcome such as tumor control and undesirable effects on normal tissue. Several computational biology approaches have been developed connecting physics to biology. Monte Carlo simulations are themore » most accurate method to calculate physical dose distributions at the nanometer scale, however simulations at the DNA scale are slow and repair processes are generally not simulated. Alternative models that rely on the random formation of individual DNA lesions within one or two turns of the DNA have been shown to reproduce the clusters of DNA lesions, including single strand breaks (SSBs), double strand breaks (DSBs) without the need for detailed track structure simulations. Efficient computational simulations of initial DNA damage induction facilitate computational modeling of DNA repair and other molecular and cellular processes. Mechanistic, multiscale models provide a useful conceptual framework to test biological hypotheses and help connect fundamental information about track structure and dosimetry at the sub-cellular level to dose-response effects on larger scales. In this symposium we will learn about the current state of the art of computational approaches estimating radiation damage at the cellular and sub-cellular scale. How can understanding the physics interactions at the DNA level be used to predict biological outcome? We will discuss if and how such calculations are relevant to advance our understanding of radiation damage and its repair, or, if the underlying biological processes are too complex for a mechanistic approach. Can computer simulations be used to guide future biological research? We will debate the feasibility of explaining biology from a physicists’ perspective. Learning Objectives: Understand the potential applications and limitations of computational methods for dose-response modeling at the molecular, cellular and tissue levels Learn about mechanism of action underlying the induction, repair and biological processing of damage to DNA and other constituents Understand how effects and processes at one biological scale impact on biological processes and outcomes on other scales J. Schuemann, NCI/NIH grantsS. McMahon, Funding: European Commission FP7 (grant EC FP7 MC-IOF-623630)« less
Poltev, Valeri; Anisimov, Victor M; Danilov, Victor I; Garcia, Dolores; Sanchez, Carolina; Deriabina, Alexandra; Gonzalez, Eduardo; Rivas, Francisco; Polteva, Nina
2014-06-01
Our previous DFT computations of deoxydinucleoside monophosphate complexes with Na(+)-ions (dDMPs) have demonstrated that the main characteristics of Watson-Crick (WC) right-handed duplex families are predefined in the local energy minima of dDMPs. In this work, we study the mechanisms of contribution of chemically monotonous sugar-phosphate backbone and the bases into the double helix irregularity. Geometry optimization of sugar-phosphate backbone produces energy minima matching the WC DNA conformations. Studying the conformational variability of dDMPs in response to sequence permutation, we found that simple replacement of bases in the previously fully optimized dDMPs, e.g. by constructing Pyr-Pur from Pur-Pyr, and Pur-Pyr from Pyr-Pur sequences, while retaining the backbone geometry, automatically produces the mutual base position characteristic of the target sequence. Based on that, we infer that the directionality and the preferable regions of the sugar-phosphate torsions, combined with the difference of purines from pyrimidines in ring shape, determines the sequence dependence of the structure of WC DNA. No such sequence dependence exists in dDMPs corresponding to other DNA conformations (e.g., Z-family and Hoogsteen duplexes). Unlike other duplexes, WC helix is unique by its ability to match the local energy minima of the free single strand to the preferable conformations of the duplex. Copyright © 2013 Wiley Periodicals, Inc.
Ponomarev, Artem L; Costes, Sylvain V; Cucinotta, Francis A
2008-11-01
We computed probabilities to have multiple double-strand breaks (DSB), which are produced in DNA on a regional scale, and not in close vicinity, in volumes matching the size of DNA damage foci, of a large chromatin loop, and in the physical volume of DNA containing the HPRT (human hypoxanthine phosphoribosyltransferase) locus. The model is based on a Monte Carlo description of DSB formation by heavy ions in the spatial context of the entire human genome contained within the cell nucleus, as well as at the gene sequence level. We showed that a finite physical volume corresponding to a visible DNA repair focus, believed to be associated with one DSB, can contain multiple DSB due to heavy ion track structure and the DNA supercoiled topography. A corrective distribution was introduced, which was a conditional probability to have excess DSB in a focus volume, given that there was already one present. The corrective distribution was calculated for 19.5 MeV/amu N ions, 3.77 MeV/amu alpha-particles, 1000 MeV/amu Fe ions, and X-rays. The corrected initial DSB yield from the experimental data on DNA repair foci was calculated. The DSB yield based on the corrective function converts the focus yield into the DSB yield, which is comparable with the DSB yield based on the earlier PFGE experiments. The distribution of DSB within the physical limits of the HPRT gene was analyzed by a similar method as well. This corrective procedure shows the applicability of the model and empowers the researcher with a tool to better analyze focus statistics. The model enables researchers to analyze the DSB yield based on focus statistics in real experimental situations that lack one-to-one focus-to-DSB correspondance.
Chakraborty, Mohua; Dhar, Bishal; Ghosh, Sankar Kumar
2017-11-01
The DNA barcodes are generally interpreted using distance-based and character-based methods. The former uses clustering of comparable groups, based on the relative genetic distance, while the latter is based on the presence or absence of discrete nucleotide substitutions. The distance-based approach has a limitation in defining a universal species boundary across the taxa as the rate of mtDNA evolution is not constant throughout the taxa. However, character-based approach more accurately defines this using a unique set of nucleotide characters. The character-based analysis of full-length barcode has some inherent limitations, like sequencing of the full-length barcode, use of a sparse-data matrix and lack of a uniform diagnostic position for each group. A short continuous stretch of a fragment can be used to resolve the limitations. Here, we observe that a 154-bp fragment, from the transversion-rich domain of 1367 COI barcode sequences can successfully delimit species in the three most diverse orders of freshwater fishes. This fragment is used to design species-specific barcode motifs for 109 species by the character-based method, which successfully identifies the correct species using a pattern-matching program. The motifs also correctly identify geographically isolated population of the Cypriniformes species. Further, this region is validated as a species-specific mini-barcode for freshwater fishes by successful PCR amplification and sequencing of the motif (154 bp) using the designed primers. We anticipate that use of such motifs will enhance the diagnostic power of DNA barcode, and the mini-barcode approach will greatly benefit the field-based system of rapid species identification. © 2017 John Wiley & Sons Ltd.
ERIC Educational Resources Information Center
Hacisalihoglu, Gokhan; Hilgert, Uwe; Nash, E. Bruce; Micklos, David A.
2008-01-01
Today's biology educators face the challenge of training their students in modern molecular biology techniques including genomics and bioinformatics. The Dolan DNA Learning Center (DNALC) of Cold Spring Harbor Laboratory has developed and disseminated a bench- and computer-based plant genomics curriculum for biology faculty. In 2007, a five-day…
Uncovering the polymerase-induced cytotoxicity of an oxidized nucleotide
Freudenthal, Bret D.; Beard, William A.; Perera, Lalith; ...
2014-11-17
Oxidative stress promotes genomic instability and human diseases. A common oxidized nucleoside is 8-oxo-7,8-dihydro-2’-deoxyguanosine found both in DNA (8-oxo-G) and as a free nucleotide (8-oxo-dGTP). Nucleotide pools are especially vulnerable to oxidative damage. Therefore cells encode an enzyme (MutT/MTH1) that removes free oxidized nucleotides. This cleansing function is required for cancer cell survival and to modulate E. coli antibiotic sensitivity in a DNA polymerase (pol)-dependent manner. How polymerase discriminates between damaged and non-damaged nucleotides is not well understood. This analysis is essential given the role of oxidized nucleotides in mutagenesis, cancer therapeutics, and bacterial antibiotics. Even with cellular sanitizing activities,more » nucleotide pools contain enough 8-oxo-dGTP to promote mutagenesis. This arises from the dual coding potential where 8-oxo-dGTP(anti) base pairs with cytosine (Cy) and 8-oxodGTP(syn) utilizes its Hoogsteen edge to base pair with adenine (Ad). Here in this paper we utilized time-lapse crystallography to follow 8-oxo-dGTP insertion opposite Ad or Cy with human DNA pol β, to reveal that insertion is accommodated in either the syn- or anti-conformation, respectively. For 8-oxo-dGTP(anti) insertion, a novel divalent metal relieves repulsive interactions between the adducted guanine base and the triphosphate of the oxidized nucleotide. With either templating base, hydrogen bonding interactions between the bases are lost as the enzyme reopens after catalysis, leading to a cytotoxic nicked DNA repair intermediate. Combining structural snapshots with kinetic and computational analysis reveals how 8-oxodGTP utilizes charge modulation during insertion that can lead to a blocked DNA repair intermediate.« less
Shuffle Optimizer: A Program to Optimize DNA Shuffling for Protein Engineering.
Milligan, John N; Garry, Daniel J
2017-01-01
DNA shuffling is a powerful tool to develop libraries of variants for protein engineering. Here, we present a protocol to use our freely available and easy-to-use computer program, Shuffle Optimizer. Shuffle Optimizer is written in the Python computer language and increases the nucleotide homology between two pieces of DNA desired to be shuffled together without changing the amino acid sequence. In addition we also include sections on optimal primer design for DNA shuffling and library construction, a small-volume ultrasonicator method to create sheared DNA, and finally a method to reassemble the sheared fragments and recover and clone the library. The Shuffle Optimizer program and these protocols will be useful to anyone desiring to perform any of the nucleotide homology-dependent shuffling methods.
The number of reduced alignments between two DNA sequences
2014-01-01
Background In this study we consider DNA sequences as mathematical strings. Total and reduced alignments between two DNA sequences have been considered in the literature to measure their similarity. Results for explicit representations of some alignments have been already obtained. Results We present exact, explicit and computable formulas for the number of different possible alignments between two DNA sequences and a new formula for a class of reduced alignments. Conclusions A unified approach for a wide class of alignments between two DNA sequences has been provided. The formula is computable and, if complemented by software development, will provide a deeper insight into the theory of sequence alignment and give rise to new comparison methods. AMS Subject Classification Primary 92B05, 33C20, secondary 39A14, 65Q30 PMID:24684679
Standard atomic volumes in double-stranded DNA and packing in protein–DNA interfaces
Nadassy, Katalin; Tomás-Oliveira, Isabel; Alberts, Ian; Janin, Joël; Wodak, Shoshana J.
2001-01-01
Standard volumes for atoms in double-stranded B-DNA are derived using high resolution crystal structures from the Nucleic Acid Database (NDB) and compared with corresponding values derived from crystal structures of small organic compounds in the Cambridge Structural Database (CSD). Two different methods are used to compute these volumes: the classical Voronoi method, which does not depend on the size of atoms, and the related Radical Planes method which does. Results show that atomic groups buried in the interior of double-stranded DNA are, on average, more tightly packed than in related small molecules in the CSD. The packing efficiency of DNA atoms at the interfaces of 25 high resolution protein–DNA complexes is determined by computing the ratios between the volumes of interfacial DNA atoms and the corresponding standard volumes. These ratios are found to be close to unity, indicating that the DNA atoms at protein–DNA interfaces are as closely packed as in crystals of B-DNA. Analogous volume ratios, computed for buried protein atoms, are also near unity, confirming our earlier conclusions that the packing efficiency of these atoms is similar to that in the protein interior. In addition, we examine the number, volume and solvent occupation of cavities located at the protein–DNA interfaces and compared them with those in the protein interior. Cavities are found to be ubiquitous in the interfaces as well as inside the protein moieties. The frequency of solvent occupation of cavities is however higher in the interfaces, indicating that those are more hydrated than protein interiors. Lastly, we compare our results with those obtained using two different measures of shape complementarity of the analysed interfaces, and find that the correlation between our volume ratios and these measures, as well as between the measures themselves, is weak. Our results indicate that a tightly packed environment made up of DNA, protein and solvent atoms plays a significant role in protein–DNA recognition. PMID:11504874
New Trends of Digital Data Storage in DNA
2016-01-01
With the exponential growth in the capacity of information generated and the emerging need for data to be stored for prolonged period of time, there emerges a need for a storage medium with high capacity, high storage density, and possibility to withstand extreme environmental conditions. DNA emerges as the prospective medium for data storage with its striking features. Diverse encoding models for reading and writing data onto DNA, codes for encrypting data which addresses issues of error generation, and approaches for developing codons and storage styles have been developed over the recent past. DNA has been identified as a potential medium for secret writing, which achieves the way towards DNA cryptography and stenography. DNA utilized as an organic memory device along with big data storage and analytics in DNA has paved the way towards DNA computing for solving computational problems. This paper critically analyzes the various methods used for encoding and encrypting data onto DNA while identifying the advantages and capability of every scheme to overcome the drawbacks identified priorly. Cryptography and stenography techniques have been analyzed in a critical approach while identifying the limitations of each method. This paper also identifies the advantages and limitations of DNA as a memory device and memory applications. PMID:27689089
New Trends of Digital Data Storage in DNA.
De Silva, Pavani Yashodha; Ganegoda, Gamage Upeksha
With the exponential growth in the capacity of information generated and the emerging need for data to be stored for prolonged period of time, there emerges a need for a storage medium with high capacity, high storage density, and possibility to withstand extreme environmental conditions. DNA emerges as the prospective medium for data storage with its striking features. Diverse encoding models for reading and writing data onto DNA, codes for encrypting data which addresses issues of error generation, and approaches for developing codons and storage styles have been developed over the recent past. DNA has been identified as a potential medium for secret writing, which achieves the way towards DNA cryptography and stenography. DNA utilized as an organic memory device along with big data storage and analytics in DNA has paved the way towards DNA computing for solving computational problems. This paper critically analyzes the various methods used for encoding and encrypting data onto DNA while identifying the advantages and capability of every scheme to overcome the drawbacks identified priorly. Cryptography and stenography techniques have been analyzed in a critical approach while identifying the limitations of each method. This paper also identifies the advantages and limitations of DNA as a memory device and memory applications.
Liu, Guo-Hua; Li, Chun; Li, Jia-Yuan; Zhou, Dong-Hui; Xiong, Rong-Chuan; Lin, Rui-Qing; Zou, Feng-Cai; Zhu, Xing-Quan
2012-01-01
Sparganosis, caused by the plerocercoid larvae of members of the genus Spirometra, can cause significant public health problem and considerable economic losses. In the present study, the complete mitochondrial DNA (mtDNA) sequence of Spirometra erinaceieuropaei from China was determined, characterized and compared with that of S. erinaceieuropaei from Japan. The gene arrangement in the mt genome sequences of S. erinaceieuropaei from China and Japan is identical. The identity of the mt genomes was 99.1% between S. erinaceieuropaei from China and Japan, and the complete mtDNA sequence of S. erinaceieuropaei from China is slightly shorter (2 bp) than that from Japan. Phylogenetic analysis of S. erinaceieuropaei with other representative cestodes using two different computational algorithms [Bayesian inference (BI) and maximum likelihood (ML)] based on concatenated amino acid sequences of 12 protein-coding genes, revealed that S. erinaceieuropaei is closely related to Diphyllobothrium spp., supporting classification based on morphological features. The present study determined the complete mtDNA sequences of S. erinaceieuropaei from China that provides novel genetic markers for studying the population genetics and molecular epidemiology of S. erinaceieuropaei in humans and animals. PMID:22553464
From molecular biology to nanotechnology and nanomedicine.
Bogunia-Kubik, Katarzyna; Sugisaka, Masanori
2002-01-01
Great progress in the development of molecular biology techniques has been seen since the discovery of the structure of deoxyribonucleic acid (DNA) and the implementation of a polymerase chain reaction (PCR) method. This started a new era of research on the structure of nucleic acids molecules, the development of new analytical tools, and DNA-based analyses. The latter included not only diagnostic procedures but also, for example, DNA-based computational approaches. On the other hand, people have started to be more interested in mimicking real life, and modeling the structures and organisms that already exist in nature for the further evaluation and insight into their behavior and evolution. These factors, among others, have led to the description of artificial organelles or cells, and the construction of nanoscale devices. These nanomachines and nanoobjects might soon find a practical implementation, especially in the field of medical research and diagnostics. The paper presents some examples, illustrating the progress in multidisciplinary research in the nanoscale area. It is focused especially on immunogenetics-related aspects and the wide usage of DNA molecules in various fields of science. In addition, some proposals for nanoparticles and nanoscale tools and their applications in medicine are reviewed and discussed.
WE-DE-202-01: Connecting Nanoscale Physics to Initial DNA Damage Through Track Structure Simulations
DOE Office of Scientific and Technical Information (OSTI.GOV)
Schuemann, J.
Radiation therapy for the treatment of cancer has been established as a highly precise and effective way to eradicate a localized region of diseased tissue. To achieve further significant gains in the therapeutic ratio, we need to move towards biologically optimized treatment planning. To achieve this goal, we need to understand how the radiation-type dependent patterns of induced energy depositions within the cell (physics) connect via molecular, cellular and tissue reactions to treatment outcome such as tumor control and undesirable effects on normal tissue. Several computational biology approaches have been developed connecting physics to biology. Monte Carlo simulations are themore » most accurate method to calculate physical dose distributions at the nanometer scale, however simulations at the DNA scale are slow and repair processes are generally not simulated. Alternative models that rely on the random formation of individual DNA lesions within one or two turns of the DNA have been shown to reproduce the clusters of DNA lesions, including single strand breaks (SSBs), double strand breaks (DSBs) without the need for detailed track structure simulations. Efficient computational simulations of initial DNA damage induction facilitate computational modeling of DNA repair and other molecular and cellular processes. Mechanistic, multiscale models provide a useful conceptual framework to test biological hypotheses and help connect fundamental information about track structure and dosimetry at the sub-cellular level to dose-response effects on larger scales. In this symposium we will learn about the current state of the art of computational approaches estimating radiation damage at the cellular and sub-cellular scale. How can understanding the physics interactions at the DNA level be used to predict biological outcome? We will discuss if and how such calculations are relevant to advance our understanding of radiation damage and its repair, or, if the underlying biological processes are too complex for a mechanistic approach. Can computer simulations be used to guide future biological research? We will debate the feasibility of explaining biology from a physicists’ perspective. Learning Objectives: Understand the potential applications and limitations of computational methods for dose-response modeling at the molecular, cellular and tissue levels Learn about mechanism of action underlying the induction, repair and biological processing of damage to DNA and other constituents Understand how effects and processes at one biological scale impact on biological processes and outcomes on other scales J. Schuemann, NCI/NIH grantsS. McMahon, Funding: European Commission FP7 (grant EC FP7 MC-IOF-623630)« less
Theoretical studies of protein-protein and protein-DNA binding rates
NASA Astrophysics Data System (ADS)
Alsallaq, Ramzi A.
Proteins are folded chains of amino acids. Some of the amino acids (e.g. Lys, Arg, His, Asp, and Glu) carry charges under physiological conditions. Proteins almost always function through binding to other proteins or ligands, for example barnase is a ribonuclease protein, found in the bacterium Bacillus amyloliquefaceus. Barnase degrades RNA by hydrolysis. For the bacterium to inhibit the potentially lethal action of Barnase within its own cell it co-produces another protein called barstar which binds quickly, and tightly, to barnase. The biological function of this binding is to block the active site of barnase. The speeds (rates) at which proteins associate are vital to many biological processes. They span a wide range (from less than 103 to 108 M-1s-1 ). Rates greater than ˜ 106 M -1s-1 are typically found to be manifestations of enhancements by long-range electrostatic interactions between the associating proteins. A different paradigm appears in the case of protein binding to DNA. The rate in this case is enhanced through attractive surface potential that effectively reduces the dimensionality of the available search space for the diffusing protein. This thesis presents computational and theoretical models on the rate of association of ligands/proteins to other proteins or DNA. For protein-protein association we present a general strategy for computing protein-protein rates of association. The main achievements of this strategy is the ability to obtain a stringent reaction criteria based on the landscape of short-range interactions between the associating proteins, and the ability to compute the effect of the electrostatic interactions on the rates of association accurately using the best known solvers for Poisson-Boltzmann equation presently available. For protein-DNA association we present a mathematical model for proteins targeting specific sites on a circular DNA topology. The main achievements are the realization that a linear DNA with reflecting ends and specific site in the middle of the chain is kinetically indistinguishable from its circularized topology, and the ability to predict the effect of the dissociation via the ends of linear DNA on the rate of association which is to reduce the rate.* *This dissertation is a compound document (contains both a paper copy and a CD as part of the dissertation). The CD requires the following system requirements: QuickTime.
Ron, Gil; Globerson, Yuval; Moran, Dror; Kaplan, Tommy
2017-12-21
Proximity-ligation methods such as Hi-C allow us to map physical DNA-DNA interactions along the genome, and reveal its organization into topologically associating domains (TADs). As the Hi-C data accumulate, computational methods were developed for identifying domain borders in multiple cell types and organisms. Here, we present PSYCHIC, a computational approach for analyzing Hi-C data and identifying promoter-enhancer interactions. We use a unified probabilistic model to segment the genome into domains, which we then merge hierarchically and fit using a local background model, allowing us to identify over-represented DNA-DNA interactions across the genome. By analyzing the published Hi-C data sets in human and mouse, we identify hundreds of thousands of putative enhancers and their target genes, and compile an extensive genome-wide catalog of gene regulation in human and mouse. As we show, our predictions are highly enriched for ChIP-seq and DNA accessibility data, evolutionary conservation, eQTLs and other DNA-DNA interaction data.
Entropic Profiler – detection of conservation in genomes using information theory
Fernandes, Francisco; Freitas, Ana T; Almeida, Jonas S; Vinga, Susana
2009-01-01
Background In the last decades, with the successive availability of whole genome sequences, many research efforts have been made to mathematically model DNA. Entropic Profiles (EP) were proposed recently as a new measure of continuous entropy of genome sequences. EP represent local information plots related to DNA randomness and are based on information theory and statistical concepts. They express the weighed relative abundance of motifs for each position in genomes. Their study is very relevant because under or over-representation segments are often associated with significant biological meaning. Findings The Entropic Profiler application here presented is a new tool designed to detect and extract under and over-represented DNA segments in genomes by using EP. It allows its computation in a very efficient way by recurring to improved algorithms and data structures, which include modified suffix trees. Available through a web interface and as downloadable source code, it allows to study positions and to search for motifs inside the whole sequence or within a specified range. DNA sequences can be entered from different sources, including FASTA files, pre-loaded examples or resuming a previously saved work. Besides the EP value plots, p-values and z-scores for each motif are also computed, along with the Chaos Game Representation of the sequence. Conclusion EP are directly related with the statistical significance of motifs and can be considered as a new method to extract and classify significant regions in genomes and estimate local scales in DNA. The present implementation establishes an efficient and useful tool for whole genome analysis. PMID:19416538
DNA Photo Lithography with Cinnamate-based Photo-Bio-Nano-Glue
NASA Astrophysics Data System (ADS)
Feng, Lang; Li, Minfeng; Romulus, Joy; Sha, Ruojie; Royer, John; Wu, Kun-Ta; Xu, Qin; Seeman, Nadrian; Weck, Marcus; Chaikin, Paul
2013-03-01
We present a technique to make patterned functional surfaces, using a cinnamate photo cross-linker and photolithography. We have designed and modified a complementary set of single DNA strands to incorporate a pair of opposing cinnamate molecules. On exposure to 360nm UV, the cinnamate makes a highly specific covalent bond permanently linking only the complementary strands containing the cinnamates. We have studied this specific and efficient crosslinking with cinnamate-containing DNA in solution and on particles. UV addressability allows us to pattern surfaces functionally. The entire surface is coated with a DNA sequence A incorporating cinnamate. DNA strands A'B with one end containing a complementary cinnamated sequence A' attached to another sequence B, are then hybridized to the surface. UV photolithography is used to bind the A'B strand in a specific pattern. The system is heated and the unbound DNA is washed away. The pattern is then observed by thermo-reversibly hybridizing either fluorescently dyed B' strands complementary to B, or colloids coated with B' strands. Our techniques can be used to reversibly and/or permanently bind, via DNA linkers, an assortment of molecules, proteins and nanostructures. Potential applications range from advanced self-assembly, such as templated self-replication schemes recently reported, to designed physical and chemical patterns, to high-resolution multi-functional DNA surfaces for genetic detection or DNA computing.
Wang, Guohua; Wang, Fang; Huang, Qian; Li, Yu; Liu, Yunlong; Wang, Yadong
2015-01-01
Transcription factors are proteins that bind to DNA sequences to regulate gene transcription. The transcription factor binding sites are short DNA sequences (5-20 bp long) specifically bound by one or more transcription factors. The identification of transcription factor binding sites and prediction of their function continue to be challenging problems in computational biology. In this study, by integrating the DNase I hypersensitive sites with known position weight matrices in the TRANSFAC database, the transcription factor binding sites in gene regulatory region are identified. Based on the global gene expression patterns in cervical cancer HeLaS3 cell and HelaS3-ifnα4h cell (interferon treatment on HeLaS3 cell for 4 hours), we present a model-based computational approach to predict a set of transcription factors that potentially cause such differential gene expression. Significantly, 6 out 10 predicted functional factors, including IRF, IRF-2, IRF-9, IRF-1 and IRF-3, ICSBP, belong to interferon regulatory factor family and upregulate the gene expression levels responding to the interferon treatment. Another factor, ISGF-3, is also a transcriptional activator induced by interferon alpha. Using the different transcription factor binding sites selected criteria, the prediction result of our model is consistent. Our model demonstrated the potential to computationally identify the functional transcription factors in gene regulation.
Carrier mobility in double-helix DNA and RNA: A quantum chemistry study with Marcus-Hush theory.
Wu, Tao; Sun, Lei; Shi, Qi; Deng, Kaiming; Deng, Weiqiao; Lu, Ruifeng
2016-12-21
Charge mobilities of six DNAs and RNAs have been computed using quantum chemistry calculation combined with the Marcus-Hush theory. Based on this simulation model, we obtained quite reasonable results when compared with the experiment, and the obtained charge mobility strongly depends on the molecular reorganization and electronic coupling. Besides, we find that hole mobilities are larger than electron mobilities no matter in DNAs or in RNAs, and the hole mobility of 2L8I can reach 1.09 × 10 -1 cm 2 V -1 s -1 which can be applied in the molecular wire. The findings also show that our theoretical model can be regarded as a promising candidate for screening DNA- and RNA-based molecular electronic devices.
Carrier mobility in double-helix DNA and RNA: A quantum chemistry study with Marcus-Hush theory
NASA Astrophysics Data System (ADS)
Wu, Tao; Sun, Lei; Shi, Qi; Deng, Kaiming; Deng, Weiqiao; Lu, Ruifeng
2016-12-01
Charge mobilities of six DNAs and RNAs have been computed using quantum chemistry calculation combined with the Marcus-Hush theory. Based on this simulation model, we obtained quite reasonable results when compared with the experiment, and the obtained charge mobility strongly depends on the molecular reorganization and electronic coupling. Besides, we find that hole mobilities are larger than electron mobilities no matter in DNAs or in RNAs, and the hole mobility of 2L8I can reach 1.09 × 10-1 cm2 V-1 s-1 which can be applied in the molecular wire. The findings also show that our theoretical model can be regarded as a promising candidate for screening DNA- and RNA-based molecular electronic devices.
Yang, Chunrong; Zou, Dan; Chen, Jianchi; Zhang, Linyan; Miao, Jiarong; Huang, Dan; Du, Yuanyuan; Yang, Shu; Yang, Qianfan; Tang, Yalin
2018-03-15
Plenty of molecular circuits with specific functions have been developed; however, logic units with reconfigurability, which could simplify the circuits and speed up the information process, are rarely reported. In this work, we designed a novel reconfigurable logic unit based on a DNA-templated, potassium-concentration-dependent, supramolecular assembly, which could respond to the input stimuli of H + and K + . By inputting different concentrations of K + , the logic unit could implement three significant functions, including a half adder, a half subtractor, and a 2-to-4 decoder. Considering its reconfigurable ability and good performance, the novel prototypes developed here may serve as a promising proof of principle in molecular computers. © 2018 Wiley-VCH Verlag GmbH & Co. KGaA, Weinheim.
Another expert system rule inference based on DNA molecule logic gates
NASA Astrophysics Data System (ADS)
WÄ siewicz, Piotr
2013-10-01
With the help of silicon industry microfluidic processors were invented utilizing nano membrane valves, pumps and microreactors. These so called lab-on-a-chips combined together with molecular computing create molecular-systems-ona- chips. This work presents a new approach to implementation of molecular inference systems. It requires the unique representation of signals by DNA molecules. The main part of this work includes the concept of logic gates based on typical genetic engineering reactions. The presented method allows for constructing logic gates with many inputs and for executing them at the same quantity of elementary operations, regardless of a number of input signals. Every microreactor of the lab-on-a-chip performs one unique operation on input molecules and can be connected by dataflow output-input connections to other ones.
Computational design and multiscale modeling of a nanoactuator using DNA actuation.
Hamdi, Mustapha
2009-12-02
Developments in the field of nanobiodevices coupling nanostructures and biological components are of great interest in medical nanorobotics. As the fundamentals of bio/non-bio interaction processes are still poorly understood in the design of these devices, design tools and multiscale dynamics modeling approaches are necessary at the fabrication pre-project stage. This paper proposes a new concept of optimized carbon nanotube based servomotor design for drug delivery and biomolecular transport applications. The design of an encapsulated DNA-multi-walled carbon nanotube actuator is prototyped using multiscale modeling. The system is parametrized by using a quantum level approach and characterized by using a molecular dynamics simulation. Based on the analysis of the simulation results, a servo nanoactuator using ionic current feedback is simulated and analyzed for application as a drug delivery carrier.
NASA Astrophysics Data System (ADS)
Mclaurin, Patrick M.; Privett, Austin J.; Stopera, Christopher; Grimes, Thomas V.; Perera, Ajith; Morales, Jorge A.
2015-02-01
Proton cancer therapy (PCT) utilises high-energy H+ projectiles to cure cancer. PCT healing arises from its DNA damage in cancerous cells, which is mostly inflicted by the products from PCT water radiolysis reactions. While clinically established, a complete microscopic understanding of PCT remains elusive. To help in the microscopic elucidation of PCT, Professor Öhrn's simplest-level electron nuclear dynamics (SLEND) method is herein applied to H+ + (H2O)3-4 and H+ + DNA-bases at ELab = 1.0 keV. These are two types of computationally feasible prototypes to study water radiolysis reactions and H+-induced DNA damage, respectively. SLEND is a time-dependent, variational, non-adiabatic and direct-dynamics method that adopts a nuclear classical-mechanics description and an electronic single-determinantal wavefunction. Additionally, our SLEND + effective-core-potential method is herein employed to simulate some computationally demanding PCT reactions. Due to these attributes, SLEND proves appropriate for the simulation of various types of PCT reactions accurately and feasibly. H+ + (H2O)3-4 simulations reveal two main processes: H+ projectile scattering and the simultaneous formation of H and OH fragments; the latter process is quantified through total integrals cross sections. H+ + DNA-base simulations reveal atoms and groups displacements, ring openings and base-to-proton electron transfers as predominant damage processes. The authors warmly dedicate this SLEND investigation in honour of Professor N. Yngve Öhrn on the occasion of his 80th birthday celebration during the 54th Sanibel Symposium in St. Simons' Island, Georgia, on February 16-21, 2014. Associate Professor Jorge A. Morales was a former chemistry PhD student under the mentorship of Professor Öhrn and Dr Ajith Perera took various quantum chemistry courses taught by Professor Öhrn during his chemistry PhD studies. Both Jorge and Ajith look back to those great times of their scientific formation under Yngve's guidance during the 1990s with a strong sense of gratitude toward him (and even with a sense of nostalgia). The authors are pleased to present to Professor Öhrn this birthday gift of fully mature SLEND developments that now venture to treat systems of biochemical interest.
Computational Nanoelectronics: Applications to DNA, Carbon Nanotubes and Nanotransistors
NASA Technical Reports Server (NTRS)
Anantram, M. P.; Svizhenko, Alexei; Govindan, T. R.; Govindan, T. R.; Walch, S.; Mehrez, H.
2003-01-01
The topics covered by the panels of this viewgraph presentation include phonon scattering, layered structures, DNA as a device, the influence of twist and rise in the DNA molecule, counter-ions, conductance versus length, and intrinsic resonant tunneling.
USDA-ARS?s Scientific Manuscript database
A computer algorithm was created to inspect scanned images from DNA microarray slides developed to rapidly detect and genotype E. Coli O157 virulent strains. The algorithm computes centroid locations for signal and background pixels in RGB space and defines a plane perpendicular to the line connect...
Making Ordered DNA and Protein Structures from Computer-Printed Transparency Film Cut-Outs
ERIC Educational Resources Information Center
Jittivadhna, Karnyupha; Ruenwongsa, Pintip; Panijpan, Bhinyo
2009-01-01
Instructions are given for building physical scale models of ordered structures of B-form DNA, protein [alpha]-helix, and parallel and antiparallel protein [beta]-pleated sheets made from colored computer printouts designed for transparency film sheets. Cut-outs from these sheets are easily assembled. Conventional color coding for atoms are used…
Red light improves spermatozoa motility and does not induce oxidative DNA damage
NASA Astrophysics Data System (ADS)
Preece, Daryl; Chow, Kay W.; Gomez-Godinez, Veronica; Gustafson, Kyle; Esener, Selin; Ravida, Nicole; Durrant, Barbara; Berns, Michael W.
2017-04-01
The ability to successfully fertilize ova relies upon the swimming ability of spermatozoa. Both in humans and in animals, sperm motility has been used as a metric for the viability of semen samples. Recently, several studies have examined the efficacy of low dosage red light exposure for cellular repair and increasing sperm motility. Of prime importance to the practical application of this technique is the absence of DNA damage caused by radiation exposure. In this study, we examine the effect of 633 nm coherent, red laser light on sperm motility using a novel wavelet-based algorithm that allows for direct measurement of curvilinear velocity under red light illumination. This new algorithm gives results comparable to the standard computer-assisted sperm analysis (CASA) system. We then assess the safety of red light treatment of sperm by analyzing, (1) the levels of double-strand breaks in the DNA, and (2) oxidative damage in the sperm DNA. The results demonstrate that for the parameters used there are insignificant differences in oxidative DNA damage as a result of irradiation.
A multiple-alignment based primer design algorithm for genetically highly variable DNA targets
2013-01-01
Background Primer design for highly variable DNA sequences is difficult, and experimental success requires attention to many interacting constraints. The advent of next-generation sequencing methods allows the investigation of rare variants otherwise hidden deep in large populations, but requires attention to population diversity and primer localization in relatively conserved regions, in addition to recognized constraints typically considered in primer design. Results Design constraints include degenerate sites to maximize population coverage, matching of melting temperatures, optimizing de novo sequence length, finding optimal bio-barcodes to allow efficient downstream analyses, and minimizing risk of dimerization. To facilitate primer design addressing these and other constraints, we created a novel computer program (PrimerDesign) that automates this complex procedure. We show its powers and limitations and give examples of successful designs for the analysis of HIV-1 populations. Conclusions PrimerDesign is useful for researchers who want to design DNA primers and probes for analyzing highly variable DNA populations. It can be used to design primers for PCR, RT-PCR, Sanger sequencing, next-generation sequencing, and other experimental protocols targeting highly variable DNA samples. PMID:23965160
The generative power of weighted one-sided and regular sticker systems
NASA Astrophysics Data System (ADS)
Siang, Gan Yee; Heng, Fong Wan; Sarmin, Nor Haniza; Turaev, Sherzod
2014-06-01
Sticker systems were introduced in 1998 as one of the DNA computing models by using the recombination behavior of DNA molecules. The Watson-Crick complementary principle of DNA molecules is abstractly used in the sticker systems to perform the computation of sticker systems. In this paper, the generative power of weighted one-sided sticker systems and weighted regular sticker systems are investigated. Moreover, the relationship of the families of languages generated by these two variants of sticker systems to the Chomsky hierarchy is also presented.
Swigon, David; Coleman, Bernard D.; Olson, Wilma K.
2006-01-01
Repression of transcription of the Escherichia coli Lac operon by the Lac repressor (LacR) is accompanied by the simultaneous binding of LacR to two operators and the formation of a DNA loop. A recently developed theory of sequence-dependent DNA elasticity enables one to relate the fine structure of the LacR–DNA complex to a wide range of heretofore-unconnected experimental observations. Here, that theory is used to calculate the configuration and free energy of the DNA loop as a function of its length and base-pair sequence, its linking number, and the end conditions imposed by the LacR tetramer. The tetramer can assume two types of conformations. Whereas a rigid V-shaped structure is observed in the crystal, EM images show extended forms in which two dimer subunits are flexibly joined. Upon comparing our computed loop configurations with published experimental observations of permanganate sensitivities, DNase I cutting patterns, and loop stabilities, we conclude that linear DNA segments of short-to-medium chain length (50–180 bp) give rise to loops with the extended form of LacR and that loops formed within negatively supercoiled plasmids induce the V-shaped structure. PMID:16785444
Contribution of indirect effects to clustered damage in DNA irradiated with protons.
Pachnerová Brabcová, K; Štěpán, V; Karamitros, M; Karabín, M; Dostálek, P; Incerti, S; Davídková, M; Sihver, L
2015-09-01
Protons are the dominant particles both in galactic cosmic rays and in solar particle events and, furthermore, proton irradiation becomes increasingly used in tumour treatment. It is believed that complex DNA damage is the determining factor for the consequent cellular response to radiation. DNA plasmid pBR322 was irradiated at U120-M cyclotron with 30 MeV protons and treated with two Escherichia coli base excision repair enzymes. The yields of SSBs and DSBs were analysed using agarose gel electrophoresis. DNA has been irradiated in the presence of hydroxyl radical scavenger (coumarin-3-carboxylic acid) in order to distinguish between direct and indirect damage of the biological target. Pure scavenger solution was used as a probe for measurement of induced OH· radical yields. Experimental OH· radical yield kinetics was compared with predictions computed by two theoretical models-RADAMOL and Geant4-DNA. Both approaches use Geant4-DNA for description of physical stages of radiation action, and then each of them applies a distinct model for description of the pre-chemical and chemical stage. © The Author 2015. Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com.
Fluctuation of the electronic coupling in DNA: Multistate versus two-state model
NASA Astrophysics Data System (ADS)
Voityuk, Alexander A.
2007-05-01
The electronic coupling for hole transfer between guanine bases G in the DNA duplex (GT) 6GTG(TG) 6 is studied using a QM/MD approach. The coupling V is calculated for 10 thousand snapshots within the two- and multistate state Generalized Mulliken-Hush model. We find that the two-state scheme considerably underestimates the rate of the hole transfer within the π stack. Moreover, the probability distributions computed with the two- and multistate schemes are quite different. It has been found that large fluctuations of V2, which are at least an order of magnitude larger than its average value, occur roughly every 1 ps.
π-π stacking tackled with density functional theory
Swart, Marcel; van der Wijst, Tushar; Fonseca Guerra, Célia
2007-01-01
Through comparison with ab initio reference data, we have evaluated the performance of various density functionals for describing π-π interactions as a function of the geometry between two stacked benzenes or benzene analogs, between two stacked DNA bases, and between two stacked Watson–Crick pairs. Our main purpose is to find a robust and computationally efficient density functional to be used specifically and only for describing π-π stacking interactions in DNA and other biological molecules in the framework of our recently developed QM/QM approach "QUILD". In line with previous studies, most standard density functionals recover, at best, only part of the favorable stacking interactions. An exception is the new KT1 functional, which correctly yields bound π-stacked structures. Surprisingly, a similarly good performance is achieved with the computationally very robust and efficient local density approximation (LDA). Furthermore, we show that classical electrostatic interactions determine the shape and depth of the π-π stacking potential energy surface. Figure Additivity approximation for the π-π interaction between two stacked Watson–Crick base pairs in terms of pairwise interactions between individual bases Electronic supplementary material The online version of this article (doi:10.1007/s00894-007-0239-y) contains supplementary material, which is available to authorized users. PMID:17874150
Structural modeling and molecular simulation analysis of HvAP2/EREBP from barley.
Pandey, Bharati; Sharma, Pradeep; Tyagi, Chetna; Goyal, Sukriti; Grover, Abhinav; Sharma, Indu
2016-06-01
AP2/ERF transcription factors play a critical role in plant development and stress adaptation. This study reports the three-dimensional ab initio-based model of AP2/EREBP protein of barley and its interaction with DNA. Full-length coding sequence of HvAP2/EREBP gene isolated from two Indian barley cultivars, RD 2503 and RD 31, was used to model the protein. Of five protein models obtained, the one with lowest C-score was chosen for further analysis. The N- and C-terminal regions of HvAP2 protein were found to be highly disordered. The dynamic properties of AP2/EREBP and its interaction with DNA were investigated by molecular dynamics simulation. Analysis of trajectories from simulation yielded the equilibrated conformation between 2-10ns for protein and 7-15ns for protein-DNA complex. We established relationship between DNA having GCC box and DNA-binding domain of HvAP2/EREBP was established by modeling 11-base-pair-long nucleotide sequence and HvAP2/EREBP protein using ab initio method. Analysis of protein-DNA interaction showed that a β-sheet motif constituting amino acid residues THR105, ARG100, ARG93, and ARG83 seems to play important role in stabilizing the complex as they form strong hydrogen bond interactions with the DNA motif. Taken together, this study provides first-hand comprehensive information detailing structural conformation and interactions of HvAP2/EREBP proteins in barley. The study intensifies the role of computational approaches for preliminary examination of unknown proteins in the absence of experimental information. It also provides molecular insight into protein-DNA binding for understanding and enhancing abiotic stress resistance for improving the water use efficiency in crop plants.
NASA Astrophysics Data System (ADS)
Seto, Donald
The convergence and wealth of informatics, bioinformatics and genomics methods and associated resources allow a comprehensive and rapid approach for the surveillance and detection of bacterial and viral organisms. Coupled with the continuing race for the fastest, most cost-efficient and highest-quality DNA sequencing technology, that is, "next generation sequencing", the detection of biological threat agents by `cheaper and faster' means is possible. With the application of improved bioinformatic tools for the understanding of these genomes and for parsing unique pathogen genome signatures, along with `state-of-the-art' informatics which include faster computational methods, equipment and databases, it is feasible to apply new algorithms to biothreat agent detection. Two such methods are high-throughput DNA sequencing-based and resequencing microarray-based identification. These are illustrated and validated by two examples involving human adenoviruses, both from real-world test beds.
Efficient Mining of Interesting Patterns in Large Biological Sequences
Rashid, Md. Mamunur; Karim, Md. Rezaul; Jeong, Byeong-Soo
2012-01-01
Pattern discovery in biological sequences (e.g., DNA sequences) is one of the most challenging tasks in computational biology and bioinformatics. So far, in most approaches, the number of occurrences is a major measure of determining whether a pattern is interesting or not. In computational biology, however, a pattern that is not frequent may still be considered very informative if its actual support frequency exceeds the prior expectation by a large margin. In this paper, we propose a new interesting measure that can provide meaningful biological information. We also propose an efficient index-based method for mining such interesting patterns. Experimental results show that our approach can find interesting patterns within an acceptable computation time. PMID:23105928
Efficient mining of interesting patterns in large biological sequences.
Rashid, Md Mamunur; Karim, Md Rezaul; Jeong, Byeong-Soo; Choi, Ho-Jin
2012-03-01
Pattern discovery in biological sequences (e.g., DNA sequences) is one of the most challenging tasks in computational biology and bioinformatics. So far, in most approaches, the number of occurrences is a major measure of determining whether a pattern is interesting or not. In computational biology, however, a pattern that is not frequent may still be considered very informative if its actual support frequency exceeds the prior expectation by a large margin. In this paper, we propose a new interesting measure that can provide meaningful biological information. We also propose an efficient index-based method for mining such interesting patterns. Experimental results show that our approach can find interesting patterns within an acceptable computation time.
A self-assembled nanoscale robotic arm controlled by electric fields
NASA Astrophysics Data System (ADS)
Kopperger, Enzo; List, Jonathan; Madhira, Sushi; Rothfischer, Florian; Lamb, Don C.; Simmel, Friedrich C.
2018-01-01
The use of dynamic, self-assembled DNA nanostructures in the context of nanorobotics requires fast and reliable actuation mechanisms. We therefore created a 55-nanometer–by–55-nanometer DNA-based molecular platform with an integrated robotic arm of length 25 nanometers, which can be extended to more than 400 nanometers and actuated with externally applied electrical fields. Precise, computer-controlled switching of the arm between arbitrary positions on the platform can be achieved within milliseconds, as demonstrated with single-pair Förster resonance energy transfer experiments and fluorescence microscopy. The arm can be used for electrically driven transport of molecules or nanoparticles over tens of nanometers, which is useful for the control of photonic and plasmonic processes. Application of piconewton forces by the robot arm is demonstrated in force-induced DNA duplex melting experiments.
Radiation breakage of DNA: a model based on random-walk chromatin structure
NASA Technical Reports Server (NTRS)
Ponomarev, A. L.; Sachs, R. K.
2001-01-01
Monte Carlo computer software, called DNAbreak, has recently been developed to analyze observed non-random clustering of DNA double strand breaks in chromatin after exposure to densely ionizing radiation. The software models coarse-grained configurations of chromatin and radiation tracks, small-scale details being suppressed in order to obtain statistical results for larger scales, up to the size of a whole chromosome. We here give an analytic counterpart of the numerical model, useful for benchmarks, for elucidating the numerical results, for analyzing the assumptions of a more general but less mechanistic "randomly-located-clusters" formalism, and, potentially, for speeding up the calculations. The equations characterize multi-track DNA fragment-size distributions in terms of one-track action; an important step in extrapolating high-dose laboratory results to the much lower doses of main interest in environmental or occupational risk estimation. The approach can utilize the experimental information on DNA fragment-size distributions to draw inferences about large-scale chromatin geometry during cell-cycle interphase.
Genetic circuit design automation.
Nielsen, Alec A K; Der, Bryan S; Shin, Jonghyeon; Vaidyanathan, Prashant; Paralanov, Vanya; Strychalski, Elizabeth A; Ross, David; Densmore, Douglas; Voigt, Christopher A
2016-04-01
Computation can be performed in living cells by DNA-encoded circuits that process sensory information and control biological functions. Their construction is time-intensive, requiring manual part assembly and balancing of regulator expression. We describe a design environment, Cello, in which a user writes Verilog code that is automatically transformed into a DNA sequence. Algorithms build a circuit diagram, assign and connect gates, and simulate performance. Reliable circuit design requires the insulation of gates from genetic context, so that they function identically when used in different circuits. We used Cello to design 60 circuits forEscherichia coli(880,000 base pairs of DNA), for which each DNA sequence was built as predicted by the software with no additional tuning. Of these, 45 circuits performed correctly in every output state (up to 10 regulators and 55 parts), and across all circuits 92% of the output states functioned as predicted. Design automation simplifies the incorporation of genetic circuits into biotechnology projects that require decision-making, control, sensing, or spatial organization. Copyright © 2016, American Association for the Advancement of Science.
Directed nucleation assembly of DNA tile complexes for barcode-patterned lattices
NASA Astrophysics Data System (ADS)
Yan, Hao; Labean, Thomas H.; Feng, Liping; Reif, John H.
2003-07-01
The programmed self-assembly of patterned aperiodic molecular structures is a major challenge in nanotechnology and has numerous potential applications for nanofabrication of complex structures and useful devices. Here we report the construction of an aperiodic patterned DNA lattice (barcode lattice) by a self-assembly process of directed nucleation of DNA tiles around a scaffold DNA strand. The input DNA scaffold strand, constructed by ligation of shorter synthetic oligonucleotides, provides layers of the DNA lattice with barcode patterning information represented by the presence or absence of DNA hairpin loops protruding out of the lattice plane. Self-assembly of multiple DNA tiles around the scaffold strand was shown to result in a patterned lattice containing barcode information of 01101. We have also demonstrated the reprogramming of the system to another patterning. An inverted barcode pattern of 10010 was achieved by modifying the scaffold strands and one of the strands composing each tile. A ribbon lattice, consisting of repetitions of the barcode pattern with expected periodicity, was also constructed by the addition of sticky ends. The patterning of both classes of lattices was clearly observable via atomic force microscopy. These results represent a step toward implementation of a visual readout system capable of converting information encoded on a 1D DNA strand into a 2D form readable by advanced microscopic techniques. A functioning visual output method would not only increase the readout speed of DNA-based computers, but may also find use in other sequence identification techniques such as mutation or allele mapping.
Directed nucleation assembly of DNA tile complexes for barcode-patterned lattices.
Yan, Hao; LaBean, Thomas H; Feng, Liping; Reif, John H
2003-07-08
The programmed self-assembly of patterned aperiodic molecular structures is a major challenge in nanotechnology and has numerous potential applications for nanofabrication of complex structures and useful devices. Here we report the construction of an aperiodic patterned DNA lattice (barcode lattice) by a self-assembly process of directed nucleation of DNA tiles around a scaffold DNA strand. The input DNA scaffold strand, constructed by ligation of shorter synthetic oligonucleotides, provides layers of the DNA lattice with barcode patterning information represented by the presence or absence of DNA hairpin loops protruding out of the lattice plane. Self-assembly of multiple DNA tiles around the scaffold strand was shown to result in a patterned lattice containing barcode information of 01101. We have also demonstrated the reprogramming of the system to another patterning. An inverted barcode pattern of 10010 was achieved by modifying the scaffold strands and one of the strands composing each tile. A ribbon lattice, consisting of repetitions of the barcode pattern with expected periodicity, was also constructed by the addition of sticky ends. The patterning of both classes of lattices was clearly observable via atomic force microscopy. These results represent a step toward implementation of a visual readout system capable of converting information encoded on a 1D DNA strand into a 2D form readable by advanced microscopic techniques. A functioning visual output method would not only increase the readout speed of DNA-based computers, but may also find use in other sequence identification techniques such as mutation or allele mapping.
DNA fragmentation and sperm head morphometry in cat epididymal spermatozoa.
Vernocchi, Valentina; Morselli, Maria Giorgia; Lange Consiglio, Anna; Faustini, Massimo; Luvoni, Gaia Cecilia
2014-10-15
Sperm DNA fragmentation is an important parameter to assess sperm quality and can be a putative fertility predictor. Because the sperm head consists almost entirely of DNA, subtle differences in sperm head morphometry might be related to DNA status. Several techniques are available to analyze sperm DNA fragmentation, but they are labor-intensive and require expensive instrumentations. Recently, a kit (Sperm-Halomax) based on the sperm chromatin dispersion test and developed for spermatozoa of different species, but not for cat spermatozoa, became commercially available. The first aim of the present study was to verify the suitability of Sperm-Halomax assay, specifically developed for canine semen, for the evaluation of DNA fragmentation of epididymal cat spermatozoa. For this purpose, DNA fragmentation indexes (DFIs) obtained with Sperm-Halomax and terminal deoxynucleotidyl transferase-mediated nick-end labeling (TUNEL) were compared. The second aim was to investigate whether a correlation between DNA status, sperm head morphology, and morphometry assessed by computer-assisted semen analysis exists in cat epididymal spermatozoa. No differences were observed in DFIs obtained with Sperm-Halomax and TUNEL. This result indicates that Sperm-Halomax assay provides a reliable evaluation of DNA fragmentation of epididymal feline spermatozoa. The DFI seems to be independent from all the measured variables of sperm head morphology and morphometry. Thus, the evaluation of the DNA status of spermatozoa could effectively contribute to the completion of the standard analysis of fresh or frozen semen used in assisted reproductive technologies. Copyright © 2014 Elsevier Inc. All rights reserved.
Wells, David B; Bhattacharya, Swati; Carr, Rogan; Maffeo, Christopher; Ho, Anthony; Comer, Jeffrey; Aksimentiev, Aleksei
2012-01-01
Molecular dynamics (MD) simulations have become a standard method for the rational design and interpretation of experimental studies of DNA translocation through nanopores. The MD method, however, offers a multitude of algorithms, parameters, and other protocol choices that can affect the accuracy of the resulting data as well as computational efficiency. In this chapter, we examine the most popular choices offered by the MD method, seeking an optimal set of parameters that enable the most computationally efficient and accurate simulations of DNA and ion transport through biological nanopores. In particular, we examine the influence of short-range cutoff, integration timestep and force field parameters on the temperature and concentration dependence of bulk ion conductivity, ion pairing, ion solvation energy, DNA structure, DNA-ion interactions, and the ionic current through a nanopore.
Kinetics of DSB rejoining and formation of simple chromosome exchange aberrations
NASA Technical Reports Server (NTRS)
Cucinotta, F. A.; Nikjoo, H.; O'Neill, P.; Goodhead, D. T.
2000-01-01
PURPOSE: To investigate the role of kinetics in the processing of DNA double strand breaks (DSB), and the formation of simple chromosome exchange aberrations following X-ray exposures to mammalian cells based on an enzymatic approach. METHODS: Using computer simulations based on a biochemical approach, rate-equations that describe the processing of DSB through the formation of a DNA-enzyme complex were formulated. A second model that allows for competition between two processing pathways was also formulated. The formation of simple exchange aberrations was modelled as misrepair during the recombination of single DSB with undamaged DNA. Non-linear coupled differential equations corresponding to biochemical pathways were solved numerically by fitting to experimental data. RESULTS: When mediated by a DSB repair enzyme complex, the processing of single DSB showed a complex behaviour that gives the appearance of fast and slow components of rejoining. This is due to the time-delay caused by the action time of enzymes in biomolecular reactions. It is shown that the kinetic- and dose-responses of simple chromosome exchange aberrations are well described by a recombination model of DSB interacting with undamaged DNA when aberration formation increases with linear dose-dependence. Competition between two or more recombination processes is shown to lead to the formation of simple exchange aberrations with a dose-dependence similar to that of a linear quadratic model. CONCLUSIONS: Using a minimal number of assumptions, the kinetics and dose response observed experimentally for DSB rejoining and the formation of simple chromosome exchange aberrations are shown to be consistent with kinetic models based on enzymatic reaction approaches. A non-linear dose response for simple exchange aberrations is possible in a model of recombination of DNA containing a DSB with undamaged DNA when two or more pathways compete for DSB repair.
A dictionary based informational genome analysis
2012-01-01
Background In the post-genomic era several methods of computational genomics are emerging to understand how the whole information is structured within genomes. Literature of last five years accounts for several alignment-free methods, arisen as alternative metrics for dissimilarity of biological sequences. Among the others, recent approaches are based on empirical frequencies of DNA k-mers in whole genomes. Results Any set of words (factors) occurring in a genome provides a genomic dictionary. About sixty genomes were analyzed by means of informational indexes based on genomic dictionaries, where a systemic view replaces a local sequence analysis. A software prototype applying a methodology here outlined carried out some computations on genomic data. We computed informational indexes, built the genomic dictionaries with different sizes, along with frequency distributions. The software performed three main tasks: computation of informational indexes, storage of these in a database, index analysis and visualization. The validation was done by investigating genomes of various organisms. A systematic analysis of genomic repeats of several lengths, which is of vivid interest in biology (for example to compute excessively represented functional sequences, such as promoters), was discussed, and suggested a method to define synthetic genetic networks. Conclusions We introduced a methodology based on dictionaries, and an efficient motif-finding software application for comparative genomics. This approach could be extended along many investigation lines, namely exported in other contexts of computational genomics, as a basis for discrimination of genomic pathologies. PMID:22985068
An evolution-based DNA-binding residue predictor using a dynamic query-driven learning scheme.
Chai, H; Zhang, J; Yang, G; Ma, Z
2016-11-15
DNA-binding proteins play a pivotal role in various biological activities. Identification of DNA-binding residues (DBRs) is of great importance for understanding the mechanism of gene regulations and chromatin remodeling. Most traditional computational methods usually construct their predictors on static non-redundant datasets. They excluded many homologous DNA-binding proteins so as to guarantee the generalization capability of their models. However, those ignored samples may potentially provide useful clues when studying protein-DNA interactions, which have not obtained enough attention. In view of this, we propose a novel method, namely DQPred-DBR, to fill the gap of DBR predictions. First, a large-scale extensible sample pool was compiled. Second, evolution-based features in the form of a relative position specific score matrix and covariant evolutionary conservation descriptors were used to encode the feature space. Third, a dynamic query-driven learning scheme was designed to make more use of proteins with known structure and functions. In comparison with a traditional static model, the introduction of dynamic models could obviously improve the prediction performance. Experimental results from the benchmark and independent datasets proved that our DQPred-DBR had promising generalization capability. It was capable of producing decent predictions and outperforms many state-of-the-art methods. For the convenience of academic use, our proposed method was also implemented as a web server at .
Zhao, Ya-E; Wu, Li-Ping
2012-09-01
To confirm phylogenetic relationships in Demodex mites based on mitochondrial 16S rDNA partial sequences, mtDNA 16S partial sequences of ten isolates of three Demodex species from China were amplified, recombined, and sequenced and then analyzed with two Demodex folliculorum isolates from Spain. Lastly, genetic distance was computed, and phylogenetic tree was reconstructed. MEGA 4.0 analysis showed high sequence identity among 16S rDNA partial sequences of three Demodex species, which were 95.85 % in D. folliculorum, 98.53 % in Demodex canis, and 99.71 % in Demodex brevis. The divergence, genetic distance, and transition/transversions of the three Demodex species reached interspecies level, whereas there was no significant difference of the divergence (1.1 %), genetic distance (0.011), and transition/transversions (3/1) of the two geographic D. folliculorum isolates (Spain and China). Phylogenetic trees reveal that the three Demodex species formed three separate branches of one clade, where D. folliculorum and D. canis gathered first, and then gathered with D. brevis. The two Spain and five China D. folliculorum isolates did not form sister clades. In conclusion, 16S mtDNA are suitable for phylogenetic relationship analysis in low taxa (genus or species), but not for intraspecies determination of Demodex. The differentiation among the three Demodex species has reached interspecies level.
iDNA-Prot: Identification of DNA Binding Proteins Using Random Forest with Grey Model
Lin, Wei-Zhong; Fang, Jian-An; Xiao, Xuan; Chou, Kuo-Chen
2011-01-01
DNA-binding proteins play crucial roles in various cellular processes. Developing high throughput tools for rapidly and effectively identifying DNA-binding proteins is one of the major challenges in the field of genome annotation. Although many efforts have been made in this regard, further effort is needed to enhance the prediction power. By incorporating the features into the general form of pseudo amino acid composition that were extracted from protein sequences via the “grey model” and by adopting the random forest operation engine, we proposed a new predictor, called iDNA-Prot, for identifying uncharacterized proteins as DNA-binding proteins or non-DNA binding proteins based on their amino acid sequences information alone. The overall success rate by iDNA-Prot was 83.96% that was obtained via jackknife tests on a newly constructed stringent benchmark dataset in which none of the proteins included has pairwise sequence identity to any other in a same subset. In addition to achieving high success rate, the computational time for iDNA-Prot is remarkably shorter in comparison with the relevant existing predictors. Hence it is anticipated that iDNA-Prot may become a useful high throughput tool for large-scale analysis of DNA-binding proteins. As a user-friendly web-server, iDNA-Prot is freely accessible to the public at the web-site on http://icpr.jci.edu.cn/bioinfo/iDNA-Prot or http://www.jci-bioinfo.cn/iDNA-Prot. Moreover, for the convenience of the vast majority of experimental scientists, a step-by-step guide is provided on how to use the web-server to get the desired results. PMID:21935457
Simple sequence repeats in Escherichia coli: abundance, distribution, composition, and polymorphism.
Gur-Arie, R; Cohen, C J; Eitan, Y; Shelef, L; Hallerman, E M; Kashi, Y
2000-01-01
Computer-based genome-wide screening of the DNA sequence of Escherichia coli strain K12 revealed tens of thousands of tandem simple sequence repeat (SSR) tracts, with motifs ranging from 1 to 6 nucleotides. SSRs were well distributed throughout the genome. Mononucleotide SSRs were over-represented in noncoding regions and under-represented in open reading frames (ORFs). Nucleotide composition of mono- and dinucleotide SSRs, both in ORFs and in noncoding regions, differed from that of the genomic region in which they occurred, with 93% of all mononucleotide SSRs proving to be of A or T. Computer-based analysis of the fine position of every SSR locus in the noncoding portion of the genome relative to downstream ORFs showed SSRs located in areas that could affect gene regulation. DNA sequences at 14 arbitrarily chosen SSR tracts were compared among E. coli strains. Polymorphisms of SSR copy number were observed at four of seven mononucleotide SSR tracts screened, with all polymorphisms occurring in noncoding regions. SSR polymorphism could prove important as a genome-wide source of variation, both for practical applications (including rapid detection, strain identification, and detection of loci affecting key phenotypes) and for evolutionary adaptation of microbes.
Transition between B-DNA and Z-DNA: free energy landscape for the B-Z junction propagation.
Lee, Juyong; Kim, Yang-Gyun; Kim, Kyeong Kyu; Seok, Chaok
2010-08-05
Canonical, right-handed B-DNA can be transformed into noncanonical, left-handed Z-DNA in vitro at high salt concentrations or in vivo under physiological conditions. The molecular mechanism of this drastic conformational transition is still unknown despite numerous studies. Inspired by the crystal structure of a B-Z junction and the previous zipper model, we show here, with the aid of molecular dynamics simulations, that a stepwise propagation of a B-Z junction is a highly probable pathway for the B-Z transition. In this paper, the movement of a B-Z junction by a two-base-pair step in a double-strand nonamer, [d(GpCpGpCpGpCpGpCpG)](2), is considered. Targeted molecular dynamics simulations and umbrella sampling for this transition resulted in a transition pathway with a free energy barrier of 13 kcal/mol. This barrier is much more favorable than those obtained from previous atomistic simulations that lead to concerted transitions of the whole strands. The free energy difference between B-DNA and Z-DNA evaluated from our simulation is 0.9 kcal/mol per dinucleotide unit, which is consistent with previous experiments. The current computation thus strongly supports the proposal that the B-Z transition involves a relatively fast extension of B-DNA or Z-DNA by sequential propagation of B-Z junctions once nucleation of junctions is established.
CMG-Biotools, a Free Workbench for Basic Comparative Microbial Genomics
Vesth, Tammi; Lagesen, Karin; Acar, Öncel; Ussery, David
2013-01-01
Background Today, there are more than a hundred times as many sequenced prokaryotic genomes than were present in the year 2000. The economical sequencing of genomic DNA has facilitated a whole new approach to microbial genomics. The real power of genomics is manifested through comparative genomics that can reveal strain specific characteristics, diversity within species and many other aspects. However, comparative genomics is a field not easily entered into by scientists with few computational skills. The CMG-biotools package is designed for microbiologists with limited knowledge of computational analysis and can be used to perform a number of analyses and comparisons of genomic data. Results The CMG-biotools system presents a stand-alone interface for comparative microbial genomics. The package is a customized operating system, based on Xubuntu 10.10, available through the open source Ubuntu project. The system can be installed on a virtual computer, allowing the user to run the system alongside any other operating system. Source codes for all programs are provided under GNU license, which makes it possible to transfer the programs to other systems if so desired. We here demonstrate the package by comparing and analyzing the diversity within the class Negativicutes, represented by 31 genomes including 10 genera. The analyses include 16S rRNA phylogeny, basic DNA and codon statistics, proteome comparisons using BLAST and graphical analyses of DNA structures. Conclusion This paper shows the strength and diverse use of the CMG-biotools system. The system can be installed on a vide range of host operating systems and utilizes as much of the host computer as desired. It allows the user to compare multiple genomes, from various sources using standardized data formats and intuitive visualizations of results. The examples presented here clearly shows that users with limited computational experience can perform complicated analysis without much training. PMID:23577086
MICA: desktop software for comprehensive searching of DNA databases
Stokes, William A; Glick, Benjamin S
2006-01-01
Background Molecular biologists work with DNA databases that often include entire genomes. A common requirement is to search a DNA database to find exact matches for a nondegenerate or partially degenerate query. The software programs available for such purposes are normally designed to run on remote servers, but an appealing alternative is to work with DNA databases stored on local computers. We describe a desktop software program termed MICA (K-Mer Indexing with Compact Arrays) that allows large DNA databases to be searched efficiently using very little memory. Results MICA rapidly indexes a DNA database. On a Macintosh G5 computer, the complete human genome could be indexed in about 5 minutes. The indexing algorithm recognizes all 15 characters of the DNA alphabet and fully captures the information in any DNA sequence, yet for a typical sequence of length L, the index occupies only about 2L bytes. The index can be searched to return a complete list of exact matches for a nondegenerate or partially degenerate query of any length. A typical search of a long DNA sequence involves reading only a small fraction of the index into memory. As a result, searches are fast even when the available RAM is limited. Conclusion MICA is suitable as a search engine for desktop DNA analysis software. PMID:17018144
Least-Squares Support Vector Machine Approach to Viral Replication Origin Prediction
Cruz-Cano, Raul; Chew, David S.H.; Kwok-Pui, Choi; Ming-Ying, Leung
2010-01-01
Replication of their DNA genomes is a central step in the reproduction of many viruses. Procedures to find replication origins, which are initiation sites of the DNA replication process, are therefore of great importance for controlling the growth and spread of such viruses. Existing computational methods for viral replication origin prediction have mostly been tested within the family of herpesviruses. This paper proposes a new approach by least-squares support vector machines (LS-SVMs) and tests its performance not only on the herpes family but also on a collection of caudoviruses coming from three viral families under the order of caudovirales. The LS-SVM approach provides sensitivities and positive predictive values superior or comparable to those given by the previous methods. When suitably combined with previous methods, the LS-SVM approach further improves the prediction accuracy for the herpesvirus replication origins. Furthermore, by recursive feature elimination, the LS-SVM has also helped find the most significant features of the data sets. The results suggest that the LS-SVMs will be a highly useful addition to the set of computational tools for viral replication origin prediction and illustrate the value of optimization-based computing techniques in biomedical applications. PMID:20729987
Least-Squares Support Vector Machine Approach to Viral Replication Origin Prediction.
Cruz-Cano, Raul; Chew, David S H; Kwok-Pui, Choi; Ming-Ying, Leung
2010-06-01
Replication of their DNA genomes is a central step in the reproduction of many viruses. Procedures to find replication origins, which are initiation sites of the DNA replication process, are therefore of great importance for controlling the growth and spread of such viruses. Existing computational methods for viral replication origin prediction have mostly been tested within the family of herpesviruses. This paper proposes a new approach by least-squares support vector machines (LS-SVMs) and tests its performance not only on the herpes family but also on a collection of caudoviruses coming from three viral families under the order of caudovirales. The LS-SVM approach provides sensitivities and positive predictive values superior or comparable to those given by the previous methods. When suitably combined with previous methods, the LS-SVM approach further improves the prediction accuracy for the herpesvirus replication origins. Furthermore, by recursive feature elimination, the LS-SVM has also helped find the most significant features of the data sets. The results suggest that the LS-SVMs will be a highly useful addition to the set of computational tools for viral replication origin prediction and illustrate the value of optimization-based computing techniques in biomedical applications.
Gray, A J; Beecher, D E; Olson, M V
1984-01-01
A stand-alone, interactive computer system has been developed that automates the analysis of ethidium bromide-stained agarose and acrylamide gels on which DNA restriction fragments have been separated by size. High-resolution digital images of the gels are obtained using a camera that contains a one-dimensional, 2048-pixel photodiode array that is mechanically translated through 2048 discrete steps in a direction perpendicular to the gel lanes. An automatic band-detection algorithm is used to establish the positions of the gel bands. A color-video graphics system, on which both the gel image and a variety of operator-controlled overlays are displayed, allows the operator to visualize and interact with critical stages of the analysis. The principal interactive steps involve defining the regions of the image that are to be analyzed and editing the results of the band-detection process. The system produces a machine-readable output file that contains the positions, intensities, and descriptive classifications of all the bands, as well as documentary information about the experiment. This file is normally further processed on a larger computer to obtain fragment-size assignments. Images PMID:6320097
Bonello, Nicolas; Sampson, James; Burn, John; Wilson, Ian J; McGrown, Gail; Margison, Geoff P; Thorncroft, Mary; Crossbie, Philip; Povey, Andrew C; Santibanez-Koref, Mauro; Walters, Kevin
2013-11-07
We exploit model-based Bayesian inference methodologies to analyse lung tumour-derived methylation data from a CpG island in the O6-methylguanine-DNA methyltransferase (MGMT) promoter. Interest is in modelling the changes in methylation patterns in a CpG island in the first exon of the promoter during lung tumour development. We propose four competils of methylation state propagation based on two mechanisms. The first is the location-dependence mechanism in which the probability of a gain or loss of methylation at a CpG within the promoter depends upon its location in the CpG sequence. The second mechanism is that of neighbour-dependence in which gain or loss of methylation at a CpG depends upon the methylation status of the immediately preceding CpG. Our data comprises the methylation status at 12 CpGs near the 5' end of the CpG island in two lung tumour samples for both alleles of a nearby polymorphism. We use approximate Bayesian computation, a computationally intensive rejection-sampling algorithm to infer model parameters and compare models without the need to evaluate the likelihood function. We compare the four proposed models using two criteria: the approximate Bayes factors and the distribution of the Euclidean distance between the summary statistics of the observed and simulated datasets. Our model-based analysis demonstrates compelling evidence for both location and neighbour dependence in the process of aberrant DNA methylation of this MGMT promoter CpG island in lung tumours. We find equivocal evidence to support the hypothesis that the methylation patterns of the two alleles evolve independently. © 2013 Published by Elsevier Ltd. All rights reserved.
Computational Micromodel for Epigenetic Mechanisms
Raghavan, Karthika; Ruskin, Heather J.; Perrin, Dimitri; Goasmat, Francois; Burns, John
2010-01-01
Characterization of the epigenetic profile of humans since the initial breakthrough on the human genome project has strongly established the key role of histone modifications and DNA methylation. These dynamic elements interact to determine the normal level of expression or methylation status of the constituent genes in the genome. Recently, considerable evidence has been put forward to demonstrate that environmental stress implicitly alters epigenetic patterns causing imbalance that can lead to cancer initiation. This chain of consequences has motivated attempts to computationally model the influence of histone modification and DNA methylation in gene expression and investigate their intrinsic interdependency. In this paper, we explore the relation between DNA methylation and transcription and characterize in detail the histone modifications for specific DNA methylation levels using a stochastic approach. PMID:21152421
A computational model to protect patient data from location-based re-identification.
Malin, Bradley
2007-07-01
Health care organizations must preserve a patient's anonymity when disclosing personal data. Traditionally, patient identity has been protected by stripping identifiers from sensitive data such as DNA. However, simple automated methods can re-identify patient data using public information. In this paper, we present a solution to prevent a threat to patient anonymity that arises when multiple health care organizations disclose data. In this setting, a patient's location visit pattern, or "trail", can re-identify seemingly anonymous DNA to patient identity. This threat exists because health care organizations (1) cannot prevent the disclosure of certain types of patient information and (2) do not know how to systematically avoid trail re-identification. In this paper, we develop and evaluate computational methods that health care organizations can apply to disclose patient-specific DNA records that are impregnable to trail re-identification. To prevent trail re-identification, we introduce a formal model called k-unlinkability, which enables health care administrators to specify different degrees of patient anonymity. Specifically, k-unlinkability is satisfied when the trail of each DNA record is linkable to no less than k identified records. We present several algorithms that enable health care organizations to coordinate their data disclosure, so that they can determine which DNA records can be shared without violating k-unlinkability. We evaluate the algorithms with the trails of patient populations derived from publicly available hospital discharge databases. Algorithm efficacy is evaluated using metrics based on real world applications, including the number of suppressed records and the number of organizations that disclose records. Our experiments indicate that it is unnecessary to suppress all patient records that initially violate k-unlinkability. Rather, only portions of the trails need to be suppressed. For example, if each hospital discloses 100% of its data on patients diagnosed with cystic fibrosis, then 48% of the DNA records are 5-unlinkable. A naïve solution would suppress the 52% of the DNA records that violate 5-unlinkability. However, by applying our protection algorithms, the hospitals can disclose 95% of the DNA records, all of which are 5-unlinkable. Similar findings hold for all populations studied. This research demonstrates that patient anonymity can be formally protected in shared databases. Our findings illustrate that significant quantities of patient-specific data can be disclosed with provable protection from trail re-identification. The configurability of our methods allows health care administrators to quantify the effects of different levels of privacy protection and formulate policy accordingly.
An end-to-end workflow for engineering of biological networks from high-level specifications.
Beal, Jacob; Weiss, Ron; Densmore, Douglas; Adler, Aaron; Appleton, Evan; Babb, Jonathan; Bhatia, Swapnil; Davidsohn, Noah; Haddock, Traci; Loyall, Joseph; Schantz, Richard; Vasilev, Viktor; Yaman, Fusun
2012-08-17
We present a workflow for the design and production of biological networks from high-level program specifications. The workflow is based on a sequence of intermediate models that incrementally translate high-level specifications into DNA samples that implement them. We identify algorithms for translating between adjacent models and implement them as a set of software tools, organized into a four-stage toolchain: Specification, Compilation, Part Assignment, and Assembly. The specification stage begins with a Boolean logic computation specified in the Proto programming language. The compilation stage uses a library of network motifs and cellular platforms, also specified in Proto, to transform the program into an optimized Abstract Genetic Regulatory Network (AGRN) that implements the programmed behavior. The part assignment stage assigns DNA parts to the AGRN, drawing the parts from a database for the target cellular platform, to create a DNA sequence implementing the AGRN. Finally, the assembly stage computes an optimized assembly plan to create the DNA sequence from available part samples, yielding a protocol for producing a sample of engineered plasmids with robotics assistance. Our workflow is the first to automate the production of biological networks from a high-level program specification. Furthermore, the workflow's modular design allows the same program to be realized on different cellular platforms simply by swapping workflow configurations. We validated our workflow by specifying a small-molecule sensor-reporter program and verifying the resulting plasmids in both HEK 293 mammalian cells and in E. coli bacterial cells.
Photoswitching of DNA Hybridization Using a Molecular Motor.
Lubbe, Anouk S; Liu, Qing; Smith, Sanne J; de Vries, Jan Willem; Kistemaker, Jos C M; de Vries, Alex H; Faustino, Ignacio; Meng, Zhuojun; Szymanski, Wiktor; Herrmann, Andreas; Feringa, Ben L
2018-04-18
Reversible control over the functionality of biological systems via external triggers may be used in future medicine to reduce the need for invasive procedures. Additionally, externally regulated biomacromolecules are now considered as particularly attractive tools in nanoscience and the design of smart materials, due to their highly programmable nature and complex functionality. Incorporation of photoswitches into biomolecules, such as peptides, antibiotics, and nucleic acids, has generated exciting results in the past few years. Molecular motors offer the potential for new and more precise methods of photoregulation, due to their multistate switching cycle, unidirectionality of rotation, and helicity inversion during the rotational steps. Aided by computational studies, we designed and synthesized a photoswitchable DNA hairpin, in which a molecular motor serves as the bridgehead unit. After it was determined that motor function was not affected by the rigid arms of the linker, solid-phase synthesis was employed to incorporate the motor into an 8-base-pair self-complementary DNA strand. With the photoswitchable bridgehead in place, hairpin formation was unimpaired, while the motor part of this advanced biohybrid system retains excellent photochemical properties. Rotation of the motor generates large changes in structure, and as a consequence the duplex stability of the oligonucleotide could be regulated by UV light irradiation. Additionally, Molecular Dynamics computations were employed to rationalize the observed behavior of the motor-DNA hybrid. The results presented herein establish molecular motors as powerful multistate switches for application in biological environments.
NASA Astrophysics Data System (ADS)
Meyer, Sam; Everaers, Ralf
2015-02-01
The histone-DNA interaction in the nucleosome is a fundamental mechanism of genomic compaction and regulation, which remains largely unknown despite increasing structural knowledge of the complex. In this paper, we propose a framework for the extraction of a nanoscale histone-DNA force-field from a collection of high-resolution structures, which may be adapted to a larger class of protein-DNA complexes. We applied the procedure to a large crystallographic database extended by snapshots from molecular dynamics simulations. The comparison of the structural models first shows that, at histone-DNA contact sites, the DNA base-pairs are shifted outwards locally, consistent with locally repulsive forces exerted by the histones. The second step shows that the various force profiles of the structures under analysis derive locally from a unique, sequence-independent, quadratic repulsive force-field, while the sequence preferences are entirely due to internal DNA mechanics. We have thus obtained the first knowledge-derived nanoscale interaction potential for histone-DNA in the nucleosome. The conformations obtained by relaxation of nucleosomal DNA with high-affinity sequences in this potential accurately reproduce the experimental values of binding preferences. Finally we address the more generic binding mechanisms relevant to the 80% genomic sequences incorporated in nucleosomes, by computing the conformation of nucleosomal DNA with sequence-averaged properties. This conformation differs from those found in crystals, and the analysis suggests that repulsive histone forces are related to local stretch tension in nucleosomal DNA, mostly between adjacent contact points. This tension could play a role in the stability of the complex.
Jalili, Seifollah; Karami, Leila
2012-03-01
The proline-rich homeodomain (PRH)-DNA complex consists of a protein with 60 residues and a 13-base-pair DNA. The PRH protein is a transcription factor that plays a key role in the regulation of gene expression. PRH is a significant member of the Q50 class of homeodomain proteins. The homeodomain section of PRH is essential for binding to DNA and mediates sequence-specific DNA binding. Three 20-ns molecular dynamics (MD) simulations (free protein, free DNA and protein-DNA complex) in explicit solvent water were performed to elucidate the intermolecular contacts in the PRH-DNA complex and the role of dynamics of water molecules forming water-mediated contacts. The simulation provides a detailed explanation of the trajectory of hydration water molecules. The simulations show that some water molecules in the protein-DNA interface exchange with bulk waters. The simulation identifies that most of the contacts consisted of direct interactions between the protein and DNA including specific and non-specific contacts, but several water-mediated polar contacts were also observed. The specific interaction between Gln50 and C18 and water-mediated hydrogen bond between Gln50 and T7 were found to be present during almost the entire time of the simulation. These results show good consistency with experimental and previous computational studies. Structural properties such as root-mean-square deviations (RMSD), root-mean-square fluctuations (RMSF) and secondary structure were also analyzed as a function of time. Analyses of the trajectories showed that the dynamic fluctuations of both the protein and the DNA were lowered by the complex formation.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Stewart, R.
Radiation therapy for the treatment of cancer has been established as a highly precise and effective way to eradicate a localized region of diseased tissue. To achieve further significant gains in the therapeutic ratio, we need to move towards biologically optimized treatment planning. To achieve this goal, we need to understand how the radiation-type dependent patterns of induced energy depositions within the cell (physics) connect via molecular, cellular and tissue reactions to treatment outcome such as tumor control and undesirable effects on normal tissue. Several computational biology approaches have been developed connecting physics to biology. Monte Carlo simulations are themore » most accurate method to calculate physical dose distributions at the nanometer scale, however simulations at the DNA scale are slow and repair processes are generally not simulated. Alternative models that rely on the random formation of individual DNA lesions within one or two turns of the DNA have been shown to reproduce the clusters of DNA lesions, including single strand breaks (SSBs), double strand breaks (DSBs) without the need for detailed track structure simulations. Efficient computational simulations of initial DNA damage induction facilitate computational modeling of DNA repair and other molecular and cellular processes. Mechanistic, multiscale models provide a useful conceptual framework to test biological hypotheses and help connect fundamental information about track structure and dosimetry at the sub-cellular level to dose-response effects on larger scales. In this symposium we will learn about the current state of the art of computational approaches estimating radiation damage at the cellular and sub-cellular scale. How can understanding the physics interactions at the DNA level be used to predict biological outcome? We will discuss if and how such calculations are relevant to advance our understanding of radiation damage and its repair, or, if the underlying biological processes are too complex for a mechanistic approach. Can computer simulations be used to guide future biological research? We will debate the feasibility of explaining biology from a physicists’ perspective. Learning Objectives: Understand the potential applications and limitations of computational methods for dose-response modeling at the molecular, cellular and tissue levels Learn about mechanism of action underlying the induction, repair and biological processing of damage to DNA and other constituents Understand how effects and processes at one biological scale impact on biological processes and outcomes on other scales J. Schuemann, NCI/NIH grantsS. McMahon, Funding: European Commission FP7 (grant EC FP7 MC-IOF-623630)« less
DOE Office of Scientific and Technical Information (OSTI.GOV)
McMahon, S.
Radiation therapy for the treatment of cancer has been established as a highly precise and effective way to eradicate a localized region of diseased tissue. To achieve further significant gains in the therapeutic ratio, we need to move towards biologically optimized treatment planning. To achieve this goal, we need to understand how the radiation-type dependent patterns of induced energy depositions within the cell (physics) connect via molecular, cellular and tissue reactions to treatment outcome such as tumor control and undesirable effects on normal tissue. Several computational biology approaches have been developed connecting physics to biology. Monte Carlo simulations are themore » most accurate method to calculate physical dose distributions at the nanometer scale, however simulations at the DNA scale are slow and repair processes are generally not simulated. Alternative models that rely on the random formation of individual DNA lesions within one or two turns of the DNA have been shown to reproduce the clusters of DNA lesions, including single strand breaks (SSBs), double strand breaks (DSBs) without the need for detailed track structure simulations. Efficient computational simulations of initial DNA damage induction facilitate computational modeling of DNA repair and other molecular and cellular processes. Mechanistic, multiscale models provide a useful conceptual framework to test biological hypotheses and help connect fundamental information about track structure and dosimetry at the sub-cellular level to dose-response effects on larger scales. In this symposium we will learn about the current state of the art of computational approaches estimating radiation damage at the cellular and sub-cellular scale. How can understanding the physics interactions at the DNA level be used to predict biological outcome? We will discuss if and how such calculations are relevant to advance our understanding of radiation damage and its repair, or, if the underlying biological processes are too complex for a mechanistic approach. Can computer simulations be used to guide future biological research? We will debate the feasibility of explaining biology from a physicists’ perspective. Learning Objectives: Understand the potential applications and limitations of computational methods for dose-response modeling at the molecular, cellular and tissue levels Learn about mechanism of action underlying the induction, repair and biological processing of damage to DNA and other constituents Understand how effects and processes at one biological scale impact on biological processes and outcomes on other scales J. Schuemann, NCI/NIH grantsS. McMahon, Funding: European Commission FP7 (grant EC FP7 MC-IOF-623630)« less
Flavin Charge Transfer Transitions Assist DNA Photolyase Electron Transfer
NASA Astrophysics Data System (ADS)
Skourtis, Spiros S.; Prytkova, Tatiana; Beratan, David N.
2007-12-01
This contribution describes molecular dynamics, semi-empirical and ab-initio studies of the primary photo-induced electron transfer reaction in DNA photolyase. DNA photolyases are FADH--containing proteins that repair UV-damaged DNA by photo-induced electron transfer. A DNA photolyase recognizes and binds to cyclobutatne pyrimidine dimer lesions of DNA. The protein repairs a bound lesion by transferring an electron to the lesion from FADH-, upon photo-excitation of FADH- with 350-450 nm light. We compute the lowest singlet excited states of FADH- in DNA photolyase using INDO/S configuration interaction, time-dependent density-functional, and time-dependent Hartree-Fock methods. The calculations identify the lowest singlet excited state of FADH- that is populated after photo-excitation and that acts as the electron donor. For this donor state we compute conformationally-averaged tunneling matrix elements to empty electron-acceptor states of a thymine dimer bound to photolyase. The conformational averaging involves different FADH--thymine dimer confromations obtained from molecular dynamics simulations of the solvated protein with a thymine dimer docked in its active site. The tunneling matrix element computations use INDO/S-level Green's function, energy splitting, and Generalized Mulliken-Hush methods. These calculations indicate that photo-excitation of FADH- causes a π→π* charge-transfer transition that shifts electron density to the side of the flavin isoalloxazine ring that is adjacent to the docked thymine dimer. This shift in electron density enhances the FADH--to-dimer electronic coupling, thus inducing rapid electron transfer.
A System Architecture for Efficient Transmission of Massive DNA Sequencing Data.
Sağiroğlu, Mahmut Şamİl; Külekcİ, M Oğuzhan
2017-11-01
The DNA sequencing data analysis pipelines require significant computational resources. In that sense, cloud computing infrastructures appear as a natural choice for this processing. However, the first practical difficulty in reaching the cloud computing services is the transmission of the massive DNA sequencing data from where they are produced to where they will be processed. The daily practice here begins with compressing the data in FASTQ file format, and then sending these data via fast data transmission protocols. In this study, we address the weaknesses in that daily practice and present a new system architecture that incorporates the computational resources available on the client side while dynamically adapting itself to the available bandwidth. Our proposal considers the real-life scenarios, where the bandwidth of the connection between the parties may fluctuate, and also the computing power on the client side may be of any size ranging from moderate personal computers to powerful workstations. The proposed architecture aims at utilizing both the communication bandwidth and the computing resources for satisfying the ultimate goal of reaching the results as early as possible. We present a prototype implementation of the proposed architecture, and analyze several real-life cases, which provide useful insights for the sequencing centers, especially on deciding when to use a cloud service and in what conditions.
The Galaxy platform for accessible, reproducible and collaborative biomedical analyses: 2016 update
Afgan, Enis; Baker, Dannon; van den Beek, Marius; Blankenberg, Daniel; Bouvier, Dave; Čech, Martin; Chilton, John; Clements, Dave; Coraor, Nate; Eberhard, Carl; Grüning, Björn; Guerler, Aysam; Hillman-Jackson, Jennifer; Von Kuster, Greg; Rasche, Eric; Soranzo, Nicola; Turaga, Nitesh; Taylor, James; Nekrutenko, Anton; Goecks, Jeremy
2016-01-01
High-throughput data production technologies, particularly ‘next-generation’ DNA sequencing, have ushered in widespread and disruptive changes to biomedical research. Making sense of the large datasets produced by these technologies requires sophisticated statistical and computational methods, as well as substantial computational power. This has led to an acute crisis in life sciences, as researchers without informatics training attempt to perform computation-dependent analyses. Since 2005, the Galaxy project has worked to address this problem by providing a framework that makes advanced computational tools usable by non experts. Galaxy seeks to make data-intensive research more accessible, transparent and reproducible by providing a Web-based environment in which users can perform computational analyses and have all of the details automatically tracked for later inspection, publication, or reuse. In this report we highlight recently added features enabling biomedical analyses on a large scale. PMID:27137889
The Effect of Sulfur Substitution on the Excited-State Dynamics of DNA and RNA Base Derivatives
NASA Astrophysics Data System (ADS)
Pollum, Marvin; Crespo-Hernández, Carlos E.
2014-06-01
Substitution of oxygen by a sulfur atom in the natural DNA and RNA bases gives rise to a family of derivatives commonly known as the thiobases. Upon excitation with UV radiation, the natural bases are able to quickly and efficiently dissipate the imparted energy as heat to their surroundings. Thiobases, on the other hand, relax into a long-lived triplet excited state in quantum yields that approach unity. This finding has both fundamental and biological relevance because the triplet state plays a foremost role in the photochemistry of the thiobases, this is especially important in the current medicinal applications of thiobase derivatives. Using femtosecond transient absorption spectroscopy, we are able uncover the ultrafast dynamics leading to the population of this reactive triplet state. In particular, I will present our results on how the site of sulfur substitution and the degree of substitution impact these dynamics and I will compare these experimental results to some recent computational work. Pinning down the excited-state dynamics of the thiobases is important to furthering the understanding of dynamics in natural DNA/RNA bases, as well as to the discovery of thiobase derivatives with desirable therapeutic properties. The authors acknowledge the CAREER program of the National Science Foundation (Grant No. CHE-1255084) for financial support.
Cristescu, Melania E
2014-10-01
DNA-based species identification, known as barcoding, transformed the traditional approach to the study of biodiversity science. The field is transitioning from barcoding individuals to metabarcoding communities. This revolution involves new sequencing technologies, bioinformatics pipelines, computational infrastructure, and experimental designs. In this dynamic genomics landscape, metabarcoding studies remain insular and biodiversity estimates depend on the particular methods used. In this opinion article, I discuss the need for a coordinated advancement of DNA-based species identification that integrates taxonomic and barcoding information. Such an approach would facilitate access to almost 3 centuries of taxonomic knowledge and 1 decade of building repository barcodes. Conservation projects are time sensitive, research funding is becoming restricted, and informed decisions depend on our ability to embrace integrative approaches to biodiversity science. Copyright © 2014 Elsevier Ltd. All rights reserved.
An improved model for whole genome phylogenetic analysis by Fourier transform.
Yin, Changchuan; Yau, Stephen S-T
2015-10-07
DNA sequence similarity comparison is one of the major steps in computational phylogenetic studies. The sequence comparison of closely related DNA sequences and genomes is usually performed by multiple sequence alignments (MSA). While the MSA method is accurate for some types of sequences, it may produce incorrect results when DNA sequences undergone rearrangements as in many bacterial and viral genomes. It is also limited by its computational complexity for comparing large volumes of data. Previously, we proposed an alignment-free method that exploits the full information contents of DNA sequences by Discrete Fourier Transform (DFT), but still with some limitations. Here, we present a significantly improved method for the similarity comparison of DNA sequences by DFT. In this method, we map DNA sequences into 2-dimensional (2D) numerical sequences and then apply DFT to transform the 2D numerical sequences into frequency domain. In the 2D mapping, the nucleotide composition of a DNA sequence is a determinant factor and the 2D mapping reduces the nucleotide composition bias in distance measure, and thus improving the similarity measure of DNA sequences. To compare the DFT power spectra of DNA sequences with different lengths, we propose an improved even scaling algorithm to extend shorter DFT power spectra to the longest length of the underlying sequences. After the DFT power spectra are evenly scaled, the spectra are in the same dimensionality of the Fourier frequency space, then the Euclidean distances of full Fourier power spectra of the DNA sequences are used as the dissimilarity metrics. The improved DFT method, with increased computational performance by 2D numerical representation, can be applicable to any DNA sequences of different length ranges. We assess the accuracy of the improved DFT similarity measure in hierarchical clustering of different DNA sequences including simulated and real datasets. The method yields accurate and reliable phylogenetic trees and demonstrates that the improved DFT dissimilarity measure is an efficient and effective similarity measure of DNA sequences. Due to its high efficiency and accuracy, the proposed DFT similarity measure is successfully applied on phylogenetic analysis for individual genes and large whole bacterial genomes. Copyright © 2015 Elsevier Ltd. All rights reserved.
Two- and three-input TALE-based AND logic computation in embryonic stem cells.
Lienert, Florian; Torella, Joseph P; Chen, Jan-Hung; Norsworthy, Michael; Richardson, Ryan R; Silver, Pamela A
2013-11-01
Biological computing circuits can enhance our ability to control cellular functions and have potential applications in tissue engineering and medical treatments. Transcriptional activator-like effectors (TALEs) represent attractive components of synthetic gene regulatory circuits, as they can be designed de novo to target a given DNA sequence. We here demonstrate that TALEs can perform Boolean logic computation in mammalian cells. Using a split-intein protein-splicing strategy, we show that a functional TALE can be reconstituted from two inactive parts, thus generating two-input AND logic computation. We further demonstrate three-piece intein splicing in mammalian cells and use it to perform three-input AND computation. Using methods for random as well as targeted insertion of these relatively large genetic circuits, we show that TALE-based logic circuits are functional when integrated into the genome of mouse embryonic stem cells. Comparing construct variants in the same genomic context, we modulated the strength of the TALE-responsive promoter to improve the output of these circuits. Our work establishes split TALEs as a tool for building logic computation with the potential of controlling expression of endogenous genes or transgenes in response to a combination of cellular signals.
Large-scale parallel genome assembler over cloud computing environment.
Das, Arghya Kusum; Koppa, Praveen Kumar; Goswami, Sayan; Platania, Richard; Park, Seung-Jong
2017-06-01
The size of high throughput DNA sequencing data has already reached the terabyte scale. To manage this huge volume of data, many downstream sequencing applications started using locality-based computing over different cloud infrastructures to take advantage of elastic (pay as you go) resources at a lower cost. However, the locality-based programming model (e.g. MapReduce) is relatively new. Consequently, developing scalable data-intensive bioinformatics applications using this model and understanding the hardware environment that these applications require for good performance, both require further research. In this paper, we present a de Bruijn graph oriented Parallel Giraph-based Genome Assembler (GiGA), as well as the hardware platform required for its optimal performance. GiGA uses the power of Hadoop (MapReduce) and Giraph (large-scale graph analysis) to achieve high scalability over hundreds of compute nodes by collocating the computation and data. GiGA achieves significantly higher scalability with competitive assembly quality compared to contemporary parallel assemblers (e.g. ABySS and Contrail) over traditional HPC cluster. Moreover, we show that the performance of GiGA is significantly improved by using an SSD-based private cloud infrastructure over traditional HPC cluster. We observe that the performance of GiGA on 256 cores of this SSD-based cloud infrastructure closely matches that of 512 cores of traditional HPC cluster.
Tomov, Toma E; Tsukanov, Roman; Glick, Yair; Berger, Yaron; Liber, Miran; Avrahami, Dorit; Gerber, Doron; Nir, Eyal
2017-04-25
Realization of bioinspired molecular machines that can perform many and diverse operations in response to external chemical commands is a major goal in nanotechnology, but current molecular machines respond to only a few sequential commands. Lack of effective methods for introduction and removal of command compounds and low efficiencies of the reactions involved are major reasons for the limited performance. We introduce here a user interface based on a microfluidics device and single-molecule fluorescence spectroscopy that allows efficient introduction and removal of chemical commands and enables detailed study of the reaction mechanisms involved in the operation of synthetic molecular machines. The microfluidics provided 64 consecutive DNA strand commands to a DNA-based motor system immobilized inside the microfluidics, driving a bipedal walker to perform 32 steps on a DNA origami track. The microfluidics enabled removal of redundant strands, resulting in a 6-fold increase in processivity relative to an identical motor operated without strand removal and significantly more operations than previously reported for user-controlled DNA nanomachines. In the motor operated without strand removal, redundant strands interfere with motor operation and reduce its performance. The microfluidics also enabled computer control of motor direction and speed. Furthermore, analysis of the reaction kinetics and motor performance in the absence of redundant strands, made possible by the microfluidics, enabled accurate modeling of the walker processivity. This enabled identification of dynamic boundaries and provided an explanation, based on the "trap state" mechanism, for why the motor did not perform an even larger number of steps. This understanding is very important for the development of future motors with significantly improved performance. Our universal interface enables two-way communication between user and molecular machine and, relying on concepts similar to that of solid-phase synthesis, removes limitations on the number of external stimuli. This interface, therefore, is an important step toward realization of reliable, processive, reproducible, and useful externally controlled DNA nanomachines.
Arora, Sanjeevani; Huwe, Peter J.; Sikder, Rahmat; Shah, Manali; Browne, Amanda J.; Lesh, Randy; Nicolas, Emmanuelle; Deshpande, Sanat; Hall, Michael J.; Dunbrack, Roland L.; Golemis, Erica A.
2017-01-01
ABSTRACT The cancer-predisposing Lynch Syndrome (LS) arises from germline mutations in DNA mismatch repair (MMR) genes, predominantly MLH1, MSH2, MSH6, and PMS2. A major challenge for clinical diagnosis of LS is the frequent identification of variants of uncertain significance (VUS) in these genes, as it is often difficult to determine variant pathogenicity, particularly for missense variants. Generic programs such as SIFT and PolyPhen-2, and MMR gene-specific programs such as PON-MMR and MAPP-MMR, are often used to predict deleterious or neutral effects of VUS in MMR genes. We evaluated the performance of multiple predictive programs in the context of functional biologic data for 15 VUS in MLH1, MSH2, and PMS2. Using cell line models, we characterized VUS predicted to range from neutral to pathogenic on mRNA and protein expression, basal cellular viability, viability following treatment with a panel of DNA-damaging agents, and functionality in DNA damage response (DDR) signaling, benchmarking to wild-type MMR proteins. Our results suggest that the MMR gene-specific classifiers do not always align with the experimental phenotypes related to DDR. Our study highlights the importance of complementary experimental and computational assessment to develop future predictors for the assessment of VUS. PMID:28494185
Developing DNA nanotechnology using single-molecule fluorescence.
Tsukanov, Roman; Tomov, Toma E; Liber, Miran; Berger, Yaron; Nir, Eyal
2014-06-17
CONSPECTUS: An important effort in the DNA nanotechnology field is focused on the rational design and manufacture of molecular structures and dynamic devices made of DNA. As is the case for other technologies that deal with manipulation of matter, rational development requires high quality and informative feedback on the building blocks and final products. For DNA nanotechnology such feedback is typically provided by gel electrophoresis, atomic force microscopy (AFM), and transmission electron microscopy (TEM). These analytical tools provide excellent structural information; however, usually they do not provide high-resolution dynamic information. For the development of DNA-made dynamic devices such as machines, motors, robots, and computers this constitutes a major problem. Bulk-fluorescence techniques are capable of providing dynamic information, but because only ensemble averaged information is obtained, the technique may not adequately describe the dynamics in the context of complex DNA devices. The single-molecule fluorescence (SMF) technique offers a unique combination of capabilities that make it an excellent tool for guiding the development of DNA-made devices. The technique has been increasingly used in DNA nanotechnology, especially for the analysis of structure, dynamics, integrity, and operation of DNA-made devices; however, its capabilities are not yet sufficiently familiar to the community. The purpose of this Account is to demonstrate how different SMF tools can be utilized for the development of DNA devices and for structural dynamic investigation of biomolecules in general and DNA molecules in particular. Single-molecule diffusion-based Förster resonance energy transfer and alternating laser excitation (sm-FRET/ALEX) and immobilization-based total internal reflection fluorescence (TIRF) techniques are briefly described and demonstrated. To illustrate the many applications of SMF to DNA nanotechnology, examples of SMF studies of DNA hairpins and Holliday junctions and of the interactions of DNA strands with DNA origami and origami-related devices such as a DNA bipedal motor are provided. These examples demonstrate how SMF can be utilized for measurement of distances and conformational distributions and equilibrium and nonequilibrium kinetics, to monitor structural integrity and operation of DNA devices, and for isolation and investigation of minor subpopulations including malfunctioning and nonreactive devices. Utilization of a flow-cell to achieve measurements of dynamics with increased time resolution and for convenient and efficient operation of DNA devices is discussed briefly. We conclude by summarizing the various benefits provided by SMF for the development of DNA nanotechnology and suggest that the method can significantly assist in the design and manufacture and evaluation of operation of DNA devices.
Argo_CUDA: Exhaustive GPU based approach for motif discovery in large DNA datasets.
Vishnevsky, Oleg V; Bocharnikov, Andrey V; Kolchanov, Nikolay A
2018-02-01
The development of chromatin immunoprecipitation sequencing (ChIP-seq) technology has revolutionized the genetic analysis of the basic mechanisms underlying transcription regulation and led to accumulation of information about a huge amount of DNA sequences. There are a lot of web services which are currently available for de novo motif discovery in datasets containing information about DNA/protein binding. An enormous motif diversity makes their finding challenging. In order to avoid the difficulties, researchers use different stochastic approaches. Unfortunately, the efficiency of the motif discovery programs dramatically declines with the query set size increase. This leads to the fact that only a fraction of top "peak" ChIP-Seq segments can be analyzed or the area of analysis should be narrowed. Thus, the motif discovery in massive datasets remains a challenging issue. Argo_Compute Unified Device Architecture (CUDA) web service is designed to process the massive DNA data. It is a program for the detection of degenerate oligonucleotide motifs of fixed length written in 15-letter IUPAC code. Argo_CUDA is a full-exhaustive approach based on the high-performance GPU technologies. Compared with the existing motif discovery web services, Argo_CUDA shows good prediction quality on simulated sets. The analysis of ChIP-Seq sequences revealed the motifs which correspond to known transcription factor binding sites.
Dynamic protein assembly by programmable DNA strand displacement.
Chen, Rebecca P; Blackstock, Daniel; Sun, Qing; Chen, Wilfred
2018-04-01
Inspired by the remarkable ability of natural protein switches to sense and respond to a wide range of environmental queues, here we report a strategy to engineer synthetic protein switches by using DNA strand displacement to dynamically organize proteins with highly diverse and complex logic gate architectures. We show that DNA strand displacement can be used to dynamically control the spatial proximity and the corresponding fluorescence resonance energy transfer between two fluorescent proteins. Performing Boolean logic operations enabled the explicit control of protein proximity using multi-input, reversible and amplification architectures. We further demonstrate the power of this technology beyond sensing by achieving dynamic control of an enzyme cascade. Finally, we establish the utility of the approach as a synthetic computing platform that drives the dynamic reconstitution of a split enzyme for targeted prodrug activation based on the sensing of cancer-specific miRNAs.
Dynamic protein assembly by programmable DNA strand displacement
NASA Astrophysics Data System (ADS)
Chen, Rebecca P.; Blackstock, Daniel; Sun, Qing; Chen, Wilfred
2018-03-01
Inspired by the remarkable ability of natural protein switches to sense and respond to a wide range of environmental queues, here we report a strategy to engineer synthetic protein switches by using DNA strand displacement to dynamically organize proteins with highly diverse and complex logic gate architectures. We show that DNA strand displacement can be used to dynamically control the spatial proximity and the corresponding fluorescence resonance energy transfer between two fluorescent proteins. Performing Boolean logic operations enabled the explicit control of protein proximity using multi-input, reversible and amplification architectures. We further demonstrate the power of this technology beyond sensing by achieving dynamic control of an enzyme cascade. Finally, we establish the utility of the approach as a synthetic computing platform that drives the dynamic reconstitution of a split enzyme for targeted prodrug activation based on the sensing of cancer-specific miRNAs.
Hughes, James Alexander; Houghten, Sheridan; Ashlock, Daniel
2016-12-01
DNA Fragment assembly - an NP-Hard problem - is one of the major steps in of DNA sequencing. Multiple strategies have been used for this problem, including greedy graph-based algorithms, deBruijn graphs, and the overlap-layout-consensus approach. This study focuses on the overlap-layout-consensus approach. Heuristics and computational intelligence methods are combined to exploit their respective benefits. These algorithm combinations were able to produce high quality results surpassing the best results obtained by a number of competitive algorithms specially designed and tuned for this problem on thirteen of sixteen popular benchmarks. This work also reinforces the necessity of using multiple search strategies as it is clearly observed that algorithm performance is dependent on problem instance; without a deeper look into many searches, top solutions could be missed entirely. Copyright © 2016. Published by Elsevier Ireland Ltd.
Lattice-free prediction of three-dimensional structure of programmed DNA assemblies
Pan, Keyao; Kim, Do-Nyun; Zhang, Fei; Adendorff, Matthew R.; Yan, Hao; Bathe, Mark
2014-01-01
DNA can be programmed to self-assemble into high molecular weight 3D assemblies with precise nanometer-scale structural features. Although numerous sequence design strategies exist to realize these assemblies in solution, there is currently no computational framework to predict their 3D structures on the basis of programmed underlying multi-way junction topologies constrained by DNA duplexes. Here, we introduce such an approach and apply it to assemblies designed using the canonical immobile four-way junction. The procedure is used to predict the 3D structure of high molecular weight planar and spherical ring-like origami objects, a tile-based sheet-like ribbon, and a 3D crystalline tensegrity motif, in quantitative agreement with experiments. Our framework provides a new approach to predict programmed nucleic acid 3D structure on the basis of prescribed secondary structure motifs, with possible application to the design of such assemblies for use in biomolecular and materials science. PMID:25470497
Computational solutions to large-scale data management and analysis
Schadt, Eric E.; Linderman, Michael D.; Sorenson, Jon; Lee, Lawrence; Nolan, Garry P.
2011-01-01
Today we can generate hundreds of gigabases of DNA and RNA sequencing data in a week for less than US$5,000. The astonishing rate of data generation by these low-cost, high-throughput technologies in genomics is being matched by that of other technologies, such as real-time imaging and mass spectrometry-based flow cytometry. Success in the life sciences will depend on our ability to properly interpret the large-scale, high-dimensional data sets that are generated by these technologies, which in turn requires us to adopt advances in informatics. Here we discuss how we can master the different types of computational environments that exist — such as cloud and heterogeneous computing — to successfully tackle our big data problems. PMID:20717155
Ahmad, Muneer; Jung, Low Tan; Bhuiyan, Al-Amin
2017-10-01
Digital signal processing techniques commonly employ fixed length window filters to process the signal contents. DNA signals differ in characteristics from common digital signals since they carry nucleotides as contents. The nucleotides own genetic code context and fuzzy behaviors due to their special structure and order in DNA strand. Employing conventional fixed length window filters for DNA signal processing produce spectral leakage and hence results in signal noise. A biological context aware adaptive window filter is required to process the DNA signals. This paper introduces a biological inspired fuzzy adaptive window median filter (FAWMF) which computes the fuzzy membership strength of nucleotides in each slide of window and filters nucleotides based on median filtering with a combination of s-shaped and z-shaped filters. Since coding regions cause 3-base periodicity by an unbalanced nucleotides' distribution producing a relatively high bias for nucleotides' usage, such fundamental characteristic of nucleotides has been exploited in FAWMF to suppress the signal noise. Along with adaptive response of FAWMF, a strong correlation between median nucleotides and the Π shaped filter was observed which produced enhanced discrimination between coding and non-coding regions contrary to fixed length conventional window filters. The proposed FAWMF attains a significant enhancement in coding regions identification i.e. 40% to 125% as compared to other conventional window filters tested over more than 250 benchmarked and randomly taken DNA datasets of different organisms. This study proves that conventional fixed length window filters applied to DNA signals do not achieve significant results since the nucleotides carry genetic code context. The proposed FAWMF algorithm is adaptive and outperforms significantly to process DNA signal contents. The algorithm applied to variety of DNA datasets produced noteworthy discrimination between coding and non-coding regions contrary to fixed window length conventional filters. Copyright © 2017 Elsevier B.V. All rights reserved.
Uncovering the self-assembly of DNA nanostructures by thermodynamics and kinetics.
Wei, Xixi; Nangreave, Jeanette; Liu, Yan
2014-06-17
CONSPECTUS: DNA nanotechnology is one of the most flourishing interdisciplinary research fields. DNA nanostructures can be designed to self-assemble into a variety of periodic or aperiodic patterns of different shapes and length scales. They can be used as scaffolds for organizing other nanoparticles, proteins, and chemical groups, leveraging their functions for creating complex bioinspired materials that may serve as smart drug delivery systems, in vitro or in vivo biomolecular computing platforms, and diagnostic devices. Achieving optimal structural features, efficient assembly protocols, and precise functional group positioning and modification requires a thorough understanding of the thermodynamics and kinetics of the DNA nanostructure self-assembly process. The most common real-time measurement strategies include monitoring changes in UV absorbance based on the hyperchromic effect of DNA, and the emission signal changes of DNA intercalating dyes or covalently conjugated fluorescent dyes/pairs that accompany temperature dependent structural changes. Thermodynamic studies of a variety of DNA nanostructures have been performed, from simple double stranded DNA formation to more complex origami assembly. The key parameters that have been evaluated in terms of stability and cooperativity include the overall dimensions, the folding path of the scaffold, crossover and nick point arrangement, length and sequence of single strands, and salt and ion concentrations. DNA tile-tile interactions through sticky end hybridization have also been analyzed, and the steric inhibition and rigidity of tiles turn out to be important factors. Many kinetic studies have also been reported, and most are based on double stranded DNA formation. A two-state assumption and the hypothesis of several intermediate states have been applied to determine the rate constant and activation energy of the DNA hybridization process. A few simulated models were proposed to represent the structural, mechanical, and kinetic properties of DNA hybridization. The kinetics of strand displacement reactions has also been studied as a special case of DNA hybridization. The thermodynamic and kinetic characteristics of DNA nanostructures have been exploited to develop rapid and isothermal annealing protocols. It is conceivable that a more thorough understanding of the DNA assembly process could be used to guide the structural design process and optimize the conditions for assembly, manipulation, and functionalization, thus benefiting both upstream design and downstream applications.
Probing of miniPEGγ-PNA-DNA Hybrid Duplex Stability with AFM Force Spectroscopy.
Dutta, Samrat; Armitage, Bruce A; Lyubchenko, Yuri L
2016-03-15
Peptide nucleic acids (PNA) are synthetic polymers, the neutral peptide backbone of which provides elevated stability to PNA-PNA and PNA-DNA hybrid duplexes. It was demonstrated that incorporation of diethylene glycol (miniPEG) at the γ position of the peptide backbone increased the thermal stability of the hybrid duplexes (Sahu, B. et al. J. Org. Chem. 2011, 76, 5614-5627). Here, we applied atomic force microscopy (AFM) based single molecule force spectroscopy and dynamic force spectroscopy (DFS) to test the strength and stability of the hybrid 10 bp duplex. This hybrid duplex consisted of miniPEGγ-PNA and DNA of the same length (γ(MP)PNA-DNA), which we compared to a DNA duplex with a homologous sequence. AFM force spectroscopy data obtained at the same conditions showed that the γ(MP)PNA-DNA hybrid is more stable than the DNA counterpart, 65 ± 15 pN vs 47 ± 15 pN, respectively. The DFS measurements performed in a range of pulling speeds analyzed in the framework of the Bell-Evans approach yielded a dissociation constant, koff ≈ 0.030 ± 0.01 s⁻¹ for γ(MP)PNA-DNA hybrid duplex vs 0.375 ± 0.18 s⁻¹ for the DNA-DNA duplex suggesting that the hybrid duplex is much more stable. Correlating the high affinity of γ(MP)PNA-DNA to slow dissociation kinetics is consistent with prior bulk characterization by surface plasmon resonance. Given the growing interest in γ(MP)PNA as well as other synthetic DNA analogues, the use of single molecule experiments along with computational analysis of force spectroscopy data will provide direct characterization of various modifications as well as higher order structures such as triplexes and quadruplexes.
Tanabe, Akifumi S; Toju, Hirokazu
2013-01-01
Taxonomic identification of biological specimens based on DNA sequence information (a.k.a. DNA barcoding) is becoming increasingly common in biodiversity science. Although several methods have been proposed, many of them are not universally applicable due to the need for prerequisite phylogenetic/machine-learning analyses, the need for huge computational resources, or the lack of a firm theoretical background. Here, we propose two new computational methods of DNA barcoding and show a benchmark for bacterial/archeal 16S, animal COX1, fungal internal transcribed spacer, and three plant chloroplast (rbcL, matK, and trnH-psbA) barcode loci that can be used to compare the performance of existing and new methods. The benchmark was performed under two alternative situations: query sequences were available in the corresponding reference sequence databases in one, but were not available in the other. In the former situation, the commonly used "1-nearest-neighbor" (1-NN) method, which assigns the taxonomic information of the most similar sequences in a reference database (i.e., BLAST-top-hit reference sequence) to a query, displays the highest rate and highest precision of successful taxonomic identification. However, in the latter situation, the 1-NN method produced extremely high rates of misidentification for all the barcode loci examined. In contrast, one of our new methods, the query-centric auto-k-nearest-neighbor (QCauto) method, consistently produced low rates of misidentification for all the loci examined in both situations. These results indicate that the 1-NN method is most suitable if the reference sequences of all potentially observable species are available in databases; otherwise, the QCauto method returns the most reliable identification results. The benchmark results also indicated that the taxon coverage of reference sequences is far from complete for genus or species level identification in all the barcode loci examined. Therefore, we need to accelerate the registration of reference barcode sequences to apply high-throughput DNA barcoding to genus or species level identification in biodiversity research.
Tanabe, Akifumi S.; Toju, Hirokazu
2013-01-01
Taxonomic identification of biological specimens based on DNA sequence information (a.k.a. DNA barcoding) is becoming increasingly common in biodiversity science. Although several methods have been proposed, many of them are not universally applicable due to the need for prerequisite phylogenetic/machine-learning analyses, the need for huge computational resources, or the lack of a firm theoretical background. Here, we propose two new computational methods of DNA barcoding and show a benchmark for bacterial/archeal 16S, animal COX1, fungal internal transcribed spacer, and three plant chloroplast (rbcL, matK, and trnH-psbA) barcode loci that can be used to compare the performance of existing and new methods. The benchmark was performed under two alternative situations: query sequences were available in the corresponding reference sequence databases in one, but were not available in the other. In the former situation, the commonly used “1-nearest-neighbor” (1-NN) method, which assigns the taxonomic information of the most similar sequences in a reference database (i.e., BLAST-top-hit reference sequence) to a query, displays the highest rate and highest precision of successful taxonomic identification. However, in the latter situation, the 1-NN method produced extremely high rates of misidentification for all the barcode loci examined. In contrast, one of our new methods, the query-centric auto-k-nearest-neighbor (QCauto) method, consistently produced low rates of misidentification for all the loci examined in both situations. These results indicate that the 1-NN method is most suitable if the reference sequences of all potentially observable species are available in databases; otherwise, the QCauto method returns the most reliable identification results. The benchmark results also indicated that the taxon coverage of reference sequences is far from complete for genus or species level identification in all the barcode loci examined. Therefore, we need to accelerate the registration of reference barcode sequences to apply high-throughput DNA barcoding to genus or species level identification in biodiversity research. PMID:24204702
Optimized smith waterman processor design for breast cancer early diagnosis
NASA Astrophysics Data System (ADS)
Nurdin, D. S.; Isa, M. N.; Ismail, R. C.; Ahmad, M. I.
2017-09-01
This paper presents an optimized design of Processing Element (PE) of Systolic Array (SA) which implements affine gap penalty Smith Waterman (SW) algorithm on the Xilinx Virtex-6 XC6VLX75T Field Programmable Gate Array (FPGA) for Deoxyribonucleic Acid (DNA) sequence alignment. The PE optimization aims to reduce PE logic resources to increase number of PEs in FPGA for higher degree of parallelism during alignment matrix computations. This is useful for aligning long DNA-based disease sequence such as Breast Cancer (BC) for early diagnosis. The optimized PE architecture has the smallest PE area with 15 slices in a PE and 776 PEs implemented in the Virtex - 6 FPGA.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Chain, Patrick
Genomics — the genetic mapping and DNA sequencing of sets of genes or the complete genomes of organisms, along with related genome analysis and database work — is emerging as one of the transformative sciences of the 21st century. But current bioinformatics tools are not accessible to most biological researchers. Now, a new computational and web-based tool called EDGE Bioinformatics is working to fulfill the promise of democratizing genomics.
Introduction to the Natural Anticipator and the Artificial Anticipator
NASA Astrophysics Data System (ADS)
Dubois, Daniel M.
2010-11-01
This short communication deals with the introduction of the concept of anticipator, which is one who anticipates, in the framework of computing anticipatory systems. The definition of anticipation deals with the concept of program. Indeed, the word program, comes from "pro-gram" meaning "to write before" by anticipation, and means a plan for the programming of a mechanism, or a sequence of coded instructions that can be inserted into a mechanism, or a sequence of coded instructions, as genes or behavioural responses, that is part of an organism. Any natural or artificial programs are thus related to anticipatory rewriting systems, as shown in this paper. All the cells in the body, and the neurons in the brain, are programmed by the anticipatory genetic code, DNA, in a low-level language with four signs. The programs in computers are also computing anticipatory systems. It will be shown, at one hand, that the genetic code DNA is a natural anticipator. As demonstrated by Nobel laureate McClintock [8], genomes are programmed. The fundamental program deals with the DNA genetic code. The properties of the DNA consist in self-replication and self-modification. The self-replicating process leads to reproduction of the species, while the self-modifying process leads to new species or evolution and adaptation in existing ones. The genetic code DNA keeps its instructions in memory in the DNA coding molecule. The genetic code DNA is a rewriting system, from DNA coding to DNA template molecule. The DNA template molecule is a rewriting system to the Messenger RNA molecule. The information is not destroyed during the execution of the rewriting program. On the other hand, it will be demonstrated that Turing machine is an artificial anticipator. The Turing machine is a rewriting system. The head reads and writes, modifying the content of the tape. The information is destroyed during the execution of the program. This is an irreversible process. The input data are lost.
Directed nucleation assembly of DNA tile complexes for barcode-patterned lattices
Yan, Hao; LaBean, Thomas H.; Feng, Liping; Reif, John H.
2003-01-01
The programmed self-assembly of patterned aperiodic molecular structures is a major challenge in nanotechnology and has numerous potential applications for nanofabrication of complex structures and useful devices. Here we report the construction of an aperiodic patterned DNA lattice (barcode lattice) by a self-assembly process of directed nucleation of DNA tiles around a scaffold DNA strand. The input DNA scaffold strand, constructed by ligation of shorter synthetic oligonucleotides, provides layers of the DNA lattice with barcode patterning information represented by the presence or absence of DNA hairpin loops protruding out of the lattice plane. Self-assembly of multiple DNA tiles around the scaffold strand was shown to result in a patterned lattice containing barcode information of 01101. We have also demonstrated the reprogramming of the system to another patterning. An inverted barcode pattern of 10010 was achieved by modifying the scaffold strands and one of the strands composing each tile. A ribbon lattice, consisting of repetitions of the barcode pattern with expected periodicity, was also constructed by the addition of sticky ends. The patterning of both classes of lattices was clearly observable via atomic force microscopy. These results represent a step toward implementation of a visual readout system capable of converting information encoded on a 1D DNA strand into a 2D form readable by advanced microscopic techniques. A functioning visual output method would not only increase the readout speed of DNA-based computers, but may also find use in other sequence identification techniques such as mutation or allele mapping. PMID:12821776
Determining the optimal forensic DNA analysis procedure following investigation of sample quality.
Hedell, Ronny; Hedman, Johannes; Mostad, Petter
2018-07-01
Crime scene traces of various types are routinely sent to forensic laboratories for analysis, generally with the aim of addressing questions about the source of the trace. The laboratory may choose to analyse the samples in different ways depending on the type and quality of the sample, the importance of the case and the cost and performance of the available analysis methods. Theoretically well-founded guidelines for the choice of analysis method are, however, lacking in most situations. In this paper, it is shown how such guidelines can be created using Bayesian decision theory. The theory is applied to forensic DNA analysis, showing how the information from the initial qPCR analysis can be utilized. It is assumed the alternatives for analysis are using a standard short tandem repeat (STR) DNA analysis assay, using the standard assay and a complementary assay, or the analysis may be cancelled following quantification. The decision is based on information about the DNA amount and level of DNA degradation of the forensic sample, as well as case circumstances and the cost for analysis. Semi-continuous electropherogram models are used for simulation of DNA profiles and for computation of likelihood ratios. It is shown how tables and graphs, prepared beforehand, can be used to quickly find the optimal decision in forensic casework.
Ranjbar, Reza; Hafezi-Moghadam, Mohammad Sadegh
2016-02-01
With all of the developments on infectious diseases, tuberculosis (TB) remains a cause of death among people. One of the most promising assembly techniques in nano-technology is "scaffolded DNA origami" to design and construct a nano-scale drug delivery system. Because of the global health problems of tuberculosis, the development of potent new anti-tuberculosis drug delivery system without cross-resistance with known anti-mycobacterial agents is urgently needed. The aim of this study was to design a nano-scale drug delivery system for TB treatment using the DNA origami method. In this study, we presented an experimental research on a DNA drug delivery system for treating Tuberculosis. TEM images were visualized with an FEI Tecnai T12 BioTWIN at 120 kV. The model was designed by caDNAno software and computational prediction of the 3D solution shape and its flexibility was calculated with a CanDo server. Synthesizing the product was imaged using transmission electron microscopy after negative-staining by uranyl formate. We constructed a multilayer 3D DNA nanostructure system by designing square lattice geometry with the scaffolded-DNA-origami method. With changes in the lock and key sequences, we recommend that this system be used for other infectious diseases to target the pathogenic bacteria.
Kubař, Tomáš; Elstner, Marcus
2013-04-28
In this work, a fragment-orbital density functional theory-based method is combined with two different non-adiabatic schemes for the propagation of the electronic degrees of freedom. This allows us to perform unbiased simulations of electron transfer processes in complex media, and the computational scheme is applied to the transfer of a hole in solvated DNA. It turns out that the mean-field approach, where the wave function of the hole is driven into a superposition of adiabatic states, leads to over-delocalization of the hole charge. This problem is avoided using a surface hopping scheme, resulting in a smaller rate of hole transfer. The method is highly efficient due to the on-the-fly computation of the coarse-grained DFT Hamiltonian for the nucleobases, which is coupled to the environment using a QM/MM approach. The computational efficiency and partial parallel character of the methodology make it possible to simulate electron transfer in systems of relevant biochemical size on a nanosecond time scale. Since standard non-polarizable force fields are applied in the molecular-mechanics part of the calculation, a simple scaling scheme was introduced into the electrostatic potential in order to simulate the effect of electronic polarization. It is shown that electronic polarization has an important effect on the features of charge transfer. The methodology is applied to two kinds of DNA sequences, illustrating the features of transfer along a flat energy landscape as well as over an energy barrier. The performance and relative merit of the mean-field scheme and the surface hopping for this application are discussed.
Protein alignment algorithms with an efficient backtracking routine on multiple GPUs.
Blazewicz, Jacek; Frohmberg, Wojciech; Kierzynka, Michal; Pesch, Erwin; Wojciechowski, Pawel
2011-05-20
Pairwise sequence alignment methods are widely used in biological research. The increasing number of sequences is perceived as one of the upcoming challenges for sequence alignment methods in the nearest future. To overcome this challenge several GPU (Graphics Processing Unit) computing approaches have been proposed lately. These solutions show a great potential of a GPU platform but in most cases address the problem of sequence database scanning and computing only the alignment score whereas the alignment itself is omitted. Thus, the need arose to implement the global and semiglobal Needleman-Wunsch, and Smith-Waterman algorithms with a backtracking procedure which is needed to construct the alignment. In this paper we present the solution that performs the alignment of every given sequence pair, which is a required step for progressive multiple sequence alignment methods, as well as for DNA recognition at the DNA assembly stage. Performed tests show that the implementation, with performance up to 6.3 GCUPS on a single GPU for affine gap penalties, is very efficient in comparison to other CPU and GPU-based solutions. Moreover, multiple GPUs support with load balancing makes the application very scalable. The article shows that the backtracking procedure of the sequence alignment algorithms may be designed to fit in with the GPU architecture. Therefore, our algorithm, apart from scores, is able to compute pairwise alignments. This opens a wide range of new possibilities, allowing other methods from the area of molecular biology to take advantage of the new computational architecture. Performed tests show that the efficiency of the implementation is excellent. Moreover, the speed of our GPU-based algorithms can be almost linearly increased when using more than one graphics card.
NASA Astrophysics Data System (ADS)
Manning, Gerald S.
2015-09-01
We give a contemporary and direct derivation of a classical, but insufficiently familiar, result in the theory of linear elasticity—a representation for the energy of a stressed elastic rod with central axis that intrinsically takes the shape of a general space curve. We show that the geometric torsion of the space curve, while playing a crucial role in the bending energy, is physically unrelated to the elastic twist. We prove that the twist energy vanishes in the lowest-energy states of a rod subject to constraints that do not restrict the twist. The stretching and contraction energies of a free helical spring are computed. There are local high-energy minima. We show the possibility of using the spring to model the chirality of DNA. We then compare our results with an available atomic level energy simulation that was performed on DNA unconstrained in the same sense as the free spring. We find some possible reflections of springlike behavior in the mechanics of DNA, but, unsurprisingly, the base pairs lend a material substance to the core of DNA that a spring does not capture.
Manning, Gerald S
2015-09-14
We give a contemporary and direct derivation of a classical, but insufficiently familiar, result in the theory of linear elasticity-a representation for the energy of a stressed elastic rod with central axis that intrinsically takes the shape of a general space curve. We show that the geometric torsion of the space curve, while playing a crucial role in the bending energy, is physically unrelated to the elastic twist. We prove that the twist energy vanishes in the lowest-energy states of a rod subject to constraints that do not restrict the twist. The stretching and contraction energies of a free helical spring are computed. There are local high-energy minima. We show the possibility of using the spring to model the chirality of DNA. We then compare our results with an available atomic level energy simulation that was performed on DNA unconstrained in the same sense as the free spring. We find some possible reflections of springlike behavior in the mechanics of DNA, but, unsurprisingly, the base pairs lend a material substance to the core of DNA that a spring does not capture.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Manning, Gerald S., E-mail: jerrymanning@rcn.com
We give a contemporary and direct derivation of a classical, but insufficiently familiar, result in the theory of linear elasticity—a representation for the energy of a stressed elastic rod with central axis that intrinsically takes the shape of a general space curve. We show that the geometric torsion of the space curve, while playing a crucial role in the bending energy, is physically unrelated to the elastic twist. We prove that the twist energy vanishes in the lowest-energy states of a rod subject to constraints that do not restrict the twist. The stretching and contraction energies of a free helicalmore » spring are computed. There are local high-energy minima. We show the possibility of using the spring to model the chirality of DNA. We then compare our results with an available atomic level energy simulation that was performed on DNA unconstrained in the same sense as the free spring. We find some possible reflections of springlike behavior in the mechanics of DNA, but, unsurprisingly, the base pairs lend a material substance to the core of DNA that a spring does not capture.« less
TaxI: a software tool for DNA barcoding using distance methods
Steinke, Dirk; Vences, Miguel; Salzburger, Walter; Meyer, Axel
2005-01-01
DNA barcoding is a promising approach to the diagnosis of biological diversity in which DNA sequences serve as the primary key for information retrieval. Most existing software for evolutionary analysis of DNA sequences was designed for phylogenetic analyses and, hence, those algorithms do not offer appropriate solutions for the rapid, but precise analyses needed for DNA barcoding, and are also unable to process the often large comparative datasets. We developed a flexible software tool for DNA taxonomy, named TaxI. This program calculates sequence divergences between a query sequence (taxon to be barcoded) and each sequence of a dataset of reference sequences defined by the user. Because the analysis is based on separate pairwise alignments this software is also able to work with sequences characterized by multiple insertions and deletions that are difficult to align in large sequence sets (i.e. thousands of sequences) by multiple alignment algorithms because of computational restrictions. Here, we demonstrate the utility of this approach with two datasets of fish larvae and juveniles from Lake Constance and juvenile land snails under different models of sequence evolution. Sets of ribosomal 16S rRNA sequences, characterized by multiple indels, performed as good as or better than cox1 sequence sets in assigning sequences to species, demonstrating the suitability of rRNA genes for DNA barcoding. PMID:16214755
Zhang, Shanxin; Zhou, Zhiping; Chen, Xinmeng; Hu, Yong; Yang, Lindong
2017-08-07
DNase I hypersensitive sites (DHSs) are accessible chromatin regions hypersensitive to cleavages by DNase I endonucleases. DHSs are indicative of cis-regulatory DNA elements (CREs), all of which play important roles in global gene expression regulation. It is helpful for discovering CREs by recognition of DHSs in genome. To accelerate the investigation, it is an important complement to develop cost-effective computational methods to identify DHSs. However, there is a lack of tools used for identifying DHSs in plant genome. Here we presented pDHS-SVM, a computational predictor to identify plant DHSs. To integrate the global sequence-order information and local DNA properties, reverse complement kmer and dinucleotide-based auto covariance of DNA sequences were applied to construct the feature space. In this work, fifteen physical-chemical properties of dinucleotides were used and Support Vector Machine (SVM) was employed. To further improve the performance of the predictor and extract an optimized subset of nucleotide physical-chemical properties positive for the DHSs, a heuristic nucleotide physical-chemical property selection algorithm was introduced. With the optimized subset of properties, experimental results of Arabidopsis thaliana and rice (Oryza sativa) showed that pDHS-SVM could achieve accuracies up to 87.00%, and 85.79%, respectively. The results indicated the effectiveness of proposed method for predicting DHSs. Furthermore, pDHS-SVM could provide a helpful complement for predicting CREs in plant genome. Our implementation of the novel proposed method pDHS-SVM is freely available as source code, at https://github.com/shanxinzhang/pDHS-SVM. Copyright © 2017 Elsevier Ltd. All rights reserved.
Li, Chengzhe; Ai, Rizi; Wang, Mengchi; Firestein, Gary S.; Wang, Wei
2016-01-01
Motivation: DNA methylation signatures in rheumatoid arthritis (RA) have been identified in fibroblast-like synoviocytes (FLS) with Illumina HumanMethylation450 array. Since <2% of CpG sites are covered by the Illumina 450K array and whole genome bisulfite sequencing is still too expensive for many samples, computationally predicting DNA methylation levels based on 450K data would be valuable to discover more RA-related genes. Results: We developed a computational model that is trained on 14 tissues with both whole genome bisulfite sequencing and 450K array data. This model integrates information derived from the similarity of local methylation pattern between tissues, the methylation information of flanking CpG sites and the methylation tendency of flanking DNA sequences. The predicted and measured methylation values were highly correlated with a Pearson correlation coefficient of 0.9 in leave-one-tissue-out cross-validations. Importantly, the majority (76%) of the top 10% differentially methylated loci among the 14 tissues was correctly detected using the predicted methylation values. Applying this model to 450K data of RA, osteoarthritis and normal FLS, we successfully expanded the coverage of CpG sites 18.5-fold and accounts for about 30% of all the CpGs in the human genome. By integrative omics study, we identified genes and pathways tightly related to RA pathogenesis, among which 12 genes were supported by triple evidences, including 6 genes already known to perform specific roles in RA and 6 genes as new potential therapeutic targets. Availability and implementation: The source code, required data for prediction, and demo data for test are freely available at: http://wanglab.ucsd.edu/star/LR450K/. Contact: wei-wang@ucsd.edu or gfirestein@ucsd.edu Supplementary information: Supplementary data are available at Bioinformatics online. PMID:26883487
Biosensing via light scattering from plasmonic core-shell nanospheres coated with DNA molecules
NASA Astrophysics Data System (ADS)
Xie, Huai-Yi; Chen, Minfeng; Chang, Yia-Chung; Moirangthem, Rakesh Singh
2017-05-01
We present both experimental and theoretical studies for investigating DNA molecules attached on metallic nanospheres. We have developed an efficient and accurate numerical method to investigate light scattering from plasmonic nanospheres on a substrate covered by a shell, based on the Green's function approach with suitable spherical harmonic basis. Next, we use this method to study optical scattering from DNA molecules attached to metallic nanoparticles placed on a substrate and compare with experimental results. We obtain fairly good agreement between theoretical predictions and the measured ellipsometric spectra. The metallic nanoparticles were used to detect the binding with DNA molecules in a microfluidic setup via spectroscopic ellipsometry (SE), and a detectable change in ellipsometric spectra was found when DNA molecules are captured on Au nanoparticles. Our theoretical simulation indicates that the coverage of Au nanosphere by a submonolayer of DNA molecules, which is modeled by a thin layer of dielectric material (which may absorb light), can lead to a small but detectable spectroscopic shift in both the Ψ and Δ spectra with more significant change in Δ spectra in agreement with experimental results. Our studies demonstrated the ultrasensitive capability of SE for sensing submonolayer coverage of DNA molecules on Au nanospheres. Hence the spectroscopic ellipsometric measurements coupled with theoretical analysis via an efficient computation method can be an effective tool for detecting DNA molecules attached on Au nanoparticles, thus achieving label-free, non-destructive, and high-sensitivity biosensing with nanoscale resolution.
Single DNA imaging and length quantification through a mobile phone microscope
NASA Astrophysics Data System (ADS)
Wei, Qingshan; Luo, Wei; Chiang, Samuel; Kappel, Tara; Mejia, Crystal; Tseng, Derek; Chan, Raymond Yan L.; Yan, Eddie; Qi, Hangfei; Shabbir, Faizan; Ozkan, Haydar; Feng, Steve; Ozcan, Aydogan
2016-03-01
The development of sensitive optical microscopy methods for the detection of single DNA molecules has become an active research area which cultivates various promising applications including point-of-care (POC) genetic testing and diagnostics. Direct visualization of individual DNA molecules usually relies on sophisticated optical microscopes that are mostly available in well-equipped laboratories. For POC DNA testing/detection, there is an increasing need for the development of new single DNA imaging and sensing methods that are field-portable, cost-effective, and accessible for diagnostic applications in resource-limited or field-settings. For this aim, we developed a mobile-phone integrated fluorescence microscopy platform that allows imaging and sizing of single DNA molecules that are stretched on a chip. This handheld device contains an opto-mechanical attachment integrated onto a smartphone camera module, which creates a high signal-to-noise ratio dark-field imaging condition by using an oblique illumination/excitation configuration. Using this device, we demonstrated imaging of individual linearly stretched λ DNA molecules (48 kilobase-pair, kbp) over 2 mm2 field-of-view. We further developed a robust computational algorithm and a smartphone app that allowed the users to quickly quantify the length of each DNA fragment imaged using this mobile interface. The cellphone based device was tested by five different DNA samples (5, 10, 20, 40, and 48 kbp), and a sizing accuracy of <1 kbp was demonstrated for DNA strands longer than 10 kbp. This mobile DNA imaging and sizing platform can be very useful for various diagnostic applications including the detection of disease-specific genes and quantification of copy-number-variations at POC settings.
Hexagonally packed DNA within bacteriophage T7 stabilized by curvature stress.
Odijk, T
1998-01-01
A continuum computation is proposed for the bending stress stabilizing DNA that is hexagonally packed within bacteriophage T7. Because the inner radius of the DNA spool is rather small, the stress of the curved DNA genome is strong enough to balance its electrostatic self-repulsion so as to form a stable hexagonal phase. The theory is in accord with the microscopically determined structure of bacteriophage T7 filled with DNA within the experimental margin of error. PMID:9726924
Wills, Peter R
2016-03-13
This article reviews contributions to this theme issue covering the topic 'DNA as information' in relation to the structure of DNA, the measure of its information content, the role and meaning of information in biology and the origin of genetic coding as a transition from uninformed to meaningful computational processes in physical systems. © 2016 The Author(s).
De Biase, Pablo M.; Markosyan, Suren; Noskov, Sergei
2014-01-01
We developed a novel scheme based on the Grand-Canonical Monte-Carlo/Brownian Dynamics (GCMC/BD) simulations and have extended it to studies of ion currents across three nanopores with the potential for ssDNA sequencing: solid-state nanopore Si3N4, α-hemolysin, and E111N/M113Y/K147N mutant. To describe nucleotide-specific ion dynamics compatible with ssDNA coarse-grained model, we used the Inverse Monte-Carlo protocol, which maps the relevant ion-nucleotide distribution functions from an all-atom MD simulations. Combined with the previously developed simulation platform for Brownian Dynamic (BD) simulations of ion transport, it allows for microsecond- and millisecond-long simulations of ssDNA dynamics in nanopore with a conductance computation accuracy that equals or exceeds that of all-atom MD simulations. In spite of the simplifications, the protocol produces results that agree with the results of previous studies on ion conductance across open channels and provide direct correlations with experimentally measured blockade currents and ion conductances that have been estimated from all-atom MD simulations. PMID:24738152
Estimates of electronic coupling for excess electron transfer in DNA
NASA Astrophysics Data System (ADS)
Voityuk, Alexander A.
2005-07-01
Electronic coupling Vda is one of the key parameters that determine the rate of charge transfer through DNA. While there have been several computational studies of Vda for hole transfer, estimates of electronic couplings for excess electron transfer (ET) in DNA remain unavailable. In the paper, an efficient strategy is established for calculating the ET matrix elements between base pairs in a π stack. Two approaches are considered. First, we employ the diabatic-state (DS) method in which donor and acceptor are represented with radical anions of the canonical base pairs adenine-thymine (AT) and guanine-cytosine (GC). In this approach, similar values of Vda are obtained with the standard 6-31G* and extended 6-31++G** basis sets. Second, the electronic couplings are derived from lowest unoccupied molecular orbitals (LUMOs) of neutral systems by using the generalized Mulliken-Hush or fragment charge methods. Because the radical-anion states of AT and GC are well reproduced by LUMOs of the neutral base pairs calculated without diffuse functions, the estimated values of Vda are in good agreement with the couplings obtained for radical-anion states using the DS method. However, when the calculation of a neutral stack is carried out with diffuse functions, LUMOs of the system exhibit the dipole-bound character and cannot be used for estimating electronic couplings. Our calculations suggest that the ET matrix elements Vda for models containing intrastrand thymine and cytosine bases are essentially larger than the couplings in complexes with interstrand pyrimidine bases. The matrix elements for excess electron transfer are found to be considerably smaller than the corresponding values for hole transfer and to be very responsive to structural changes in a DNA stack.
Uncovering the polymerase-induced cytotoxicity of an oxidized nucleotide
NASA Astrophysics Data System (ADS)
Freudenthal, Bret D.; Beard, William A.; Perera, Lalith; Shock, David D.; Kim, Taejin; Schlick, Tamar; Wilson, Samuel H.
2015-01-01
Oxidative stress promotes genomic instability and human diseases. A common oxidized nucleoside is 8-oxo-7,8-dihydro-2'-deoxyguanosine, which is found both in DNA (8-oxo-G) and as a free nucleotide (8-oxo-dGTP). Nucleotide pools are especially vulnerable to oxidative damage. Therefore cells encode an enzyme (MutT/MTH1) that removes free oxidized nucleotides. This cleansing function is required for cancer cell survival and to modulate Escherichia coli antibiotic sensitivity in a DNA polymerase (pol)-dependent manner. How polymerases discriminate between damaged and non-damaged nucleotides is not well understood. This analysis is essential given the role of oxidized nucleotides in mutagenesis, cancer therapeutics, and bacterial antibiotics. Even with cellular sanitizing activities, nucleotide pools contain enough 8-oxo-dGTP to promote mutagenesis. This arises from the dual coding potential where 8-oxo-dGTP(anti) base pairs with cytosine and 8-oxo-dGTP(syn) uses its Hoogsteen edge to base pair with adenine. Here we use time-lapse crystallography to follow 8-oxo-dGTP insertion opposite adenine or cytosine with human pol β, to reveal that insertion is accommodated in either the syn- or anti-conformation, respectively. For 8-oxo-dGTP(anti) insertion, a novel divalent metal relieves repulsive interactions between the adducted guanine base and the triphosphate of the oxidized nucleotide. With either templating base, hydrogen-bonding interactions between the bases are lost as the enzyme reopens after catalysis, leading to a cytotoxic nicked DNA repair intermediate. Combining structural snapshots with kinetic and computational analysis reveals how 8-oxo-dGTP uses charge modulation during insertion that can lead to a blocked DNA repair intermediate.