Sample records for computing dna computing

  1. QPSO-Based Adaptive DNA Computing Algorithm

    PubMed Central

    Karakose, Mehmet; Cigdem, Ugur

    2013-01-01

    DNA (deoxyribonucleic acid) computing that is a new computation model based on DNA molecules for information storage has been increasingly used for optimization and data analysis in recent years. However, DNA computing algorithm has some limitations in terms of convergence speed, adaptability, and effectiveness. In this paper, a new approach for improvement of DNA computing is proposed. This new approach aims to perform DNA computing algorithm with adaptive parameters towards the desired goal using quantum-behaved particle swarm optimization (QPSO). Some contributions provided by the proposed QPSO based on adaptive DNA computing algorithm are as follows: (1) parameters of population size, crossover rate, maximum number of operations, enzyme and virus mutation rate, and fitness function of DNA computing algorithm are simultaneously tuned for adaptive process, (2) adaptive algorithm is performed using QPSO algorithm for goal-driven progress, faster operation, and flexibility in data, and (3) numerical realization of DNA computing algorithm with proposed approach is implemented in system identification. Two experiments with different systems were carried out to evaluate the performance of the proposed approach with comparative results. Experimental results obtained with Matlab and FPGA demonstrate ability to provide effective optimization, considerable convergence speed, and high accuracy according to DNA computing algorithm. PMID:23935409

  2. Design and Analysis of Compact DNA Strand Displacement Circuits for Analog Computation Using Autocatalytic Amplifiers.

    PubMed

    Song, Tianqi; Garg, Sudhanshu; Mokhtar, Reem; Bui, Hieu; Reif, John

    2018-01-19

    A main goal in DNA computing is to build DNA circuits to compute designated functions using a minimal number of DNA strands. Here, we propose a novel architecture to build compact DNA strand displacement circuits to compute a broad scope of functions in an analog fashion. A circuit by this architecture is composed of three autocatalytic amplifiers, and the amplifiers interact to perform computation. We show DNA circuits to compute functions sqrt(x), ln(x) and exp(x) for x in tunable ranges with simulation results. A key innovation in our architecture, inspired by Napier's use of logarithm transforms to compute square roots on a slide rule, is to make use of autocatalytic amplifiers to do logarithmic and exponential transforms in concentration and time. In particular, we convert from the input that is encoded by the initial concentration of the input DNA strand, to time, and then back again to the output encoded by the concentration of the output DNA strand at equilibrium. This combined use of strand-concentration and time encoding of computational values may have impact on other forms of molecular computation.

  3. Molecular Sticker Model Stimulation on Silicon for a Maximum Clique Problem

    PubMed Central

    Ning, Jianguo; Li, Yanmei; Yu, Wen

    2015-01-01

    Molecular computers (also called DNA computers), as an alternative to traditional electronic computers, are smaller in size but more energy efficient, and have massive parallel processing capacity. However, DNA computers may not outperform electronic computers owing to their higher error rates and some limitations of the biological laboratory. The stickers model, as a typical DNA-based computer, is computationally complete and universal, and can be viewed as a bit-vertically operating machine. This makes it attractive for silicon implementation. Inspired by the information processing method on the stickers computer, we propose a novel parallel computing model called DEM (DNA Electronic Computing Model) on System-on-a-Programmable-Chip (SOPC) architecture. Except for the significant difference in the computing medium—transistor chips rather than bio-molecules—the DEM works similarly to DNA computers in immense parallel information processing. Additionally, a plasma display panel (PDP) is used to show the change of solutions, and helps us directly see the distribution of assignments. The feasibility of the DEM is tested by applying it to compute a maximum clique problem (MCP) with eight vertices. Owing to the limited computing sources on SOPC architecture, the DEM could solve moderate-size problems in polynomial time. PMID:26075867

  4. Computational Approaches to Nucleic Acid Origami.

    PubMed

    Jabbari, Hosna; Aminpour, Maral; Montemagno, Carlo

    2015-10-12

    Recent advances in experimental DNA origami have dramatically expanded the horizon of DNA nanotechnology. Complex 3D suprastructures have been designed and developed using DNA origami with applications in biomaterial science, nanomedicine, nanorobotics, and molecular computation. Ribonucleic acid (RNA) origami has recently been realized as a new approach. Similar to DNA, RNA molecules can be designed to form complex 3D structures through complementary base pairings. RNA origami structures are, however, more compact and more thermodynamically stable due to RNA's non-canonical base pairing and tertiary interactions. With all these advantages, the development of RNA origami lags behind DNA origami by a large gap. Furthermore, although computational methods have proven to be effective in designing DNA and RNA origami structures and in their evaluation, advances in computational nucleic acid origami is even more limited. In this paper, we review major milestones in experimental and computational DNA and RNA origami and present current challenges in these fields. We believe collaboration between experimental nanotechnologists and computer scientists are critical for advancing these new research paradigms.

  5. Analog Computation by DNA Strand Displacement Circuits.

    PubMed

    Song, Tianqi; Garg, Sudhanshu; Mokhtar, Reem; Bui, Hieu; Reif, John

    2016-08-19

    DNA circuits have been widely used to develop biological computing devices because of their high programmability and versatility. Here, we propose an architecture for the systematic construction of DNA circuits for analog computation based on DNA strand displacement. The elementary gates in our architecture include addition, subtraction, and multiplication gates. The input and output of these gates are analog, which means that they are directly represented by the concentrations of the input and output DNA strands, respectively, without requiring a threshold for converting to Boolean signals. We provide detailed domain designs and kinetic simulations of the gates to demonstrate their expected performance. On the basis of these gates, we describe how DNA circuits to compute polynomial functions of inputs can be built. Using Taylor Series and Newton Iteration methods, functions beyond the scope of polynomials can also be computed by DNA circuits built upon our architecture.

  6. A DNA sequence analysis package for the IBM personal computer.

    PubMed Central

    Lagrimini, L M; Brentano, S T; Donelson, J E

    1984-01-01

    We present here a collection of DNA sequence analysis programs, called "PC Sequence" (PCS), which are designed to run on the IBM Personal Computer (PC). These programs are written in IBM PC compiled BASIC and take full advantage of the IBM PC's speed, error handling, and graphics capabilities. For a modest initial expense in hardware any laboratory can use these programs to quickly perform computer analysis on DNA sequences. They are written with the novice user in mind and require very little training or previous experience with computers. Also provided are a text editing program for creating and modifying DNA sequence files and a communications program which enables the PC to communicate with and collect information from mainframe computers and DNA sequence databases. PMID:6546433

  7. Reversible Data Hiding Based on DNA Computing

    PubMed Central

    Xie, Yingjie

    2017-01-01

    Biocomputing, especially DNA, computing has got great development. It is widely used in information security. In this paper, a novel algorithm of reversible data hiding based on DNA computing is proposed. Inspired by the algorithm of histogram modification, which is a classical algorithm for reversible data hiding, we combine it with DNA computing to realize this algorithm based on biological technology. Compared with previous results, our experimental results have significantly improved the ER (Embedding Rate). Furthermore, some PSNR (peak signal-to-noise ratios) of test images are also improved. Experimental results show that it is suitable for protecting the copyright of cover image in DNA-based information security. PMID:28280504

  8. Tyramine Hydrochloride Based Label-Free System for Operating Various DNA Logic Gates and a DNA Caliper for Base Number Measurements.

    PubMed

    Fan, Daoqing; Zhu, Xiaoqing; Dong, Shaojun; Wang, Erkang

    2017-07-05

    DNA is believed to be a promising candidate for molecular logic computation, and the fluorogenic/colorimetric substrates of G-quadruplex DNAzyme (G4zyme) are broadly used as label-free output reporters of DNA logic circuits. Herein, for the first time, tyramine-HCl (a fluorogenic substrate of G4zyme) is applied to DNA logic computation and a series of label-free DNA-input logic gates, including elementary AND, OR, and INHIBIT logic gates, as well as a two to one encoder, are constructed. Furthermore, a DNA caliper that can measure the base number of target DNA as low as three bases is also fabricated. This DNA caliper can also perform concatenated AND-AND logic computation to fulfil the requirements of sophisticated logic computing. © 2017 Wiley-VCH Verlag GmbH & Co. KGaA, Weinheim.

  9. In vitro molecular machine learning algorithm via symmetric internal loops of DNA.

    PubMed

    Lee, Ji-Hoon; Lee, Seung Hwan; Baek, Christina; Chun, Hyosun; Ryu, Je-Hwan; Kim, Jin-Woo; Deaton, Russell; Zhang, Byoung-Tak

    2017-08-01

    Programmable biomolecules, such as DNA strands, deoxyribozymes, and restriction enzymes, have been used to solve computational problems, construct large-scale logic circuits, and program simple molecular games. Although studies have shown the potential of molecular computing, the capability of computational learning with DNA molecules, i.e., molecular machine learning, has yet to be experimentally verified. Here, we present a novel molecular learning in vitro model in which symmetric internal loops of double-stranded DNA are exploited to measure the differences between training instances, thus enabling the molecules to learn from small errors. The model was evaluated on a data set of twenty dialogue sentences obtained from the television shows Friends and Prison Break. The wet DNA-computing experiments confirmed that the molecular learning machine was able to generalize the dialogue patterns of each show and successfully identify the show from which the sentences originated. The molecular machine learning model described here opens the way for solving machine learning problems in computer science and biology using in vitro molecular computing with the data encoded in DNA molecules. Copyright © 2017. Published by Elsevier B.V.

  10. High performance transcription factor-DNA docking with GPU computing

    PubMed Central

    2012-01-01

    Background Protein-DNA docking is a very challenging problem in structural bioinformatics and has important implications in a number of applications, such as structure-based prediction of transcription factor binding sites and rational drug design. Protein-DNA docking is very computational demanding due to the high cost of energy calculation and the statistical nature of conformational sampling algorithms. More importantly, experiments show that the docking quality depends on the coverage of the conformational sampling space. It is therefore desirable to accelerate the computation of the docking algorithm, not only to reduce computing time, but also to improve docking quality. Methods In an attempt to accelerate the sampling process and to improve the docking performance, we developed a graphics processing unit (GPU)-based protein-DNA docking algorithm. The algorithm employs a potential-based energy function to describe the binding affinity of a protein-DNA pair, and integrates Monte-Carlo simulation and a simulated annealing method to search through the conformational space. Algorithmic techniques were developed to improve the computation efficiency and scalability on GPU-based high performance computing systems. Results The effectiveness of our approach is tested on a non-redundant set of 75 TF-DNA complexes and a newly developed TF-DNA docking benchmark. We demonstrated that the GPU-based docking algorithm can significantly accelerate the simulation process and thereby improving the chance of finding near-native TF-DNA complex structures. This study also suggests that further improvement in protein-DNA docking research would require efforts from two integral aspects: improvement in computation efficiency and energy function design. Conclusions We present a high performance computing approach for improving the prediction accuracy of protein-DNA docking. The GPU-based docking algorithm accelerates the search of the conformational space and thus increases the chance of finding more near-native structures. To the best of our knowledge, this is the first ad hoc effort of applying GPU or GPU clusters to the protein-DNA docking problem. PMID:22759575

  11. Constructing Smart Protocells with Built-In DNA Computational Core to Eliminate Exogenous Challenge.

    PubMed

    Lyu, Yifan; Wu, Cuichen; Heinke, Charles; Han, Da; Cai, Ren; Teng, I-Ting; Liu, Yuan; Liu, Hui; Zhang, Xiaobing; Liu, Qiaoling; Tan, Weihong

    2018-06-06

    A DNA reaction network is like a biological algorithm that can respond to "molecular input signals", such as biological molecules, while the artificial cell is like a microrobot whose function is powered by the encapsulated DNA reaction network. In this work, we describe the feasibility of using a DNA reaction network as the computational core of a protocell, which will perform an artificial immune response in a concise way to eliminate a mimicked pathogenic challenge. Such a DNA reaction network (RN)-powered protocell can realize the connection of logical computation and biological recognition due to the natural programmability and biological properties of DNA. Thus, the biological input molecules can be easily involved in the molecular computation and the computation process can be spatially isolated and protected by artificial bilayer membrane. We believe the strategy proposed in the current paper, i.e., using DNA RN to power artificial cells, will lay the groundwork for understanding the basic design principles of DNA algorithm-based nanodevices which will, in turn, inspire the construction of artificial cells, or protocells, that will find a place in future biomedical research.

  12. Biomolecular computers with multiple restriction enzymes.

    PubMed

    Sakowski, Sebastian; Krasinski, Tadeusz; Waldmajer, Jacek; Sarnik, Joanna; Blasiak, Janusz; Poplawski, Tomasz

    2017-01-01

    The development of conventional, silicon-based computers has several limitations, including some related to the Heisenberg uncertainty principle and the von Neumann "bottleneck". Biomolecular computers based on DNA and proteins are largely free of these disadvantages and, along with quantum computers, are reasonable alternatives to their conventional counterparts in some applications. The idea of a DNA computer proposed by Ehud Shapiro's group at the Weizmann Institute of Science was developed using one restriction enzyme as hardware and DNA fragments (the transition molecules) as software and input/output signals. This computer represented a two-state two-symbol finite automaton that was subsequently extended by using two restriction enzymes. In this paper, we propose the idea of a multistate biomolecular computer with multiple commercially available restriction enzymes as hardware. Additionally, an algorithmic method for the construction of transition molecules in the DNA computer based on the use of multiple restriction enzymes is presented. We use this method to construct multistate, biomolecular, nondeterministic finite automata with four commercially available restriction enzymes as hardware. We also describe an experimental applicaton of this theoretical model to a biomolecular finite automaton made of four endonucleases.

  13. Computational method and system for modeling, analyzing, and optimizing DNA amplification and synthesis

    DOEpatents

    Vandersall, Jennifer A.; Gardner, Shea N.; Clague, David S.

    2010-05-04

    A computational method and computer-based system of modeling DNA synthesis for the design and interpretation of PCR amplification, parallel DNA synthesis, and microarray chip analysis. The method and system include modules that address the bioinformatics, kinetics, and thermodynamics of DNA amplification and synthesis. Specifically, the steps of DNA selection, as well as the kinetics and thermodynamics of DNA hybridization and extensions, are addressed, which enable the optimization of the processing and the prediction of the products as a function of DNA sequence, mixing protocol, time, temperature and concentration of species.

  14. Markov chains: computing limit existence and approximations with DNA.

    PubMed

    Cardona, M; Colomer, M A; Conde, J; Miret, J M; Miró, J; Zaragoza, A

    2005-09-01

    We present two algorithms to perform computations over Markov chains. The first one determines whether the sequence of powers of the transition matrix of a Markov chain converges or not to a limit matrix. If it does converge, the second algorithm enables us to estimate this limit. The combination of these algorithms allows the computation of a limit using DNA computing. In this sense, we have encoded the states and the transition probabilities using strands of DNA for generating paths of the Markov chain.

  15. Computational Design of DNA-Binding Proteins.

    PubMed

    Thyme, Summer; Song, Yifan

    2016-01-01

    Predicting the outcome of engineered and naturally occurring sequence perturbations to protein-DNA interfaces requires accurate computational modeling technologies. It has been well established that computational design to accommodate small numbers of DNA target site substitutions is possible. This chapter details the basic method of design used in the Rosetta macromolecular modeling program that has been successfully used to modulate the specificity of DNA-binding proteins. More recently, combining computational design and directed evolution has become a common approach for increasing the success rate of protein engineering projects. The power of such high-throughput screening depends on computational methods producing multiple potential solutions. Therefore, this chapter describes several protocols for increasing the diversity of designed output. Lastly, we describe an approach for building comparative models of protein-DNA complexes in order to utilize information from homologous sequences. These models can be used to explore how nature modulates specificity of protein-DNA interfaces and potentially can even be used as starting templates for further engineering.

  16. Computer-Aided Drug Discovery: Molecular Docking of Diminazene Ligands to DNA Minor Groove

    ERIC Educational Resources Information Center

    Kholod, Yana; Hoag, Erin; Muratore, Katlynn; Kosenkov, Dmytro

    2018-01-01

    The reported project-based laboratory unit introduces upper-division undergraduate students to the basics of computer-aided drug discovery as a part of a computational chemistry laboratory course. The students learn to perform model binding of organic molecules (ligands) to the DNA minor groove with computer-aided drug discovery (CADD) tools. The…

  17. Solving satisfiability problems using a novel microarray-based DNA computer.

    PubMed

    Lin, Che-Hsin; Cheng, Hsiao-Ping; Yang, Chang-Biau; Yang, Chia-Ning

    2007-01-01

    An algorithm based on a modified sticker model accompanied with an advanced MEMS-based microarray technology is demonstrated to solve SAT problem, which has long served as a benchmark in DNA computing. Unlike conventional DNA computing algorithms needing an initial data pool to cover correct and incorrect answers and further executing a series of separation procedures to destroy the unwanted ones, we built solutions in parts to satisfy one clause in one step, and eventually solve the entire Boolean formula through steps. No time-consuming sample preparation procedures and delicate sample applying equipment were required for the computing process. Moreover, experimental results show the bound DNA sequences can sustain the chemical solutions during computing processes such that the proposed method shall be useful in dealing with large-scale problems.

  18. Computer program for the IBM personal computer which searches for approximate matches to short oligonucleotide sequences in long target DNA sequences.

    PubMed Central

    Myers, E W; Mount, D W

    1986-01-01

    We describe a program which may be used to find approximate matches to a short predefined DNA sequence in a larger target DNA sequence. The program predicts the usefulness of specific DNA probes and sequencing primers and finds nearly identical sequences that might represent the same regulatory signal. The program is written in the C programming language and will run on virtually any computer system with a C compiler, such as the IBM/PC and other computers running under the MS/DOS and UNIX operating systems. The program has been integrated into an existing software package for the IBM personal computer (see article by Mount and Conrad, this volume). Some examples of its use are given. PMID:3753785

  19. A detailed experimental study of a DNA computer with two endonucleases.

    PubMed

    Sakowski, Sebastian; Krasiński, Tadeusz; Sarnik, Joanna; Blasiak, Janusz; Waldmajer, Jacek; Poplawski, Tomasz

    2017-07-14

    Great advances in biotechnology have allowed the construction of a computer from DNA. One of the proposed solutions is a biomolecular finite automaton, a simple two-state DNA computer without memory, which was presented by Ehud Shapiro's group at the Weizmann Institute of Science. The main problem with this computer, in which biomolecules carry out logical operations, is its complexity - increasing the number of states of biomolecular automata. In this study, we constructed (in laboratory conditions) a six-state DNA computer that uses two endonucleases (e.g. AcuI and BbvI) and a ligase. We have presented a detailed experimental verification of its feasibility. We described the effect of the number of states, the length of input data, and the nondeterminism on the computing process. We also tested different automata (with three, four, and six states) running on various accepted input words of different lengths such as ab, aab, aaab, ababa, and of an unaccepted word ba. Moreover, this article presents the reaction optimization and the methods of eliminating certain biochemical problems occurring in the implementation of a biomolecular DNA automaton based on two endonucleases.

  20. Biomolecular computers with multiple restriction enzymes

    PubMed Central

    Sakowski, Sebastian; Krasinski, Tadeusz; Waldmajer, Jacek; Sarnik, Joanna; Blasiak, Janusz; Poplawski, Tomasz

    2017-01-01

    Abstract The development of conventional, silicon-based computers has several limitations, including some related to the Heisenberg uncertainty principle and the von Neumann “bottleneck”. Biomolecular computers based on DNA and proteins are largely free of these disadvantages and, along with quantum computers, are reasonable alternatives to their conventional counterparts in some applications. The idea of a DNA computer proposed by Ehud Shapiro’s group at the Weizmann Institute of Science was developed using one restriction enzyme as hardware and DNA fragments (the transition molecules) as software and input/output signals. This computer represented a two-state two-symbol finite automaton that was subsequently extended by using two restriction enzymes. In this paper, we propose the idea of a multistate biomolecular computer with multiple commercially available restriction enzymes as hardware. Additionally, an algorithmic method for the construction of transition molecules in the DNA computer based on the use of multiple restriction enzymes is presented. We use this method to construct multistate, biomolecular, nondeterministic finite automata with four commercially available restriction enzymes as hardware. We also describe an experimental applicaton of this theoretical model to a biomolecular finite automaton made of four endonucleases. PMID:29064510

  1. A new parallel DNA algorithm to solve the task scheduling problem based on inspired computational model.

    PubMed

    Wang, Zhaocai; Ji, Zuwen; Wang, Xiaoming; Wu, Tunhua; Huang, Wei

    2017-12-01

    As a promising approach to solve the computationally intractable problem, the method based on DNA computing is an emerging research area including mathematics, computer science and molecular biology. The task scheduling problem, as a well-known NP-complete problem, arranges n jobs to m individuals and finds the minimum execution time of last finished individual. In this paper, we use a biologically inspired computational model and describe a new parallel algorithm to solve the task scheduling problem by basic DNA molecular operations. In turn, we skillfully design flexible length DNA strands to represent elements of the allocation matrix, take appropriate biological experiment operations and get solutions of the task scheduling problem in proper length range with less than O(n 2 ) time complexity. Copyright © 2017. Published by Elsevier B.V.

  2. Genomic signal processing methods for computation of alignment-free distances from DNA sequences.

    PubMed

    Borrayo, Ernesto; Mendizabal-Ruiz, E Gerardo; Vélez-Pérez, Hugo; Romo-Vázquez, Rebeca; Mendizabal, Adriana P; Morales, J Alejandro

    2014-01-01

    Genomic signal processing (GSP) refers to the use of digital signal processing (DSP) tools for analyzing genomic data such as DNA sequences. A possible application of GSP that has not been fully explored is the computation of the distance between a pair of sequences. In this work we present GAFD, a novel GSP alignment-free distance computation method. We introduce a DNA sequence-to-signal mapping function based on the employment of doublet values, which increases the number of possible amplitude values for the generated signal. Additionally, we explore the use of three DSP distance metrics as descriptors for categorizing DNA signal fragments. Our results indicate the feasibility of employing GAFD for computing sequence distances and the use of descriptors for characterizing DNA fragments.

  3. Genomic Signal Processing Methods for Computation of Alignment-Free Distances from DNA Sequences

    PubMed Central

    Borrayo, Ernesto; Mendizabal-Ruiz, E. Gerardo; Vélez-Pérez, Hugo; Romo-Vázquez, Rebeca; Mendizabal, Adriana P.; Morales, J. Alejandro

    2014-01-01

    Genomic signal processing (GSP) refers to the use of digital signal processing (DSP) tools for analyzing genomic data such as DNA sequences. A possible application of GSP that has not been fully explored is the computation of the distance between a pair of sequences. In this work we present GAFD, a novel GSP alignment-free distance computation method. We introduce a DNA sequence-to-signal mapping function based on the employment of doublet values, which increases the number of possible amplitude values for the generated signal. Additionally, we explore the use of three DSP distance metrics as descriptors for categorizing DNA signal fragments. Our results indicate the feasibility of employing GAFD for computing sequence distances and the use of descriptors for characterizing DNA fragments. PMID:25393409

  4. A strand graph semantics for DNA-based computation

    PubMed Central

    Petersen, Rasmus L.; Lakin, Matthew R.; Phillips, Andrew

    2015-01-01

    DNA nanotechnology is a promising approach for engineering computation at the nanoscale, with potential applications in biofabrication and intelligent nanomedicine. DNA strand displacement is a general strategy for implementing a broad range of nanoscale computations, including any computation that can be expressed as a chemical reaction network. Modelling and analysis of DNA strand displacement systems is an important part of the design process, prior to experimental realisation. As experimental techniques improve, it is important for modelling languages to keep pace with the complexity of structures that can be realised experimentally. In this paper we present a process calculus for modelling DNA strand displacement computations involving rich secondary structures, including DNA branches and loops. We prove that our calculus is also sufficiently expressive to model previous work on non-branching structures, and propose a mapping from our calculus to a canonical strand graph representation, in which vertices represent DNA strands, ordered sites represent domains, and edges between sites represent bonds between domains. We define interactions between strands by means of strand graph rewriting, and prove the correspondence between the process calculus and strand graph behaviours. Finally, we propose a mapping from strand graphs to an efficient implementation, which we use to perform modelling and simulation of DNA strand displacement systems with rich secondary structure. PMID:27293306

  5. 21st International Conference on DNA Computing and Molecular Programming: 8.1 Biochemistry

    DTIC Science & Technology

    include information storage and biological applications of DNA systems, biomolecular chemical reaction networks, applications of self -assembled DNA...nanostructures, tile self -assembly and computation, principles and models of self -assembly, and strand displacement and biomolecular circuits. The fund

  6. Comprehensive restriction enzyme lists to update any DNA sequence computer program.

    PubMed

    Raschke, E

    1993-04-01

    Restriction enzyme lists are presented for the practical working geneticist to update any DNA computer program. These lists combine formerly scattered information and contain all presently known restriction enzymes with a unique recognition sequence, a cut site, or methylation (in)sensitivity. The lists are in the shortest possible form to also be functional with small DNA computer programs, and will produce clear restriction maps without any redundancy or loss of information. The lists discern between commercial and noncommercial enzymes, and prototype enzymes and different isoschizomers are cross-referenced. Differences in general methylation sensitivities and (in)sensitivities against Dam and Dcm methylases of Escherichia coli are indicated. Commercial methylases and intron-encoded endonucleases are included. An address list is presented to contact commercial suppliers. The lists are constantly updated and available in electronic form as pure US ASCII files, and in formats for the DNA computer programs DNA-Strider for Apple Macintosh, and DNAsis for IBM personal computers or compatibles via e-mail from the internet address: NETSERV@EMBL-HEIDELBERG.DE by sending only the message HELP RELIBRARY.

  7. DOE Office of Scientific and Technical Information (OSTI.GOV)

    Tamrin, Mohd Izzuddin Mohd; Turaev, Sherzod; Sembok, Tengku Mohd Tengku

    There are tremendous works in biotechnology especially in area of DNA molecules. The computer society is attempting to develop smaller computing devices through computational models which are based on the operations performed on the DNA molecules. A Watson-Crick automaton, a theoretical model for DNA based computation, has two reading heads, and works on double-stranded sequences of the input related by a complementarity relation similar with the Watson-Crick complementarity of DNA nucleotides. Over the time, several variants of Watson-Crick automata have been introduced and investigated. However, they cannot be used as suitable DNA based computational models for molecular stochastic processes andmore » fuzzy processes that are related to important practical problems such as molecular parsing, gene disease detection, and food authentication. In this paper we define new variants of Watson-Crick automata, called weighted Watson-Crick automata, developing theoretical models for molecular stochastic and fuzzy processes. We define weighted Watson-Crick automata adapting weight restriction mechanisms associated with formal grammars and automata. We also study the generative capacities of weighted Watson-Crick automata, including probabilistic and fuzzy variants. We show that weighted variants of Watson-Crick automata increase their generative power.« less

  8. Weighted Watson-Crick automata

    NASA Astrophysics Data System (ADS)

    Tamrin, Mohd Izzuddin Mohd; Turaev, Sherzod; Sembok, Tengku Mohd Tengku

    2014-07-01

    There are tremendous works in biotechnology especially in area of DNA molecules. The computer society is attempting to develop smaller computing devices through computational models which are based on the operations performed on the DNA molecules. A Watson-Crick automaton, a theoretical model for DNA based computation, has two reading heads, and works on double-stranded sequences of the input related by a complementarity relation similar with the Watson-Crick complementarity of DNA nucleotides. Over the time, several variants of Watson-Crick automata have been introduced and investigated. However, they cannot be used as suitable DNA based computational models for molecular stochastic processes and fuzzy processes that are related to important practical problems such as molecular parsing, gene disease detection, and food authentication. In this paper we define new variants of Watson-Crick automata, called weighted Watson-Crick automata, developing theoretical models for molecular stochastic and fuzzy processes. We define weighted Watson-Crick automata adapting weight restriction mechanisms associated with formal grammars and automata. We also study the generative capacities of weighted Watson-Crick automata, including probabilistic and fuzzy variants. We show that weighted variants of Watson-Crick automata increase their generative power.

  9. Simultaneous G-Quadruplex DNA Logic.

    PubMed

    Bader, Antoine; Cockroft, Scott L

    2018-04-03

    A fundamental principle of digital computer operation is Boolean logic, where inputs and outputs are described by binary integer voltages. Similarly, inputs and outputs may be processed on the molecular level as exemplified by synthetic circuits that exploit the programmability of DNA base-pairing. Unlike modern computers, which execute large numbers of logic gates in parallel, most implementations of molecular logic have been limited to single computing tasks, or sensing applications. This work reports three G-quadruplex-based logic gates that operate simultaneously in a single reaction vessel. The gates respond to unique Boolean DNA inputs by undergoing topological conversion from duplex to G-quadruplex states that were resolved using a thioflavin T dye and gel electrophoresis. The modular, addressable, and label-free approach could be incorporated into DNA-based sensors, or used for resolving and debugging parallel processes in DNA computing applications. © 2018 Wiley-VCH Verlag GmbH & Co. KGaA, Weinheim.

  10. A programming language for composable DNA circuits

    PubMed Central

    Phillips, Andrew; Cardelli, Luca

    2009-01-01

    Recently, a range of information-processing circuits have been implemented in DNA by using strand displacement as their main computational mechanism. Examples include digital logic circuits and catalytic signal amplification circuits that function as efficient molecular detectors. As new paradigms for DNA computation emerge, the development of corresponding languages and tools for these paradigms will help to facilitate the design of DNA circuits and their automatic compilation to nucleotide sequences. We present a programming language for designing and simulating DNA circuits in which strand displacement is the main computational mechanism. The language includes basic elements of sequence domains, toeholds and branch migration, and assumes that strands do not possess any secondary structure. The language is used to model and simulate a variety of circuits, including an entropy-driven catalytic gate, a simple gate motif for synthesizing large-scale circuits and a scheme for implementing an arbitrary system of chemical reactions. The language is a first step towards the design of modelling and simulation tools for DNA strand displacement, which complements the emergence of novel implementation strategies for DNA computing. PMID:19535415

  11. A programming language for composable DNA circuits.

    PubMed

    Phillips, Andrew; Cardelli, Luca

    2009-08-06

    Recently, a range of information-processing circuits have been implemented in DNA by using strand displacement as their main computational mechanism. Examples include digital logic circuits and catalytic signal amplification circuits that function as efficient molecular detectors. As new paradigms for DNA computation emerge, the development of corresponding languages and tools for these paradigms will help to facilitate the design of DNA circuits and their automatic compilation to nucleotide sequences. We present a programming language for designing and simulating DNA circuits in which strand displacement is the main computational mechanism. The language includes basic elements of sequence domains, toeholds and branch migration, and assumes that strands do not possess any secondary structure. The language is used to model and simulate a variety of circuits, including an entropy-driven catalytic gate, a simple gate motif for synthesizing large-scale circuits and a scheme for implementing an arbitrary system of chemical reactions. The language is a first step towards the design of modelling and simulation tools for DNA strand displacement, which complements the emergence of novel implementation strategies for DNA computing.

  12. Computing exponentially faster: implementing a non-deterministic universal Turing machine using DNA

    PubMed Central

    Currin, Andrew; Korovin, Konstantin; Ababi, Maria; Roper, Katherine; Kell, Douglas B.; Day, Philip J.

    2017-01-01

    The theory of computer science is based around universal Turing machines (UTMs): abstract machines able to execute all possible algorithms. Modern digital computers are physical embodiments of classical UTMs. For the most important class of problem in computer science, non-deterministic polynomial complete problems, non-deterministic UTMs (NUTMs) are theoretically exponentially faster than both classical UTMs and quantum mechanical UTMs (QUTMs). However, no attempt has previously been made to build an NUTM, and their construction has been regarded as impossible. Here, we demonstrate the first physical design of an NUTM. This design is based on Thue string rewriting systems, and thereby avoids the limitations of most previous DNA computing schemes: all the computation is local (simple edits to strings) so there is no need for communication, and there is no need to order operations. The design exploits DNA's ability to replicate to execute an exponential number of computational paths in P time. Each Thue rewriting step is embodied in a DNA edit implemented using a novel combination of polymerase chain reactions and site-directed mutagenesis. We demonstrate that the design works using both computational modelling and in vitro molecular biology experimentation: the design is thermodynamically favourable, microprogramming can be used to encode arbitrary Thue rules, all classes of Thue rule can be implemented, and non-deterministic rule implementation. In an NUTM, the resource limitation is space, which contrasts with classical UTMs and QUTMs where it is time. This fundamental difference enables an NUTM to trade space for time, which is significant for both theoretical computer science and physics. It is also of practical importance, for to quote Richard Feynman ‘there's plenty of room at the bottom’. This means that a desktop DNA NUTM could potentially utilize more processors than all the electronic computers in the world combined, and thereby outperform the world's current fastest supercomputer, while consuming a tiny fraction of its energy. PMID:28250099

  13. Programmable and autonomous computing machine made of biomolecules

    PubMed Central

    Benenson, Yaakov; Paz-Elizur, Tamar; Adar, Rivka; Keinan, Ehud; Livneh, Zvi; Shapiro, Ehud

    2013-01-01

    Devices that convert information from one form into another according to a definite procedure are known as automata. One such hypothetical device is the universal Turing machine1, which stimulated work leading to the development of modern computers. The Turing machine and its special cases2, including finite automata3, operate by scanning a data tape, whose striking analogy to information-encoding biopolymers inspired several designs for molecular DNA computers4–8. Laboratory-scale computing using DNA and human-assisted protocols has been demonstrated9–15, but the realization of computing devices operating autonomously on the molecular scale remains rare16–20. Here we describe a programmable finite automaton comprising DNA and DNA-manipulating enzymes that solves computational problems autonomously. The automaton’s hardware consists of a restriction nuclease and ligase, the software and input are encoded by double-stranded DNA, and programming amounts to choosing appropriate software molecules. Upon mixing solutions containing these components, the automaton processes the input molecule via a cascade of restriction, hybridization and ligation cycles, producing a detectable output molecule that encodes the automaton’s final state, and thus the computational result. In our implementation 1012 automata sharing the same software run independently and in parallel on inputs (which could, in principle, be distinct) in 120 μl solution at room temperature at a combined rate of 109 transitions per second with a transition fidelity greater than 99.8%, consuming less than 10−10 W. PMID:11719800

  14. A Hybrid Computer Simulation to Generate the DNA Distribution of a Cell Population.

    ERIC Educational Resources Information Center

    Griebling, John L.; Adams, William S.

    1981-01-01

    Described is a method of simulating the formation of a DNA distribution, on which statistical results and experimentally measured parameters from DNA distribution and percent-labeled mitosis studies are combined. An EAI-680 and DECSystem-10 Hybrid Computer configuration are used. (Author/CS)

  15. Logical NAND and NOR Operations Using Algorithmic Self-assembly of DNA Molecules

    NASA Astrophysics Data System (ADS)

    Wang, Yanfeng; Cui, Guangzhao; Zhang, Xuncai; Zheng, Yan

    DNA self-assembly is the most advanced and versatile system that has been experimentally demonstrated for programmable construction of patterned systems on the molecular scale. It has been demonstrated that the simple binary arithmetic and logical operations can be computed by the process of self assembly of DNA tiles. Here we report a one-dimensional algorithmic self-assembly of DNA triple-crossover molecules that can be used to execute five steps of a logical NAND and NOR operations on a string of binary bits. To achieve this, abstract tiles were translated into DNA tiles based on triple-crossover motifs. Serving as input for the computation, long single stranded DNA molecules were used to nucleate growth of tiles into algorithmic crystals. Our method shows that engineered DNA self-assembly can be treated as a bottom-up design techniques, and can be capable of designing DNA computer organization and architecture.

  16. The 'Biologically-Inspired Computing' Column

    NASA Technical Reports Server (NTRS)

    Hinchey, Mike

    2006-01-01

    The field of Biology changed dramatically in 1953, with the determination by Francis Crick and James Dewey Watson of the double helix structure of DNA. This discovery changed Biology for ever, allowing the sequencing of the human genome, and the emergence of a "new Biology" focused on DNA, genes, proteins, data, and search. Computational Biology and Bioinformatics heavily rely on computing to facilitate research into life and development. Simultaneously, an understanding of the biology of living organisms indicates a parallel with computing systems: molecules in living cells interact, grow, and transform according to the "program" dictated by DNA. Moreover, paradigms of Computing are emerging based on modelling and developing computer-based systems exploiting ideas that are observed in nature. This includes building into computer systems self-management and self-governance mechanisms that are inspired by the human body's autonomic nervous system, modelling evolutionary systems analogous to colonies of ants or other insects, and developing highly-efficient and highly-complex distributed systems from large numbers of (often quite simple) largely homogeneous components to reflect the behaviour of flocks of birds, swarms of bees, herds of animals, or schools of fish. This new field of "Biologically-Inspired Computing", often known in other incarnations by other names, such as: Autonomic Computing, Pervasive Computing, Organic Computing, Biomimetics, and Artificial Life, amongst others, is poised at the intersection of Computer Science, Engineering, Mathematics, and the Life Sciences. Successes have been reported in the fields of drug discovery, data communications, computer animation, control and command, exploration systems for space, undersea, and harsh environments, to name but a few, and augur much promise for future progress.

  17. Investigation of a Sybr-Green-Based Method to Validate DNA Sequences for DNA Computing

    DTIC Science & Technology

    2005-05-01

    OF A SYBR-GREEN-BASED METHOD TO VALIDATE DNA SEQUENCES FOR DNA COMPUTING 6. AUTHOR(S) Wendy Pogozelski, Salvatore Priore, Matthew Bernard ...simulated annealing. Biochemistry, 35, 14077-14089. 15 Pogozelski, W.K., Bernard , M.P. and Macula, A. (2004) DNA code validation using...and Clark, B.F.C. (eds) In RNA Biochemistry and Biotechnology, NATO ASI Series, Kluwer Academic Publishers. Zucker, M. and Stiegler , P. (1981

  18. Superimposed Code Theoretic Analysis of DNA Codes and DNA Computing

    DTIC Science & Technology

    2008-01-01

    complements of one another and the DNA duplex formed is a Watson - Crick (WC) duplex. However, there are many instances when the formation of non-WC...that the user’s requirements for probe selection are met based on the Watson - Crick probe locality within a target. The second type, called...AFRL-RI-RS-TR-2007-288 Final Technical Report January 2008 SUPERIMPOSED CODE THEORETIC ANALYSIS OF DNA CODES AND DNA COMPUTING

  19. Solving probability reasoning based on DNA strand displacement and probability modules.

    PubMed

    Zhang, Qiang; Wang, Xiaobiao; Wang, Xiaojun; Zhou, Changjun

    2017-12-01

    In computation biology, DNA strand displacement technology is used to simulate the computation process and has shown strong computing ability. Most researchers use it to solve logic problems, but it is only rarely used in probabilistic reasoning. To process probabilistic reasoning, a conditional probability derivation model and total probability model based on DNA strand displacement were established in this paper. The models were assessed through the game "read your mind." It has been shown to enable the application of probabilistic reasoning in genetic diagnosis. Copyright © 2017 Elsevier Ltd. All rights reserved.

  20. 4P: fast computing of population genetics statistics from large DNA polymorphism panels

    PubMed Central

    Benazzo, Andrea; Panziera, Alex; Bertorelle, Giorgio

    2015-01-01

    Massive DNA sequencing has significantly increased the amount of data available for population genetics and molecular ecology studies. However, the parallel computation of simple statistics within and between populations from large panels of polymorphic sites is not yet available, making the exploratory analyses of a set or subset of data a very laborious task. Here, we present 4P (parallel processing of polymorphism panels), a stand-alone software program for the rapid computation of genetic variation statistics (including the joint frequency spectrum) from millions of DNA variants in multiple individuals and multiple populations. It handles a standard input file format commonly used to store DNA variation from empirical or simulation experiments. The computational performance of 4P was evaluated using large SNP (single nucleotide polymorphism) datasets from human genomes or obtained by simulations. 4P was faster or much faster than other comparable programs, and the impact of parallel computing using multicore computers or servers was evident. 4P is a useful tool for biologists who need a simple and rapid computer program to run exploratory population genetics analyses in large panels of genomic data. It is also particularly suitable to analyze multiple data sets produced in simulation studies. Unix, Windows, and MacOs versions are provided, as well as the source code for easier pipeline implementations. PMID:25628874

  1. Research on Image Encryption Based on DNA Sequence and Chaos Theory

    NASA Astrophysics Data System (ADS)

    Tian Zhang, Tian; Yan, Shan Jun; Gu, Cheng Yan; Ren, Ran; Liao, Kai Xin

    2018-04-01

    Nowadays encryption is a common technique to protect image data from unauthorized access. In recent years, many scientists have proposed various encryption algorithms based on DNA sequence to provide a new idea for the design of image encryption algorithm. Therefore, a new method of image encryption based on DNA computing technology is proposed in this paper, whose original image is encrypted by DNA coding and 1-D logistic chaotic mapping. First, the algorithm uses two modules as the encryption key. The first module uses the real DNA sequence, and the second module is made by one-dimensional logistic chaos mapping. Secondly, the algorithm uses DNA complementary rules to encode original image, and uses the key and DNA computing technology to compute each pixel value of the original image, so as to realize the encryption of the whole image. Simulation results show that the algorithm has good encryption effect and security.

  2. Abstractions for DNA circuit design.

    PubMed

    Lakin, Matthew R; Youssef, Simon; Cardelli, Luca; Phillips, Andrew

    2012-03-07

    DNA strand displacement techniques have been used to implement a broad range of information processing devices, from logic gates, to chemical reaction networks, to architectures for universal computation. Strand displacement techniques enable computational devices to be implemented in DNA without the need for additional components, allowing computation to be programmed solely in terms of nucleotide sequences. A major challenge in the design of strand displacement devices has been to enable rapid analysis of high-level designs while also supporting detailed simulations that include known forms of interference. Another challenge has been to design devices capable of sustaining precise reaction kinetics over long periods, without relying on complex experimental equipment to continually replenish depleted species over time. In this paper, we present a programming language for designing DNA strand displacement devices, which supports progressively increasing levels of molecular detail. The language allows device designs to be programmed using a common syntax and then analysed at varying levels of detail, with or without interference, without needing to modify the program. This allows a trade-off to be made between the level of molecular detail and the computational cost of analysis. We use the language to design a buffered architecture for DNA devices, capable of maintaining precise reaction kinetics for a potentially unbounded period. We test the effectiveness of buffered gates to support long-running computation by designing a DNA strand displacement system capable of sustained oscillations.

  3. Probabilistic simple sticker systems

    NASA Astrophysics Data System (ADS)

    Selvarajoo, Mathuri; Heng, Fong Wan; Sarmin, Nor Haniza; Turaev, Sherzod

    2017-04-01

    A model for DNA computing using the recombination behavior of DNA molecules, known as a sticker system, was introduced by by L. Kari, G. Paun, G. Rozenberg, A. Salomaa, and S. Yu in the paper entitled DNA computing, sticker systems and universality from the journal of Acta Informatica vol. 35, pp. 401-420 in the year 1998. A sticker system uses the Watson-Crick complementary feature of DNA molecules: starting from the incomplete double stranded sequences, and iteratively using sticking operations until a complete double stranded sequence is obtained. It is known that sticker systems with finite sets of axioms and sticker rules generate only regular languages. Hence, different types of restrictions have been considered to increase the computational power of sticker systems. Recently, a variant of restricted sticker systems, called probabilistic sticker systems, has been introduced [4]. In this variant, the probabilities are initially associated with the axioms, and the probability of a generated string is computed by multiplying the probabilities of all occurrences of the initial strings in the computation of the string. Strings for the language are selected according to some probabilistic requirements. In this paper, we study fundamental properties of probabilistic simple sticker systems. We prove that the probabilistic enhancement increases the computational power of simple sticker systems.

  4. The role of structural parameters in DNA cyclization

    DOE PAGES

    Alexandrov, Ludmil B.; Bishop, Alan R.; Rasmussen, Kim O.; ...

    2016-02-04

    The intrinsic bendability of DNA plays an important role with relevance for myriad of essential cellular mechanisms. The flexibility of a DNA fragment can be experimentally and computationally examined by its propensity for cyclization, quantified by the Jacobson-Stockmayer J factor. In this paper, we use a well-established coarse-grained three-dimensional model of DNA and seven distinct sets of experimentally and computationally derived conformational parameters of the double helix to evaluate the role of structural parameters in calculating DNA cyclization.

  5. A comparative approach for the investigation of biological information processing: An examination of the structure and function of computer hard drives and DNA

    PubMed Central

    2010-01-01

    Background The robust storage, updating and utilization of information are necessary for the maintenance and perpetuation of dynamic systems. These systems can exist as constructs of metal-oxide semiconductors and silicon, as in a digital computer, or in the "wetware" of organic compounds, proteins and nucleic acids that make up biological organisms. We propose that there are essential functional properties of centralized information-processing systems; for digital computers these properties reside in the computer's hard drive, and for eukaryotic cells they are manifest in the DNA and associated structures. Methods Presented herein is a descriptive framework that compares DNA and its associated proteins and sub-nuclear structure with the structure and function of the computer hard drive. We identify four essential properties of information for a centralized storage and processing system: (1) orthogonal uniqueness, (2) low level formatting, (3) high level formatting and (4) translation of stored to usable form. The corresponding aspects of the DNA complex and a computer hard drive are categorized using this classification. This is intended to demonstrate a functional equivalence between the components of the two systems, and thus the systems themselves. Results Both the DNA complex and the computer hard drive contain components that fulfill the essential properties of a centralized information storage and processing system. The functional equivalence of these components provides insight into both the design process of engineered systems and the evolved solutions addressing similar system requirements. However, there are points where the comparison breaks down, particularly when there are externally imposed information-organizing structures on the computer hard drive. A specific example of this is the imposition of the File Allocation Table (FAT) during high level formatting of the computer hard drive and the subsequent loading of an operating system (OS). Biological systems do not have an external source for a map of their stored information or for an operational instruction set; rather, they must contain an organizational template conserved within their intra-nuclear architecture that "manipulates" the laws of chemistry and physics into a highly robust instruction set. We propose that the epigenetic structure of the intra-nuclear environment and the non-coding RNA may play the roles of a Biological File Allocation Table (BFAT) and biological operating system (Bio-OS) in eukaryotic cells. Conclusions The comparison of functional and structural characteristics of the DNA complex and the computer hard drive leads to a new descriptive paradigm that identifies the DNA as a dynamic storage system of biological information. This system is embodied in an autonomous operating system that inductively follows organizational structures, data hierarchy and executable operations that are well understood in the computer science industry. Characterizing the "DNA hard drive" in this fashion can lead to insights arising from discrepancies in the descriptive framework, particularly with respect to positing the role of epigenetic processes in an information-processing context. Further expansions arising from this comparison include the view of cells as parallel computing machines and a new approach towards characterizing cellular control systems. PMID:20092652

  6. A comparative approach for the investigation of biological information processing: an examination of the structure and function of computer hard drives and DNA.

    PubMed

    D'Onofrio, David J; An, Gary

    2010-01-21

    The robust storage, updating and utilization of information are necessary for the maintenance and perpetuation of dynamic systems. These systems can exist as constructs of metal-oxide semiconductors and silicon, as in a digital computer, or in the "wetware" of organic compounds, proteins and nucleic acids that make up biological organisms. We propose that there are essential functional properties of centralized information-processing systems; for digital computers these properties reside in the computer's hard drive, and for eukaryotic cells they are manifest in the DNA and associated structures. Presented herein is a descriptive framework that compares DNA and its associated proteins and sub-nuclear structure with the structure and function of the computer hard drive. We identify four essential properties of information for a centralized storage and processing system: (1) orthogonal uniqueness, (2) low level formatting, (3) high level formatting and (4) translation of stored to usable form. The corresponding aspects of the DNA complex and a computer hard drive are categorized using this classification. This is intended to demonstrate a functional equivalence between the components of the two systems, and thus the systems themselves. Both the DNA complex and the computer hard drive contain components that fulfill the essential properties of a centralized information storage and processing system. The functional equivalence of these components provides insight into both the design process of engineered systems and the evolved solutions addressing similar system requirements. However, there are points where the comparison breaks down, particularly when there are externally imposed information-organizing structures on the computer hard drive. A specific example of this is the imposition of the File Allocation Table (FAT) during high level formatting of the computer hard drive and the subsequent loading of an operating system (OS). Biological systems do not have an external source for a map of their stored information or for an operational instruction set; rather, they must contain an organizational template conserved within their intra-nuclear architecture that "manipulates" the laws of chemistry and physics into a highly robust instruction set. We propose that the epigenetic structure of the intra-nuclear environment and the non-coding RNA may play the roles of a Biological File Allocation Table (BFAT) and biological operating system (Bio-OS) in eukaryotic cells. The comparison of functional and structural characteristics of the DNA complex and the computer hard drive leads to a new descriptive paradigm that identifies the DNA as a dynamic storage system of biological information. This system is embodied in an autonomous operating system that inductively follows organizational structures, data hierarchy and executable operations that are well understood in the computer science industry. Characterizing the "DNA hard drive" in this fashion can lead to insights arising from discrepancies in the descriptive framework, particularly with respect to positing the role of epigenetic processes in an information-processing context. Further expansions arising from this comparison include the view of cells as parallel computing machines and a new approach towards characterizing cellular control systems.

  7. DENA: A Configurable Microarchitecture and Design Flow for Biomedical DNA-Based Logic Design.

    PubMed

    Beiki, Zohre; Jahanian, Ali

    2017-10-01

    DNA is known as the building block for storing the life codes and transferring the genetic features through the generations. However, it is found that DNA strands can be used for a new type of computation that opens fascinating horizons in computational medicine. Significant contributions are addressed on design of DNA-based logic gates for medical and computational applications but there are serious challenges for designing the medium and large-scale DNA circuits. In this paper, a new microarchitecture and corresponding design flow is proposed to facilitate the design of multistage large-scale DNA logic systems. Feasibility and efficiency of the proposed microarchitecture are evaluated by implementing a full adder and, then, its cascadability is determined by implementing a multistage 8-bit adder. Simulation results show the highlight features of the proposed design style and microarchitecture in terms of the scalability, implementation cost, and signal integrity of the DNA-based logic system compared to the traditional approaches.

  8. A System Architecture for Efficient Transmission of Massive DNA Sequencing Data.

    PubMed

    Sağiroğlu, Mahmut Şamİl; Külekcİ, M Oğuzhan

    2017-11-01

    The DNA sequencing data analysis pipelines require significant computational resources. In that sense, cloud computing infrastructures appear as a natural choice for this processing. However, the first practical difficulty in reaching the cloud computing services is the transmission of the massive DNA sequencing data from where they are produced to where they will be processed. The daily practice here begins with compressing the data in FASTQ file format, and then sending these data via fast data transmission protocols. In this study, we address the weaknesses in that daily practice and present a new system architecture that incorporates the computational resources available on the client side while dynamically adapting itself to the available bandwidth. Our proposal considers the real-life scenarios, where the bandwidth of the connection between the parties may fluctuate, and also the computing power on the client side may be of any size ranging from moderate personal computers to powerful workstations. The proposed architecture aims at utilizing both the communication bandwidth and the computing resources for satisfying the ultimate goal of reaching the results as early as possible. We present a prototype implementation of the proposed architecture, and analyze several real-life cases, which provide useful insights for the sequencing centers, especially on deciding when to use a cloud service and in what conditions.

  9. Combining H/D exchange mass spectroscopy and computational docking reveals extended DNA-binding surface on uracil-DNA glycosylase

    PubMed Central

    Roberts, Victoria A.; Pique, Michael E.; Hsu, Simon; Li, Sheng; Slupphaug, Geir; Rambo, Robert P.; Jamison, Jonathan W.; Liu, Tong; Lee, Jun H.; Tainer, John A.; Ten Eyck, Lynn F.; Woods, Virgil L.

    2012-01-01

    X-ray crystallography provides excellent structural data on protein–DNA interfaces, but crystallographic complexes typically contain only small fragments of large DNA molecules. We present a new approach that can use longer DNA substrates and reveal new protein–DNA interactions even in extensively studied systems. Our approach combines rigid-body computational docking with hydrogen/deuterium exchange mass spectrometry (DXMS). DXMS identifies solvent-exposed protein surfaces; docking is used to create a 3-dimensional model of the protein–DNA interaction. We investigated the enzyme uracil-DNA glycosylase (UNG), which detects and cleaves uracil from DNA. UNG was incubated with a 30 bp DNA fragment containing a single uracil, giving the complex with the abasic DNA product. Compared with free UNG, the UNG–DNA complex showed increased solvent protection at the UNG active site and at two regions outside the active site: residues 210–220 and 251–264. Computational docking also identified these two DNA-binding surfaces, but neither shows DNA contact in UNG–DNA crystallographic structures. Our results can be explained by separation of the two DNA strands on one side of the active site. These non-sequence-specific DNA-binding surfaces may aid local uracil search, contribute to binding the abasic DNA product and help present the DNA product to APE-1, the next enzyme on the DNA-repair pathway. PMID:22492624

  10. An affinity-structure database of helix-turn-helix: DNA complexes with a universal coordinate system.

    PubMed

    AlQuraishi, Mohammed; Tang, Shengdong; Xia, Xide

    2015-11-19

    Molecular interactions between proteins and DNA molecules underlie many cellular processes, including transcriptional regulation, chromosome replication, and nucleosome positioning. Computational analyses of protein-DNA interactions rely on experimental data characterizing known protein-DNA interactions structurally and biochemically. While many databases exist that contain either structural or biochemical data, few integrate these two data sources in a unified fashion. Such integration is becoming increasingly critical with the rapid growth of structural and biochemical data, and the emergence of algorithms that rely on the synthesis of multiple data types to derive computational models of molecular interactions. We have developed an integrated affinity-structure database in which the experimental and quantitative DNA binding affinities of helix-turn-helix proteins are mapped onto the crystal structures of the corresponding protein-DNA complexes. This database provides access to: (i) protein-DNA structures, (ii) quantitative summaries of protein-DNA binding affinities using position weight matrices, and (iii) raw experimental data of protein-DNA binding instances. Critically, this database establishes a correspondence between experimental structural data and quantitative binding affinity data at the single basepair level. Furthermore, we present a novel alignment algorithm that structurally aligns the protein-DNA complexes in the database and creates a unified residue-level coordinate system for comparing the physico-chemical environments at the interface between complexes. Using this unified coordinate system, we compute the statistics of atomic interactions at the protein-DNA interface of helix-turn-helix proteins. We provide an interactive website for visualization, querying, and analyzing this database, and a downloadable version to facilitate programmatic analysis. This database will facilitate the analysis of protein-DNA interactions and the development of programmatic computational methods that capitalize on integration of structural and biochemical datasets. The database can be accessed at http://ProteinDNA.hms.harvard.edu.

  11. Petri-net-based 2D design of DNA walker circuits.

    PubMed

    Gilbert, David; Heiner, Monika; Rohr, Christian

    2018-01-01

    We consider localised DNA computation, where a DNA strand walks along a binary decision graph to compute a binary function. One of the challenges for the design of reliable walker circuits consists in leakage transitions, which occur when a walker jumps into another branch of the decision graph. We automatically identify leakage transitions, which allows for a detailed qualitative and quantitative assessment of circuit designs, design comparison, and design optimisation. The ability to identify leakage transitions is an important step in the process of optimising DNA circuit layouts where the aim is to minimise the computational error inherent in a circuit while minimising the area of the circuit. Our 2D modelling approach of DNA walker circuits relies on coloured stochastic Petri nets which enable functionality, topology and dimensionality all to be integrated in one two-dimensional model. Our modelling and analysis approach can be easily extended to 3-dimensional walker systems.

  12. Research Advances: DNA Computing Targets West Nile Virus, Other Deadly Diseases, and Tic-Tac-Toe; Marijuana Component May Offer Hope for Alzheimer's Disease Treatment; New Wound Dressing May Lead to Maggot Therapy--Without the Maggots

    ERIC Educational Resources Information Center

    King, Angela G.

    2007-01-01

    This article presents three reports of research advances. The first report describes a deoxyribonucleic acid (DNA)-based computer that could lead to faster, more accurate tests for diagnosing West Nile Virus and bird flu. Representing the first "medium-scale integrated molecular circuit," it is the most powerful computing device of its type to…

  13. Redesigning the specificity of protein-DNA interactions with Rosetta.

    PubMed

    Thyme, Summer; Baker, David

    2014-01-01

    Building protein tools that can selectively bind or cleave specific DNA sequences requires efficient technologies for modifying protein-DNA interactions. Computational design is one method for accomplishing this goal. In this chapter, we present the current state of protein-DNA interface design with the Rosetta macromolecular modeling program. The LAGLIDADG endonuclease family of DNA-cleaving enzymes, under study as potential gene therapy reagents, has been the main testing ground for these in silico protocols. At this time, the computational methods are most useful for designing endonuclease variants that can accommodate small numbers of target site substitutions. Attempts to engineer for more extensive interface changes will likely benefit from an approach that uses the computational design results in conjunction with a high-throughput directed evolution or screening procedure. The family of enzymes presents an engineering challenge because their interfaces are highly integrated and there is significant coordination between the binding and catalysis events. Future developments in the computational algorithms depend on experimental feedback to improve understanding and modeling of these complex enzymatic features. This chapter presents both the basic method of design that has been successfully used to modulate specificity and more advanced procedures that incorporate DNA flexibility and other properties that are likely necessary for reliable modeling of more extensive target site changes.

  14. An autonomous molecular computer for logical control of gene expression.

    PubMed

    Benenson, Yaakov; Gil, Binyamin; Ben-Dor, Uri; Adar, Rivka; Shapiro, Ehud

    2004-05-27

    Early biomolecular computer research focused on laboratory-scale, human-operated computers for complex computational problems. Recently, simple molecular-scale autonomous programmable computers were demonstrated allowing both input and output information to be in molecular form. Such computers, using biological molecules as input data and biologically active molecules as outputs, could produce a system for 'logical' control of biological processes. Here we describe an autonomous biomolecular computer that, at least in vitro, logically analyses the levels of messenger RNA species, and in response produces a molecule capable of affecting levels of gene expression. The computer operates at a concentration of close to a trillion computers per microlitre and consists of three programmable modules: a computation module, that is, a stochastic molecular automaton; an input module, by which specific mRNA levels or point mutations regulate software molecule concentrations, and hence automaton transition probabilities; and an output module, capable of controlled release of a short single-stranded DNA molecule. This approach might be applied in vivo to biochemical sensing, genetic engineering and even medical diagnosis and treatment. As a proof of principle we programmed the computer to identify and analyse mRNA of disease-related genes associated with models of small-cell lung cancer and prostate cancer, and to produce a single-stranded DNA molecule modelled after an anticancer drug.

  15. Manipulation of oligonucleotides immobilized on solid supports - DNA computations on surfaces

    NASA Astrophysics Data System (ADS)

    Liu, Qinghua

    The manipulation of DNA oligonucleotides immobilized on various solid supports has been studied intensively, especially in the area of surface hybridization. Recently, surface-based biotechnology has been applied to the area of molecular computing. These surface-based methods have advantages with regard to ease of handling, facile purification, and less interference when compared to solution methodologies. This dissertation describes the investigation of molecular approaches to DNA computing. The feasibility of encoding a bit (0 or 1) of information for DNA-based computations at the single nucleotide level was studied, particularly with regard to the efficiency and specificity of hybridization discrimination. Both gold and glass surfaces, with addressed arrays of 32 oligonucleotides, were employed with similar hybridization results. Although single-base discrimination may be achieved in the system, it is at the cost of a severe decrease in the efficiency of hybridization to perfectly matched sequences. This compromises the utility of single nucleotide encoding for DNA computing applications in the absence of some additional mechanism for increasing specificity. Several methods are suggested including a multiple-base encoding strategy. The multiple-base encoding strategy was employed to develop a prototype DNA computer. The approach was demonstrated by solving a small example of the Satisfiability (SAT) problem, an NP-complete problem in Boolean logic. 16 distinct DNA oligonucleotides, encoding all candidate solutions to the 4-variable-4-clause-3-SAT problem, were immobilized on a gold surface in the non-addressed format. Four cycles of MARK (hybridization), DESTROY (enzymatic destruction) and UNMARK (denaturation) were performed, which identified and eliminated members of the set which were not solutions to the problem. Determination of the answer was accomplished in the READOUT (sequence identification) operation by PCR amplification of the remaining molecules and hybridization to an addressed array. Four answers were determined and the S/N ratio between correct and incorrect solutions ranged from 10 to 777, making discrimination between correct and incorrect solutions to the problem straightforward. Additionally, studies of enzymatic manipulations of DNA molecules on surfaces suggested the use of E. coli Exonuclease I (Exo I) and perhaps EarI in the DESTROY operation.

  16. Exploring the Feasibility of a DNA Computer: Design of an ALU Using Sticker-Based DNA Model.

    PubMed

    Sarkar, Mayukh; Ghosal, Prasun; Mohanty, Saraju P

    2017-09-01

    Since its inception, DNA computing has advanced to offer an extremely powerful, energy-efficient emerging technology for solving hard computational problems with its inherent massive parallelism and extremely high data density. This would be much more powerful and general purpose when combined with other existing well-known algorithmic solutions that exist for conventional computing architectures using a suitable ALU. Thus, a specifically designed DNA Arithmetic and Logic Unit (ALU) that can address operations suitable for both domains can mitigate the gap between these two. An ALU must be able to perform all possible logic operations, including NOT, OR, AND, XOR, NOR, NAND, and XNOR; compare, shift etc., integer and floating point arithmetic operations (addition, subtraction, multiplication, and division). In this paper, design of an ALU has been proposed using sticker-based DNA model with experimental feasibility analysis. Novelties of this paper may be in manifold. First, the integer arithmetic operations performed here are 2s complement arithmetic, and the floating point operations follow the IEEE 754 floating point format, resembling closely to a conventional ALU. Also, the output of each operation can be reused for any next operation. So any algorithm or program logic that users can think of can be implemented directly on the DNA computer without any modification. Second, once the basic operations of sticker model can be automated, the implementations proposed in this paper become highly suitable to design a fully automated ALU. Third, proposed approaches are easy to implement. Finally, these approaches can work on sufficiently large binary numbers.

  17. Evaluating forensic DNA mixtures with contributors of different structured ethnic origins: a computer software.

    PubMed

    Hu, Yue-Qing; Fung, Wing K

    2003-08-01

    The effect of a structured population on the likelihood ratio of a DNA mixture has been studied by the current authors and others. In practice, contributors of a DNA mixture may belong to different ethnic/racial origins, a situation especially common in multi-racial countries such as the USA and Singapore. We have developed a computer software which is available on the web for evaluating DNA mixtures in multi-structured populations. The software can deal with various DNA mixture problems that cannot be handled by the methods given in a recent article of Fung and Hu.

  18. Synthetic Biology: Knowledge Accessed by Everyone (Open Sources)

    ERIC Educational Resources Information Center

    Sánchez Reyes, Patricia Margarita

    2016-01-01

    Using the principles of biology, along with engineering and with the help of computer, scientists manage to copy. DNA sequences from nature and use them to create new organisms. DNA is created through engineering and computer science managing to create life inside a laboratory. We cannot dismiss the role that synthetic biology could lead in…

  19. Solving traveling salesman problems with DNA molecules encoding numerical values.

    PubMed

    Lee, Ji Youn; Shin, Soo-Yong; Park, Tai Hyun; Zhang, Byoung-Tak

    2004-12-01

    We introduce a DNA encoding method to represent numerical values and a biased molecular algorithm based on the thermodynamic properties of DNA. DNA strands are designed to encode real values by variation of their melting temperatures. The thermodynamic properties of DNA are used for effective local search of optimal solutions using biochemical techniques, such as denaturation temperature gradient polymerase chain reaction and temperature gradient gel electrophoresis. The proposed method was successfully applied to the traveling salesman problem, an instance of optimization problems on weighted graphs. This work extends the capability of DNA computing to solving numerical optimization problems, which is contrasted with other DNA computing methods focusing on logical problem solving.

  20. Approaching mathematical model of the immune network based DNA Strand Displacement system.

    PubMed

    Mardian, Rizki; Sekiyama, Kosuke; Fukuda, Toshio

    2013-12-01

    One biggest obstacle in molecular programming is that there is still no direct method to compile any existed mathematical model into biochemical reaction in order to solve a computational problem. In this paper, the implementation of DNA Strand Displacement system based on nature-inspired computation is observed. By using the Immune Network Theory and Chemical Reaction Network, the compilation of DNA-based operation is defined and the formulation of its mathematical model is derived. Furthermore, the implementation on this system is compared with the conventional implementation by using silicon-based programming. From the obtained results, we can see a positive correlation between both. One possible application from this DNA-based model is for a decision making scheme of intelligent computer or molecular robot. Copyright © 2013 Elsevier Ireland Ltd. All rights reserved.

  1. Computational design of co-assembling protein-DNA nanowires

    NASA Astrophysics Data System (ADS)

    Mou, Yun; Yu, Jiun-Yann; Wannier, Timothy M.; Guo, Chin-Lin; Mayo, Stephen L.

    2015-09-01

    Biomolecular self-assemblies are of great interest to nanotechnologists because of their functional versatility and their biocompatibility. Over the past decade, sophisticated single-component nanostructures composed exclusively of nucleic acids, peptides and proteins have been reported, and these nanostructures have been used in a wide range of applications, from drug delivery to molecular computing. Despite these successes, the development of hybrid co-assemblies of nucleic acids and proteins has remained elusive. Here we use computational protein design to create a protein-DNA co-assembling nanomaterial whose assembly is driven via non-covalent interactions. To achieve this, a homodimerization interface is engineered onto the Drosophila Engrailed homeodomain (ENH), allowing the dimerized protein complex to bind to two double-stranded DNA (dsDNA) molecules. By varying the arrangement of protein-binding sites on the dsDNA, an irregular bulk nanoparticle or a nanowire with single-molecule width can be spontaneously formed by mixing the protein and dsDNA building blocks. We characterize the protein-DNA nanowire using fluorescence microscopy, atomic force microscopy and X-ray crystallography, confirming that the nanowire is formed via the proposed mechanism. This work lays the foundation for the development of new classes of protein-DNA hybrid materials. Further applications can be explored by incorporating DNA origami, DNA aptamers and/or peptide epitopes into the protein-DNA framework presented here.

  2. HTSFinder: Powerful Pipeline of DNA Signature Discovery by Parallel and Distributed Computing

    PubMed Central

    Karimi, Ramin; Hajdu, Andras

    2016-01-01

    Comprehensive effort for low-cost sequencing in the past few years has led to the growth of complete genome databases. In parallel with this effort, a strong need, fast and cost-effective methods and applications have been developed to accelerate sequence analysis. Identification is the very first step of this task. Due to the difficulties, high costs, and computational challenges of alignment-based approaches, an alternative universal identification method is highly required. Like an alignment-free approach, DNA signatures have provided new opportunities for the rapid identification of species. In this paper, we present an effective pipeline HTSFinder (high-throughput signature finder) with a corresponding k-mer generator GkmerG (genome k-mers generator). Using this pipeline, we determine the frequency of k-mers from the available complete genome databases for the detection of extensive DNA signatures in a reasonably short time. Our application can detect both unique and common signatures in the arbitrarily selected target and nontarget databases. Hadoop and MapReduce as parallel and distributed computing tools with commodity hardware are used in this pipeline. This approach brings the power of high-performance computing into the ordinary desktop personal computers for discovering DNA signatures in large databases such as bacterial genome. A considerable number of detected unique and common DNA signatures of the target database bring the opportunities to improve the identification process not only for polymerase chain reaction and microarray assays but also for more complex scenarios such as metagenomics and next-generation sequencing analysis. PMID:26884678

  3. HTSFinder: Powerful Pipeline of DNA Signature Discovery by Parallel and Distributed Computing.

    PubMed

    Karimi, Ramin; Hajdu, Andras

    2016-01-01

    Comprehensive effort for low-cost sequencing in the past few years has led to the growth of complete genome databases. In parallel with this effort, a strong need, fast and cost-effective methods and applications have been developed to accelerate sequence analysis. Identification is the very first step of this task. Due to the difficulties, high costs, and computational challenges of alignment-based approaches, an alternative universal identification method is highly required. Like an alignment-free approach, DNA signatures have provided new opportunities for the rapid identification of species. In this paper, we present an effective pipeline HTSFinder (high-throughput signature finder) with a corresponding k-mer generator GkmerG (genome k-mers generator). Using this pipeline, we determine the frequency of k-mers from the available complete genome databases for the detection of extensive DNA signatures in a reasonably short time. Our application can detect both unique and common signatures in the arbitrarily selected target and nontarget databases. Hadoop and MapReduce as parallel and distributed computing tools with commodity hardware are used in this pipeline. This approach brings the power of high-performance computing into the ordinary desktop personal computers for discovering DNA signatures in large databases such as bacterial genome. A considerable number of detected unique and common DNA signatures of the target database bring the opportunities to improve the identification process not only for polymerase chain reaction and microarray assays but also for more complex scenarios such as metagenomics and next-generation sequencing analysis.

  4. OSG-GEM: Gene Expression Matrix Construction Using the Open Science Grid.

    PubMed

    Poehlman, William L; Rynge, Mats; Branton, Chris; Balamurugan, D; Feltus, Frank A

    2016-01-01

    High-throughput DNA sequencing technology has revolutionized the study of gene expression while introducing significant computational challenges for biologists. These computational challenges include access to sufficient computer hardware and functional data processing workflows. Both these challenges are addressed with our scalable, open-source Pegasus workflow for processing high-throughput DNA sequence datasets into a gene expression matrix (GEM) using computational resources available to U.S.-based researchers on the Open Science Grid (OSG). We describe the usage of the workflow (OSG-GEM), discuss workflow design, inspect performance data, and assess accuracy in mapping paired-end sequencing reads to a reference genome. A target OSG-GEM user is proficient with the Linux command line and possesses basic bioinformatics experience. The user may run this workflow directly on the OSG or adapt it to novel computing environments.

  5. OSG-GEM: Gene Expression Matrix Construction Using the Open Science Grid

    PubMed Central

    Poehlman, William L.; Rynge, Mats; Branton, Chris; Balamurugan, D.; Feltus, Frank A.

    2016-01-01

    High-throughput DNA sequencing technology has revolutionized the study of gene expression while introducing significant computational challenges for biologists. These computational challenges include access to sufficient computer hardware and functional data processing workflows. Both these challenges are addressed with our scalable, open-source Pegasus workflow for processing high-throughput DNA sequence datasets into a gene expression matrix (GEM) using computational resources available to U.S.-based researchers on the Open Science Grid (OSG). We describe the usage of the workflow (OSG-GEM), discuss workflow design, inspect performance data, and assess accuracy in mapping paired-end sequencing reads to a reference genome. A target OSG-GEM user is proficient with the Linux command line and possesses basic bioinformatics experience. The user may run this workflow directly on the OSG or adapt it to novel computing environments. PMID:27499617

  6. An autonomous molecular computer for logical control of gene expression

    PubMed Central

    Benenson, Yaakov; Gil, Binyamin; Ben-Dor, Uri; Adar, Rivka; Shapiro, Ehud

    2013-01-01

    Early biomolecular computer research focused on laboratory-scale, human-operated computers for complex computational problems1–7. Recently, simple molecular-scale autonomous programmable computers were demonstrated8–15 allowing both input and output information to be in molecular form. Such computers, using biological molecules as input data and biologically active molecules as outputs, could produce a system for ‘logical’ control of biological processes. Here we describe an autonomous biomolecular computer that, at least in vitro, logically analyses the levels of messenger RNA species, and in response produces a molecule capable of affecting levels of gene expression. The computer operates at a concentration of close to a trillion computers per microlitre and consists of three programmable modules: a computation module, that is, a stochastic molecular automaton12–17; an input module, by which specific mRNA levels or point mutations regulate software molecule concentrations, and hence automaton transition probabilities; and an output module, capable of controlled release of a short single-stranded DNA molecule. This approach might be applied in vivo to biochemical sensing, genetic engineering and even medical diagnosis and treatment. As a proof of principle we programmed the computer to identify and analyse mRNA of disease-related genes18–22 associated with models of small-cell lung cancer and prostate cancer, and to produce a single-stranded DNA molecule modelled after an anticancer drug. PMID:15116117

  7. Automatic image analysis and spot classification for detection of pathogenic Escherichia coli on glass slide DNA microarrays

    USDA-ARS?s Scientific Manuscript database

    A computer algorithm was created to inspect scanned images from DNA microarray slides developed to rapidly detect and genotype E. Coli O157 virulent strains. The algorithm computes centroid locations for signal and background pixels in RGB space and defines a plane perpendicular to the line connect...

  8. Making Ordered DNA and Protein Structures from Computer-Printed Transparency Film Cut-Outs

    ERIC Educational Resources Information Center

    Jittivadhna, Karnyupha; Ruenwongsa, Pintip; Panijpan, Bhinyo

    2009-01-01

    Instructions are given for building physical scale models of ordered structures of B-form DNA, protein [alpha]-helix, and parallel and antiparallel protein [beta]-pleated sheets made from colored computer printouts designed for transparency film sheets. Cut-outs from these sheets are easily assembled. Conventional color coding for atoms are used…

  9. WE-DE-202-00: Connecting Radiation Physics with Computational Biology

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    NONE

    Radiation therapy for the treatment of cancer has been established as a highly precise and effective way to eradicate a localized region of diseased tissue. To achieve further significant gains in the therapeutic ratio, we need to move towards biologically optimized treatment planning. To achieve this goal, we need to understand how the radiation-type dependent patterns of induced energy depositions within the cell (physics) connect via molecular, cellular and tissue reactions to treatment outcome such as tumor control and undesirable effects on normal tissue. Several computational biology approaches have been developed connecting physics to biology. Monte Carlo simulations are themore » most accurate method to calculate physical dose distributions at the nanometer scale, however simulations at the DNA scale are slow and repair processes are generally not simulated. Alternative models that rely on the random formation of individual DNA lesions within one or two turns of the DNA have been shown to reproduce the clusters of DNA lesions, including single strand breaks (SSBs), double strand breaks (DSBs) without the need for detailed track structure simulations. Efficient computational simulations of initial DNA damage induction facilitate computational modeling of DNA repair and other molecular and cellular processes. Mechanistic, multiscale models provide a useful conceptual framework to test biological hypotheses and help connect fundamental information about track structure and dosimetry at the sub-cellular level to dose-response effects on larger scales. In this symposium we will learn about the current state of the art of computational approaches estimating radiation damage at the cellular and sub-cellular scale. How can understanding the physics interactions at the DNA level be used to predict biological outcome? We will discuss if and how such calculations are relevant to advance our understanding of radiation damage and its repair, or, if the underlying biological processes are too complex for a mechanistic approach. Can computer simulations be used to guide future biological research? We will debate the feasibility of explaining biology from a physicists’ perspective. Learning Objectives: Understand the potential applications and limitations of computational methods for dose-response modeling at the molecular, cellular and tissue levels Learn about mechanism of action underlying the induction, repair and biological processing of damage to DNA and other constituents Understand how effects and processes at one biological scale impact on biological processes and outcomes on other scales J. Schuemann, NCI/NIH grantsS. McMahon, Funding: European Commission FP7 (grant EC FP7 MC-IOF-623630)« less

  10. Electron Nuclear Dynamics Simulations of Proton Cancer Therapy Reactions: Water Radiolysis and Proton- and Electron-Induced DNA Damage in Computational Prototypes.

    PubMed

    Teixeira, Erico S; Uppulury, Karthik; Privett, Austin J; Stopera, Christopher; McLaurin, Patrick M; Morales, Jorge A

    2018-05-06

    Proton cancer therapy (PCT) utilizes high-energy proton projectiles to obliterate cancerous tumors with low damage to healthy tissues and without the side effects of X-ray therapy. The healing action of the protons results from their damage on cancerous cell DNA. Despite established clinical use, the chemical mechanisms of PCT reactions at the molecular level remain elusive. This situation prevents a rational design of PCT that can maximize its therapeutic power and minimize its side effects. The incomplete characterization of PCT reactions is partially due to the health risks associated with experimental/clinical techniques applied to human subjects. To overcome this situation, we are conducting time-dependent and non-adiabatic computer simulations of PCT reactions with the electron nuclear dynamics (END) method. Herein, we present a review of our previous and new END research on three fundamental types of PCT reactions: water radiolysis reactions, proton-induced DNA damage and electron-induced DNA damage. These studies are performed on the computational prototypes: proton + H₂O clusters, proton + DNA/RNA bases and + cytosine nucleotide, and electron + cytosine nucleotide + H₂O. These simulations provide chemical mechanisms and dynamical properties of the selected PCT reactions in comparison with available experimental and alternative computational results.

  11. An affinity-structure database of helix-turn-helix: DNA complexes with a universal coordinate system

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    AlQuraishi, Mohammed; Tang, Shengdong; Xia, Xide

    Molecular interactions between proteins and DNA molecules underlie many cellular processes, including transcriptional regulation, chromosome replication, and nucleosome positioning. Computational analyses of protein-DNA interactions rely on experimental data characterizing known protein-DNA interactions structurally and biochemically. While many databases exist that contain either structural or biochemical data, few integrate these two data sources in a unified fashion. Such integration is becoming increasingly critical with the rapid growth of structural and biochemical data, and the emergence of algorithms that rely on the synthesis of multiple data types to derive computational models of molecular interactions. We have developed an integrated affinity-structure database inmore » which the experimental and quantitative DNA binding affinities of helix-turn-helix proteins are mapped onto the crystal structures of the corresponding protein-DNA complexes. This database provides access to: (i) protein-DNA structures, (ii) quantitative summaries of protein-DNA binding affinities using position weight matrices, and (iii) raw experimental data of protein-DNA binding instances. Critically, this database establishes a correspondence between experimental structural data and quantitative binding affinity data at the single basepair level. Furthermore, we present a novel alignment algorithm that structurally aligns the protein-DNA complexes in the database and creates a unified residue-level coordinate system for comparing the physico-chemical environments at the interface between complexes. Using this unified coordinate system, we compute the statistics of atomic interactions at the protein-DNA interface of helix-turn-helix proteins. We provide an interactive website for visualization, querying, and analyzing this database, and a downloadable version to facilitate programmatic analysis. Lastly, this database will facilitate the analysis of protein-DNA interactions and the development of programmatic computational methods that capitalize on integration of structural and biochemical datasets. The database can be accessed at http://ProteinDNA.hms.harvard.edu.« less

  12. An affinity-structure database of helix-turn-helix: DNA complexes with a universal coordinate system

    DOE PAGES

    AlQuraishi, Mohammed; Tang, Shengdong; Xia, Xide

    2015-11-19

    Molecular interactions between proteins and DNA molecules underlie many cellular processes, including transcriptional regulation, chromosome replication, and nucleosome positioning. Computational analyses of protein-DNA interactions rely on experimental data characterizing known protein-DNA interactions structurally and biochemically. While many databases exist that contain either structural or biochemical data, few integrate these two data sources in a unified fashion. Such integration is becoming increasingly critical with the rapid growth of structural and biochemical data, and the emergence of algorithms that rely on the synthesis of multiple data types to derive computational models of molecular interactions. We have developed an integrated affinity-structure database inmore » which the experimental and quantitative DNA binding affinities of helix-turn-helix proteins are mapped onto the crystal structures of the corresponding protein-DNA complexes. This database provides access to: (i) protein-DNA structures, (ii) quantitative summaries of protein-DNA binding affinities using position weight matrices, and (iii) raw experimental data of protein-DNA binding instances. Critically, this database establishes a correspondence between experimental structural data and quantitative binding affinity data at the single basepair level. Furthermore, we present a novel alignment algorithm that structurally aligns the protein-DNA complexes in the database and creates a unified residue-level coordinate system for comparing the physico-chemical environments at the interface between complexes. Using this unified coordinate system, we compute the statistics of atomic interactions at the protein-DNA interface of helix-turn-helix proteins. We provide an interactive website for visualization, querying, and analyzing this database, and a downloadable version to facilitate programmatic analysis. Lastly, this database will facilitate the analysis of protein-DNA interactions and the development of programmatic computational methods that capitalize on integration of structural and biochemical datasets. The database can be accessed at http://ProteinDNA.hms.harvard.edu.« less

  13. A spatially localized architecture for fast and modular DNA computing

    NASA Astrophysics Data System (ADS)

    Chatterjee, Gourab; Dalchau, Neil; Muscat, Richard A.; Phillips, Andrew; Seelig, Georg

    2017-09-01

    Cells use spatial constraints to control and accelerate the flow of information in enzyme cascades and signalling networks. Synthetic silicon-based circuitry similarly relies on spatial constraints to process information. Here, we show that spatial organization can be a similarly powerful design principle for overcoming limitations of speed and modularity in engineered molecular circuits. We create logic gates and signal transmission lines by spatially arranging reactive DNA hairpins on a DNA origami. Signal propagation is demonstrated across transmission lines of different lengths and orientations and logic gates are modularly combined into circuits that establish the universality of our approach. Because reactions preferentially occur between neighbours, identical DNA hairpins can be reused across circuits. Co-localization of circuit elements decreases computation time from hours to minutes compared to circuits with diffusible components. Detailed computational models enable predictive circuit design. We anticipate our approach will motivate using spatial constraints for future molecular control circuit designs.

  14. DNA Compass: a secure, client-side site for navigating personal genetic information

    PubMed Central

    Curnin, Charles; Gordon, Assaf; Erlich, Yaniv

    2017-01-01

    Abstract Motivation: Millions of individuals have access to raw genomic data using direct-to-consumer companies. The advent of large-scale sequencing projects, such as the Precision Medicine Initiative, will further increase the number of individuals with access to their own genomic information. However, querying genomic data requires a computer terminal and computational skill to analyze the data—an impediment for the general public. Results: DNA Compass is a website designed to empower the public by enabling simple navigation of personal genomic data. Users can query the status of their genomic variants for over 1658 markers or tens of millions of documented single nucleotide polymorphisms (SNPs). DNA Compass presents the relevant genotypes of the user side-by-side with explanatory scientific resources. The genotype data never leaves the user’s computer, a feature that provides improved security and performance. More than 12 000 unique users, mainly from the general genetic genealogy community, have already used DNA Compass, demonstrating its utility. Availability and Implementation: DNA Compass is freely available on https://compass.dna.land. Contact: yaniv@cs.columbia.edu PMID:28334237

  15. A new method for enhancer prediction based on deep belief network.

    PubMed

    Bu, Hongda; Gan, Yanglan; Wang, Yang; Zhou, Shuigeng; Guan, Jihong

    2017-10-16

    Studies have shown that enhancers are significant regulatory elements to play crucial roles in gene expression regulation. Since enhancers are unrelated to the orientation and distance to their target genes, it is a challenging mission for scholars and researchers to accurately predicting distal enhancers. In the past years, with the high-throughout ChiP-seq technologies development, several computational techniques emerge to predict enhancers using epigenetic or genomic features. Nevertheless, the inconsistency of computational models across different cell-lines and the unsatisfactory prediction performance call for further research in this area. Here, we propose a new Deep Belief Network (DBN) based computational method for enhancer prediction, which is called EnhancerDBN. This method combines diverse features, composed of DNA sequence compositional features, DNA methylation and histone modifications. Our computational results indicate that 1) EnhancerDBN outperforms 13 existing methods in prediction, and 2) GC content and DNA methylation can serve as relevant features for enhancer prediction. Deep learning is effective in boosting the performance of enhancer prediction.

  16. Computing Life

    ERIC Educational Resources Information Center

    National Institute of General Medical Sciences (NIGMS), 2009

    2009-01-01

    Computer advances now let researchers quickly search through DNA sequences to find gene variations that could lead to disease, simulate how flu might spread through one's school, and design three-dimensional animations of molecules that rival any video game. By teaming computers and biology, scientists can answer new and old questions that could…

  17. Molecular robots with sensors and intelligence.

    PubMed

    Hagiya, Masami; Konagaya, Akihiko; Kobayashi, Satoshi; Saito, Hirohide; Murata, Satoshi

    2014-06-17

    CONSPECTUS: What we can call a molecular robot is a set of molecular devices such as sensors, logic gates, and actuators integrated into a consistent system. The molecular robot is supposed to react autonomously to its environment by receiving molecular signals and making decisions by molecular computation. Building such a system has long been a dream of scientists; however, despite extensive efforts, systems having all three functions (sensing, computation, and actuation) have not been realized yet. This Account introduces an ongoing research project that focuses on the development of molecular robotics funded by MEXT (Ministry of Education, Culture, Sports, Science and Technology, Japan). This 5 year project started in July 2012 and is titled "Development of Molecular Robots Equipped with Sensors and Intelligence". The major issues in the field of molecular robotics all correspond to a feedback (i.e., plan-do-see) cycle of a robotic system. More specifically, these issues are (1) developing molecular sensors capable of handling a wide array of signals, (2) developing amplification methods of signals to drive molecular computing devices, (3) accelerating molecular computing, (4) developing actuators that are controllable by molecular computers, and (5) providing bodies of molecular robots encapsulating the above molecular devices, which implement the conformational changes and locomotion of the robots. In this Account, the latest contributions to the project are reported. There are four research teams in the project that specialize on sensing, intelligence, amoeba-like actuation, and slime-like actuation, respectively. The molecular sensor team is focusing on the development of molecular sensors that can handle a variety of signals. This team is also investigating methods to amplify signals from the molecular sensors. The molecular intelligence team is developing molecular computers and is currently focusing on a new photochemical technology for accelerating DNA-based computations. They also introduce novel computational models behind various kinds of molecular computers necessary for designing such computers. The amoeba robot team aims at constructing amoeba-like robots. The team is trying to incorporate motor proteins, including kinesin and microtubules (MTs), for use as actuators implemented in a liposomal compartment as a robot body. They are also developing a methodology to link DNA-based computation and molecular motor control. The slime robot team focuses on the development of slime-like robots. The team is evaluating various gels, including DNA gel and BZ gel, for use as actuators, as well as the body material to disperse various molecular devices in it. They also try to control the gel actuators by DNA signals coming from molecular computers.

  18. Computer-aided engineering system for design of sequence arrays and lithographic masks

    DOEpatents

    Hubbell, Earl A.; Lipshutz, Robert J.; Morris, Macdonald S.; Winkler, James L.

    1997-01-01

    An improved set of computer tools for forming arrays. According to one aspect of the invention, a computer system is used to select probes and design the layout of an array of DNA or other polymers with certain beneficial characteristics. According to another aspect of the invention, a computer system uses chip design files to design and/or generate lithographic masks.

  19. Intrinsically bent DNA in replication origins and gene promoters.

    PubMed

    Gimenes, F; Takeda, K I; Fiorini, A; Gouveia, F S; Fernandez, M A

    2008-06-24

    Intrinsically bent DNA is an alternative conformation of the DNA molecule caused by the presence of dA/dT tracts, 2 to 6 bp long, in a helical turn phase DNA or with multiple intervals of 10 to 11 bp. Other than flexibility, intrinsic bending sites induce DNA curvature in particular chromosome regions such as replication origins and promoters. Intrinsically bent DNA sites are important in initiating DNA replication, and are sometimes found near to regions associated with the nuclear matrix. Many methods have been developed to localize bent sites, for example, circular permutation, computational analysis, and atomic force microscopy. This review discusses intrinsically bent DNA sites associated with replication origins and gene promoter regions in prokaryote and eukaryote cells. We also describe methods for identifying bent DNA sites for circular permutation and computational analysis.

  20. COMPUTATIONAL MODELING OF SIGNALING PATHWAYS MEDIATING CELL CYCLE AND APOPTOTIC RESPONSES TO IONIZING RADIATION MEDIATED DNA DAMAGE

    EPA Science Inventory

    Demonstrated of the use of a computational systems biology approach to model dose response relationships. Also discussed how the biologically motivated dose response models have only limited reference to the underlying molecular level. Discussed the integration of Computational S...

  1. RNA nanotechnology for computer design and in vivo computation

    PubMed Central

    Qiu, Meikang; Khisamutdinov, Emil; Zhao, Zhengyi; Pan, Cheryl; Choi, Jeong-Woo; Leontis, Neocles B.; Guo, Peixuan

    2013-01-01

    Molecular-scale computing has been explored since 1989 owing to the foreseeable limitation of Moore's law for silicon-based computation devices. With the potential of massive parallelism, low energy consumption and capability of working in vivo, molecular-scale computing promises a new computational paradigm. Inspired by the concepts from the electronic computer, DNA computing has realized basic Boolean functions and has progressed into multi-layered circuits. Recently, RNA nanotechnology has emerged as an alternative approach. Owing to the newly discovered thermodynamic stability of a special RNA motif (Shu et al. 2011 Nat. Nanotechnol. 6, 658–667 (doi:10.1038/nnano.2011.105)), RNA nanoparticles are emerging as another promising medium for nanodevice and nanomedicine as well as molecular-scale computing. Like DNA, RNA sequences can be designed to form desired secondary structures in a straightforward manner, but RNA is structurally more versatile and more thermodynamically stable owing to its non-canonical base-pairing, tertiary interactions and base-stacking property. A 90-nucleotide RNA can exhibit 490 nanostructures, and its loops and tertiary architecture can serve as a mounting dovetail that eliminates the need for external linking dowels. Its enzymatic and fluorogenic activity creates diversity in computational design. Varieties of small RNA can work cooperatively, synergistically or antagonistically to carry out computational logic circuits. The riboswitch and enzymatic ribozyme activities and its special in vivo attributes offer a great potential for in vivo computation. Unique features in transcription, termination, self-assembly, self-processing and acid resistance enable in vivo production of RNA nanoparticles that harbour various regulators for intracellular manipulation. With all these advantages, RNA computation is promising, but it is still in its infancy. Many challenges still exist. Collaborations between RNA nanotechnologists and computer scientists are necessary to advance this nascent technology. PMID:24000362

  2. RNA nanotechnology for computer design and in vivo computation.

    PubMed

    Qiu, Meikang; Khisamutdinov, Emil; Zhao, Zhengyi; Pan, Cheryl; Choi, Jeong-Woo; Leontis, Neocles B; Guo, Peixuan

    2013-10-13

    Molecular-scale computing has been explored since 1989 owing to the foreseeable limitation of Moore's law for silicon-based computation devices. With the potential of massive parallelism, low energy consumption and capability of working in vivo, molecular-scale computing promises a new computational paradigm. Inspired by the concepts from the electronic computer, DNA computing has realized basic Boolean functions and has progressed into multi-layered circuits. Recently, RNA nanotechnology has emerged as an alternative approach. Owing to the newly discovered thermodynamic stability of a special RNA motif (Shu et al. 2011 Nat. Nanotechnol. 6, 658-667 (doi:10.1038/nnano.2011.105)), RNA nanoparticles are emerging as another promising medium for nanodevice and nanomedicine as well as molecular-scale computing. Like DNA, RNA sequences can be designed to form desired secondary structures in a straightforward manner, but RNA is structurally more versatile and more thermodynamically stable owing to its non-canonical base-pairing, tertiary interactions and base-stacking property. A 90-nucleotide RNA can exhibit 4⁹⁰ nanostructures, and its loops and tertiary architecture can serve as a mounting dovetail that eliminates the need for external linking dowels. Its enzymatic and fluorogenic activity creates diversity in computational design. Varieties of small RNA can work cooperatively, synergistically or antagonistically to carry out computational logic circuits. The riboswitch and enzymatic ribozyme activities and its special in vivo attributes offer a great potential for in vivo computation. Unique features in transcription, termination, self-assembly, self-processing and acid resistance enable in vivo production of RNA nanoparticles that harbour various regulators for intracellular manipulation. With all these advantages, RNA computation is promising, but it is still in its infancy. Many challenges still exist. Collaborations between RNA nanotechnologists and computer scientists are necessary to advance this nascent technology.

  3. Programmable DNA-Mediated Multitasking Processor.

    PubMed

    Shu, Jian-Jun; Wang, Qi-Wen; Yong, Kian-Yan; Shao, Fangwei; Lee, Kee Jin

    2015-04-30

    Because of DNA appealing features as perfect material, including minuscule size, defined structural repeat and rigidity, programmable DNA-mediated processing is a promising computing paradigm, which employs DNAs as information storing and processing substrates to tackle the computational problems. The massive parallelism of DNA hybridization exhibits transcendent potential to improve multitasking capabilities and yield a tremendous speed-up over the conventional electronic processors with stepwise signal cascade. As an example of multitasking capability, we present an in vitro programmable DNA-mediated optimal route planning processor as a functional unit embedded in contemporary navigation systems. The novel programmable DNA-mediated processor has several advantages over the existing silicon-mediated methods, such as conducting massive data storage and simultaneous processing via much fewer materials than conventional silicon devices.

  4. Evaluating the role of coherent delocalized phonon-like modes in DNA cyclization

    DOE PAGES

    Alexandrov, Ludmil B.; Rasmussen, Kim Ø.; Bishop, Alan R.; ...

    2017-08-29

    The innate flexibility of a DNA sequence is quantified by the Jacobson-Stockmayer’s J-factor, which measures the propensity for DNA loop formation. Recent studies of ultra-short DNA sequences revealed a discrepancy of up to six orders of magnitude between experimentally measured and theoretically predicted J-factors. These large differences suggest that, in addition to the elastic moduli of the double helix, other factors contribute to loop formation. We develop a new theoretical model that explores how coherent delocalized phonon-like modes in DNA provide single-stranded ”flexible hinges” to assist in loop formation. We also combine the Czapla-Swigon-Olson structural model of DNA with ourmore » extended Peyrard-Bishop-Dauxois model and, without changing any of the parameters of the two models, apply this new computational framework to 86 experimentally characterized DNA sequences. Our results demonstrate that the new computational framework can predict J-factors within an order of magnitude of experimental measurements for most ultra-short DNA sequences, while continuing to accurately describe the J-factors of longer sequences. Furthermore, we demonstrate that our computational framework can be used to describe the cyclization of DNA sequences that contain a base pair mismatch. Overall, our results support the conclusion that coherent delocalized phonon-like modes play an important role in DNA cyclization.« less

  5. Evaluating the role of coherent delocalized phonon-like modes in DNA cyclization

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Alexandrov, Ludmil B.; Rasmussen, Kim Ø.; Bishop, Alan R.

    The innate flexibility of a DNA sequence is quantified by the Jacobson-Stockmayer’s J-factor, which measures the propensity for DNA loop formation. Recent studies of ultra-short DNA sequences revealed a discrepancy of up to six orders of magnitude between experimentally measured and theoretically predicted J-factors. These large differences suggest that, in addition to the elastic moduli of the double helix, other factors contribute to loop formation. We develop a new theoretical model that explores how coherent delocalized phonon-like modes in DNA provide single-stranded ”flexible hinges” to assist in loop formation. We also combine the Czapla-Swigon-Olson structural model of DNA with ourmore » extended Peyrard-Bishop-Dauxois model and, without changing any of the parameters of the two models, apply this new computational framework to 86 experimentally characterized DNA sequences. Our results demonstrate that the new computational framework can predict J-factors within an order of magnitude of experimental measurements for most ultra-short DNA sequences, while continuing to accurately describe the J-factors of longer sequences. Furthermore, we demonstrate that our computational framework can be used to describe the cyclization of DNA sequences that contain a base pair mismatch. Overall, our results support the conclusion that coherent delocalized phonon-like modes play an important role in DNA cyclization.« less

  6. WE-DE-202-01: Connecting Nanoscale Physics to Initial DNA Damage Through Track Structure Simulations

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Schuemann, J.

    Radiation therapy for the treatment of cancer has been established as a highly precise and effective way to eradicate a localized region of diseased tissue. To achieve further significant gains in the therapeutic ratio, we need to move towards biologically optimized treatment planning. To achieve this goal, we need to understand how the radiation-type dependent patterns of induced energy depositions within the cell (physics) connect via molecular, cellular and tissue reactions to treatment outcome such as tumor control and undesirable effects on normal tissue. Several computational biology approaches have been developed connecting physics to biology. Monte Carlo simulations are themore » most accurate method to calculate physical dose distributions at the nanometer scale, however simulations at the DNA scale are slow and repair processes are generally not simulated. Alternative models that rely on the random formation of individual DNA lesions within one or two turns of the DNA have been shown to reproduce the clusters of DNA lesions, including single strand breaks (SSBs), double strand breaks (DSBs) without the need for detailed track structure simulations. Efficient computational simulations of initial DNA damage induction facilitate computational modeling of DNA repair and other molecular and cellular processes. Mechanistic, multiscale models provide a useful conceptual framework to test biological hypotheses and help connect fundamental information about track structure and dosimetry at the sub-cellular level to dose-response effects on larger scales. In this symposium we will learn about the current state of the art of computational approaches estimating radiation damage at the cellular and sub-cellular scale. How can understanding the physics interactions at the DNA level be used to predict biological outcome? We will discuss if and how such calculations are relevant to advance our understanding of radiation damage and its repair, or, if the underlying biological processes are too complex for a mechanistic approach. Can computer simulations be used to guide future biological research? We will debate the feasibility of explaining biology from a physicists’ perspective. Learning Objectives: Understand the potential applications and limitations of computational methods for dose-response modeling at the molecular, cellular and tissue levels Learn about mechanism of action underlying the induction, repair and biological processing of damage to DNA and other constituents Understand how effects and processes at one biological scale impact on biological processes and outcomes on other scales J. Schuemann, NCI/NIH grantsS. McMahon, Funding: European Commission FP7 (grant EC FP7 MC-IOF-623630)« less

  7. Experimental and computational studies on the effects of valganciclovir as an antiviral drug on calf thymus DNA.

    PubMed

    Shahabadi, Nahid; Pourfoulad, Mehdi; Moghadam, Neda Hosseinpour

    2017-01-02

    DNA-binding properties of an antiviral drug, valganciclovir (valcyte) was studied by using emission, absorption, circular dichroism, viscosity, differential pulse voltammetry, fluorescence techniques, and computational studies. The drug bound to calf thymus DNA (ct-DNA) in a groove-binding mode. The calculated binding constant of UV-vis, K a , is comparable to groove-binding drugs. Competitive fluorimetric studies with Hoechst 33258 showed that valcyte could displace the DNA-bound Hoechst 33258. The drug could not displace intercalated methylene blue from DNA double helix. Furthermore, the induced detectable changes in the CD spectrum of ct-DNA as well as changes in its viscosity confirm the groove-binding mode. In addition, an integrated molecular docking was employed to further investigate the binding interactions between valcyte and calf thymus DNA.

  8. Computer-aided engineering system for design of sequence arrays and lithographic masks

    DOEpatents

    Hubbell, Earl A.; Morris, MacDonald S.; Winkler, James L.

    1999-01-05

    An improved set of computer tools for forming arrays. According to one aspect of the invention, a computer system (100) is used to select probes and design the layout of an array of DNA or other polymers with certain beneficial characteristics. According to another aspect of the invention, a computer system uses chip design files (104) to design and/or generate lithographic masks (110).

  9. Computer-aided engineering system for design of sequence arrays and lithographic masks

    DOEpatents

    Hubbell, Earl A.; Morris, MacDonald S.; Winkler, James L.

    1996-01-01

    An improved set of computer tools for forming arrays. According to one aspect of the invention, a computer system (100) is used to select probes and design the layout of an array of DNA or other polymers with certain beneficial characteristics. According to another aspect of the invention, a computer system uses chip design files (104) to design and/or generate lithographic masks (110).

  10. Computer-aided engineering system for design of sequence arrays and lithographic masks

    DOEpatents

    Hubbell, E.A.; Morris, M.S.; Winkler, J.L.

    1999-01-05

    An improved set of computer tools for forming arrays is disclosed. According to one aspect of the invention, a computer system is used to select probes and design the layout of an array of DNA or other polymers with certain beneficial characteristics. According to another aspect of the invention, a computer system uses chip design files to design and/or generate lithographic masks. 14 figs.

  11. Computer-aided engineering system for design of sequence arrays and lithographic masks

    DOEpatents

    Hubbell, E.A.; Lipshutz, R.J.; Morris, M.S.; Winkler, J.L.

    1997-01-14

    An improved set of computer tools for forming arrays is disclosed. According to one aspect of the invention, a computer system is used to select probes and design the layout of an array of DNA or other polymers with certain beneficial characteristics. According to another aspect of the invention, a computer system uses chip design files to design and/or generate lithographic masks. 14 figs.

  12. Computer-aided engineering system for design of sequence arrays and lithographic masks

    DOEpatents

    Hubbell, E.A.; Morris, M.S.; Winkler, J.L.

    1996-11-05

    An improved set of computer tools for forming arrays is disclosed. According to one aspect of the invention, a computer system is used to select probes and design the layout of an array of DNA or other polymers with certain beneficial characteristics. According to another aspect of the invention, a computer system uses chip design files to design and/or generate lithographic masks. 14 figs.

  13. Logic integration of mRNA signals by an RNAi-based molecular computer.

    PubMed

    Xie, Zhen; Liu, Siyuan John; Bleris, Leonidas; Benenson, Yaakov

    2010-05-01

    Synthetic in vivo molecular 'computers' could rewire biological processes by establishing programmable, non-native pathways between molecular signals and biological responses. Multiple molecular computer prototypes have been shown to work in simple buffered solutions. Many of those prototypes were made of DNA strands and performed computations using cycles of annealing-digestion or strand displacement. We have previously introduced RNA interference (RNAi)-based computing as a way of implementing complex molecular logic in vivo. Because it also relies on nucleic acids for its operation, RNAi computing could benefit from the tools developed for DNA systems. However, these tools must be harnessed to produce bioactive components and be adapted for harsh operating environments that reflect in vivo conditions. In a step toward this goal, we report the construction and implementation of biosensors that 'transduce' mRNA levels into bioactive, small interfering RNA molecules via RNA strand exchange in a cell-free Drosophila embryo lysate, a step beyond simple buffered environments. We further integrate the sensors with our RNAi 'computational' module to evaluate two-input logic functions on mRNA concentrations. Our results show how RNA strand exchange can expand the utility of RNAi computing and point toward the possibility of using strand exchange in a native biological setting.

  14. GUI to Facilitate Research on Biological Damage from Radiation

    NASA Technical Reports Server (NTRS)

    Cucinotta, Frances A.; Ponomarev, Artem Lvovich

    2010-01-01

    A graphical-user-interface (GUI) computer program has been developed to facilitate research on the damage caused by highly energetic particles and photons impinging on living organisms. The program brings together, into one computational workspace, computer codes that have been developed over the years, plus codes that will be developed during the foreseeable future, to address diverse aspects of radiation damage. These include codes that implement radiation-track models, codes for biophysical models of breakage of deoxyribonucleic acid (DNA) by radiation, pattern-recognition programs for extracting quantitative information from biological assays, and image-processing programs that aid visualization of DNA breaks. The radiation-track models are based on transport models of interactions of radiation with matter and solution of the Boltzmann transport equation by use of both theoretical and numerical models. The biophysical models of breakage of DNA by radiation include biopolymer coarse-grained and atomistic models of DNA, stochastic- process models of deposition of energy, and Markov-based probabilistic models of placement of double-strand breaks in DNA. The program is designed for use in the NT, 95, 98, 2000, ME, and XP variants of the Windows operating system.

  15. The generative power of weighted one-sided and regular sticker systems

    NASA Astrophysics Data System (ADS)

    Siang, Gan Yee; Heng, Fong Wan; Sarmin, Nor Haniza; Turaev, Sherzod

    2014-06-01

    Sticker systems were introduced in 1998 as one of the DNA computing models by using the recombination behavior of DNA molecules. The Watson-Crick complementary principle of DNA molecules is abstractly used in the sticker systems to perform the computation of sticker systems. In this paper, the generative power of weighted one-sided sticker systems and weighted regular sticker systems are investigated. Moreover, the relationship of the families of languages generated by these two variants of sticker systems to the Chomsky hierarchy is also presented.

  16. Wormlike Chain Theory and Bending of Short DNA

    NASA Astrophysics Data System (ADS)

    Mazur, Alexey K.

    2007-05-01

    The probability distributions for bending angles in double helical DNA obtained in all-atom molecular dynamics simulations are compared with theoretical predictions. The computed distributions remarkably agree with the wormlike chain theory and qualitatively differ from predictions of the subelastic chain model. The computed data exhibit only small anomalies in the apparent flexibility of short DNA and cannot account for the recently reported AFM data. It is possible that the current atomistic DNA models miss some essential mechanisms of DNA bending on intermediate length scales. Analysis of bent DNA structures reveal, however, that the bending motion is structurally heterogeneous and directionally anisotropic on the length scales where the experimental anomalies were detected. These effects are essential for interpretation of the experimental data and they also can be responsible for the apparent discrepancy.

  17. DNA-programmed dynamic assembly of quantum dots for molecular computation.

    PubMed

    He, Xuewen; Li, Zhi; Chen, Muzi; Ma, Nan

    2014-12-22

    Despite the widespread use of quantum dots (QDs) for biosensing and bioimaging, QD-based bio-interfaceable and reconfigurable molecular computing systems have not yet been realized. DNA-programmed dynamic assembly of multi-color QDs is presented for the construction of a new class of fluorescence resonance energy transfer (FRET)-based QD computing systems. A complete set of seven elementary logic gates (OR, AND, NOR, NAND, INH, XOR, XNOR) are realized using a series of binary and ternary QD complexes operated by strand displacement reactions. The integration of different logic gates into a half-adder circuit for molecular computation is also demonstrated. This strategy is quite versatile and straightforward for logical operations and would pave the way for QD-biocomputing-based intelligent molecular diagnostics. © 2014 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.

  18. A Parallel Biological Optimization Algorithm to Solve the Unbalanced Assignment Problem Based on DNA Molecular Computing.

    PubMed

    Wang, Zhaocai; Pu, Jun; Cao, Liling; Tan, Jian

    2015-10-23

    The unbalanced assignment problem (UAP) is to optimally resolve the problem of assigning n jobs to m individuals (m < n), such that minimum cost or maximum profit obtained. It is a vitally important Non-deterministic Polynomial (NP) complete problem in operation management and applied mathematics, having numerous real life applications. In this paper, we present a new parallel DNA algorithm for solving the unbalanced assignment problem using DNA molecular operations. We reasonably design flexible-length DNA strands representing different jobs and individuals, take appropriate steps, and get the solutions of the UAP in the proper length range and O(mn) time. We extend the application of DNA molecular operations and simultaneity to simplify the complexity of the computation.

  19. WE-DE-202-02: Are Track Structure Simulations Truly Needed for Radiobiology at the Cellular and Tissue Levels?

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Stewart, R.

    Radiation therapy for the treatment of cancer has been established as a highly precise and effective way to eradicate a localized region of diseased tissue. To achieve further significant gains in the therapeutic ratio, we need to move towards biologically optimized treatment planning. To achieve this goal, we need to understand how the radiation-type dependent patterns of induced energy depositions within the cell (physics) connect via molecular, cellular and tissue reactions to treatment outcome such as tumor control and undesirable effects on normal tissue. Several computational biology approaches have been developed connecting physics to biology. Monte Carlo simulations are themore » most accurate method to calculate physical dose distributions at the nanometer scale, however simulations at the DNA scale are slow and repair processes are generally not simulated. Alternative models that rely on the random formation of individual DNA lesions within one or two turns of the DNA have been shown to reproduce the clusters of DNA lesions, including single strand breaks (SSBs), double strand breaks (DSBs) without the need for detailed track structure simulations. Efficient computational simulations of initial DNA damage induction facilitate computational modeling of DNA repair and other molecular and cellular processes. Mechanistic, multiscale models provide a useful conceptual framework to test biological hypotheses and help connect fundamental information about track structure and dosimetry at the sub-cellular level to dose-response effects on larger scales. In this symposium we will learn about the current state of the art of computational approaches estimating radiation damage at the cellular and sub-cellular scale. How can understanding the physics interactions at the DNA level be used to predict biological outcome? We will discuss if and how such calculations are relevant to advance our understanding of radiation damage and its repair, or, if the underlying biological processes are too complex for a mechanistic approach. Can computer simulations be used to guide future biological research? We will debate the feasibility of explaining biology from a physicists’ perspective. Learning Objectives: Understand the potential applications and limitations of computational methods for dose-response modeling at the molecular, cellular and tissue levels Learn about mechanism of action underlying the induction, repair and biological processing of damage to DNA and other constituents Understand how effects and processes at one biological scale impact on biological processes and outcomes on other scales J. Schuemann, NCI/NIH grantsS. McMahon, Funding: European Commission FP7 (grant EC FP7 MC-IOF-623630)« less

  20. DOE Office of Scientific and Technical Information (OSTI.GOV)

    McMahon, S.

    Radiation therapy for the treatment of cancer has been established as a highly precise and effective way to eradicate a localized region of diseased tissue. To achieve further significant gains in the therapeutic ratio, we need to move towards biologically optimized treatment planning. To achieve this goal, we need to understand how the radiation-type dependent patterns of induced energy depositions within the cell (physics) connect via molecular, cellular and tissue reactions to treatment outcome such as tumor control and undesirable effects on normal tissue. Several computational biology approaches have been developed connecting physics to biology. Monte Carlo simulations are themore » most accurate method to calculate physical dose distributions at the nanometer scale, however simulations at the DNA scale are slow and repair processes are generally not simulated. Alternative models that rely on the random formation of individual DNA lesions within one or two turns of the DNA have been shown to reproduce the clusters of DNA lesions, including single strand breaks (SSBs), double strand breaks (DSBs) without the need for detailed track structure simulations. Efficient computational simulations of initial DNA damage induction facilitate computational modeling of DNA repair and other molecular and cellular processes. Mechanistic, multiscale models provide a useful conceptual framework to test biological hypotheses and help connect fundamental information about track structure and dosimetry at the sub-cellular level to dose-response effects on larger scales. In this symposium we will learn about the current state of the art of computational approaches estimating radiation damage at the cellular and sub-cellular scale. How can understanding the physics interactions at the DNA level be used to predict biological outcome? We will discuss if and how such calculations are relevant to advance our understanding of radiation damage and its repair, or, if the underlying biological processes are too complex for a mechanistic approach. Can computer simulations be used to guide future biological research? We will debate the feasibility of explaining biology from a physicists’ perspective. Learning Objectives: Understand the potential applications and limitations of computational methods for dose-response modeling at the molecular, cellular and tissue levels Learn about mechanism of action underlying the induction, repair and biological processing of damage to DNA and other constituents Understand how effects and processes at one biological scale impact on biological processes and outcomes on other scales J. Schuemann, NCI/NIH grantsS. McMahon, Funding: European Commission FP7 (grant EC FP7 MC-IOF-623630)« less

  1. Mapping the Space of Genomic Signatures

    PubMed Central

    Kari, Lila; Hill, Kathleen A.; Sayem, Abu S.; Karamichalis, Rallis; Bryans, Nathaniel; Davis, Katelyn; Dattani, Nikesh S.

    2015-01-01

    We propose a computational method to measure and visualize interrelationships among any number of DNA sequences allowing, for example, the examination of hundreds or thousands of complete mitochondrial genomes. An "image distance" is computed for each pair of graphical representations of DNA sequences, and the distances are visualized as a Molecular Distance Map: Each point on the map represents a DNA sequence, and the spatial proximity between any two points reflects the degree of structural similarity between the corresponding sequences. The graphical representation of DNA sequences utilized, Chaos Game Representation (CGR), is genome- and species-specific and can thus act as a genomic signature. Consequently, Molecular Distance Maps could inform species identification, taxonomic classifications and, to a certain extent, evolutionary history. The image distance employed, Structural Dissimilarity Index (DSSIM), implicitly compares the occurrences of oligomers of length up to k (herein k = 9) in DNA sequences. We computed DSSIM distances for more than 5 million pairs of complete mitochondrial genomes, and used Multi-Dimensional Scaling (MDS) to obtain Molecular Distance Maps that visually display the sequence relatedness in various subsets, at different taxonomic levels. This general-purpose method does not require DNA sequence alignment and can thus be used to compare similar or vastly different DNA sequences, genomic or computer-generated, of the same or different lengths. We illustrate potential uses of this approach by applying it to several taxonomic subsets: phylum Vertebrata, (super)kingdom Protista, classes Amphibia-Insecta-Mammalia, class Amphibia, and order Primates. This analysis of an extensive dataset confirms that the oligomer composition of full mtDNA sequences can be a source of taxonomic information. This method also correctly finds the mtDNA sequences most closely related to that of the anatomically modern human (the Neanderthal, the Denisovan, and the chimp), and that the sequence most different from it in this dataset belongs to a cucumber. PMID:26000734

  2. Covalently bound DNA on naked iron oxide nanoparticles: Intelligent colloidal nano-vector for cell transfection.

    PubMed

    Magro, Massimiliano; Martinello, Tiziana; Bonaiuto, Emanuela; Gomiero, Chiara; Baratella, Davide; Zoppellaro, Giorgio; Cozza, Giorgio; Patruno, Marco; Zboril, Radek; Vianello, Fabio

    2017-11-01

    Conversely to common coated iron oxide nanoparticles, novel naked surface active maghemite nanoparticles (SAMNs) can covalently bind DNA. Plasmid (pDNA) harboring the coding gene for GFP was directly chemisorbed onto SAMNs, leading to a novel DNA nanovector (SAMN@pDNA). The spontaneous internalization of SAMN@pDNA into cells was compared with an extensively studied fluorescent SAMN derivative (SAMN@RITC). Moreover, the transfection efficiency of SAMN@pDNA was evaluated and explained by computational model. SAMN@pDNA was prepared and characterized by spectroscopic and computational methods, and molecular dynamic simulation. The size and hydrodynamic properties of SAMN@pDNA and SAMN@RITC were studied by electron transmission microscopy, light scattering and zeta-potential. The two nanomaterials were tested by confocal scanning microscopy on equine peripheral blood-derived mesenchymal stem cells (ePB-MSCs) and GFP expression by SAMN@pDNA was determined. Nanomaterials characterized by similar hydrodynamic properties were successfully internalized and stored into mesenchymal stem cells. Transfection by SAMN@pDNA occurred and GFP expression was higher than lipofectamine procedure, even in the absence of an external magnetic field. A computational model clarified that transfection efficiency can be ascribed to DNA availability inside cells. Direct covalent binding of DNA on naked magnetic nanoparticles led to an extremely robust gene delivery tool. Hydrodynamic and chemical-physical properties of SAMN@pDNA were responsible of the successful uptake by cells and of the efficiency of GFP gene transfection. SAMNs are characterized by colloidal stability, excellent cell uptake, persistence in the host cells, low toxicity and are proposed as novel intelligent DNA nanovectors for efficient cell transfection. Copyright © 2017 Elsevier B.V. All rights reserved.

  3. Modelling of DNA-protein recognition

    NASA Technical Reports Server (NTRS)

    Rein, R.; Garduno, R.; Colombano, S.; Nir, S.; Haydock, K.; Macelroy, R. D.

    1980-01-01

    Computer model-building procedures using stereochemical principles together with theoretical energy calculations appear to be, at this stage, the most promising route toward the elucidation of DNA-protein binding schemes and recognition principles. A review of models and bonding principles is conducted and approaches to modeling are considered, taking into account possible di-hydrogen-bonding schemes between a peptide and a base (or a base pair) of a double-stranded nucleic acid in the major groove, aspects of computer graphic modeling, and a search for isogeometric helices. The energetics of recognition complexes is discussed and several models for peptide DNA recognition are presented.

  4. Computation of marginal distributions of peak-heights in electropherograms for analysing single source and mixture STR DNA samples.

    PubMed

    Cowell, Robert G

    2018-05-04

    Current models for single source and mixture samples, and probabilistic genotyping software based on them used for analysing STR electropherogram data, assume simple probability distributions, such as the gamma distribution, to model the allelic peak height variability given the initial amount of DNA prior to PCR amplification. Here we illustrate how amplicon number distributions, for a model of the process of sample DNA collection and PCR amplification, may be efficiently computed by evaluating probability generating functions using discrete Fourier transforms. Copyright © 2018 Elsevier B.V. All rights reserved.

  5. The Ins and Outs of DNA Fingerprinting the Infectious Fungi

    PubMed Central

    Soll, David R.

    2000-01-01

    DNA fingerprinting methods have evolved as major tools in fungal epidemiology. However, no single method has emerged as the method of choice, and some methods perform better than others at different levels of resolution. In this review, requirements for an effective DNA fingerprinting method are proposed and procedures are described for testing the efficacy of a method. In light of the proposed requirements, the most common methods now being used to DNA fingerprint the infectious fungi are described and assessed. These methods include restriction fragment length polymorphisms (RFLP), RFLP with hybridization probes, randomly amplified polymorphic DNA and other PCR-based methods, electrophoretic karyotyping, and sequencing-based methods. Procedures for computing similarity coefficients, generating phylogenetic trees, and testing the stability of clusters are then described. To facilitate the analysis of DNA fingerprinting data, computer-assisted methods are described. Finally, the problems inherent in the collection of test and control isolates are considered, and DNA fingerprinting studies of strain maintenance during persistent or recurrent infections, microevolution in infecting strains, and the origin of nosocomial infections are assessed in light of the preceding discussion of the ins and outs of DNA fingerprinting. The intent of this review is to generate an awareness of the need to verify the efficacy of each DNA fingerprinting method for the level of genetic relatedness necessary to answer the epidemiological question posed, to use quantitative methods to analyze DNA fingerprint data, to use computer-assisted DNA fingerprint analysis systems to analyze data, and to file data in a form that can be used in the future for retrospective and comparative studies. PMID:10756003

  6. A new fast algorithm for solving the minimum spanning tree problem based on DNA molecules computation.

    PubMed

    Wang, Zhaocai; Huang, Dongmei; Meng, Huajun; Tang, Chengpei

    2013-10-01

    The minimum spanning tree (MST) problem is to find minimum edge connected subsets containing all the vertex of a given undirected graph. It is a vitally important NP-complete problem in graph theory and applied mathematics, having numerous real life applications. Moreover in previous studies, DNA molecular operations usually were used to solve NP-complete head-to-tail path search problems, rarely for NP-hard problems with multi-lateral path solutions result, such as the minimum spanning tree problem. In this paper, we present a new fast DNA algorithm for solving the MST problem using DNA molecular operations. For an undirected graph with n vertex and m edges, we reasonably design flexible length DNA strands representing the vertex and edges, take appropriate steps and get the solutions of the MST problem in proper length range and O(3m+n) time complexity. We extend the application of DNA molecular operations and simultaneity simplify the complexity of the computation. Results of computer simulative experiments show that the proposed method updates some of the best known values with very short time and that the proposed method provides a better performance with solution accuracy over existing algorithms. Copyright © 2013 The Authors. Published by Elsevier Ireland Ltd.. All rights reserved.

  7. Shuffle Optimizer: A Program to Optimize DNA Shuffling for Protein Engineering.

    PubMed

    Milligan, John N; Garry, Daniel J

    2017-01-01

    DNA shuffling is a powerful tool to develop libraries of variants for protein engineering. Here, we present a protocol to use our freely available and easy-to-use computer program, Shuffle Optimizer. Shuffle Optimizer is written in the Python computer language and increases the nucleotide homology between two pieces of DNA desired to be shuffled together without changing the amino acid sequence. In addition we also include sections on optimal primer design for DNA shuffling and library construction, a small-volume ultrasonicator method to create sheared DNA, and finally a method to reassemble the sheared fragments and recover and clone the library. The Shuffle Optimizer program and these protocols will be useful to anyone desiring to perform any of the nucleotide homology-dependent shuffling methods.

  8. The number of reduced alignments between two DNA sequences

    PubMed Central

    2014-01-01

    Background In this study we consider DNA sequences as mathematical strings. Total and reduced alignments between two DNA sequences have been considered in the literature to measure their similarity. Results for explicit representations of some alignments have been already obtained. Results We present exact, explicit and computable formulas for the number of different possible alignments between two DNA sequences and a new formula for a class of reduced alignments. Conclusions A unified approach for a wide class of alignments between two DNA sequences has been provided. The formula is computable and, if complemented by software development, will provide a deeper insight into the theory of sequence alignment and give rise to new comparison methods. AMS Subject Classification Primary 92B05, 33C20, secondary 39A14, 65Q30 PMID:24684679

  9. The effect on cadaver blood DNA identification by the use of targeted and whole body post-mortem computed tomography angiography.

    PubMed

    Rutty, Guy N; Barber, Jade; Amoroso, Jasmin; Morgan, Bruno; Graham, Eleanor A M

    2013-12-01

    Post-mortem computed tomography angiography (PMCTA) involves the injection of contrast agents. This could have both a dilution effect on biological fluid samples and could affect subsequent post-contrast analytical laboratory processes. We undertook a small sample study of 10 targeted and 10 whole body PMCTA cases to consider whether or not these two methods of PMCTA could affect post-PMCTA cadaver blood based DNA identification. We used standard methodology to examine DNA from blood samples obtained before and after the PMCTA procedure. We illustrate that neither of these PMCTA methods had an effect on the alleles called following short tandem repeat based DNA profiling, and therefore the ability to undertake post-PMCTA blood based DNA identification.

  10. A Parallel Biological Optimization Algorithm to Solve the Unbalanced Assignment Problem Based on DNA Molecular Computing

    PubMed Central

    Wang, Zhaocai; Pu, Jun; Cao, Liling; Tan, Jian

    2015-01-01

    The unbalanced assignment problem (UAP) is to optimally resolve the problem of assigning n jobs to m individuals (m < n), such that minimum cost or maximum profit obtained. It is a vitally important Non-deterministic Polynomial (NP) complete problem in operation management and applied mathematics, having numerous real life applications. In this paper, we present a new parallel DNA algorithm for solving the unbalanced assignment problem using DNA molecular operations. We reasonably design flexible-length DNA strands representing different jobs and individuals, take appropriate steps, and get the solutions of the UAP in the proper length range and O(mn) time. We extend the application of DNA molecular operations and simultaneity to simplify the complexity of the computation. PMID:26512650

  11. Harnessing vision for computation.

    PubMed

    Changizi, Mark

    2008-01-01

    Might it be possible to harness the visual system to carry out artificial computations, somewhat akin to how DNA has been harnessed to carry out computation? I provide the beginnings of a research programme attempting to do this. In particular, new techniques are described for building 'visual circuits' (or 'visual software') using wire, NOT, OR, and AND gates in a visual 6modality such that our visual system acts as 'visual hardware' computing the circuit, and generating a resultant perception which is the output.

  12. Searching for SNPs with cloud computing

    PubMed Central

    2009-01-01

    As DNA sequencing outpaces improvements in computer speed, there is a critical need to accelerate tasks like alignment and SNP calling. Crossbow is a cloud-computing software tool that combines the aligner Bowtie and the SNP caller SOAPsnp. Executing in parallel using Hadoop, Crossbow analyzes data comprising 38-fold coverage of the human genome in three hours using a 320-CPU cluster rented from a cloud computing service for about $85. Crossbow is available from http://bowtie-bio.sourceforge.net/crossbow/. PMID:19930550

  13. Computer-aided design of nano-filter construction using DNA self-assembly

    NASA Astrophysics Data System (ADS)

    Mohammadzadegan, Reza; Mohabatkar, Hassan

    2007-01-01

    Computer-aided design plays a fundamental role in both top-down and bottom-up nano-system fabrication. This paper presents a bottom-up nano-filter patterning process based on DNA self-assembly. In this study we designed a new method to construct fully designed nano-filters with the pores between 5 nm and 9 nm in diameter. Our calculations illustrated that by constructing such a nano-filter we would be able to separate many molecules.

  14. geneGIS: Computational Tools for Spatial Analyses of DNA Profiles with Associated Photo-Identification and Telemetry Records of Marine Mammals

    DTIC Science & Technology

    2012-09-30

    computational tools provide the ability to display, browse, select, filter and summarize spatio-temporal relationships of these individual-based...her research assistant at Esri, Shaun Walbridge, and members of the Marine Mammal Institute ( MMI ), including Tomas Follet and Debbie Steel. This...Genomics Laboratory, MMI , OSU. 4 As part of the geneGIS initiative, these SPLASH photo-identification records and the geneSPLASH DNA profiles

  15. Discrete Ramanujan transform for distinguishing the protein coding regions from other regions.

    PubMed

    Hua, Wei; Wang, Jiasong; Zhao, Jian

    2014-01-01

    Based on the study of Ramanujan sum and Ramanujan coefficient, this paper suggests the concepts of discrete Ramanujan transform and spectrum. Using Voss numerical representation, one maps a symbolic DNA strand as a numerical DNA sequence, and deduces the discrete Ramanujan spectrum of the numerical DNA sequence. It is well known that of discrete Fourier power spectrum of protein coding sequence has an important feature of 3-base periodicity, which is widely used for DNA sequence analysis by the technique of discrete Fourier transform. It is performed by testing the signal-to-noise ratio at frequency N/3 as a criterion for the analysis, where N is the length of the sequence. The results presented in this paper show that the property of 3-base periodicity can be only identified as a prominent spike of the discrete Ramanujan spectrum at period 3 for the protein coding regions. The signal-to-noise ratio for discrete Ramanujan spectrum is defined for numerical measurement. Therefore, the discrete Ramanujan spectrum and the signal-to-noise ratio of a DNA sequence can be used for distinguishing the protein coding regions from the noncoding regions. All the exon and intron sequences in whole chromosomes 1, 2, 3 and 4 of Caenorhabditis elegans have been tested and the histograms and tables from the computational results illustrate the reliability of our method. In addition, we have analyzed theoretically and gotten the conclusion that the algorithm for calculating discrete Ramanujan spectrum owns the lower computational complexity and higher computational accuracy. The computational experiments show that the technique by using discrete Ramanujan spectrum for classifying different DNA sequences is a fast and effective method. Copyright © 2014 Elsevier Ltd. All rights reserved.

  16. DCJ-indel and DCJ-substitution distances with distinct operation costs

    PubMed Central

    2013-01-01

    Background Classical approaches to compute the genomic distance are usually limited to genomes with the same content and take into consideration only rearrangements that change the organization of the genome (i.e. positions and orientation of pieces of DNA, number and type of chromosomes, etc.), such as inversions, translocations, fusions and fissions. These operations are generically represented by the double-cut and join (DCJ) operation. The distance between two genomes, in terms of number of DCJ operations, can be computed in linear time. In order to handle genomes with distinct contents, also insertions and deletions of fragments of DNA – named indels – must be allowed. More powerful than an indel is a substitution of a fragment of DNA by another fragment of DNA. Indels and substitutions are called content-modifying operations. It has been shown that both the DCJ-indel and the DCJ-substitution distances can also be computed in linear time, assuming that the same cost is assigned to any DCJ or content-modifying operation. Results In the present study we extend the DCJ-indel and the DCJ-substitution models, considering that the content-modifying cost is distinct from and upper bounded by the DCJ cost, and show that the distance in both models can still be computed in linear time. Although the triangular inequality can be disrupted in both models, we also show how to efficiently fix this problem a posteriori. PMID:23879938

  17. Interaction of anthraquinone anti-cancer drugs with DNA:Experimental and computational quantum chemical study

    NASA Astrophysics Data System (ADS)

    Al-Otaibi, Jamelah S.; Teesdale Spittle, Paul; El Gogary, Tarek M.

    2017-01-01

    Anthraquinones form the basis of several anticancer drugs. Anthraquinones anticancer drugs carry out their cytotoxic activities through their interaction with DNA, and inhibition of topoisomerase II activity. Anthraquinones (AQ4 and AQ4H) were synthesized and studied along with 1,4-DAAQ by computational and experimental tools. The purpose of this study is to shade more light on mechanism of interaction between anthraquinone DNA affinic agents and different types of DNA. This study will lead to gain of information useful for drug design and development. Molecular structures were optimized using DFT B3LYP/6-31 + G(d). Depending on intramolecular hydrogen bonding interactions two conformers of AQ4 were detected and computed as 25.667 kcal/mol apart. Molecular reactivity of the anthraquinone compounds was explored using global and condensed descriptors (electrophilicity and Fukui functions). Molecular docking studies for the inhibition of CDK2 and DNA binding were carried out to explore the anti cancer potency of these drugs. NMR and UV-VIS electronic absorption spectra of anthraquinones/DNA were investigated at the physiological pH. The interaction of the three anthraquinones (AQ4, AQ4H and 1,4-DAAQ) were studied with three DNA (calf thymus DNA, (Poly[dA].Poly[dT]) and (Poly[dG].Poly[dC]). NMR study shows a qualitative pattern of drug/DNA interaction in terms of band shift and broadening. UV-VIS electronic absorption spectra were employed to measure the affinity constants of drug/DNA binding using Scatchard analysis.

  18. Electrochemical sensor for multiplex screening of genetically modified DNA: identification of biotech crops by logic-based biomolecular analysis.

    PubMed

    Liao, Wei-Ching; Chuang, Min-Chieh; Ho, Ja-An Annie

    2013-12-15

    Genetically modified (GM) technique, one of the modern biomolecular engineering technologies, has been deemed as profitable strategy to fight against global starvation. Yet rapid and reliable analytical method is deficient to evaluate the quality and potential risk of such resulting GM products. We herein present a biomolecular analytical system constructed with distinct biochemical activities to expedite the computational detection of genetically modified organisms (GMOs). The computational mechanism provides an alternative to the complex procedures commonly involved in the screening of GMOs. Given that the bioanalytical system is capable of processing promoter, coding and species genes, affirmative interpretations succeed to identify specified GM event in terms of both electrochemical and optical fashions. The biomolecular computational assay exhibits detection capability of genetically modified DNA below sub-nanomolar level and is found interference-free by abundant coexistence of non-GM DNA. This bioanalytical system, furthermore, sophisticates in array fashion operating multiplex screening against variable GM events. Such a biomolecular computational assay and biosensor holds great promise for rapid, cost-effective, and high-fidelity screening of GMO. Copyright © 2013 Elsevier B.V. All rights reserved.

  19. Fast parallel molecular algorithms for DNA-based computation: solving the elliptic curve discrete logarithm problem over GF2.

    PubMed

    Li, Kenli; Zou, Shuting; Xv, Jin

    2008-01-01

    Elliptic curve cryptographic algorithms convert input data to unrecognizable encryption and the unrecognizable data back again into its original decrypted form. The security of this form of encryption hinges on the enormous difficulty that is required to solve the elliptic curve discrete logarithm problem (ECDLP), especially over GF(2(n)), n in Z+. This paper describes an effective method to find solutions to the ECDLP by means of a molecular computer. We propose that this research accomplishment would represent a breakthrough for applied biological computation and this paper demonstrates that in principle this is possible. Three DNA-based algorithms: a parallel adder, a parallel multiplier, and a parallel inverse over GF(2(n)) are described. The biological operation time of all of these algorithms is polynomial with respect to n. Considering this analysis, cryptography using a public key might be less secure. In this respect, a principal contribution of this paper is to provide enhanced evidence of the potential of molecular computing to tackle such ambitious computations.

  20. Fast Parallel Molecular Algorithms for DNA-Based Computation: Solving the Elliptic Curve Discrete Logarithm Problem over GF(2n)

    PubMed Central

    Li, Kenli; Zou, Shuting; Xv, Jin

    2008-01-01

    Elliptic curve cryptographic algorithms convert input data to unrecognizable encryption and the unrecognizable data back again into its original decrypted form. The security of this form of encryption hinges on the enormous difficulty that is required to solve the elliptic curve discrete logarithm problem (ECDLP), especially over GF(2n), n ∈ Z+. This paper describes an effective method to find solutions to the ECDLP by means of a molecular computer. We propose that this research accomplishment would represent a breakthrough for applied biological computation and this paper demonstrates that in principle this is possible. Three DNA-based algorithms: a parallel adder, a parallel multiplier, and a parallel inverse over GF(2n) are described. The biological operation time of all of these algorithms is polynomial with respect to n. Considering this analysis, cryptography using a public key might be less secure. In this respect, a principal contribution of this paper is to provide enhanced evidence of the potential of molecular computing to tackle such ambitious computations. PMID:18431451

  1. Fast algorithms for computing phylogenetic divergence time.

    PubMed

    Crosby, Ralph W; Williams, Tiffani L

    2017-12-06

    The inference of species divergence time is a key step in most phylogenetic studies. Methods have been available for the last ten years to perform the inference, but the performance of the methods does not yet scale well to studies with hundreds of taxa and thousands of DNA base pairs. For example a study of 349 primate taxa was estimated to require over 9 months of processing time. In this work, we present a new algorithm, AncestralAge, that significantly improves the performance of the divergence time process. As part of AncestralAge, we demonstrate a new method for the computation of phylogenetic likelihood and our experiments show a 90% improvement in likelihood computation time on the aforementioned dataset of 349 primates taxa with over 60,000 DNA base pairs. Additionally, we show that our new method for the computation of the Bayesian prior on node ages reduces the running time for this computation on the 349 taxa dataset by 99%. Through the use of these new algorithms we open up the ability to perform divergence time inference on large phylogenetic studies.

  2. Solving a Hamiltonian Path Problem with a bacterial computer

    PubMed Central

    Baumgardner, Jordan; Acker, Karen; Adefuye, Oyinade; Crowley, Samuel Thomas; DeLoache, Will; Dickson, James O; Heard, Lane; Martens, Andrew T; Morton, Nickolaus; Ritter, Michelle; Shoecraft, Amber; Treece, Jessica; Unzicker, Matthew; Valencia, Amanda; Waters, Mike; Campbell, A Malcolm; Heyer, Laurie J; Poet, Jeffrey L; Eckdahl, Todd T

    2009-01-01

    Background The Hamiltonian Path Problem asks whether there is a route in a directed graph from a beginning node to an ending node, visiting each node exactly once. The Hamiltonian Path Problem is NP complete, achieving surprising computational complexity with modest increases in size. This challenge has inspired researchers to broaden the definition of a computer. DNA computers have been developed that solve NP complete problems. Bacterial computers can be programmed by constructing genetic circuits to execute an algorithm that is responsive to the environment and whose result can be observed. Each bacterium can examine a solution to a mathematical problem and billions of them can explore billions of possible solutions. Bacterial computers can be automated, made responsive to selection, and reproduce themselves so that more processing capacity is applied to problems over time. Results We programmed bacteria with a genetic circuit that enables them to evaluate all possible paths in a directed graph in order to find a Hamiltonian path. We encoded a three node directed graph as DNA segments that were autonomously shuffled randomly inside bacteria by a Hin/hixC recombination system we previously adapted from Salmonella typhimurium for use in Escherichia coli. We represented nodes in the graph as linked halves of two different genes encoding red or green fluorescent proteins. Bacterial populations displayed phenotypes that reflected random ordering of edges in the graph. Individual bacterial clones that found a Hamiltonian path reported their success by fluorescing both red and green, resulting in yellow colonies. We used DNA sequencing to verify that the yellow phenotype resulted from genotypes that represented Hamiltonian path solutions, demonstrating that our bacterial computer functioned as expected. Conclusion We successfully designed, constructed, and tested a bacterial computer capable of finding a Hamiltonian path in a three node directed graph. This proof-of-concept experiment demonstrates that bacterial computing is a new way to address NP-complete problems using the inherent advantages of genetic systems. The results of our experiments also validate synthetic biology as a valuable approach to biological engineering. We designed and constructed basic parts, devices, and systems using synthetic biology principles of standardization and abstraction. PMID:19630940

  3. Computational Micromodel for Epigenetic Mechanisms

    PubMed Central

    Raghavan, Karthika; Ruskin, Heather J.; Perrin, Dimitri; Goasmat, Francois; Burns, John

    2010-01-01

    Characterization of the epigenetic profile of humans since the initial breakthrough on the human genome project has strongly established the key role of histone modifications and DNA methylation. These dynamic elements interact to determine the normal level of expression or methylation status of the constituent genes in the genome. Recently, considerable evidence has been put forward to demonstrate that environmental stress implicitly alters epigenetic patterns causing imbalance that can lead to cancer initiation. This chain of consequences has motivated attempts to computationally model the influence of histone modification and DNA methylation in gene expression and investigate their intrinsic interdependency. In this paper, we explore the relation between DNA methylation and transcription and characterize in detail the histone modifications for specific DNA methylation levels using a stochastic approach. PMID:21152421

  4. Computational and experimental analysis of DNA shuffling

    PubMed Central

    Maheshri, Narendra; Schaffer, David V.

    2003-01-01

    We describe a computational model of DNA shuffling based on the thermodynamics and kinetics of this process. The model independently tracks a representative ensemble of DNA molecules and records their states at every stage of a shuffling reaction. These data can subsequently be analyzed to yield information on any relevant metric, including reassembly efficiency, crossover number, type and distribution, and DNA sequence length distributions. The predictive ability of the model was validated by comparison to three independent sets of experimental data, and analysis of the simulation results led to several unique insights into the DNA shuffling process. We examine a tradeoff between crossover frequency and reassembly efficiency and illustrate the effects of experimental parameters on this relationship. Furthermore, we discuss conditions that promote the formation of useless “junk” DNA sequences or multimeric sequences containing multiple copies of the reassembled product. This model will therefore aid in the design of optimal shuffling reaction conditions. PMID:12626764

  5. Programmable energy landscapes for kinetic control of DNA strand displacement.

    PubMed

    Machinek, Robert R F; Ouldridge, Thomas E; Haley, Natalie E C; Bath, Jonathan; Turberfield, Andrew J

    2014-11-10

    DNA is used to construct synthetic systems that sense, actuate, move and compute. The operation of many dynamic DNA devices depends on toehold-mediated strand displacement, by which one DNA strand displaces another from a duplex. Kinetic control of strand displacement is particularly important in autonomous molecular machinery and molecular computation, in which non-equilibrium systems are controlled through rates of competing processes. Here, we introduce a new method based on the creation of mismatched base pairs as kinetic barriers to strand displacement. Reaction rate constants can be tuned across three orders of magnitude by altering the position of such a defect without significantly changing the stabilities of reactants or products. By modelling reaction free-energy landscapes, we explore the mechanistic basis of this control mechanism. We also demonstrate that oxDNA, a coarse-grained model of DNA, is capable of accurately predicting and explaining the impact of mismatches on displacement kinetics.

  6. Testing the Use of Implicit Solvent in the Molecular Dynamics Modelling of DNA Flexibility

    NASA Astrophysics Data System (ADS)

    Mitchell, J.; Harris, S.

    DNA flexibility controls packaging, looping and in some cases sequence specific protein binding. Molecular dynamics simulations carried out with a computationally efficient implicit solvent model are potentially a powerful tool for studying larger DNA molecules than can be currently simulated when water and counterions are represented explicitly. In this work we compare DNA flexibility at the base pair step level modelled using an implicit solvent model to that previously determined from explicit solvent simulations and database analysis. Although much of the sequence dependent behaviour is preserved in implicit solvent, the DNA is considerably more flexible when the approximate model is used. In addition we test the ability of the implicit solvent to model stress induced DNA disruptions by simulating a series of DNA minicircle topoisomers which vary in size and superhelical density. When compared with previously run explicit solvent simulations, we find that while the levels of DNA denaturation are similar using both computational methodologies, the specific structural form of the disruptions is different.

  7. Exercises in molecular computing.

    PubMed

    Stojanovic, Milan N; Stefanovic, Darko; Rudchenko, Sergei

    2014-06-17

    CONSPECTUS: The successes of electronic digital logic have transformed every aspect of human life over the last half-century. The word "computer" now signifies a ubiquitous electronic device, rather than a human occupation. Yet evidently humans, large assemblies of molecules, can compute, and it has been a thrilling challenge to develop smaller, simpler, synthetic assemblies of molecules that can do useful computation. When we say that molecules compute, what we usually mean is that such molecules respond to certain inputs, for example, the presence or absence of other molecules, in a precisely defined but potentially complex fashion. The simplest way for a chemist to think about computing molecules is as sensors that can integrate the presence or absence of multiple analytes into a change in a single reporting property. Here we review several forms of molecular computing developed in our laboratories. When we began our work, combinatorial approaches to using DNA for computing were used to search for solutions to constraint satisfaction problems. We chose to work instead on logic circuits, building bottom-up from units based on catalytic nucleic acids, focusing on DNA secondary structures in the design of individual circuit elements, and reserving the combinatorial opportunities of DNA for the representation of multiple signals propagating in a large circuit. Such circuit design directly corresponds to the intuition about sensors transforming the detection of analytes into reporting properties. While this approach was unusual at the time, it has been adopted since by other groups working on biomolecular computing with different nucleic acid chemistries. We created logic gates by modularly combining deoxyribozymes (DNA-based enzymes cleaving or combining other oligonucleotides), in the role of reporting elements, with stem-loops as input detection elements. For instance, a deoxyribozyme that normally exhibits an oligonucleotide substrate recognition region is modified such that a stem-loop closes onto the substrate recognition region, making it unavailable for the substrate and thus rendering the deoxyribozyme inactive. But a conformational change can then be induced by an input oligonucleotide, complementary to the loop, to open the stem, allow the substrate to bind, and allow its cleavage to proceed, which is eventually reported via fluorescence. In this Account, several designs of this form are reviewed, along with their application in the construction of large circuits that exhibited complex logical and temporal relationships between the inputs and the outputs. Intelligent (in the sense of being capable of nontrivial information processing) theranostic (therapy + diagnostic) applications have always been the ultimate motivation for developing computing (i.e., decision-making) circuits, and we review our experiments with logic-gate elements bound to cell surfaces that evaluate the proximal presence of multiple markers on lymphocytes.

  8. Optimization of the molecular dynamics method for simulations of DNA and ion transport through biological nanopores.

    PubMed

    Wells, David B; Bhattacharya, Swati; Carr, Rogan; Maffeo, Christopher; Ho, Anthony; Comer, Jeffrey; Aksimentiev, Aleksei

    2012-01-01

    Molecular dynamics (MD) simulations have become a standard method for the rational design and interpretation of experimental studies of DNA translocation through nanopores. The MD method, however, offers a multitude of algorithms, parameters, and other protocol choices that can affect the accuracy of the resulting data as well as computational efficiency. In this chapter, we examine the most popular choices offered by the MD method, seeking an optimal set of parameters that enable the most computationally efficient and accurate simulations of DNA and ion transport through biological nanopores. In particular, we examine the influence of short-range cutoff, integration timestep and force field parameters on the temperature and concentration dependence of bulk ion conductivity, ion pairing, ion solvation energy, DNA structure, DNA-ion interactions, and the ionic current through a nanopore.

  9. Stochastic modelling, Bayesian inference, and new in vivo measurements elucidate the debated mtDNA bottleneck mechanism

    PubMed Central

    Johnston, Iain G; Burgstaller, Joerg P; Havlicek, Vitezslav; Kolbe, Thomas; Rülicke, Thomas; Brem, Gottfried; Poulton, Jo; Jones, Nick S

    2015-01-01

    Dangerous damage to mitochondrial DNA (mtDNA) can be ameliorated during mammalian development through a highly debated mechanism called the mtDNA bottleneck. Uncertainty surrounding this process limits our ability to address inherited mtDNA diseases. We produce a new, physically motivated, generalisable theoretical model for mtDNA populations during development, allowing the first statistical comparison of proposed bottleneck mechanisms. Using approximate Bayesian computation and mouse data, we find most statistical support for a combination of binomial partitioning of mtDNAs at cell divisions and random mtDNA turnover, meaning that the debated exact magnitude of mtDNA copy number depletion is flexible. New experimental measurements from a wild-derived mtDNA pairing in mice confirm the theoretical predictions of this model. We analytically solve a mathematical description of this mechanism, computing probabilities of mtDNA disease onset, efficacy of clinical sampling strategies, and effects of potential dynamic interventions, thus developing a quantitative and experimentally-supported stochastic theory of the bottleneck. DOI: http://dx.doi.org/10.7554/eLife.07464.001 PMID:26035426

  10. Computational Nanoelectronics: Applications to DNA, Carbon Nanotubes and Nanotransistors

    NASA Technical Reports Server (NTRS)

    Anantram, M. P.; Svizhenko, Alexei; Govindan, T. R.; Govindan, T. R.; Walch, S.; Mehrez, H.

    2003-01-01

    The topics covered by the panels of this viewgraph presentation include phonon scattering, layered structures, DNA as a device, the influence of twist and rise in the DNA molecule, counter-ions, conductance versus length, and intrinsic resonant tunneling.

  11. Improved programs for DNA and protein sequence analysis on the IBM personal computer and other standard computer systems.

    PubMed Central

    Mount, D W; Conrad, B

    1986-01-01

    We have previously described programs for a variety of types of sequence analysis (1-4). These programs have now been integrated into a single package. They are written in the standard C programming language and run on virtually any computer system with a C compiler, such as the IBM/PC and other computers running under the MS/DOS and UNIX operating systems. The programs are widely distributed and may be obtained from the authors as described below. PMID:3753780

  12. Logic integration of mRNA signals by an RNAi-based molecular computer

    PubMed Central

    Xie, Zhen; Liu, Siyuan John; Bleris, Leonidas; Benenson, Yaakov

    2010-01-01

    Synthetic in vivo molecular ‘computers’ could rewire biological processes by establishing programmable, non-native pathways between molecular signals and biological responses. Multiple molecular computer prototypes have been shown to work in simple buffered solutions. Many of those prototypes were made of DNA strands and performed computations using cycles of annealing-digestion or strand displacement. We have previously introduced RNA interference (RNAi)-based computing as a way of implementing complex molecular logic in vivo. Because it also relies on nucleic acids for its operation, RNAi computing could benefit from the tools developed for DNA systems. However, these tools must be harnessed to produce bioactive components and be adapted for harsh operating environments that reflect in vivo conditions. In a step toward this goal, we report the construction and implementation of biosensors that ‘transduce’ mRNA levels into bioactive, small interfering RNA molecules via RNA strand exchange in a cell-free Drosophila embryo lysate, a step beyond simple buffered environments. We further integrate the sensors with our RNAi ‘computational’ module to evaluate two-input logic functions on mRNA concentrations. Our results show how RNA strand exchange can expand the utility of RNAi computing and point toward the possibility of using strand exchange in a native biological setting. PMID:20194121

  13. Gener: a minimal programming module for chemical controllers based on DNA strand displacement

    PubMed Central

    Kahramanoğulları, Ozan; Cardelli, Luca

    2015-01-01

    Summary: Gener is a development module for programming chemical controllers based on DNA strand displacement. Gener is developed with the aim of providing a simple interface that minimizes the opportunities for programming errors: Gener allows the user to test the computations of the DNA programs based on a simple two-domain strand displacement algebra, the minimal available so far. The tool allows the user to perform stepwise computations with respect to the rules of the algebra as well as exhaustive search of the computation space with different options for exploration and visualization. Gener can be used in combination with existing tools, and in particular, its programs can be exported to Microsoft Research’s DSD tool as well as to LaTeX. Availability and implementation: Gener is available for download at the Cosbi website at http://www.cosbi.eu/research/prototypes/gener as a windows executable that can be run on Mac OS X and Linux by using Mono. Contact: ozan@cosbi.eu PMID:25957353

  14. MIGS-GPU: Microarray Image Gridding and Segmentation on the GPU.

    PubMed

    Katsigiannis, Stamos; Zacharia, Eleni; Maroulis, Dimitris

    2017-05-01

    Complementary DNA (cDNA) microarray is a powerful tool for simultaneously studying the expression level of thousands of genes. Nevertheless, the analysis of microarray images remains an arduous and challenging task due to the poor quality of the images that often suffer from noise, artifacts, and uneven background. In this study, the MIGS-GPU [Microarray Image Gridding and Segmentation on Graphics Processing Unit (GPU)] software for gridding and segmenting microarray images is presented. MIGS-GPU's computations are performed on the GPU by means of the compute unified device architecture (CUDA) in order to achieve fast performance and increase the utilization of available system resources. Evaluation on both real and synthetic cDNA microarray images showed that MIGS-GPU provides better performance than state-of-the-art alternatives, while the proposed GPU implementation achieves significantly lower computational times compared to the respective CPU approaches. Consequently, MIGS-GPU can be an advantageous and useful tool for biomedical laboratories, offering a user-friendly interface that requires minimum input in order to run.

  15. Gener: a minimal programming module for chemical controllers based on DNA strand displacement.

    PubMed

    Kahramanoğulları, Ozan; Cardelli, Luca

    2015-09-01

    : Gener is a development module for programming chemical controllers based on DNA strand displacement. Gener is developed with the aim of providing a simple interface that minimizes the opportunities for programming errors: Gener allows the user to test the computations of the DNA programs based on a simple two-domain strand displacement algebra, the minimal available so far. The tool allows the user to perform stepwise computations with respect to the rules of the algebra as well as exhaustive search of the computation space with different options for exploration and visualization. Gener can be used in combination with existing tools, and in particular, its programs can be exported to Microsoft Research's DSD tool as well as to LaTeX. Gener is available for download at the Cosbi website at http://www.cosbi.eu/research/prototypes/gener as a windows executable that can be run on Mac OS X and Linux by using Mono. ozan@cosbi.eu. © The Author 2015. Published by Oxford University Press.

  16. Effective Design of Multifunctional Peptides by Combining Compatible Functions

    PubMed Central

    Diener, Christian; Garza Ramos Martínez, Georgina; Moreno Blas, Daniel; Castillo González, David A.; Corzo, Gerardo; Castro-Obregon, Susana; Del Rio, Gabriel

    2016-01-01

    Multifunctionality is a common trait of many natural proteins and peptides, yet the rules to generate such multifunctionality remain unclear. We propose that the rules defining some protein/peptide functions are compatible. To explore this hypothesis, we trained a computational method to predict cell-penetrating peptides at the sequence level and learned that antimicrobial peptides and DNA-binding proteins are compatible with the rules of our predictor. Based on this finding, we expected that designing peptides for CPP activity may render AMP and DNA-binding activities. To test this prediction, we designed peptides that embedded two independent functional domains (nuclear localization and yeast pheromone activity), linked by optimizing their composition to fit the rules characterizing cell-penetrating peptides. These peptides presented effective cell penetration, DNA-binding, pheromone and antimicrobial activities, thus confirming the effectiveness of our computational approach to design multifunctional peptides with potential therapeutic uses. Our computational implementation is available at http://bis.ifc.unam.mx/en/software/dcf. PMID:27096600

  17. Promoter-enhancer interactions identified from Hi-C data using probabilistic models and hierarchical topological domains.

    PubMed

    Ron, Gil; Globerson, Yuval; Moran, Dror; Kaplan, Tommy

    2017-12-21

    Proximity-ligation methods such as Hi-C allow us to map physical DNA-DNA interactions along the genome, and reveal its organization into topologically associating domains (TADs). As the Hi-C data accumulate, computational methods were developed for identifying domain borders in multiple cell types and organisms. Here, we present PSYCHIC, a computational approach for analyzing Hi-C data and identifying promoter-enhancer interactions. We use a unified probabilistic model to segment the genome into domains, which we then merge hierarchically and fit using a local background model, allowing us to identify over-represented DNA-DNA interactions across the genome. By analyzing the published Hi-C data sets in human and mouse, we identify hundreds of thousands of putative enhancers and their target genes, and compile an extensive genome-wide catalog of gene regulation in human and mouse. As we show, our predictions are highly enriched for ChIP-seq and DNA accessibility data, evolutionary conservation, eQTLs and other DNA-DNA interaction data.

  18. An optimized and low-cost FPGA-based DNA sequence alignment--a step towards personal genomics.

    PubMed

    Shah, Hurmat Ali; Hasan, Laiq; Ahmad, Nasir

    2013-01-01

    DNA sequence alignment is a cardinal process in computational biology but also is much expensive computationally when performing through traditional computational platforms like CPU. Of many off the shelf platforms explored for speeding up the computation process, FPGA stands as the best candidate due to its performance per dollar spent and performance per watt. These two advantages make FPGA as the most appropriate choice for realizing the aim of personal genomics. The previous implementation of DNA sequence alignment did not take into consideration the price of the device on which optimization was performed. This paper presents optimization over previous FPGA implementation that increases the overall speed-up achieved as well as the price incurred by the platform that was optimized. The optimizations are (1) The array of processing elements is made to run on change in input value and not on clock, so eliminating the need for tight clock synchronization, (2) the implementation is unrestrained by the size of the sequences to be aligned, (3) the waiting time required for the sequences to load to FPGA is reduced to the minimum possible and (4) an efficient method is devised to store the output matrix that make possible to save the diagonal elements to be used in next pass, in parallel with the computation of output matrix. Implemented on Spartan3 FPGA, this implementation achieved 20 times performance improvement in terms of CUPS over GPP implementation.

  19. BarraCUDA - a fast short read sequence aligner using graphics processing units

    PubMed Central

    2012-01-01

    Background With the maturation of next-generation DNA sequencing (NGS) technologies, the throughput of DNA sequencing reads has soared to over 600 gigabases from a single instrument run. General purpose computing on graphics processing units (GPGPU), extracts the computing power from hundreds of parallel stream processors within graphics processing cores and provides a cost-effective and energy efficient alternative to traditional high-performance computing (HPC) clusters. In this article, we describe the implementation of BarraCUDA, a GPGPU sequence alignment software that is based on BWA, to accelerate the alignment of sequencing reads generated by these instruments to a reference DNA sequence. Findings Using the NVIDIA Compute Unified Device Architecture (CUDA) software development environment, we ported the most computational-intensive alignment component of BWA to GPU to take advantage of the massive parallelism. As a result, BarraCUDA offers a magnitude of performance boost in alignment throughput when compared to a CPU core while delivering the same level of alignment fidelity. The software is also capable of supporting multiple CUDA devices in parallel to further accelerate the alignment throughput. Conclusions BarraCUDA is designed to take advantage of the parallelism of GPU to accelerate the alignment of millions of sequencing reads generated by NGS instruments. By doing this, we could, at least in part streamline the current bioinformatics pipeline such that the wider scientific community could benefit from the sequencing technology. BarraCUDA is currently available from http://seqbarracuda.sf.net PMID:22244497

  20. Flavin Charge Transfer Transitions Assist DNA Photolyase Electron Transfer

    NASA Astrophysics Data System (ADS)

    Skourtis, Spiros S.; Prytkova, Tatiana; Beratan, David N.

    2007-12-01

    This contribution describes molecular dynamics, semi-empirical and ab-initio studies of the primary photo-induced electron transfer reaction in DNA photolyase. DNA photolyases are FADH--containing proteins that repair UV-damaged DNA by photo-induced electron transfer. A DNA photolyase recognizes and binds to cyclobutatne pyrimidine dimer lesions of DNA. The protein repairs a bound lesion by transferring an electron to the lesion from FADH-, upon photo-excitation of FADH- with 350-450 nm light. We compute the lowest singlet excited states of FADH- in DNA photolyase using INDO/S configuration interaction, time-dependent density-functional, and time-dependent Hartree-Fock methods. The calculations identify the lowest singlet excited state of FADH- that is populated after photo-excitation and that acts as the electron donor. For this donor state we compute conformationally-averaged tunneling matrix elements to empty electron-acceptor states of a thymine dimer bound to photolyase. The conformational averaging involves different FADH--thymine dimer confromations obtained from molecular dynamics simulations of the solvated protein with a thymine dimer docked in its active site. The tunneling matrix element computations use INDO/S-level Green's function, energy splitting, and Generalized Mulliken-Hush methods. These calculations indicate that photo-excitation of FADH- causes a π→π* charge-transfer transition that shifts electron density to the side of the flavin isoalloxazine ring that is adjacent to the docked thymine dimer. This shift in electron density enhances the FADH--to-dimer electronic coupling, thus inducing rapid electron transfer.

  1. A Feature Selection Algorithm to Compute Gene Centric Methylation from Probe Level Methylation Data.

    PubMed

    Baur, Brittany; Bozdag, Serdar

    2016-01-01

    DNA methylation is an important epigenetic event that effects gene expression during development and various diseases such as cancer. Understanding the mechanism of action of DNA methylation is important for downstream analysis. In the Illumina Infinium HumanMethylation 450K array, there are tens of probes associated with each gene. Given methylation intensities of all these probes, it is necessary to compute which of these probes are most representative of the gene centric methylation level. In this study, we developed a feature selection algorithm based on sequential forward selection that utilized different classification methods to compute gene centric DNA methylation using probe level DNA methylation data. We compared our algorithm to other feature selection algorithms such as support vector machines with recursive feature elimination, genetic algorithms and ReliefF. We evaluated all methods based on the predictive power of selected probes on their mRNA expression levels and found that a K-Nearest Neighbors classification using the sequential forward selection algorithm performed better than other algorithms based on all metrics. We also observed that transcriptional activities of certain genes were more sensitive to DNA methylation changes than transcriptional activities of other genes. Our algorithm was able to predict the expression of those genes with high accuracy using only DNA methylation data. Our results also showed that those DNA methylation-sensitive genes were enriched in Gene Ontology terms related to the regulation of various biological processes.

  2. DNA context represents transcription regulation of the gene in mouse embryonic stem cells

    NASA Astrophysics Data System (ADS)

    Ha, Misook; Hong, Soondo

    2016-04-01

    Understanding gene regulatory information in DNA remains a significant challenge in biomedical research. This study presents a computational approach to infer gene regulatory programs from primary DNA sequences. Using DNA around transcription start sites as attributes, our model predicts gene regulation in the gene. We find that H3K27ac around TSS is an informative descriptor of the transcription program in mouse embryonic stem cells. We build a computational model inferring the cell-type-specific H3K27ac signatures in the DNA around TSS. A comparison of embryonic stem cell and liver cell-specific H3K27ac signatures in DNA shows that the H3K27ac signatures in DNA around TSS efficiently distinguish the cell-type specific H3K27ac peaks and the gene regulation. The arrangement of the H3K27ac signatures inferred from the DNA represents the transcription regulation of the gene in mESC. We show that the DNA around transcription start sites is associated with the gene regulatory program by specific interaction with H3K27ac.

  3. DNA context represents transcription regulation of the gene in mouse embryonic stem cells.

    PubMed

    Ha, Misook; Hong, Soondo

    2016-04-14

    Understanding gene regulatory information in DNA remains a significant challenge in biomedical research. This study presents a computational approach to infer gene regulatory programs from primary DNA sequences. Using DNA around transcription start sites as attributes, our model predicts gene regulation in the gene. We find that H3K27ac around TSS is an informative descriptor of the transcription program in mouse embryonic stem cells. We build a computational model inferring the cell-type-specific H3K27ac signatures in the DNA around TSS. A comparison of embryonic stem cell and liver cell-specific H3K27ac signatures in DNA shows that the H3K27ac signatures in DNA around TSS efficiently distinguish the cell-type specific H3K27ac peaks and the gene regulation. The arrangement of the H3K27ac signatures inferred from the DNA represents the transcription regulation of the gene in mESC. We show that the DNA around transcription start sites is associated with the gene regulatory program by specific interaction with H3K27ac.

  4. Easy design of colorimetric logic gates based on nonnatural base pairing and controlled assembly of gold nanoparticles.

    PubMed

    Zhang, Li; Wang, Zhong-Xia; Liang, Ru-Ping; Qiu, Jian-Ding

    2013-07-16

    Utilizing the principles of metal-ion-mediated base pairs (C-Ag-C and T-Hg-T), the pH-sensitive conformational transition of C-rich DNA strand, and the ligand-exchange process triggered by DL-dithiothreitol (DTT), a system of colorimetric logic gates (YES, AND, INHIBIT, and XOR) can be rationally constructed based on the aggregation of the DNA-modified Au NPs. The proposed logic operation system is simple, which consists of only T-/C-rich DNA-modified Au NPs, and it is unnecessary to exquisitely design and alter the DNA sequence for different multiple molecular logic operations. The nonnatural base pairing combined with unique optical properties of Au NPs promises great potential in multiplexed ion sensing, molecular-scale computers, and other computational logic devices.

  5. A novel image encryption algorithm based on the chaotic system and DNA computing

    NASA Astrophysics Data System (ADS)

    Chai, Xiuli; Gan, Zhihua; Lu, Yang; Chen, Yiran; Han, Daojun

    A novel image encryption algorithm using the chaotic system and deoxyribonucleic acid (DNA) computing is presented. Different from the traditional encryption methods, the permutation and diffusion of our method are manipulated on the 3D DNA matrix. Firstly, a 3D DNA matrix is obtained through bit plane splitting, bit plane recombination, DNA encoding of the plain image. Secondly, 3D DNA level permutation based on position sequence group (3DDNALPBPSG) is introduced, and chaotic sequences generated from the chaotic system are employed to permutate the positions of the elements of the 3D DNA matrix. Thirdly, 3D DNA level diffusion (3DDNALD) is given, the confused 3D DNA matrix is split into sub-blocks, and XOR operation by block is manipulated to the sub-DNA matrix and the key DNA matrix from the chaotic system. At last, by decoding the diffused DNA matrix, we get the cipher image. SHA 256 hash of the plain image is employed to calculate the initial values of the chaotic system to avoid chosen plaintext attack. Experimental results and security analyses show that our scheme is secure against several known attacks, and it can effectively protect the security of the images.

  6. DEVELOPMENT OF DNA MICROARRAYS FOR ECOLOGICAL EXPOSURE ASSESSMENT

    EPA Science Inventory

    EPA/ORD is moving forward with a computational toxicology initiative in FY 04 which aims to integrate genomics and computational methods to provide a mechanistic basis for prediction of exposure and effects of chemical stressors in the environment.

    The goal of the presen...

  7. Standard atomic volumes in double-stranded DNA and packing in protein–DNA interfaces

    PubMed Central

    Nadassy, Katalin; Tomás-Oliveira, Isabel; Alberts, Ian; Janin, Joël; Wodak, Shoshana J.

    2001-01-01

    Standard volumes for atoms in double-stranded B-DNA are derived using high resolution crystal structures from the Nucleic Acid Database (NDB) and compared with corresponding values derived from crystal structures of small organic compounds in the Cambridge Structural Database (CSD). Two different methods are used to compute these volumes: the classical Voronoi method, which does not depend on the size of atoms, and the related Radical Planes method which does. Results show that atomic groups buried in the interior of double-stranded DNA are, on average, more tightly packed than in related small molecules in the CSD. The packing efficiency of DNA atoms at the interfaces of 25 high resolution protein–DNA complexes is determined by computing the ratios between the volumes of interfacial DNA atoms and the corresponding standard volumes. These ratios are found to be close to unity, indicating that the DNA atoms at protein–DNA interfaces are as closely packed as in crystals of B-DNA. Analogous volume ratios, computed for buried protein atoms, are also near unity, confirming our earlier conclusions that the packing efficiency of these atoms is similar to that in the protein interior. In addition, we examine the number, volume and solvent occupation of cavities located at the protein–DNA interfaces and compared them with those in the protein interior. Cavities are found to be ubiquitous in the interfaces as well as inside the protein moieties. The frequency of solvent occupation of cavities is however higher in the interfaces, indicating that those are more hydrated than protein interiors. Lastly, we compare our results with those obtained using two different measures of shape complementarity of the analysed interfaces, and find that the correlation between our volume ratios and these measures, as well as between the measures themselves, is weak. Our results indicate that a tightly packed environment made up of DNA, protein and solvent atoms plays a significant role in protein–DNA recognition. PMID:11504874

  8. Exercises in Molecular Computing

    PubMed Central

    2014-01-01

    Conspectus The successes of electronic digital logic have transformed every aspect of human life over the last half-century. The word “computer” now signifies a ubiquitous electronic device, rather than a human occupation. Yet evidently humans, large assemblies of molecules, can compute, and it has been a thrilling challenge to develop smaller, simpler, synthetic assemblies of molecules that can do useful computation. When we say that molecules compute, what we usually mean is that such molecules respond to certain inputs, for example, the presence or absence of other molecules, in a precisely defined but potentially complex fashion. The simplest way for a chemist to think about computing molecules is as sensors that can integrate the presence or absence of multiple analytes into a change in a single reporting property. Here we review several forms of molecular computing developed in our laboratories. When we began our work, combinatorial approaches to using DNA for computing were used to search for solutions to constraint satisfaction problems. We chose to work instead on logic circuits, building bottom-up from units based on catalytic nucleic acids, focusing on DNA secondary structures in the design of individual circuit elements, and reserving the combinatorial opportunities of DNA for the representation of multiple signals propagating in a large circuit. Such circuit design directly corresponds to the intuition about sensors transforming the detection of analytes into reporting properties. While this approach was unusual at the time, it has been adopted since by other groups working on biomolecular computing with different nucleic acid chemistries. We created logic gates by modularly combining deoxyribozymes (DNA-based enzymes cleaving or combining other oligonucleotides), in the role of reporting elements, with stem–loops as input detection elements. For instance, a deoxyribozyme that normally exhibits an oligonucleotide substrate recognition region is modified such that a stem–loop closes onto the substrate recognition region, making it unavailable for the substrate and thus rendering the deoxyribozyme inactive. But a conformational change can then be induced by an input oligonucleotide, complementary to the loop, to open the stem, allow the substrate to bind, and allow its cleavage to proceed, which is eventually reported via fluorescence. In this Account, several designs of this form are reviewed, along with their application in the construction of large circuits that exhibited complex logical and temporal relationships between the inputs and the outputs. Intelligent (in the sense of being capable of nontrivial information processing) theranostic (therapy + diagnostic) applications have always been the ultimate motivation for developing computing (i.e., decision-making) circuits, and we review our experiments with logic-gate elements bound to cell surfaces that evaluate the proximal presence of multiple markers on lymphocytes. PMID:24873234

  9. New Trends of Digital Data Storage in DNA

    PubMed Central

    2016-01-01

    With the exponential growth in the capacity of information generated and the emerging need for data to be stored for prolonged period of time, there emerges a need for a storage medium with high capacity, high storage density, and possibility to withstand extreme environmental conditions. DNA emerges as the prospective medium for data storage with its striking features. Diverse encoding models for reading and writing data onto DNA, codes for encrypting data which addresses issues of error generation, and approaches for developing codons and storage styles have been developed over the recent past. DNA has been identified as a potential medium for secret writing, which achieves the way towards DNA cryptography and stenography. DNA utilized as an organic memory device along with big data storage and analytics in DNA has paved the way towards DNA computing for solving computational problems. This paper critically analyzes the various methods used for encoding and encrypting data onto DNA while identifying the advantages and capability of every scheme to overcome the drawbacks identified priorly. Cryptography and stenography techniques have been analyzed in a critical approach while identifying the limitations of each method. This paper also identifies the advantages and limitations of DNA as a memory device and memory applications. PMID:27689089

  10. New Trends of Digital Data Storage in DNA.

    PubMed

    De Silva, Pavani Yashodha; Ganegoda, Gamage Upeksha

    With the exponential growth in the capacity of information generated and the emerging need for data to be stored for prolonged period of time, there emerges a need for a storage medium with high capacity, high storage density, and possibility to withstand extreme environmental conditions. DNA emerges as the prospective medium for data storage with its striking features. Diverse encoding models for reading and writing data onto DNA, codes for encrypting data which addresses issues of error generation, and approaches for developing codons and storage styles have been developed over the recent past. DNA has been identified as a potential medium for secret writing, which achieves the way towards DNA cryptography and stenography. DNA utilized as an organic memory device along with big data storage and analytics in DNA has paved the way towards DNA computing for solving computational problems. This paper critically analyzes the various methods used for encoding and encrypting data onto DNA while identifying the advantages and capability of every scheme to overcome the drawbacks identified priorly. Cryptography and stenography techniques have been analyzed in a critical approach while identifying the limitations of each method. This paper also identifies the advantages and limitations of DNA as a memory device and memory applications.

  11. Visualising "Junk" DNA through Bioinformatics

    ERIC Educational Resources Information Center

    Elwess, Nancy L.; Latourelle, Sandra M.; Cauthorn, Olivia

    2005-01-01

    One of the hottest areas of science today is the field in which biology, information technology,and computer science are merged into a single discipline called bioinformatics. This field enables the discovery and analysis of biological data, including nucleotide and amino acid sequences that are easily accessed through the use of computers. As…

  12. Unveiling Stability Criteria of DNA-Carbon Nanotubes Constructs by Scanning Tunneling Microscopy and Computational Modeling

    DOE PAGES

    Kilina, Svetlana; Yarotski, Dzmitry A.; Talin, A. Alec; ...

    2011-01-01

    We present a combined approach that relies on computational simulations and scanning tunneling microscopy (STM) measurements to reveal morphological properties and stability criteria of carbon nanotube-DNA (CNT-DNA) constructs. Application of STM allows direct observation of very stable CNT-DNA hybrid structures with the well-defined DNA wrapping angle of 63.4 ° and a coiling period of 3.3 nm. Using force field simulations, we determine how the DNA-CNT binding energy depends on the sequence and binding geometry of a single strand DNA. This dependence allows us to quantitatively characterize the stability of a hybrid structure with an optimal π-stacking between DNA nucleotides and themore » tube surface and better interpret STM data. Our simulations clearly demonstrate the existence of a very stable DNA binding geometry for (6,5) CNT as evidenced by the presence of a well-defined minimum in the binding energy as a function of an angle between DNA strand and the nanotube chiral vector. This novel approach demonstrates the feasibility of CNT-DNA geometry studies with subnanometer resolution and paves the way towards complete characterization of the structural and electronic properties of drug-delivering systems based on DNA-CNT hybrids as a function of DNA sequence and a nanotube chirality.« less

  13. Holliday Triangle Hunter (HolT Hunter): Efficient Software for Identifying Low Strain DNA Triangular Configurations

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Sherman, W.B.

    2012-04-16

    Synthetic DNA nanostructures are typically held together primarily by Holliday junctions. One of the most basic types of structures possible to assemble with only DNA and Holliday junctions is the triangle. To date, however, only equilateral triangles have been assembled in this manner - primarily because it is difficult to figure out what configurations of Holliday triangles have low strain. Early attempts at identifying such configurations relied upon calculations that followed the strained helical paths of DNA. Those methods, however, were computationally expensive, and failed to find many of the possible solutions. I have developed a new approach to identifyingmore » Holliday triangles that is computationally faster, and finds well over 95% of the possible solutions. The new approach is based on splitting the problem into two parts. The first part involves figuring out all the different ways that three featureless rods of the appropriate length and diameter can weave over and under one another to form a triangle. The second part of the computation entails seeing whether double helical DNA backbones can fit into the shape dictated by the rods in such a manner that the strands can cross over from one domain to the other at the appropriate spots. Structures with low strain (that is, good fit between the rods and the helices) on all three edges are recorded as promising for assembly.« less

  14. [Correlation of codon biases and potential secondary structures with mRNA translation efficiency in unicellular organisms].

    PubMed

    Vladimirov, N V; Likhoshvaĭ, V A; Matushkin, Iu G

    2007-01-01

    Gene expression is known to correlate with degree of codon bias in many unicellular organisms. However, such correlation is absent in some organisms. Recently we demonstrated that inverted complementary repeats within coding DNA sequence must be considered for proper estimation of translation efficiency, since they may form secondary structures that obstruct ribosome movement. We have developed a program for estimation of potential coding DNA sequence expression in defined unicellular organism using its genome sequence. The program computes elongation efficiency index. Computation is based on estimation of coding DNA sequence elongation efficiency, taking into account three key factors: codon bias, average number of inverted complementary repeats, and free energy of potential stem-loop structures formed by the repeats. The influence of these factors on translation is numerically estimated. An optimal proportion of these factors is computed for each organism individually. Quantitative translational characteristics of 384 unicellular organisms (351 bacteria, 28 archaea, 5 eukaryota) have been computed using their annotated genomes from NCBI GenBank. Five potential evolutionary strategies of translational optimization have been determined among studied organisms. A considerable difference of preferred translational strategies between Bacteria and Archaea has been revealed. Significant correlations between elongation efficiency index and gene expression levels have been shown for two organisms (S. cerevisiae and H. pylori) using available microarray data. The proposed method allows to estimate numerically the coding DNA sequence translation efficiency and to optimize nucleotide composition of heterologous genes in unicellular organisms. http://www.mgs.bionet.nsc.ru/mgs/programs/eei-calculator/.

  15. Fast parallel molecular algorithms for DNA-based computation: factoring integers.

    PubMed

    Chang, Weng-Long; Guo, Minyi; Ho, Michael Shan-Hui

    2005-06-01

    The RSA public-key cryptosystem is an algorithm that converts input data to an unrecognizable encryption and converts the unrecognizable data back into its original decryption form. The security of the RSA public-key cryptosystem is based on the difficulty of factoring the product of two large prime numbers. This paper demonstrates to factor the product of two large prime numbers, and is a breakthrough in basic biological operations using a molecular computer. In order to achieve this, we propose three DNA-based algorithms for parallel subtractor, parallel comparator, and parallel modular arithmetic that formally verify our designed molecular solutions for factoring the product of two large prime numbers. Furthermore, this work indicates that the cryptosystems using public-key are perhaps insecure and also presents clear evidence of the ability of molecular computing to perform complicated mathematical operations.

  16. POLLUX: a program for simulated cloning, mutagenesis and database searching of DNA constructs.

    PubMed

    Dayringer, H E; Sammons, S A

    1991-04-01

    Computer support for research in biotechnology has developed rapidly and has provided several tools to aid the researcher. This report describes the capabilities of new computer software developed in this laboratory to aid in the documentation and planning of experiments in molecular biology. The program, POLLUX, provides a graphical medium for the entry, edit and manipulation of DNA constructs and a textual format for display and edit of construct descriptive data. Program operation and procedures are designed to mimic the actual laboratory experiments with respect to capability and the order in which they are performed. Flexible control over the content of the computer-generated displays and program facilities is provided by a mouse-driven menu interface. Programmed facilities for mutagenesis, simulated cloning and searching of the database from networked workstations are described.

  17. 'DNA Strider': a 'C' program for the fast analysis of DNA and protein sequences on the Apple Macintosh family of computers.

    PubMed Central

    Marck, C

    1988-01-01

    DNA Strider is a new integrated DNA and Protein sequence analysis program written with the C language for the Macintosh Plus, SE and II computers. It has been designed as an easy to learn and use program as well as a fast and efficient tool for the day-to-day sequence analysis work. The program consists of a multi-window sequence editor and of various DNA and Protein analysis functions. The editor may use 4 different types of sequences (DNA, degenerate DNA, RNA and one-letter coded protein) and can handle simultaneously 6 sequences of any type up to 32.5 kB each. Negative numbering of the bases is allowed for DNA sequences. All classical restriction and translation analysis functions are present and can be performed in any order on any open sequence or part of a sequence. The main feature of the program is that the same analysis function can be repeated several times on different sequences, thus generating multiple windows on the screen. Many graphic capabilities have been incorporated such as graphic restriction map, hydrophobicity profile and the CAI plot- codon adaptation index according to Sharp and Li. The restriction sites search uses a newly designed fast hexamer look-ahead algorithm. Typical runtime for the search of all sites with a library of 130 restriction endonucleases is 1 second per 10,000 bases. The circular graphic restriction map of the pBR322 plasmid can be therefore computed from its sequence and displayed on the Macintosh Plus screen within 2 seconds and its multiline restriction map obtained in a scrolling window within 5 seconds. PMID:2832831

  18. A DNA network as an information processing system.

    PubMed

    Santini, Cristina Costa; Bath, Jonathan; Turberfield, Andrew J; Tyrrell, Andy M

    2012-01-01

    Biomolecular systems that can process information are sought for computational applications, because of their potential for parallelism and miniaturization and because their biocompatibility also makes them suitable for future biomedical applications. DNA has been used to design machines, motors, finite automata, logic gates, reaction networks and logic programs, amongst many other structures and dynamic behaviours. Here we design and program a synthetic DNA network to implement computational paradigms abstracted from cellular regulatory networks. These show information processing properties that are desirable in artificial, engineered molecular systems, including robustness of the output in relation to different sources of variation. We show the results of numerical simulations of the dynamic behaviour of the network and preliminary experimental analysis of its main components.

  19. Implementation of cascade logic gates and majority logic gate on a simple and universal molecular platform.

    PubMed

    Gao, Jinting; Liu, Yaqing; Lin, Xiaodong; Deng, Jiankang; Yin, Jinjin; Wang, Shuo

    2017-10-25

    Wiring a series of simple logic gates to process complex data is significantly important and a large challenge for untraditional molecular computing systems. The programmable property of DNA endows its powerful application in molecular computing. In our investigation, it was found that DNA exhibits excellent peroxidase-like activity in a colorimetric system of TMB/H 2 O 2 /Hemin (TMB, 3,3', 5,5'-Tetramethylbenzidine) in the presence of K + and Cu 2+ , which is significantly inhibited by the addition of an antioxidant. According to the modulated catalytic activity of this DNA-based catalyst, three cascade logic gates including AND-OR-INH (INHIBIT), AND-INH and OR-INH were successfully constructed. Interestingly, by only modulating the concentration of Cu 2+ , a majority logic gate with a single-vote veto function was realized following the same threshold value as that of the cascade logic gates. The strategy is quite straightforward and versatile and provides an instructive method for constructing multiple logic gates on a simple platform to implement complex molecular computing.

  20. DNA algorithms of implementing biomolecular databases on a biological computer.

    PubMed

    Chang, Weng-Long; Vasilakos, Athanasios V

    2015-01-01

    In this paper, DNA algorithms are proposed to perform eight operations of relational algebra (calculus), which include Cartesian product, union, set difference, selection, projection, intersection, join, and division, on biomolecular relational databases.

  1. Comparing DNA damage-processing pathways by computer analysis of chromosome painting data.

    PubMed

    Levy, Dan; Vazquez, Mariel; Cornforth, Michael; Loucas, Bradford; Sachs, Rainer K; Arsuaga, Javier

    2004-01-01

    Chromosome aberrations are large-scale illegitimate rearrangements of the genome. They are indicative of DNA damage and informative about damage processing pathways. Despite extensive investigations over many years, the mechanisms underlying aberration formation remain controversial. New experimental assays such as multiplex fluorescent in situ hybridyzation (mFISH) allow combinatorial "painting" of chromosomes and are promising for elucidating aberration formation mechanisms. Recently observed mFISH aberration patterns are so complex that computer and graph-theoretical methods are needed for their full analysis. An important part of the analysis is decomposing a chromosome rearrangement process into "cycles." A cycle of order n, characterized formally by the cyclic graph with 2n vertices, indicates that n chromatin breaks take part in a single irreducible reaction. We here describe algorithms for computing cycle structures from experimentally observed or computer-simulated mFISH aberration patterns. We show that analyzing cycles quantitatively can distinguish between different aberration formation mechanisms. In particular, we show that homology-based mechanisms do not generate the large number of complex aberrations, involving higher-order cycles, observed in irradiated human lymphocytes.

  2. The fusion of biology, computer science, and engineering: towards efficient and successful synthetic biology.

    PubMed

    Linshiz, Gregory; Goldberg, Alex; Konry, Tania; Hillson, Nathan J

    2012-01-01

    Synthetic biology is a nascent field that emerged in earnest only around the turn of the millennium. It aims to engineer new biological systems and impart new biological functionality, often through genetic modifications. The design and construction of new biological systems is a complex, multistep process, requiring multidisciplinary collaborative efforts from "fusion" scientists who have formal training in computer science or engineering, as well as hands-on biological expertise. The public has high expectations for synthetic biology and eagerly anticipates the development of solutions to the major challenges facing humanity. This article discusses laboratory practices and the conduct of research in synthetic biology. It argues that the fusion science approach, which integrates biology with computer science and engineering best practices, including standardization, process optimization, computer-aided design and laboratory automation, miniaturization, and systematic management, will increase the predictability and reproducibility of experiments and lead to breakthroughs in the construction of new biological systems. The article also discusses several successful fusion projects, including the development of software tools for DNA construction design automation, recursive DNA construction, and the development of integrated microfluidics systems.

  3. A Review of Computational Intelligence Methods for Eukaryotic Promoter Prediction.

    PubMed

    Singh, Shailendra; Kaur, Sukhbir; Goel, Neelam

    2015-01-01

    In past decades, prediction of genes in DNA sequences has attracted the attention of many researchers but due to its complex structure it is extremely intricate to correctly locate its position. A large number of regulatory regions are present in DNA that helps in transcription of a gene. Promoter is one such region and to find its location is a challenging problem. Various computational methods for promoter prediction have been developed over the past few years. This paper reviews these promoter prediction methods. Several difficulties and pitfalls encountered by these methods are also detailed, along with future research directions.

  4. FY02 CBNP Annual Report Input: Bioinformatics Support for CBNP Research and Deployments

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Slezak, T; Wolinsky, M

    2002-10-31

    The events of FY01 dynamically reprogrammed the objectives of the CBNP bioinformatics support team, to meet rapidly-changing Homeland Defense needs and requests from other agencies for assistance: Use computational techniques to determine potential unique DNA signature candidates for microbial and viral pathogens of interest to CBNP researcher and to our collaborating partner agencies such as the Centers for Disease Control and Prevention (CDC), U.S. Department of Agriculture (USDA), Department of Defense (DOD), and Food and Drug Administration (FDA). Develop effective electronic screening measures for DNA signatures to reduce the cost and time of wet-bench screening. Build a comprehensive system formore » tracking the development and testing of DNA signatures. Build a chain-of-custody sample tracking system for field deployment of the DNA signatures as part of the BASIS project. Provide computational tools for use by CBNP Biological Foundations researchers.« less

  5. DNA strand displacement system running logic programs.

    PubMed

    Rodríguez-Patón, Alfonso; Sainz de Murieta, Iñaki; Sosík, Petr

    2014-01-01

    The paper presents a DNA-based computing model which is enzyme-free and autonomous, not requiring a human intervention during the computation. The model is able to perform iterated resolution steps with logical formulae in conjunctive normal form. The implementation is based on the technique of DNA strand displacement, with each clause encoded in a separate DNA molecule. Propositions are encoded assigning a strand to each proposition p, and its complementary strand to the proposition ¬p; clauses are encoded comprising different propositions in the same strand. The model allows to run logic programs composed of Horn clauses by cascading resolution steps. The potential of the model is demonstrated also by its theoretical capability of solving SAT. The resulting SAT algorithm has a linear time complexity in the number of resolution steps, whereas its spatial complexity is exponential in the number of variables of the formula. Copyright © 2013 Elsevier Ireland Ltd. All rights reserved.

  6. A DNA-based molecular motor that can navigate a network of tracks

    NASA Astrophysics Data System (ADS)

    Wickham, Shelley F. J.; Bath, Jonathan; Katsuda, Yousuke; Endo, Masayuki; Hidaka, Kumi; Sugiyama, Hiroshi; Turberfield, Andrew J.

    2012-03-01

    Synthetic molecular motors can be fuelled by the hydrolysis or hybridization of DNA. Such motors can move autonomously and programmably, and long-range transport has been observed on linear tracks. It has also been shown that DNA systems can compute. Here, we report a synthetic DNA-based system that integrates long-range transport and information processing. We show that the path of a motor through a network of tracks containing four possible routes can be programmed using instructions that are added externally or carried by the motor itself. When external control is used we find that 87% of the motors follow the correct path, and when internal control is used 71% of the motors follow the correct path. Programmable motion will allow the development of computing networks, molecular systems that can sort and process cargoes according to instructions that they carry, and assembly lines that can be reconfigured dynamically in response to changing demands.

  7. Introduction to the Natural Anticipator and the Artificial Anticipator

    NASA Astrophysics Data System (ADS)

    Dubois, Daniel M.

    2010-11-01

    This short communication deals with the introduction of the concept of anticipator, which is one who anticipates, in the framework of computing anticipatory systems. The definition of anticipation deals with the concept of program. Indeed, the word program, comes from "pro-gram" meaning "to write before" by anticipation, and means a plan for the programming of a mechanism, or a sequence of coded instructions that can be inserted into a mechanism, or a sequence of coded instructions, as genes or behavioural responses, that is part of an organism. Any natural or artificial programs are thus related to anticipatory rewriting systems, as shown in this paper. All the cells in the body, and the neurons in the brain, are programmed by the anticipatory genetic code, DNA, in a low-level language with four signs. The programs in computers are also computing anticipatory systems. It will be shown, at one hand, that the genetic code DNA is a natural anticipator. As demonstrated by Nobel laureate McClintock [8], genomes are programmed. The fundamental program deals with the DNA genetic code. The properties of the DNA consist in self-replication and self-modification. The self-replicating process leads to reproduction of the species, while the self-modifying process leads to new species or evolution and adaptation in existing ones. The genetic code DNA keeps its instructions in memory in the DNA coding molecule. The genetic code DNA is a rewriting system, from DNA coding to DNA template molecule. The DNA template molecule is a rewriting system to the Messenger RNA molecule. The information is not destroyed during the execution of the rewriting program. On the other hand, it will be demonstrated that Turing machine is an artificial anticipator. The Turing machine is a rewriting system. The head reads and writes, modifying the content of the tape. The information is destroyed during the execution of the program. This is an irreversible process. The input data are lost.

  8. Programmable chemical controllers made from DNA.

    PubMed

    Chen, Yuan-Jyue; Dalchau, Neil; Srinivas, Niranjan; Phillips, Andrew; Cardelli, Luca; Soloveichik, David; Seelig, Georg

    2013-10-01

    Biological organisms use complex molecular networks to navigate their environment and regulate their internal state. The development of synthetic systems with similar capabilities could lead to applications such as smart therapeutics or fabrication methods based on self-organization. To achieve this, molecular control circuits need to be engineered to perform integrated sensing, computation and actuation. Here we report a DNA-based technology for implementing the computational core of such controllers. We use the formalism of chemical reaction networks as a 'programming language' and our DNA architecture can, in principle, implement any behaviour that can be mathematically expressed as such. Unlike logic circuits, our formulation naturally allows complex signal processing of intrinsically analogue biological and chemical inputs. Controller components can be derived from biologically synthesized (plasmid) DNA, which reduces errors associated with chemically synthesized DNA. We implement several building-block reaction types and then combine them into a network that realizes, at the molecular level, an algorithm used in distributed control systems for achieving consensus between multiple agents.

  9. Holes influence the mutation spectrum of human mitochondrial DNA

    NASA Astrophysics Data System (ADS)

    Villagran, Martha; Miller, John

    Mutations drive evolution and disease, showing highly non-random patterns of variant frequency vs. nucleotide position. We use computational DNA hole spectroscopy [M.Y. Suarez-Villagran & J.H. Miller, Sci. Rep. 5, 13571 (2015)] to reveal sites of enhanced hole probability in selected regions of human mitochondrial DNA. A hole is a mobile site of positive charge created when an electron is removed, for example by radiation or contact with a mutagenic agent. The hole spectra are quantum mechanically computed using a two-stranded tight binding model of DNA. We observe significant correlation between spectra of hole probabilities and of genetic variation frequencies from the MITOMAP database. These results suggest that hole-enhanced mutation mechanisms exert a substantial, perhaps dominant, influence on mutation patterns in DNA. One example is where a trapped hole induces a hydrogen bond shift, known as tautomerization, which then triggers a base-pair mismatch during replication. Our results deepen overall understanding of sequence specific mutation rates, encompassing both hotspots and cold spots, which drive molecular evolution.

  10. Programmable chemical controllers made from DNA

    NASA Astrophysics Data System (ADS)

    Chen, Yuan-Jyue; Dalchau, Neil; Srinivas, Niranjan; Phillips, Andrew; Cardelli, Luca; Soloveichik, David; Seelig, Georg

    2013-10-01

    Biological organisms use complex molecular networks to navigate their environment and regulate their internal state. The development of synthetic systems with similar capabilities could lead to applications such as smart therapeutics or fabrication methods based on self-organization. To achieve this, molecular control circuits need to be engineered to perform integrated sensing, computation and actuation. Here we report a DNA-based technology for implementing the computational core of such controllers. We use the formalism of chemical reaction networks as a 'programming language' and our DNA architecture can, in principle, implement any behaviour that can be mathematically expressed as such. Unlike logic circuits, our formulation naturally allows complex signal processing of intrinsically analogue biological and chemical inputs. Controller components can be derived from biologically synthesized (plasmid) DNA, which reduces errors associated with chemically synthesized DNA. We implement several building-block reaction types and then combine them into a network that realizes, at the molecular level, an algorithm used in distributed control systems for achieving consensus between multiple agents.

  11. Programmable chemical controllers made from DNA

    PubMed Central

    Chen, Yuan-Jyue; Dalchau, Neil; Srinivas, Niranjan; Phillips, Andrew; Cardelli, Luca; Soloveichik, David; Seelig, Georg

    2014-01-01

    Biological organisms use complex molecular networks to navigate their environment and regulate their internal state. The development of synthetic systems with similar capabilities could lead to applications such as smart therapeutics or fabrication methods based on self-organization. To achieve this, molecular control circuits need to be engineered to perform integrated sensing, computation and actuation. Here we report a DNA-based technology for implementing the computational core of such controllers. We use the formalism of chemical reaction networks as a 'programming language', and our DNA architecture can, in principle, implement any behaviour that can be mathematically expressed as such. Unlike logic circuits, our formulation naturally allows complex signal processing of intrinsically analogue biological and chemical inputs. Controller components can be derived from biologically synthesized (plasmid) DNA, which reduces errors associated with chemically synthesized DNA. We implement several building-block reaction types and then combine them into a network that realizes, at the molecular level, an algorithm used in distributed control systems for achieving consensus between multiple agents. PMID:24077029

  12. A stochastic reaction-diffusion model for protein aggregation on DNA

    NASA Astrophysics Data System (ADS)

    Voulgarakis, Nikolaos K.

    Vital functions of DNA, such as transcription and packaging, depend on the proper clustering of proteins on the double strand. The present study investigates how the interplay between DNA allostery and electrostatic interactions affects protein clustering. The statistical analysis of a simple but transparent computational model reveals two major consequences of this interplay. First, depending on the protein and salt concentration, protein filaments exhibit a bimodal DNA stiffening and softening behavior. Second, within a certain domain of the control parameters, electrostatic interactions can cause energetic frustration that forces proteins to assemble in rigid spiral configurations. Such spiral filaments might trigger both positive and negative supercoiling, which can ultimately promote gene compaction and regulate the promoter. It has been experimentally shown that bacterial histone-like proteins assemble in similar spiral patterns and/or exhibit the same bimodal behavior. The proposed model can, thus, provide computational insights into the physical mechanisms used by proteins to control the mechanical properties of the DNA.

  13. DNA profiles, computer searches, and the Fourth Amendment.

    PubMed

    Kimel, Catherine W

    2013-01-01

    Pursuant to federal statutes and to laws in all fifty states, the United States government has assembled a database containing the DNA profiles of over eleven million citizens. Without judicial authorization, the government searches each of these profiles one-hundred thousand times every day, seeking to link database subjects to crimes they are not suspected of committing. Yet, courts and scholars that have addressed DNA databasing have focused their attention almost exclusively on the constitutionality of the government's seizure of the biological samples from which the profiles are generated. This Note fills a gap in the scholarship by examining the Fourth Amendment problems that arise when the government searches its vast DNA database. This Note argues that each attempt to match two DNA profiles constitutes a Fourth Amendment search because each attempted match infringes upon database subjects' expectations of privacy in their biological relationships and physical movements. The Note further argues that database searches are unreasonable as they are currently conducted, and it suggests an adaptation of computer-search procedures to remedy the constitutional deficiency.

  14. MICA: desktop software for comprehensive searching of DNA databases

    PubMed Central

    Stokes, William A; Glick, Benjamin S

    2006-01-01

    Background Molecular biologists work with DNA databases that often include entire genomes. A common requirement is to search a DNA database to find exact matches for a nondegenerate or partially degenerate query. The software programs available for such purposes are normally designed to run on remote servers, but an appealing alternative is to work with DNA databases stored on local computers. We describe a desktop software program termed MICA (K-Mer Indexing with Compact Arrays) that allows large DNA databases to be searched efficiently using very little memory. Results MICA rapidly indexes a DNA database. On a Macintosh G5 computer, the complete human genome could be indexed in about 5 minutes. The indexing algorithm recognizes all 15 characters of the DNA alphabet and fully captures the information in any DNA sequence, yet for a typical sequence of length L, the index occupies only about 2L bytes. The index can be searched to return a complete list of exact matches for a nondegenerate or partially degenerate query of any length. A typical search of a long DNA sequence involves reading only a small fraction of the index into memory. As a result, searches are fast even when the available RAM is limited. Conclusion MICA is suitable as a search engine for desktop DNA analysis software. PMID:17018144

  15. CMG-biotools, a free workbench for basic comparative microbial genomics.

    PubMed

    Vesth, Tammi; Lagesen, Karin; Acar, Öncel; Ussery, David

    2013-01-01

    Today, there are more than a hundred times as many sequenced prokaryotic genomes than were present in the year 2000. The economical sequencing of genomic DNA has facilitated a whole new approach to microbial genomics. The real power of genomics is manifested through comparative genomics that can reveal strain specific characteristics, diversity within species and many other aspects. However, comparative genomics is a field not easily entered into by scientists with few computational skills. The CMG-biotools package is designed for microbiologists with limited knowledge of computational analysis and can be used to perform a number of analyses and comparisons of genomic data. The CMG-biotools system presents a stand-alone interface for comparative microbial genomics. The package is a customized operating system, based on Xubuntu 10.10, available through the open source Ubuntu project. The system can be installed on a virtual computer, allowing the user to run the system alongside any other operating system. Source codes for all programs are provided under GNU license, which makes it possible to transfer the programs to other systems if so desired. We here demonstrate the package by comparing and analyzing the diversity within the class Negativicutes, represented by 31 genomes including 10 genera. The analyses include 16S rRNA phylogeny, basic DNA and codon statistics, proteome comparisons using BLAST and graphical analyses of DNA structures. This paper shows the strength and diverse use of the CMG-biotools system. The system can be installed on a vide range of host operating systems and utilizes as much of the host computer as desired. It allows the user to compare multiple genomes, from various sources using standardized data formats and intuitive visualizations of results. The examples presented here clearly shows that users with limited computational experience can perform complicated analysis without much training.

  16. An improved model for whole genome phylogenetic analysis by Fourier transform.

    PubMed

    Yin, Changchuan; Yau, Stephen S-T

    2015-10-07

    DNA sequence similarity comparison is one of the major steps in computational phylogenetic studies. The sequence comparison of closely related DNA sequences and genomes is usually performed by multiple sequence alignments (MSA). While the MSA method is accurate for some types of sequences, it may produce incorrect results when DNA sequences undergone rearrangements as in many bacterial and viral genomes. It is also limited by its computational complexity for comparing large volumes of data. Previously, we proposed an alignment-free method that exploits the full information contents of DNA sequences by Discrete Fourier Transform (DFT), but still with some limitations. Here, we present a significantly improved method for the similarity comparison of DNA sequences by DFT. In this method, we map DNA sequences into 2-dimensional (2D) numerical sequences and then apply DFT to transform the 2D numerical sequences into frequency domain. In the 2D mapping, the nucleotide composition of a DNA sequence is a determinant factor and the 2D mapping reduces the nucleotide composition bias in distance measure, and thus improving the similarity measure of DNA sequences. To compare the DFT power spectra of DNA sequences with different lengths, we propose an improved even scaling algorithm to extend shorter DFT power spectra to the longest length of the underlying sequences. After the DFT power spectra are evenly scaled, the spectra are in the same dimensionality of the Fourier frequency space, then the Euclidean distances of full Fourier power spectra of the DNA sequences are used as the dissimilarity metrics. The improved DFT method, with increased computational performance by 2D numerical representation, can be applicable to any DNA sequences of different length ranges. We assess the accuracy of the improved DFT similarity measure in hierarchical clustering of different DNA sequences including simulated and real datasets. The method yields accurate and reliable phylogenetic trees and demonstrates that the improved DFT dissimilarity measure is an efficient and effective similarity measure of DNA sequences. Due to its high efficiency and accuracy, the proposed DFT similarity measure is successfully applied on phylogenetic analysis for individual genes and large whole bacterial genomes. Copyright © 2015 Elsevier Ltd. All rights reserved.

  17. DNA as information.

    PubMed

    Wills, Peter R

    2016-03-13

    This article reviews contributions to this theme issue covering the topic 'DNA as information' in relation to the structure of DNA, the measure of its information content, the role and meaning of information in biology and the origin of genetic coding as a transition from uninformed to meaningful computational processes in physical systems. © 2016 The Author(s).

  18. Self-Directed Student Research through Analysis of Microarray Datasets: A Computer-Based Functional Genomics Practical Class for Masters-Level Students

    ERIC Educational Resources Information Center

    Grenville-Briggs, Laura J.; Stansfield, Ian

    2011-01-01

    This report describes a linked series of Masters-level computer practical workshops. They comprise an advanced functional genomics investigation, based upon analysis of a microarray dataset probing yeast DNA damage responses. The workshops require the students to analyse highly complex transcriptomics datasets, and were designed to stimulate…

  19. Application of permanents of square matrices for DNA identification in multiple-fatality cases

    PubMed Central

    2013-01-01

    Background DNA profiling is essential for individual identification. In forensic medicine, the likelihood ratio (LR) is commonly used to identify individuals. The LR is calculated by comparing two hypotheses for the sample DNA: that the sample DNA is identical or related to a reference DNA, and that it is randomly sampled from a population. For multiple-fatality cases, however, identification should be considered as an assignment problem, and a particular sample and reference pair should therefore be compared with other possibilities conditional on the entire dataset. Results We developed a new method to compute the probability via permanents of square matrices of nonnegative entries. As the exact permanent is known as a #P-complete problem, we applied the Huber–Law algorithm to approximate the permanents. We performed a computer simulation to evaluate the performance of our method via receiver operating characteristic curve analysis compared with LR under the assumption of a closed incident. Differences between the two methods were well demonstrated when references provided neither obligate alleles nor impossible alleles. The new method exhibited higher sensitivity (0.188 vs. 0.055) at a threshold value of 0.999, at which specificity was 1, and it exhibited higher area under a receiver operating characteristic curve (0.990 vs. 0.959, P = 9.6E-15). Conclusions Our method therefore offers a solution for a computationally intensive assignment problem and may be a viable alternative to LR-based identification for closed-incident multiple-fatality cases. PMID:23962363

  20. Generating finite cyclic and dihedral groups using sequential insertion systems with interactions

    NASA Astrophysics Data System (ADS)

    Fong, Wan Heng; Sarmin, Nor Haniza; Turaev, Sherzod; Yosman, Ahmad Firdaus

    2017-04-01

    The operation of insertion has been studied extensively throughout the years for its impact in many areas of theoretical computer science such as DNA computing. First introduced as a generalization of the concatenation operation, many variants of insertion have been introduced, each with their own computational properties. In this paper, we introduce a new variant that enables the generation of some special types of groups called sequential insertion systems with interactions. We show that these new systems are able to generate all finite cyclic and dihedral groups.

  1. Computational DNA hole spectroscopy: A new tool to predict mutation hotspots, critical base pairs, and disease ‘driver’ mutations

    PubMed Central

    Suárez, Martha Y.; Villagrán; Miller, John H.

    2015-01-01

    We report on a new technique, computational DNA hole spectroscopy, which creates spectra of electron hole probabilities vs. nucleotide position. A hole is a site of positive charge created when an electron is removed. Peaks in the hole spectrum depict sites where holes tend to localize and potentially trigger a base pair mismatch during replication. Our studies of mitochondrial DNA reveal a correlation between L-strand hole spectrum peaks and spikes in the human mutation spectrum. Importantly, we also find that hole peak positions that do not coincide with large variant frequencies often coincide with disease-implicated mutations and/or (for coding DNA) encoded conserved amino acids. This enables combining hole spectra with variant data to identify critical base pairs and potential disease ‘driver’ mutations. Such integration of DNA hole and variance spectra could ultimately prove invaluable for pinpointing critical regions of the vast non-protein-coding genome. An observed asymmetry in correlations, between the spectrum of human mtDNA variations and the L- and H-strand hole spectra, is attributed to asymmetric DNA replication processes that occur for the leading and lagging strands. PMID:26310834

  2. Computational DNA hole spectroscopy: A new tool to predict mutation hotspots, critical base pairs, and disease 'driver' mutations.

    PubMed

    Villagrán, Martha Y Suárez; Miller, John H

    2015-08-27

    We report on a new technique, computational DNA hole spectroscopy, which creates spectra of electron hole probabilities vs. nucleotide position. A hole is a site of positive charge created when an electron is removed. Peaks in the hole spectrum depict sites where holes tend to localize and potentially trigger a base pair mismatch during replication. Our studies of mitochondrial DNA reveal a correlation between L-strand hole spectrum peaks and spikes in the human mutation spectrum. Importantly, we also find that hole peak positions that do not coincide with large variant frequencies often coincide with disease-implicated mutations and/or (for coding DNA) encoded conserved amino acids. This enables combining hole spectra with variant data to identify critical base pairs and potential disease 'driver' mutations. Such integration of DNA hole and variance spectra could ultimately prove invaluable for pinpointing critical regions of the vast non-protein-coding genome. An observed asymmetry in correlations, between the spectrum of human mtDNA variations and the L- and H-strand hole spectra, is attributed to asymmetric DNA replication processes that occur for the leading and lagging strands.

  3. Model Checking Temporal Logic Formulas Using Sticker Automata

    PubMed Central

    Feng, Changwei; Wu, Huanmei

    2017-01-01

    As an important complex problem, the temporal logic model checking problem is still far from being fully resolved under the circumstance of DNA computing, especially Computation Tree Logic (CTL), Interval Temporal Logic (ITL), and Projection Temporal Logic (PTL), because there is still a lack of approaches for DNA model checking. To address this challenge, a model checking method is proposed for checking the basic formulas in the above three temporal logic types with DNA molecules. First, one-type single-stranded DNA molecules are employed to encode the Finite State Automaton (FSA) model of the given basic formula so that a sticker automaton is obtained. On the other hand, other single-stranded DNA molecules are employed to encode the given system model so that the input strings of the sticker automaton are obtained. Next, a series of biochemical reactions are conducted between the above two types of single-stranded DNA molecules. It can then be decided whether the system satisfies the formula or not. As a result, we have developed a DNA-based approach for checking all the basic formulas of CTL, ITL, and PTL. The simulated results demonstrate the effectiveness of the new method. PMID:29119114

  4. Biologically important conformational features of DNA as interpreted by quantum mechanics and molecular mechanics computations of its simple fragments.

    PubMed

    Poltev, V; Anisimov, V M; Dominguez, V; Gonzalez, E; Deriabina, A; Garcia, D; Rivas, F; Polteva, N A

    2018-02-01

    Deciphering the mechanism of functioning of DNA as the carrier of genetic information requires identifying inherent factors determining its structure and function. Following this path, our previous DFT studies attributed the origin of unique conformational characteristics of right-handed Watson-Crick duplexes (WCDs) to the conformational profile of deoxydinucleoside monophosphates (dDMPs) serving as the minimal repeating units of DNA strand. According to those findings, the directionality of the sugar-phosphate chain and the characteristic ranges of dihedral angles of energy minima combined with the geometric differences between purines and pyrimidines determine the dependence on base sequence of the three-dimensional (3D) structure of WCDs. This work extends our computational study to complementary deoxydinucleotide-monophosphates (cdDMPs) of non-standard conformation, including those of Z-family, Hoogsteen duplexes, parallel-stranded structures, and duplexes with mispaired bases. For most of these systems, except Z-conformation, computations closely reproduce experimental data within the tolerance of characteristic limits of dihedral parameters for each conformation family. Computation of cdDMPs with Z-conformation reveals that their experimental structures do not correspond to the internal energy minimum. This finding establishes the leading role of external factors in formation of the Z-conformation. Energy minima of cdDMPs of non-Watson-Crick duplexes demonstrate different sequence-dependence features than those known for WCDs. The obtained results provide evidence that the biologically important regularities of 3D structure distinguish WCDs from duplexes having non-Watson-Crick nucleotide pairing.

  5. Simulation and display of macromolecular complexes

    NASA Technical Reports Server (NTRS)

    Nir, S.; Garduno, R.; Rein, R.; Macelroy, R. D.

    1977-01-01

    In association with an investigation of the interaction of proteins with DNA and RNA, an interactive computer program for building, manipulating, and displaying macromolecular complexes has been designed. The system provides perspective, planar, and stereoscopic views on the computer terminal display, as well as views for standard and nonstandard observer locations. The molecule or its parts may be rotated and/or translated in any direction; bond connections may be added or removed by the viewer. Molecular fragments may be juxtaposed in such a way that given bonds are aligned, and given planes and points coincide. Another subroutine provides for the duplication of a given unit such as a DNA or amino-acid base.

  6. Physicist's simple access to protein structures: the computer program WHAT IF

    NASA Astrophysics Data System (ADS)

    Altenberg-Greulich, Brigitte; Zech, Stephan G.; Stehlik, Dietmar; Vriend, Gert

    2001-06-01

    We describe the computer program WHAT IF and its application to two physical examples. For the DNA binding protein, OCT-1 (pou domain) the location of amino acids with a sidechain amino group is shown. Such knowledge is required when staining this molecule with a fluorescence dye, which binds chemically to the amino terminus as well as amino groups in sidechains. The program shows that most sidechain amino groups are protected when DNA is bound to OCT-1, allowing selective staining of the amino terminal NH2 group. A protein stained this way can be used in fluorescence spectroscopic studies on function aspects of OCT-1.

  7. Non-linear molecular pattern classification using molecular beacons with multiple targets.

    PubMed

    Lee, In-Hee; Lee, Seung Hwan; Park, Tai Hyun; Zhang, Byoung-Tak

    2013-12-01

    In vitro pattern classification has been highlighted as an important future application of DNA computing. Previous work has demonstrated the feasibility of linear classifiers using DNA-based molecular computing. However, complex tasks require non-linear classification capability. Here we design a molecular beacon that can interact with multiple targets and experimentally shows that its fluorescent signals form a complex radial-basis function, enabling it to be used as a building block for non-linear molecular classification in vitro. The proposed method was successfully applied to solving artificial and real-world classification problems: XOR and microRNA expression patterns. Copyright © 2013 Elsevier Ireland Ltd. All rights reserved.

  8. PopSc: Computing Toolkit for Basic Statistics of Molecular Population Genetics Simultaneously Implemented in Web-Based Calculator, Python and R

    PubMed Central

    Huang, Ying; Li, Cao; Liu, Linhai; Jia, Xianbo; Lai, Song-Jia

    2016-01-01

    Although various computer tools have been elaborately developed to calculate a series of statistics in molecular population genetics for both small- and large-scale DNA data, there is no efficient and easy-to-use toolkit available yet for exclusively focusing on the steps of mathematical calculation. Here, we present PopSc, a bioinformatic toolkit for calculating 45 basic statistics in molecular population genetics, which could be categorized into three classes, including (i) genetic diversity of DNA sequences, (ii) statistical tests for neutral evolution, and (iii) measures of genetic differentiation among populations. In contrast to the existing computer tools, PopSc was designed to directly accept the intermediate metadata, such as allele frequencies, rather than the raw DNA sequences or genotyping results. PopSc is first implemented as the web-based calculator with user-friendly interface, which greatly facilitates the teaching of population genetics in class and also promotes the convenient and straightforward calculation of statistics in research. Additionally, we also provide the Python library and R package of PopSc, which can be flexibly integrated into other advanced bioinformatic packages of population genetics analysis. PMID:27792763

  9. PopSc: Computing Toolkit for Basic Statistics of Molecular Population Genetics Simultaneously Implemented in Web-Based Calculator, Python and R.

    PubMed

    Chen, Shi-Yi; Deng, Feilong; Huang, Ying; Li, Cao; Liu, Linhai; Jia, Xianbo; Lai, Song-Jia

    2016-01-01

    Although various computer tools have been elaborately developed to calculate a series of statistics in molecular population genetics for both small- and large-scale DNA data, there is no efficient and easy-to-use toolkit available yet for exclusively focusing on the steps of mathematical calculation. Here, we present PopSc, a bioinformatic toolkit for calculating 45 basic statistics in molecular population genetics, which could be categorized into three classes, including (i) genetic diversity of DNA sequences, (ii) statistical tests for neutral evolution, and (iii) measures of genetic differentiation among populations. In contrast to the existing computer tools, PopSc was designed to directly accept the intermediate metadata, such as allele frequencies, rather than the raw DNA sequences or genotyping results. PopSc is first implemented as the web-based calculator with user-friendly interface, which greatly facilitates the teaching of population genetics in class and also promotes the convenient and straightforward calculation of statistics in research. Additionally, we also provide the Python library and R package of PopSc, which can be flexibly integrated into other advanced bioinformatic packages of population genetics analysis.

  10. Identification of DNA-binding proteins by combining auto-cross covariance transformation and ensemble learning.

    PubMed

    Liu, Bin; Wang, Shanyi; Dong, Qiwen; Li, Shumin; Liu, Xuan

    2016-04-20

    DNA-binding proteins play a pivotal role in various intra- and extra-cellular activities ranging from DNA replication to gene expression control. With the rapid development of next generation of sequencing technique, the number of protein sequences is unprecedentedly increasing. Thus it is necessary to develop computational methods to identify the DNA-binding proteins only based on the protein sequence information. In this study, a novel method called iDNA-KACC is presented, which combines the Support Vector Machine (SVM) and the auto-cross covariance transformation. The protein sequences are first converted into profile-based protein representation, and then converted into a series of fixed-length vectors by the auto-cross covariance transformation with Kmer composition. The sequence order effect can be effectively captured by this scheme. These vectors are then fed into Support Vector Machine (SVM) to discriminate the DNA-binding proteins from the non DNA-binding ones. iDNA-KACC achieves an overall accuracy of 75.16% and Matthew correlation coefficient of 0.5 by a rigorous jackknife test. Its performance is further improved by employing an ensemble learning approach, and the improved predictor is called iDNA-KACC-EL. Experimental results on an independent dataset shows that iDNA-KACC-EL outperforms all the other state-of-the-art predictors, indicating that it would be a useful computational tool for DNA binding protein identification. .

  11. Genome Partitioner: A web tool for multi-level partitioning of large-scale DNA constructs for synthetic biology applications.

    PubMed

    Christen, Matthias; Del Medico, Luca; Christen, Heinz; Christen, Beat

    2017-01-01

    Recent advances in lower-cost DNA synthesis techniques have enabled new innovations in the field of synthetic biology. Still, efficient design and higher-order assembly of genome-scale DNA constructs remains a labor-intensive process. Given the complexity, computer assisted design tools that fragment large DNA sequences into fabricable DNA blocks are needed to pave the way towards streamlined assembly of biological systems. Here, we present the Genome Partitioner software implemented as a web-based interface that permits multi-level partitioning of genome-scale DNA designs. Without the need for specialized computing skills, biologists can submit their DNA designs to a fully automated pipeline that generates the optimal retrosynthetic route for higher-order DNA assembly. To test the algorithm, we partitioned a 783 kb Caulobacter crescentus genome design. We validated the partitioning strategy by assembling a 20 kb test segment encompassing a difficult to synthesize DNA sequence. Successful assembly from 1 kb subblocks into the 20 kb segment highlights the effectiveness of the Genome Partitioner for reducing synthesis costs and timelines for higher-order DNA assembly. The Genome Partitioner is broadly applicable to translate DNA designs into ready to order sequences that can be assembled with standardized protocols, thus offering new opportunities to harness the diversity of microbial genomes for synthetic biology applications. The Genome Partitioner web tool can be accessed at https://christenlab.ethz.ch/GenomePartitioner.

  12. Twisting short dsDNA with applied tension

    NASA Astrophysics Data System (ADS)

    Zoli, Marco

    2018-02-01

    The twisting deformation of mechanically stretched DNA molecules is studied by a coarse grained Hamiltonian model incorporating the fundamental interactions that stabilize the double helix and accounting for the radial and angular base pair fluctuations. The latter are all the more important at short length scales in which DNA fragments maintain an intrinsic flexibility. The presented computational method simulates a broad ensemble of possible molecule conformations characterized by a specific average twist and determines the energetically most convenient helical twist by free energy minimization. As this is done for any external load, the method yields the characteristic twist-stretch profile of the molecule and also computes the changes in the macroscopic helix parameters i.e. average diameter and rise distance. It is predicted that short molecules under stretching should first over-twist and then untwist by increasing the external load. Moreover, applying a constant load and simulating a torsional strain which over-twists the helix, it is found that the average helix diameter shrinks while the molecule elongates, in agreement with the experimental trend observed in kilo-base long sequences. The quantitative relation between percent relative elongation and superhelical density at fixed load is derived. The proposed theoretical model and computational method offer a general approach to characterize specific DNA fragments and predict their macroscopic elastic response as a function of the effective potential parameters of the mesoscopic Hamiltonian.

  13. A 21st Century Science, Technology, and Innovation Strategy for Americas National Security

    DTIC Science & Technology

    2016-05-01

    areas. Advanced Computing and Communications The exponential growth of the digital economy, driven by ubiquitous computing and communication...weapons- focused R&D, many of the capabilities being developed have significant dual-use potential. Digital connectivity, for instance, brings...scale than traditional recombinant DNA techniques, and to share these designs digitally . Nanotechnology promises the ability to engineer entirely

  14. Student Conceptions about the DNA Structure within a Hierarchical Organizational Level: Improvement by Experiment- and Computer-Based Outreach Learning

    ERIC Educational Resources Information Center

    Langheinrich, Jessica; Bogner, Franz X.

    2015-01-01

    As non-scientific conceptions interfere with learning processes, teachers need both, to know about them and to address them in their classrooms. For our study, based on 182 eleventh graders, we analyzed the level of conceptual understanding by implementing the "draw and write" technique during a computer-supported gene technology module.…

  15. Advances in PCR technology.

    PubMed

    Lauerman, Lloyd H

    2004-12-01

    Since the discovery of the polymerase chain reaction (PCR) 20 years ago, an avalanche of scientific publications have reported major developments and changes in specialized equipment, reagents, sample preparation, computer programs and techniques, generated through business, government and university research. The requirement for genetic sequences for primer selection and validation has been greatly facilitated by the development of new sequencing techniques, machines and computer programs. Genetic libraries, such as GenBank, EMBL and DDBJ continue to accumulate a wealth of genetic sequence information for the development and validation of molecular-based diagnostic procedures concerning human and veterinary disease agents. The mechanization of various aspects of the PCR assay, such as robotics, microfluidics and nanotechnology, has made it possible for the rapid advancement of new procedures. Real-time PCR, DNA microarray and DNA chips utilize these newer techniques in conjunction with computer and computer programs. Instruments for hand-held PCR assays are being developed. The PCR and reverse transcription-PCR (RT-PCR) assays have greatly accelerated the speed and accuracy of diagnoses of human and animal disease, especially of the infectious agents that are difficult to isolate or demonstrate. The PCR has made it possible to genetically characterize a microbial isolate inexpensively and rapidly for identification, typing and epidemiological comparison.

  16. COMPETITIVE METAGENOMIC DNA HYBRIDIZATION IDENTIFIES HOST-SPECIFIC GENETIC MARKERS IN HUMAN FECAL MICROBIAL COMMUNITIES

    EPA Science Inventory

    Although recent technological advances in DNA sequencing and computational biology now allow scientists to compare entire microbial genomes, the use of these approaches to discern key genomic differences between natural microbial communities remains prohibitively expensive for mo...

  17. Identification of Bacterial DNA Markers for the Detection of Human and Cattle Fecal Pollution - SLIDES

    EPA Science Inventory

    Technological advances in DNA sequencing and computational biology allow scientists to compare entire microbial genomes. However, the use of these approaches to discern key genomic differences between natural microbial communities remains prohibitively expensive for most laborato...

  18. IDENTIFICATION OF BACTERIAL DNA MARKERS FOR THE DETECTION OF HUMAN AND CATTLE FECAL POLLUTION

    EPA Science Inventory

    Technological advances in DNA sequencing and computational biology allow scientists to compare entire microbial genomes. However, the use of these approaches to discern key genomic differences between natural microbial communities remains prohibitively expensive for most laborato...

  19. Tumor purity and differential methylation in cancer epigenomics.

    PubMed

    Wang, Fayou; Zhang, Naiqian; Wang, Jun; Wu, Hao; Zheng, Xiaoqi

    2016-11-01

    DNA methylation is an epigenetic modification of DNA molecule that plays a vital role in gene expression regulation. It is not only involved in many basic biological processes, but also considered an important factor for tumorigenesis and other human diseases. Study of DNA methylation has been an active field in cancer epigenomics research. With the advances of high-throughput technologies and the accumulation of enormous amount of data, method development for analyzing these data has gained tremendous interests in the fields of computational biology and bioinformatics. In this review, we systematically summarize the recent developments of computational methods and software tools in high-throughput methylation data analysis with focus on two aspects: differential methylation analysis and tumor purity estimation in cancer studies. © The Author 2016. Published by Oxford University Press. All rights reserved. For permissions, please email: journals.permissions@oup.com.

  20. DNA-binding specificity prediction with FoldX.

    PubMed

    Nadra, Alejandro D; Serrano, Luis; Alibés, Andreu

    2011-01-01

    With the advent of Synthetic Biology, a field between basic science and applied engineering, new computational tools are needed to help scientists reach their goal, their design, optimizing resources. In this chapter, we present a simple and powerful method to either know the DNA specificity of a wild-type protein or design new specificities by using the protein design algorithm FoldX. The only basic requirement is having a good resolution structure of the complex. Protein-DNA interaction design may aid the development of new parts designed to be orthogonal, decoupled, and precise in its target. Further, it could help to fine-tune the systems in terms of specificity, discrimination, and binding constants. In the age of newly developed devices and invented systems, computer-aided engineering promises to be an invaluable tool. Copyright © 2011 Elsevier Inc. All rights reserved.

  1. A simple method for the computation of first neighbour frequencies of DNAs from CD spectra

    PubMed Central

    Marck, Christian; Guschlbauer, Wilhelm

    1978-01-01

    A procedure for the computation of the first neighbour frequencies of DNA's is presented. This procedure is based on the first neighbour approximation of Gray and Tinoco. We show that the knowledge of all the ten elementary CD signals attached to the ten double stranded first neighbour configurations is not necessary. One can obtain the ten frequencies of an unknown DNA with the use of eight elementary CD signals corresponding to eight linearly independent polymer sequences. These signals can be extracted very simply from any eight or more CD spectra of double stranded DNA's of known frequencies. The ten frequencies of a DNA are obtained by least square fit of its CD spectrum with these elementary signals. One advantage of this procedure is that it does not necessitate linear programming, it can be used with CD data digitalized using a large number of wavelengths, thus permitting an accurate resolution of the CD spectra. Under favorable case, the ten frequencies of a DNA (not used as input data) can be determined with an average absolute error < 2%. We have also observed that certain satellite DNA's, those of Drosophila virilis and Callinectes sapidus have CD spectra compatible with those of DNA's of quasi random sequence; these satellite DNA's should adopt also the B-form in solution. PMID:673843

  2. USE OF COMPETITIVE DNA HYBRIDIZATION TO IDENTIFY DIFFERENCES IN THE GENOMES OF TWO CLOSELY RELATED FECAL INDICATOR BACTERIA

    EPA Science Inventory

    Although recent technological advances in DNA sequencing and computational biology now allow scientists to compare entire microbial genomes, comparisons of closely related bacterial species and individual isolates by whole-genome sequencing approaches remains prohibitively expens...

  3. Computer Center: 2 HyperCard Stacks for Biology.

    ERIC Educational Resources Information Center

    Duhrkopf, Richard, Ed.

    1989-01-01

    Two Hypercard stacks are reviewed including "Amino Acids," created to help students associate amino acid names with their structures, and "DNA Teacher," a tutorial on the structure and function of DNA. Availability, functions, hardware requirements, and general comments on these stacks are provided. (CW)

  4. A Theoretical Study of Phosphoryl Transfers of Tyrosyl-DNA Phosphodiesterase I (Tdp1) and the Possibility of a "Dead-End" Phosphohistidine Intermediate.

    PubMed

    DeYonker, Nathan J; Webster, Charles Edwin

    2015-07-14

    Tyrosyl-DNA phosphodiesterase I (Tdp1) is a DNA repair enzyme conserved across eukaryotes that catalyzes the hydrolysis of the phosphodiester bond between the tyrosine residue of topoisomerase I and the 3'-phosphate of DNA. Atomic level details of the mechanism of Tdp1 are proposed and analyzed using a fully quantum mechanical, geometrically constrained model. The structural basis for the computational model is the vanadate-inhibited crystal structure of human Tdp1 (hTdp1, Protein Data Bank entry 1RFF ). Density functional theory computations are used to acquire thermodynamic and kinetic data along the catalytic pathway, including the phosphoryl transfer and subsequent hydrolysis. Located transition states and intermediates along the reaction coordinate suggest an associative phosphoryl transfer mechanism with five-coordinate phosphorane intermediates. Similar to both theoretical and experimental results for phospholipase D, the proposed mechanism for hTdp1 also includes the thermodynamically favorable possibility of a four-coordinate phosphohistidine "dead-end" product.

  5. Modeling Structure-Function Relationships in Synthetic DNA Sequences using Attribute Grammars

    PubMed Central

    Cai, Yizhi; Lux, Matthew W.; Adam, Laura; Peccoud, Jean

    2009-01-01

    Recognizing that certain biological functions can be associated with specific DNA sequences has led various fields of biology to adopt the notion of the genetic part. This concept provides a finer level of granularity than the traditional notion of the gene. However, a method of formally relating how a set of parts relates to a function has not yet emerged. Synthetic biology both demands such a formalism and provides an ideal setting for testing hypotheses about relationships between DNA sequences and phenotypes beyond the gene-centric methods used in genetics. Attribute grammars are used in computer science to translate the text of a program source code into the computational operations it represents. By associating attributes with parts, modifying the value of these attributes using rules that describe the structure of DNA sequences, and using a multi-pass compilation process, it is possible to translate DNA sequences into molecular interaction network models. These capabilities are illustrated by simple example grammars expressing how gene expression rates are dependent upon single or multiple parts. The translation process is validated by systematically generating, translating, and simulating the phenotype of all the sequences in the design space generated by a small library of genetic parts. Attribute grammars represent a flexible framework connecting parts with models of biological function. They will be instrumental for building mathematical models of libraries of genetic constructs synthesized to characterize the function of genetic parts. This formalism is also expected to provide a solid foundation for the development of computer assisted design applications for synthetic biology. PMID:19816554

  6. Quantum-mechanical analysis of the energetic contributions to π stacking in nucleic acids versus rise, twist, and slide.

    PubMed

    Parker, Trent M; Hohenstein, Edward G; Parrish, Robert M; Hud, Nicholas V; Sherrill, C David

    2013-01-30

    Symmetry-adapted perturbation theory (SAPT) is applied to pairs of hydrogen-bonded nucleobases to obtain the energetic components of base stacking (electrostatic, exchange-repulsion, induction/polarization, and London dispersion interactions) and how they vary as a function of the helical parameters Rise, Twist, and Slide. Computed average values of Rise and Twist agree well with experimental data for B-form DNA from the Nucleic Acids Database, even though the model computations omitted the backbone atoms (suggesting that the backbone in B-form DNA is compatible with having the bases adopt their ideal stacking geometries). London dispersion forces are the most important attractive component in base stacking, followed by electrostatic interactions. At values of Rise typical of those in DNA (3.36 Å), the electrostatic contribution is nearly always attractive, providing further evidence for the importance of charge-penetration effects in π-π interactions (a term neglected in classical force fields). Comparison of the computed stacking energies with those from model complexes made of the "parent" nucleobases purine and 2-pyrimidone indicates that chemical substituents in DNA and RNA account for 20-40% of the base-stacking energy. A lack of correspondence between the SAPT results and experiment for Slide in RNA base-pair steps suggests that the backbone plays a larger role in determining stacking geometries in RNA than in B-form DNA. In comparisons of base-pair steps with thymine versus uracil, the thymine methyl group tends to enhance the strength of the stacking interaction through a combination of dispersion and electrosatic interactions.

  7. Molecular dynamics simulations and applications in computational toxicology and nanotoxicology.

    PubMed

    Selvaraj, Chandrabose; Sakkiah, Sugunadevi; Tong, Weida; Hong, Huixiao

    2018-02-01

    Nanotoxicology studies toxicity of nanomaterials and has been widely applied in biomedical researches to explore toxicity of various biological systems. Investigating biological systems through in vivo and in vitro methods is expensive and time taking. Therefore, computational toxicology, a multi-discipline field that utilizes computational power and algorithms to examine toxicology of biological systems, has gained attractions to scientists. Molecular dynamics (MD) simulations of biomolecules such as proteins and DNA are popular for understanding of interactions between biological systems and chemicals in computational toxicology. In this paper, we review MD simulation methods, protocol for running MD simulations and their applications in studies of toxicity and nanotechnology. We also briefly summarize some popular software tools for execution of MD simulations. Published by Elsevier Ltd.

  8. Chromatin Computation

    PubMed Central

    Bryant, Barbara

    2012-01-01

    In living cells, DNA is packaged along with protein and RNA into chromatin. Chemical modifications to nucleotides and histone proteins are added, removed and recognized by multi-functional molecular complexes. Here I define a new computational model, in which chromatin modifications are information units that can be written onto a one-dimensional string of nucleosomes, analogous to the symbols written onto cells of a Turing machine tape, and chromatin-modifying complexes are modeled as read-write rules that operate on a finite set of adjacent nucleosomes. I illustrate the use of this “chromatin computer” to solve an instance of the Hamiltonian path problem. I prove that chromatin computers are computationally universal – and therefore more powerful than the logic circuits often used to model transcription factor control of gene expression. Features of biological chromatin provide a rich instruction set for efficient computation of nontrivial algorithms in biological time scales. Modeling chromatin as a computer shifts how we think about chromatin function, suggests new approaches to medical intervention, and lays the groundwork for the engineering of a new class of biological computing machines. PMID:22567109

  9. Second-generation DNA-templated macrocycle libraries for the discovery of bioactive small molecules.

    PubMed

    Usanov, Dmitry L; Chan, Alix I; Maianti, Juan Pablo; Liu, David R

    2018-07-01

    DNA-encoded libraries have emerged as a widely used resource for the discovery of bioactive small molecules, and offer substantial advantages compared with conventional small-molecule libraries. Here, we have developed and streamlined multiple fundamental aspects of DNA-encoded and DNA-templated library synthesis methodology, including computational identification and experimental validation of a 20 × 20 × 20 × 80 set of orthogonal codons, chemical and computational tools for enhancing the structural diversity and drug-likeness of library members, a highly efficient polymerase-mediated template library assembly strategy, and library isolation and purification methods. We have integrated these improved methods to produce a second-generation DNA-templated library of 256,000 small-molecule macrocycles with improved drug-like physical properties. In vitro selection of this library for insulin-degrading enzyme affinity resulted in novel insulin-degrading enzyme inhibitors, including one of unusual potency and novel macrocycle stereochemistry (IC 50  = 40 nM). Collectively, these developments enable DNA-templated small-molecule libraries to serve as more powerful, accessible, streamlined and cost-effective tools for bioactive small-molecule discovery.

  10. Assessing the potential of surface-immobilized molecular logic machines for integration with solid state technology.

    PubMed

    Dunn, Katherine E; Trefzer, Martin A; Johnson, Steven; Tyrrell, Andy M

    2016-08-01

    Molecular computation with DNA has great potential for low power, highly parallel information processing in a biological or biochemical context. However, significant challenges remain for the field of DNA computation. New technology is needed to allow multiplexed label-free readout and to enable regulation of molecular state without addition of new DNA strands. These capabilities could be provided by hybrid bioelectronic systems in which biomolecular computing is integrated with conventional electronics through immobilization of DNA machines on the surface of electronic circuitry. Here we present a quantitative experimental analysis of a surface-immobilized OR gate made from DNA and driven by strand displacement. The purpose of our work is to examine the performance of a simple representative surface-immobilized DNA logic machine, to provide valuable information for future work on hybrid bioelectronic systems involving DNA devices. We used a quartz crystal microbalance to examine a DNA monolayer containing approximately 5×10(11)gatescm(-2), with an inter-gate separation of approximately 14nm, and we found that the ensemble of gates took approximately 6min to switch. The gates could be switched repeatedly, but the switching efficiency was significantly degraded on the second and subsequent cycles when the binding site for the input was near to the surface. Otherwise, the switching efficiency could be 80% or better, and the power dissipated by the ensemble of gates during switching was approximately 0.1nWcm(-2), which is orders of magnitude less than the power dissipated during switching of an equivalent array of transistors. We propose an architecture for hybrid DNA-electronic systems in which information can be stored and processed, either in series or in parallel, by a combination of molecular machines and conventional electronics. In this architecture, information can flow freely and in both directions between the solution-phase and the underlying electronics via surface-immobilized DNA machines that provide the interface between the molecular and electronic domains. Copyright © 2016 Elsevier Ireland Ltd. All rights reserved.

  11. Analog synthetic biology.

    PubMed

    Sarpeshkar, R

    2014-03-28

    We analyse the pros and cons of analog versus digital computation in living cells. Our analysis is based on fundamental laws of noise in gene and protein expression, which set limits on the energy, time, space, molecular count and part-count resources needed to compute at a given level of precision. We conclude that analog computation is significantly more efficient in its use of resources than deterministic digital computation even at relatively high levels of precision in the cell. Based on this analysis, we conclude that synthetic biology must use analog, collective analog, probabilistic and hybrid analog-digital computational approaches; otherwise, even relatively simple synthetic computations in cells such as addition will exceed energy and molecular-count budgets. We present schematics for efficiently representing analog DNA-protein computation in cells. Analog electronic flow in subthreshold transistors and analog molecular flux in chemical reactions obey Boltzmann exponential laws of thermodynamics and are described by astoundingly similar logarithmic electrochemical potentials. Therefore, cytomorphic circuits can help to map circuit designs between electronic and biochemical domains. We review recent work that uses positive-feedback linearization circuits to architect wide-dynamic-range logarithmic analog computation in Escherichia coli using three transcription factors, nearly two orders of magnitude more efficient in parts than prior digital implementations.

  12. Parallel computation with molecular-motor-propelled agents in nanofabricated networks.

    PubMed

    Nicolau, Dan V; Lard, Mercy; Korten, Till; van Delft, Falco C M J M; Persson, Malin; Bengtsson, Elina; Månsson, Alf; Diez, Stefan; Linke, Heiner; Nicolau, Dan V

    2016-03-08

    The combinatorial nature of many important mathematical problems, including nondeterministic-polynomial-time (NP)-complete problems, places a severe limitation on the problem size that can be solved with conventional, sequentially operating electronic computers. There have been significant efforts in conceiving parallel-computation approaches in the past, for example: DNA computation, quantum computation, and microfluidics-based computation. However, these approaches have not proven, so far, to be scalable and practical from a fabrication and operational perspective. Here, we report the foundations of an alternative parallel-computation system in which a given combinatorial problem is encoded into a graphical, modular network that is embedded in a nanofabricated planar device. Exploring the network in a parallel fashion using a large number of independent, molecular-motor-propelled agents then solves the mathematical problem. This approach uses orders of magnitude less energy than conventional computers, thus addressing issues related to power consumption and heat dissipation. We provide a proof-of-concept demonstration of such a device by solving, in a parallel fashion, the small instance {2, 5, 9} of the subset sum problem, which is a benchmark NP-complete problem. Finally, we discuss the technical advances necessary to make our system scalable with presently available technology.

  13. Coalescence computations for large samples drawn from populations of time-varying sizes

    PubMed Central

    Polanski, Andrzej; Szczesna, Agnieszka; Garbulowski, Mateusz; Kimmel, Marek

    2017-01-01

    We present new results concerning probability distributions of times in the coalescence tree and expected allele frequencies for coalescent with large sample size. The obtained results are based on computational methodologies, which involve combining coalescence time scale changes with techniques of integral transformations and using analytical formulae for infinite products. We show applications of the proposed methodologies for computing probability distributions of times in the coalescence tree and their limits, for evaluation of accuracy of approximate expressions for times in the coalescence tree and expected allele frequencies, and for analysis of large human mitochondrial DNA dataset. PMID:28170404

  14. MS-CASPT2 study of hole transfer in guanine-indole complexes using the generalized Mulliken-Hush method: effective two-state treatment.

    PubMed

    Butchosa, C; Simon, S; Blancafort, L; Voityuk, A

    2012-07-12

    Because hole transfer from nucleobases to amino acid residues in DNA-protein complexes can prevent oxidative damage of DNA in living cells, computational modeling of the process is of high interest. We performed MS-CASPT2 calculations of several model structures of π-stacked guanine and indole and derived electron-transfer (ET) parameters for these systems using the generalized Mulliken-Hush (GMH) method. We show that the two-state model commonly applied to treat thermal ET between adjacent donor and acceptor is of limited use for the considered systems because of the small gap between the ground and first excited states in the indole radical cation. The ET parameters obtained within the two-state GMH scheme can deviate significantly from the corresponding matrix elements of the two-state effective Hamiltonian based on the GMH treatment of three adiabatic states. The computed values of diabatic energies and electronic couplings provide benchmarks to assess the performance of less sophisticated computational methods.

  15. Student conceptions about the DNA structure within a hierarchical organizational level: Improvement by experiment- and computer-based outreach learning.

    PubMed

    Langheinrich, Jessica; Bogner, Franz X

    2015-01-01

    As non-scientific conceptions interfere with learning processes, teachers need both, to know about them and to address them in their classrooms. For our study, based on 182 eleventh graders, we analyzed the level of conceptual understanding by implementing the "draw and write" technique during a computer-supported gene technology module. To give participants the hierarchical organizational level which they have to draw, was a specific feature of our study. We introduced two objective category systems for analyzing drawings and inscriptions. Our results indicated a long- as well as a short-term increase in the level of conceptual understanding and in the number of drawn elements and their grades concerning the DNA structure. Consequently, we regard the "draw and write" technique as a tool for a teacher to get to know students' alternative conceptions. Furthermore, our study points the modification potential of hands-on and computer-supported learning modules. © 2015 The International Union of Biochemistry and Molecular Biology.

  16. The Galaxy platform for accessible, reproducible and collaborative biomedical analyses: 2016 update

    PubMed Central

    Afgan, Enis; Baker, Dannon; van den Beek, Marius; Blankenberg, Daniel; Bouvier, Dave; Čech, Martin; Chilton, John; Clements, Dave; Coraor, Nate; Eberhard, Carl; Grüning, Björn; Guerler, Aysam; Hillman-Jackson, Jennifer; Von Kuster, Greg; Rasche, Eric; Soranzo, Nicola; Turaga, Nitesh; Taylor, James; Nekrutenko, Anton; Goecks, Jeremy

    2016-01-01

    High-throughput data production technologies, particularly ‘next-generation’ DNA sequencing, have ushered in widespread and disruptive changes to biomedical research. Making sense of the large datasets produced by these technologies requires sophisticated statistical and computational methods, as well as substantial computational power. This has led to an acute crisis in life sciences, as researchers without informatics training attempt to perform computation-dependent analyses. Since 2005, the Galaxy project has worked to address this problem by providing a framework that makes advanced computational tools usable by non experts. Galaxy seeks to make data-intensive research more accessible, transparent and reproducible by providing a Web-based environment in which users can perform computational analyses and have all of the details automatically tracked for later inspection, publication, or reuse. In this report we highlight recently added features enabling biomedical analyses on a large scale. PMID:27137889

  17. Modelling and Holographic Visualization of Space Radiation-Induced DNA Damage

    NASA Technical Reports Server (NTRS)

    Plante, Ianik

    2017-01-01

    Space radiation is composed by a mixture of ions of different energies. Among these, heavy inos are of particular importance because their health effects are poorly understood. In. the recent years, a software named RITRACKS (Relativistic Ion Tracks) was developed to simulate the detailed radiation track structure, several DNA models and DNA damage. As the DNA structure is complex due to packing, it is difficult to the damage using a regular computer screen.

  18. Hexagonally packed DNA within bacteriophage T7 stabilized by curvature stress.

    PubMed Central

    Odijk, T

    1998-01-01

    A continuum computation is proposed for the bending stress stabilizing DNA that is hexagonally packed within bacteriophage T7. Because the inner radius of the DNA spool is rather small, the stress of the curved DNA genome is strong enough to balance its electrostatic self-repulsion so as to form a stable hexagonal phase. The theory is in accord with the microscopically determined structure of bacteriophage T7 filled with DNA within the experimental margin of error. PMID:9726924

  19. Normal-Mode Analysis of Circular DNA at the Base-Pair Level. 2. Large-Scale Configurational Transformation of a Naturally Curved Molecule.

    PubMed

    Matsumoto, Atsushi; Tobias, Irwin; Olson, Wilma K

    2005-01-01

    Fine structural and energetic details embedded in the DNA base sequence, such as intrinsic curvature, are important to the packaging and processing of the genetic material. Here we investigate the internal dynamics of a 200 bp closed circular molecule with natural curvature using a newly developed normal-mode treatment of DNA in terms of neighboring base-pair "step" parameters. The intrinsic curvature of the DNA is described by a 10 bp repeating pattern of bending distortions at successive base-pair steps. We vary the degree of intrinsic curvature and the superhelical stress on the molecule and consider the normal-mode fluctuations of both the circle and the stable figure-8 configuration under conditions where the energies of the two states are similar. To extract the properties due solely to curvature, we ignore other important features of the double helix, such as the extensibility of the chain, the anisotropy of local bending, and the coupling of step parameters. We compare the computed normal modes of the curved DNA model with the corresponding dynamical features of a covalently closed duplex of the same chain length constructed from naturally straight DNA and with the theoretically predicted dynamical properties of a naturally circular, inextensible elastic rod, i.e., an O-ring. The cyclic molecules with intrinsic curvature are found to be more deformable under superhelical stress than rings formed from naturally straight DNA. As superhelical stress is accumulated in the DNA, the frequency, i.e., energy, of the dominant bending mode decreases in value, and if the imposed stress is sufficiently large, a global configurational rearrangement of the circle to the figure-8 form takes place. We combine energy minimization with normal-mode calculations of the two states to decipher the configurational pathway between the two states. We also describe and make use of a general analytical treatment of the thermal fluctuations of an elastic rod to characterize the motions of the minicircle as a whole from knowledge of the full set of normal modes. The remarkable agreement between computed and theoretically predicted values of the average deviation and dispersion of the writhe of the circular configuration adds to the reliability in the computational approach. Application of the new formalism to the computed modes of the figure-8 provides insights into macromolecular motions which are beyond the scope of current theoretical treatments.

  20. Multiple DNA and protein sequence alignment on a workstation and a supercomputer.

    PubMed

    Tajima, K

    1988-11-01

    This paper describes a multiple alignment method using a workstation and supercomputer. The method is based on the alignment of a set of aligned sequences with the new sequence, and uses a recursive procedure of such alignment. The alignment is executed in a reasonable computation time on diverse levels from a workstation to a supercomputer, from the viewpoint of alignment results and computational speed by parallel processing. The application of the algorithm is illustrated by several examples of multiple alignment of 12 amino acid and DNA sequences of HIV (human immunodeficiency virus) env genes. Colour graphic programs on a workstation and parallel processing on a supercomputer are discussed.

  1. Genome Partitioner: A web tool for multi-level partitioning of large-scale DNA constructs for synthetic biology applications

    PubMed Central

    Del Medico, Luca; Christen, Heinz; Christen, Beat

    2017-01-01

    Recent advances in lower-cost DNA synthesis techniques have enabled new innovations in the field of synthetic biology. Still, efficient design and higher-order assembly of genome-scale DNA constructs remains a labor-intensive process. Given the complexity, computer assisted design tools that fragment large DNA sequences into fabricable DNA blocks are needed to pave the way towards streamlined assembly of biological systems. Here, we present the Genome Partitioner software implemented as a web-based interface that permits multi-level partitioning of genome-scale DNA designs. Without the need for specialized computing skills, biologists can submit their DNA designs to a fully automated pipeline that generates the optimal retrosynthetic route for higher-order DNA assembly. To test the algorithm, we partitioned a 783 kb Caulobacter crescentus genome design. We validated the partitioning strategy by assembling a 20 kb test segment encompassing a difficult to synthesize DNA sequence. Successful assembly from 1 kb subblocks into the 20 kb segment highlights the effectiveness of the Genome Partitioner for reducing synthesis costs and timelines for higher-order DNA assembly. The Genome Partitioner is broadly applicable to translate DNA designs into ready to order sequences that can be assembled with standardized protocols, thus offering new opportunities to harness the diversity of microbial genomes for synthetic biology applications. The Genome Partitioner web tool can be accessed at https://christenlab.ethz.ch/GenomePartitioner. PMID:28531174

  2. An efficient variational method to study the denaturation of DNA induced by superhelical stress

    NASA Astrophysics Data System (ADS)

    Jost, Daniel; Everaers, Ralf

    2010-03-01

    Many fundamental biological processes, like transcription or replication, need the opening of the double-stranded DNA. One common way to control the local denaturation is to impose superhelical stress to the DNA using protein machineries. To describe superhelical effect for circular molecules, Benham introduced a model where the standard thermodynamic description of base-pairing is coupled with torsional stress energetics. Here, we introduce an efficient mean-field approximation of the Benham model. Our self-consistent solution is confident and computationally-fast, compared to the full treatment of the model. In particular, our formulation allows to compute the probability of bubble formation for given length and position along the sequence. Evolution of this probability as a function of the superhelical stress could inform us on the ability for organisms to control the strength of superhelicity acting on their genomes.

  3. Improved Force Fields for Peptide Nucleic Acids with Optimized Backbone Torsion Parameters.

    PubMed

    Jasiński, Maciej; Feig, Michael; Trylska, Joanna

    2018-06-06

    Peptide nucleic acids are promising nucleic acid analogs for antisense therapies as they can form stable duplex and triplex structures with DNA and RNA. Computational studies of PNA-containing duplexes and triplexes are an important component for guiding their design, yet existing force fields have not been well validated and parametrized with modern computational capabilities. We present updated CHARMM and Amber force fields for PNA that greatly improve the stability of simulated PNA-containing duplexes and triplexes in comparison with experimental structures and allow such systems to be studied on microsecond time scales. The force field modifications focus on reparametrized PNA backbone torsion angles to match high-level quantum mechanics reference energies for a model compound. The microsecond simulations of PNA-PNA, PNA-DNA, PNA-RNA, and PNA-DNA-PNA complexes also allowed a comprehensive analysis of hydration and ion interactions with such systems.

  4. CMG-Biotools, a Free Workbench for Basic Comparative Microbial Genomics

    PubMed Central

    Vesth, Tammi; Lagesen, Karin; Acar, Öncel; Ussery, David

    2013-01-01

    Background Today, there are more than a hundred times as many sequenced prokaryotic genomes than were present in the year 2000. The economical sequencing of genomic DNA has facilitated a whole new approach to microbial genomics. The real power of genomics is manifested through comparative genomics that can reveal strain specific characteristics, diversity within species and many other aspects. However, comparative genomics is a field not easily entered into by scientists with few computational skills. The CMG-biotools package is designed for microbiologists with limited knowledge of computational analysis and can be used to perform a number of analyses and comparisons of genomic data. Results The CMG-biotools system presents a stand-alone interface for comparative microbial genomics. The package is a customized operating system, based on Xubuntu 10.10, available through the open source Ubuntu project. The system can be installed on a virtual computer, allowing the user to run the system alongside any other operating system. Source codes for all programs are provided under GNU license, which makes it possible to transfer the programs to other systems if so desired. We here demonstrate the package by comparing and analyzing the diversity within the class Negativicutes, represented by 31 genomes including 10 genera. The analyses include 16S rRNA phylogeny, basic DNA and codon statistics, proteome comparisons using BLAST and graphical analyses of DNA structures. Conclusion This paper shows the strength and diverse use of the CMG-biotools system. The system can be installed on a vide range of host operating systems and utilizes as much of the host computer as desired. It allows the user to compare multiple genomes, from various sources using standardized data formats and intuitive visualizations of results. The examples presented here clearly shows that users with limited computational experience can perform complicated analysis without much training. PMID:23577086

  5. Large-scale molecular dynamics simulation of DNA: implementation and validation of the AMBER98 force field in LAMMPS.

    PubMed

    Grindon, Christina; Harris, Sarah; Evans, Tom; Novik, Keir; Coveney, Peter; Laughton, Charles

    2004-07-15

    Molecular modelling played a central role in the discovery of the structure of DNA by Watson and Crick. Today, such modelling is done on computers: the more powerful these computers are, the more detailed and extensive can be the study of the dynamics of such biological macromolecules. To fully harness the power of modern massively parallel computers, however, we need to develop and deploy algorithms which can exploit the structure of such hardware. The Large-scale Atomic/Molecular Massively Parallel Simulator (LAMMPS) is a scalable molecular dynamics code including long-range Coulomb interactions, which has been specifically designed to function efficiently on parallel platforms. Here we describe the implementation of the AMBER98 force field in LAMMPS and its validation for molecular dynamics investigations of DNA structure and flexibility against the benchmark of results obtained with the long-established code AMBER6 (Assisted Model Building with Energy Refinement, version 6). Extended molecular dynamics simulations on the hydrated DNA dodecamer d(CTTTTGCAAAAG)(2), which has previously been the subject of extensive dynamical analysis using AMBER6, show that it is possible to obtain excellent agreement in terms of static, dynamic and thermodynamic parameters between AMBER6 and LAMMPS. In comparison with AMBER6, LAMMPS shows greatly improved scalability in massively parallel environments, opening up the possibility of efficient simulations of order-of-magnitude larger systems and/or for order-of-magnitude greater simulation times.

  6. Transformation of personal computers and mobile phones into genetic diagnostic systems.

    PubMed

    Walker, Faye M; Ahmad, Kareem M; Eisenstein, Michael; Soh, H Tom

    2014-09-16

    Molecular diagnostics based on the polymerase chain reaction (PCR) offer rapid and sensitive means for detecting infectious disease, but prohibitive costs have impeded their use in resource-limited settings where such diseases are endemic. In this work, we report an innovative method for transforming a desktop computer and a mobile camera phone--devices that have become readily accessible in developing countries--into a highly sensitive DNA detection system. This transformation was achieved by converting a desktop computer into a de facto thermal cycler with software that controls the temperature of the central processing unit (CPU), allowing for highly efficient PCR. Next, we reconfigured the mobile phone into a fluorescence imager by adding a low-cost filter, which enabled us to quantitatively measure the resulting PCR amplicons. Our system is highly sensitive, achieving quantitative detection of as little as 9.6 attograms of target DNA, and we show that its performance is comparable to advanced laboratory instruments at approximately 1/500th of the cost. Finally, in order to demonstrate clinical utility, we have used our platform for the successful detection of genomic DNA from the parasite that causes Chagas disease, Trypanosoma cruzi, directly in whole, unprocessed human blood at concentrations 4-fold below the clinical titer of the parasite.

  7. Transformation of Personal Computers and Mobile Phones into Genetic Diagnostic Systems

    PubMed Central

    2014-01-01

    Molecular diagnostics based on the polymerase chain reaction (PCR) offer rapid and sensitive means for detecting infectious disease, but prohibitive costs have impeded their use in resource-limited settings where such diseases are endemic. In this work, we report an innovative method for transforming a desktop computer and a mobile camera phone—devices that have become readily accessible in developing countries—into a highly sensitive DNA detection system. This transformation was achieved by converting a desktop computer into a de facto thermal cycler with software that controls the temperature of the central processing unit (CPU), allowing for highly efficient PCR. Next, we reconfigured the mobile phone into a fluorescence imager by adding a low-cost filter, which enabled us to quantitatively measure the resulting PCR amplicons. Our system is highly sensitive, achieving quantitative detection of as little as 9.6 attograms of target DNA, and we show that its performance is comparable to advanced laboratory instruments at approximately 1/500th of the cost. Finally, in order to demonstrate clinical utility, we have used our platform for the successful detection of genomic DNA from the parasite that causes Chagas disease, Trypanosoma cruzi, directly in whole, unprocessed human blood at concentrations 4-fold below the clinical titer of the parasite. PMID:25223929

  8. Boolean Logic Tree of Label-Free Dual-Signal Electrochemical Aptasensor System for Biosensing, Three-State Logic Computation, and Keypad Lock Security Operation.

    PubMed

    Lu, Jiao Yang; Zhang, Xin Xing; Huang, Wei Tao; Zhu, Qiu Yan; Ding, Xue Zhi; Xia, Li Qiu; Luo, Hong Qun; Li, Nian Bing

    2017-09-19

    The most serious and yet unsolved problems of molecular logic computing consist in how to connect molecular events in complex systems into a usable device with specific functions and how to selectively control branchy logic processes from the cascading logic systems. This report demonstrates that a Boolean logic tree is utilized to organize and connect "plug and play" chemical events DNA, nanomaterials, organic dye, biomolecule, and denaturant for developing the dual-signal electrochemical evolution aptasensor system with good resettability for amplification detection of thrombin, controllable and selectable three-state logic computation, and keypad lock security operation. The aptasensor system combines the merits of DNA-functionalized nanoamplification architecture and simple dual-signal electroactive dye brilliant cresyl blue for sensitive and selective detection of thrombin with a wide linear response range of 0.02-100 nM and a detection limit of 1.92 pM. By using these aforementioned chemical events as inputs and the differential pulse voltammetry current changes at different voltages as dual outputs, a resettable three-input biomolecular keypad lock based on sequential logic is established. Moreover, the first example of controllable and selectable three-state molecular logic computation with active-high and active-low logic functions can be implemented and allows the output ports to assume a high impediment or nothing (Z) state in addition to the 0 and 1 logic levels, effectively controlling subsequent branchy logic computation processes. Our approach is helpful in developing the advanced controllable and selectable logic computing and sensing system in large-scale integration circuits for application in biomedical engineering, intelligent sensing, and control.

  9. Simulations Using Random-Generated DNA and RNA Sequences

    ERIC Educational Resources Information Center

    Bryce, C. F. A.

    1977-01-01

    Using a very simple computer program written in BASIC, a very large number of random-generated DNA or RNA sequences are obtained. Students use these sequences to predict complementary sequences and translational products, evaluate base compositions, determine frequencies of particular triplet codons, and suggest possible secondary structures.…

  10. Transitional circuitry for studying the properties of DNA

    NASA Astrophysics Data System (ADS)

    Trubochkina, N.

    2018-01-01

    The article is devoted to a new view of the structure of DNA as an intellectual scheme possessing the properties of logic and memory. The theory of transient circuitry, developed by the author for optimal computer circuits, revealed an amazing structural similarity between mathematical models of transition silicon elements and logic and memory circuits of solid state transient circuitry and atomic models of parts of DNA.

  11. 1-Amino-4-hydroxy-9,10-anthraquinone - An analogue of anthracycline anticancer drugs, interacts with DNA and induces apoptosis in human MDA-MB-231 breast adinocarcinoma cells: Evaluation of structure-activity relationship using computational, spectroscopic and biochemical studies.

    PubMed

    Mondal, Palash; Roy, Sanjay; Loganathan, Gayathri; Mandal, Bitapi; Dharumadurai, Dhanasekaran; Akbarsha, Mohammad A; Sengupta, Partha Sarathi; Chattopadhyay, Shouvik; Guin, Partha Sarathi

    2015-12-01

    The X-ray diffraction and spectroscopic properties of 1-amino-4-hydroxy-9,10-anthraquinone (1-AHAQ), a simple analogue of anthracycline chemotherapeutic drugs were studied by adopting experimental and computational methods. The optimized geometrical parameters obtained from computational methods were compared with the results of X-ray diffraction analysis and the two were found to be in reasonably good agreement. X-ray diffraction study, Density Functional Theory (DFT) and natural bond orbital (NBO) analysis indicated two types of hydrogen bonds in the molecule. The IR spectra of 1-AHAQ were studied by Vibrational Energy Distribution Analysis (VEDA) using potential energy distribution (PED) analysis. The electronic spectra were studied by TDDFT computation and compared with the experimental results. Experimental and theoretical results corroborated each other to a fair extent. To understand the biological efficacy of 1-AHAQ, it was allowed to interact with calf thymus DNA and human breast adino-carcinoma cell MDA-MB-231. It was found that the molecule induces apoptosis in this adinocarcinoma cell, with little, if any, cytotoxic effect in HBL-100 normal breast epithelial cell.

  12. Biomedical Requirements for High Productivity Computing Systems

    DTIC Science & Technology

    2005-04-01

    server at http://www.ncbi.nlm.nih.gov/BLAST/. There are many variants of BLAST, including: 1. BLASTN - Compares a DNA query to a DNA database. Searches ...database (3 reading frames from each strand of the DNA) searching . 13 4. TBLASTN - Compares a protein query to a DNA database, in the 6 possible...the molecular during this phase. After eliminating molecules that could not match the query , an atom-by-atom search for the molecules in conducted

  13. DNA Cryptography and Deep Learning using Genetic Algorithm with NW algorithm for Key Generation.

    PubMed

    Kalsi, Shruti; Kaur, Harleen; Chang, Victor

    2017-12-05

    Cryptography is not only a science of applying complex mathematics and logic to design strong methods to hide data called as encryption, but also to retrieve the original data back, called decryption. The purpose of cryptography is to transmit a message between a sender and receiver such that an eavesdropper is unable to comprehend it. To accomplish this, not only we need a strong algorithm, but a strong key and a strong concept for encryption and decryption process. We have introduced a concept of DNA Deep Learning Cryptography which is defined as a technique of concealing data in terms of DNA sequence and deep learning. In the cryptographic technique, each alphabet of a letter is converted into a different combination of the four bases, namely; Adenine (A), Cytosine (C), Guanine (G) and Thymine (T), which make up the human deoxyribonucleic acid (DNA). Actual implementations with the DNA don't exceed laboratory level and are expensive. To bring DNA computing on a digital level, easy and effective algorithms are proposed in this paper. In proposed work we have introduced firstly, a method and its implementation for key generation based on the theory of natural selection using Genetic Algorithm with Needleman-Wunsch (NW) algorithm and Secondly, a method for implementation of encryption and decryption based on DNA computing using biological operations Transcription, Translation, DNA Sequencing and Deep Learning.

  14. The intrinsic combinatorial organization and information theoretic content of a sequence are correlated to the DNA encoded nucleosome organization of eukaryotic genomes.

    PubMed

    Utro, Filippo; Di Benedetto, Valeria; Corona, Davide F V; Giancarlo, Raffaele

    2016-03-15

    Thanks to research spanning nearly 30 years, two major models have emerged that account for nucleosome organization in chromatin: statistical and sequence specific. The first is based on elegant, easy to compute, closed-form mathematical formulas that make no assumptions of the physical and chemical properties of the underlying DNA sequence. Moreover, they need no training on the data for their computation. The latter is based on some sequence regularities but, as opposed to the statistical model, it lacks the same type of closed-form formulas that, in this case, should be based on the DNA sequence only. We contribute to close this important methodological gap between the two models by providing three very simple formulas for the sequence specific one. They are all based on well-known formulas in Computer Science and Bioinformatics, and they give different quantifications of how complex a sequence is. In view of how remarkably well they perform, it is very surprising that measures of sequence complexity have not even been considered as candidates to close the mentioned gap. We provide experimental evidence that the intrinsic level of combinatorial organization and information-theoretic content of subsequences within a genome are strongly correlated to the level of DNA encoded nucleosome organization discovered by Kaplan et al Our results establish an important connection between the intrinsic complexity of subsequences in a genome and the intrinsic, i.e. DNA encoded, nucleosome organization of eukaryotic genomes. It is a first step towards a mathematical characterization of this latter 'encoding'. Supplementary data are available at Bioinformatics online. futro@us.ibm.com. © The Author 2015. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.

  15. Large-scale symmetry-adapted perturbation theory computations via density fitting and Laplace transformation techniques: investigating the fundamental forces of DNA-intercalator interactions.

    PubMed

    Hohenstein, Edward G; Parrish, Robert M; Sherrill, C David; Turney, Justin M; Schaefer, Henry F

    2011-11-07

    Symmetry-adapted perturbation theory (SAPT) provides a means of probing the fundamental nature of intermolecular interactions. Low-orders of SAPT (here, SAPT0) are especially attractive since they provide qualitative (sometimes quantitative) results while remaining tractable for large systems. The application of density fitting and Laplace transformation techniques to SAPT0 can significantly reduce the expense associated with these computations and make even larger systems accessible. We present new factorizations of the SAPT0 equations with density-fitted two-electron integrals and the first application of Laplace transformations of energy denominators to SAPT. The improved scalability of the DF-SAPT0 implementation allows it to be applied to systems with more than 200 atoms and 2800 basis functions. The Laplace-transformed energy denominators are compared to analogous partial Cholesky decompositions of the energy denominator tensor. Application of our new DF-SAPT0 program to the intercalation of DNA by proflavine has allowed us to determine the nature of the proflavine-DNA interaction. Overall, the proflavine-DNA interaction contains important contributions from both electrostatics and dispersion. The energetics of the intercalator interaction are are dominated by the stacking interactions (two-thirds of the total), but contain important contributions from the intercalator-backbone interactions. It is hypothesized that the geometry of the complex will be determined by the interactions of the intercalator with the backbone, because by shifting toward one side of the backbone, the intercalator can form two long hydrogen-bonding type interactions. The long-range interactions between the intercalator and the next-nearest base pairs appear to be negligible, justifying the use of truncated DNA models in computational studies of intercalation interaction energies.

  16. Large-scale symmetry-adapted perturbation theory computations via density fitting and Laplace transformation techniques: Investigating the fundamental forces of DNA-intercalator interactions

    NASA Astrophysics Data System (ADS)

    Hohenstein, Edward G.; Parrish, Robert M.; Sherrill, C. David; Turney, Justin M.; Schaefer, Henry F.

    2011-11-01

    Symmetry-adapted perturbation theory (SAPT) provides a means of probing the fundamental nature of intermolecular interactions. Low-orders of SAPT (here, SAPT0) are especially attractive since they provide qualitative (sometimes quantitative) results while remaining tractable for large systems. The application of density fitting and Laplace transformation techniques to SAPT0 can significantly reduce the expense associated with these computations and make even larger systems accessible. We present new factorizations of the SAPT0 equations with density-fitted two-electron integrals and the first application of Laplace transformations of energy denominators to SAPT. The improved scalability of the DF-SAPT0 implementation allows it to be applied to systems with more than 200 atoms and 2800 basis functions. The Laplace-transformed energy denominators are compared to analogous partial Cholesky decompositions of the energy denominator tensor. Application of our new DF-SAPT0 program to the intercalation of DNA by proflavine has allowed us to determine the nature of the proflavine-DNA interaction. Overall, the proflavine-DNA interaction contains important contributions from both electrostatics and dispersion. The energetics of the intercalator interaction are are dominated by the stacking interactions (two-thirds of the total), but contain important contributions from the intercalator-backbone interactions. It is hypothesized that the geometry of the complex will be determined by the interactions of the intercalator with the backbone, because by shifting toward one side of the backbone, the intercalator can form two long hydrogen-bonding type interactions. The long-range interactions between the intercalator and the next-nearest base pairs appear to be negligible, justifying the use of truncated DNA models in computational studies of intercalation interaction energies.

  17. Comparison of computational methods to model DNA minor groove binders.

    PubMed

    Srivastava, Hemant Kumar; Chourasia, Mukesh; Kumar, Devesh; Sastry, G Narahari

    2011-03-28

    There has been a profound interest in designing small molecules that interact in sequence-selective fashion with DNA minor grooves. However, most in silico approaches have not been parametrized for DNA ligand interaction. In this regard, a systematic computational analysis of 57 available PDB structures of noncovalent DNA minor groove binders has been undertaken. The study starts with a rigorous benchmarking of GOLD, GLIDE, CDOCKER, and AUTODOCK docking protocols followed by developing QSSR models and finally molecular dynamics simulations. In GOLD and GLIDE, the orientation of the best score pose is closer to the lowest rmsd pose, and the deviation in the conformation of various poses is also smaller compared to other docking protocols. Efficient QSSR models were developed with constitutional, topological, and quantum chemical descriptors on the basis of B3LYP/6-31G* optimized geometries, and with this ΔT(m) values of 46 ligands were predicted. Molecular dynamics simulations of the 14 DNA-ligand complexes with Amber 8.0 show that the complexes are stable in aqueous conditions and do not undergo noticeable fluctuations during the 5 ns production run, with respect to their initial placement in the minor groove region.

  18. Universal computing by DNA origami robots in a living animal

    PubMed Central

    Levner, Daniel; Ittah, Shmulik; Abu-Horowitz, Almogit; Bachelet, Ido

    2014-01-01

    Biological systems are collections of discrete molecular objects that move around and collide with each other. Cells carry out elaborate processes by precisely controlling these collisions, but developing artificial machines that can interface with and control such interactions remains a significant challenge. DNA is a natural substrate for computing and has been used to implement a diverse set of mathematical problems1-3, logic circuits4-6 and robotics7-9. The molecule also naturally interfaces with living systems, and different forms of DNA-based biocomputing have previously been demonstrated10-13. Here we show that DNA origami14-16 can be used to fabricate nanoscale robots that are capable of dynamically interacting with each other17-18 in a living animal. The interactions generate logical outputs, which are relayed to switch molecular payloads on or off. As a proof-of-principle, we use the system to create architectures that emulate various logic gates (AND, OR, XOR, NAND, NOT, CNOT, and a half adder). Following an ex vivo prototyping phase, we successfully employed the DNA origami robots in living cockroaches (Blaberus discoidalis) to control a molecule that targets the cells of the animal. PMID:24705510

  19. Electron interaction with phosphate cytidine oligomer dCpdC: base-centered radical anions and their electronic spectra.

    PubMed

    Gu, Jiande; Wang, Jing; Leszczynski, Jerzy

    2014-01-30

    Computational chemistry approach was applied to explore the nature of electron attachment to cytosine-rich DNA single strands. An oligomer dinucleoside phosphate deoxycytidylyl-3',5'-deoxycytidine (dCpdC) was selected as a model system for investigations by density functional theory. Electron distribution patterns for the radical anions of dCpdC in aqueous solution were explored. The excess electron may reside on the nucleobase at the 5' position (dC(•-)pdC) or at the 3' position (dCpdC(•-)). From comparison with electron attachment to the cytosine related DNA fragments, the electron affinity for the formation of the cytosine-centered radical anion in DNA is estimated to be around 2.2 eV. Electron attachment to cytosine sites in DNA single strands might cause perturbations of local structural characteristics. Visible absorption spectroscopy may be applied to validate computational results and determine experimentally the existence of the base-centered radical anion. The time-dependent DFT study shows the absorption around 550-600 nm for the cytosine-centered radical anions of DNA oligomers. This indicates that if such species are detected experimentally they would be characterized by a distinctive color.

  20. Identifying the impact of G-quadruplexes on Affymetrix 3' arrays using cloud computing.

    PubMed

    Memon, Farhat N; Owen, Anne M; Sanchez-Graillet, Olivia; Upton, Graham J G; Harrison, Andrew P

    2010-01-15

    A tetramer quadruplex structure is formed by four parallel strands of DNA/ RNA containing runs of guanine. These quadruplexes are able to form because guanine can Hoogsteen hydrogen bond to other guanines, and a tetrad of guanines can form a stable arrangement. Recently we have discovered that probes on Affymetrix GeneChips that contain runs of guanine do not measure gene expression reliably. We associate this finding with the likelihood that quadruplexes are forming on the surface of GeneChips. In order to cope with the rapidly expanding size of GeneChip array datasets in the public domain, we are exploring the use of cloud computing to replicate our experiments on 3' arrays to look at the effect of the location of G-spots (runs of guanines). Cloud computing is a recently introduced high-performance solution that takes advantage of the computational infrastructure of large organisations such as Amazon and Google. We expect that cloud computing will become widely adopted because it enables bioinformaticians to avoid capital expenditure on expensive computing resources and to only pay a cloud computing provider for what is used. Moreover, as well as financial efficiency, cloud computing is an ecologically-friendly technology, it enables efficient data-sharing and we expect it to be faster for development purposes. Here we propose the advantageous use of cloud computing to perform a large data-mining analysis of public domain 3' arrays.

  1. A 3D puzzle approach to building protein-DNA structures.

    PubMed

    Hinton, Deborah M

    2017-03-15

    Despite recent advances in structural analysis, it is still challenging to obtain a high-resolution structure for a complex of RNA polymerase, transcriptional factors, and DNA. However, using biochemical constraints, 3D printed models of available structures, and computer modeling, one can build biologically relevant models of such supramolecular complexes.

  2. DMINDA: an integrated web server for DNA motif identification and analyses

    PubMed Central

    Ma, Qin; Zhang, Hanyuan; Mao, Xizeng; Zhou, Chuan; Liu, Bingqiang; Chen, Xin; Xu, Ying

    2014-01-01

    DMINDA (DNA motif identification and analyses) is an integrated web server for DNA motif identification and analyses, which is accessible at http://csbl.bmb.uga.edu/DMINDA/. This web site is freely available to all users and there is no login requirement. This server provides a suite of cis-regulatory motif analysis functions on DNA sequences, which are important to elucidation of the mechanisms of transcriptional regulation: (i) de novo motif finding for a given set of promoter sequences along with statistical scores for the predicted motifs derived based on information extracted from a control set, (ii) scanning motif instances of a query motif in provided genomic sequences, (iii) motif comparison and clustering of identified motifs, and (iv) co-occurrence analyses of query motifs in given promoter sequences. The server is powered by a backend computer cluster with over 150 computing nodes, and is particularly useful for motif prediction and analyses in prokaryotic genomes. We believe that DMINDA, as a new and comprehensive web server for cis-regulatory motif finding and analyses, will benefit the genomic research community in general and prokaryotic genome researchers in particular. PMID:24753419

  3. Subacute Low Dose Nerve Agent Exposure Causes DNA Fragmentation in Guinea Pig Leukocytes

    DTIC Science & Technology

    2005-10-01

    1 SUBACUTE LOW DOSE NERVE AGENT EXPOSURE CAUSES DNA FRAGMENTATION IN GUINEA PIG LEUKOCYTES. Jitendra R. Dave1, John R. Moffett1, Sally M...DNA fragmentation in blood leukocytes from guinea pigs by ‘Comet’ assay after exposure to soman at doses ranging from 0.1LD50 to 0.4 LD50, once per...computer. Data obtained for exposure to soman demonstrated significant increases in DNA fragmentation in circulating leukocytes in CWNA treated guinea pigs as

  4. Decoding the conformation-linked functional properties of nucleic acids by the use of computational tools.

    PubMed

    Iacovelli, Federico; Falconi, Mattia

    2015-09-01

    DNA and RNA are large and flexible polymers selected by nature to transmit information. The most common DNA three-dimensional structure is represented by the double helix, but this biopolymer is extremely flexible and polymorphic, and can easily change its conformation to adapt to different interactions and purposes. DNA can also adopt singular topologies, giving rise, for instance, to supercoils, formed because of the limited free rotation of the DNA domain flanking a replication or transcription complex. Our understanding of the importance of these unusual or transient structures is growing, as recent studies of DNA topology, supercoiling, knotting and linking have shown that the geometric changes can drive, or strongly influence, the interactions between protein and DNA, so altering its own metabolism. On the other hand, the unique self-recognition properties of DNA, determined by the strict Watson-Crick rules of base pairing, make this material ideal for the creation of self-assembling, predesigned nanostructures. The construction of such structures is one of the main focuses of the thriving area of DNA nanotechnology, where several assembly strategies have been employed to build increasingly complex DNA nanostructures. DNA nanodevices can have direct applications in biomedicine, but also in the materials science field, requiring the immersion of DNA in an environment far from the physiological one. Crucial help in the understanding and planning of natural and artificial nanostructures is given by modern computer simulation techniques, which are able to provide a reliable structural and dynamic description of nucleic acids. © 2015 FEBS.

  5. A Guide to Computational Tools and Design Strategies for Genome Editing Experiments in Zebrafish Using CRISPR/Cas9.

    PubMed

    Prykhozhij, Sergey V; Rajan, Vinothkumar; Berman, Jason N

    2016-02-01

    The development of clustered regularly interspaced short palindromic repeats (CRISPR)/Cas9 technology for mainstream biotechnological use based on its discovery as an adaptive immune mechanism in bacteria has dramatically improved the ability of molecular biologists to modify genomes of model organisms. The zebrafish is highly amenable to applications of CRISPR/Cas9 for mutation generation and a variety of DNA insertions. Cas9 protein in complex with a guide RNA molecule recognizes where to cut the homologous DNA based on a short stretch of DNA termed the protospacer-adjacent motif (PAM). Rapid and efficient identification of target sites immediately preceding PAM sites, quantification of genomic occurrences of similar (off target) sites and predictions of cutting efficiency are some of the features where computational tools play critical roles in CRISPR/Cas9 applications. Given the rapid advent and development of this technology, it can be a challenge for researchers to remain up to date with all of the important technological developments in this field. We have contributed to the armamentarium of CRISPR/Cas9 bioinformatics tools and trained other researchers in the use of appropriate computational programs to develop suitable experimental strategies. Here we provide an in-depth guide on how to use CRISPR/Cas9 and other relevant computational tools at each step of a host of genome editing experimental strategies. We also provide detailed conceptual outlines of the steps involved in the design and execution of CRISPR/Cas9-based experimental strategies, such as generation of frameshift mutations, larger chromosomal deletions and inversions, homology-independent insertion of gene cassettes and homology-based knock-in of defined point mutations and larger gene constructs.

  6. Computational Identification and Functional Predictions of Long Noncoding RNA in Zea mays

    PubMed Central

    Boerner, Susan; McGinnis, Karen M.

    2012-01-01

    Background Computational analysis of cDNA sequences from multiple organisms suggests that a large portion of transcribed DNA does not code for a functional protein. In mammals, noncoding transcription is abundant, and often results in functional RNA molecules that do not appear to encode proteins. Many long noncoding RNAs (lncRNAs) appear to have epigenetic regulatory function in humans, including HOTAIR and XIST. While epigenetic gene regulation is clearly an essential mechanism in plants, relatively little is known about the presence or function of lncRNAs in plants. Methodology/Principal Findings To explore the connection between lncRNA and epigenetic regulation of gene expression in plants, a computational pipeline using the programming language Python has been developed and applied to maize full length cDNA sequences to identify, classify, and localize potential lncRNAs. The pipeline was used in parallel with an SVM tool for identifying ncRNAs to identify the maximal number of ncRNAs in the dataset. Although the available library of sequences was small and potentially biased toward protein coding transcripts, 15% of the sequences were predicted to be noncoding. Approximately 60% of these sequences appear to act as precursors for small RNA molecules and may function to regulate gene expression via a small RNA dependent mechanism. ncRNAs were predicted to originate from both genic and intergenic loci. Of the lncRNAs that originated from genic loci, ∼20% were antisense to the host gene loci. Conclusions/Significance Consistent with similar studies in other organisms, noncoding transcription appears to be widespread in the maize genome. Computational predictions indicate that maize lncRNAs may function to regulate expression of other genes through multiple RNA mediated mechanisms. PMID:22916204

  7. A computer aided thermodynamic approach for predicting the formation of Z-DNA in naturally occurring sequences

    NASA Technical Reports Server (NTRS)

    Ho, P. S.; Ellison, M. J.; Quigley, G. J.; Rich, A.

    1986-01-01

    The ease with which a particular DNA segment adopts the left-handed Z-conformation depends largely on the sequence and on the degree of negative supercoiling to which it is subjected. We describe a computer program (Z-hunt) that is designed to search long sequences of naturally occurring DNA and retrieve those nucleotide combinations of up to 24 bp in length which show a strong propensity for Z-DNA formation. Incorporated into Z-hunt is a statistical mechanical model based on empirically determined energetic parameters for the B to Z transition accumulated to date. The Z-forming potential of a sequence is assessed by ranking its behavior as a function of negative superhelicity relative to the behavior of similar sized randomly generated nucleotide sequences assembled from over 80,000 combinations. The program makes it possible to compare directly the Z-forming potential of sequences with different base compositions and different sequence lengths. Using Z-hunt, we have analyzed the DNA sequences of the bacteriophage phi X174, plasmid pBR322, the animal virus SV40 and the replicative form of the eukaryotic adenovirus-2. The results are compared with those previously obtained by others from experiments designed to locate Z-DNA forming regions in these sequences using probes which show specificity for the left-handed DNA conformation.

  8. Encoding of low-quality DNA profiles as genotype probability matrices for improved profile comparisons, relatedness evaluation and database searches.

    PubMed

    Ryan, K; Williams, D Gareth; Balding, David J

    2016-11-01

    Many DNA profiles recovered from crime scene samples are of a quality that does not allow them to be searched against, nor entered into, databases. We propose a method for the comparison of profiles arising from two DNA samples, one or both of which can have multiple donors and be affected by low DNA template or degraded DNA. We compute likelihood ratios to evaluate the hypothesis that the two samples have a common DNA donor, and hypotheses specifying the relatedness of two donors. Our method uses a probability distribution for the genotype of the donor of interest in each sample. This distribution can be obtained from a statistical model, or we can exploit the ability of trained human experts to assess genotype probabilities, thus extracting much information that would be discarded by standard interpretation rules. Our method is compatible with established methods in simple settings, but is more widely applicable and can make better use of information than many current methods for the analysis of mixed-source, low-template DNA profiles. It can accommodate uncertainty arising from relatedness instead of or in addition to uncertainty arising from noisy genotyping. We describe a computer program GPMDNA, available under an open source licence, to calculate LRs using the method presented in this paper. Copyright © 2016 Elsevier Ireland Ltd. All rights reserved.

  9. Private genome analysis through homomorphic encryption

    PubMed Central

    2015-01-01

    Background The rapid development of genome sequencing technology allows researchers to access large genome datasets. However, outsourcing the data processing o the cloud poses high risks for personal privacy. The aim of this paper is to give a practical solution for this problem using homomorphic encryption. In our approach, all the computations can be performed in an untrusted cloud without requiring the decryption key or any interaction with the data owner, which preserves the privacy of genome data. Methods We present evaluation algorithms for secure computation of the minor allele frequencies and χ2 statistic in a genome-wide association studies setting. We also describe how to privately compute the Hamming distance and approximate Edit distance between encrypted DNA sequences. Finally, we compare performance details of using two practical homomorphic encryption schemes - the BGV scheme by Gentry, Halevi and Smart and the YASHE scheme by Bos, Lauter, Loftus and Naehrig. Results The approach with the YASHE scheme analyzes data from 400 people within about 2 seconds and picks a variant associated with disease from 311 spots. For another task, using the BGV scheme, it took about 65 seconds to securely compute the approximate Edit distance for DNA sequences of size 5K and figure out the differences between them. Conclusions The performance numbers for BGV are better than YASHE when homomorphically evaluating deep circuits (like the Hamming distance algorithm or approximate Edit distance algorithm). On the other hand, it is more efficient to use the YASHE scheme for a low-degree computation, such as minor allele frequencies or χ2 test statistic in a case-control study. PMID:26733152

  10. Toward unification of taxonomy databases in a distributed computer environment

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Kitakami, Hajime; Tateno, Yoshio; Gojobori, Takashi

    1994-12-31

    All the taxonomy databases constructed with the DNA databases of the international DNA data banks are powerful electronic dictionaries which aid in biological research by computer. The taxonomy databases are, however not consistently unified with a relational format. If we can achieve consistent unification of the taxonomy databases, it will be useful in comparing many research results, and investigating future research directions from existent research results. In particular, it will be useful in comparing relationships between phylogenetic trees inferred from molecular data and those constructed from morphological data. The goal of the present study is to unify the existent taxonomymore » databases and eliminate inconsistencies (errors) that are present in them. Inconsistencies occur particularly in the restructuring of the existent taxonomy databases, since classification rules for constructing the taxonomy have rapidly changed with biological advancements. A repair system is needed to remove inconsistencies in each data bank and mismatches among data banks. This paper describes a new methodology for removing both inconsistencies and mismatches from the databases on a distributed computer environment. The methodology is implemented in a relational database management system, SYBASE.« less

  11. Genome wide approaches to identify protein-DNA interactions.

    PubMed

    Ma, Tao; Ye, Zhenqing; Wang, Liguo

    2018-05-29

    Transcription factors are DNA-binding proteins that play key roles in many fundamental biological processes. Unraveling their interactions with DNA is essential to identify their target genes and understand the regulatory network. Genome-wide identification of their binding sites became feasible thanks to recent progress in experimental and computational approaches. ChIP-chip, ChIP-seq, and ChIP-exo are three widely used techniques to demarcate genome-wide transcription factor binding sites. This review aims to provide an overview of these three techniques including their experiment procedures, computational approaches, and popular analytic tools. ChIP-chip, ChIP-seq, and ChIP-exo have been the major techniques to study genome-wide in vivo protein-DNA interaction. Due to the rapid development of next-generation sequencing technology, array-based ChIP-chip is deprecated and ChIP-seq has become the most widely used technique to identify transcription factor binding sites in genome-wide. The newly developed ChIP-exo further improves the spatial resolution to single nucleotide. Numerous tools have been developed to analyze ChIP-chip, ChIP-seq and ChIP-exo data. However, different programs may employ different mechanisms or underlying algorithms thus each will inherently include its own set of statistical assumption and bias. So choosing the most appropriate analytic program for a given experiment needs careful considerations. Moreover, most programs only have command line interface so their installation and usage will require basic computation expertise in Unix/Linux. Copyright© Bentham Science Publishers; For any queries, please email at epub@benthamscience.org.

  12. Utilizing Molecular Dynamics ' Multipotent Methodologies to Measure Microscopic Motions of DNA Molecules: A Magniloquent Manuscript On DNA's Means and Mannerisms

    NASA Astrophysics Data System (ADS)

    Kingsland, Addie

    DNA is an amazing molecule which is the basic template for all genetics. It is the primary molecule for storing biological information, and has many applications in nanotechnology. Double-stranded DNA may contain mismatched base pairs beyond the Watson-Crick pairs guanine-cytosine and adenine-thymine. To date, no one has found a physical property of base pair mismatches which describes the behavior of naturally occurring mismatch repair enzymes. Many materials properties of DNA are also unknown, for instance, when pulling DNA in different configurations, different energy differences are observed with no obvious reason why. DNA mismatches also affect their local environment, for instance changing the quantum yield of nearby azobenzene moieties. We utilize molecular dynamics computer simulations to study the structure and dynamics for both matched and mismatched base pairs, within both biological and materials contexts, and in both equilibrium and biased dynamics. We show that mismatched pairs shift further in the plane normal to the DNA strand and are more likely to exhibit non-canonical structures, including the e-motif. Base pair mismatches alter their local environment, affecting the trans- to cis- photoisomerization quantum yield of azobenzene, as well as increasing the likelihood of observing the e-motif. We also show that by using simulated data, we can give new insights on theoretical models to calculate the energetics of pulling DNA strands apart. These results, all relatively inexpensive on modern computer hardware, can help guide the design of DNA-based nanotechnologies, as well as give new insights into the functioning of mismatch repair systems in cancer prevention.

  13. Orchestration of Molecular Information through Higher Order Chemical Recognition

    NASA Astrophysics Data System (ADS)

    Frezza, Brian M.

    Broadly defined, higher order chemical recognition is the process whereby discrete chemical building blocks capable of specifically binding to cognate moieties are covalently linked into oligomeric chains. These chains, or sequences, are then able to recognize and bind to their cognate sequences with a high degree of cooperativity. Principally speaking, DNA and RNA are the most readily obtained examples of this chemical phenomenon, and function via Watson-Crick cognate pairing: guanine pairs with cytosine and adenine with thymine (DNA) or uracil (RNA), in an anti-parallel manner. While the theoretical principles, techniques, and equations derived herein apply generally to any higher-order chemical recognition system, in practice we utilize DNA oligomers as a model-building material to experimentally investigate and validate our hypotheses. Historically, general purpose information processing has been a task limited to semiconductor electronics. Molecular computing on the other hand has been limited to ad hoc approaches designed to solve highly specific and unique computation problems, often involving components or techniques that cannot be applied generally in a manner suitable for precise and predictable engineering. Herein, we provide a fundamental framework for harnessing high-order recognition in a modular and programmable fashion to synthesize molecular information process networks of arbitrary construction and complexity. This document provides a solid foundation for routinely embedding computational capability into chemical and biological systems where semiconductor electronics are unsuitable for practical application.

  14. Charge Structure and Counterion Distribution in Hexagonal DNA Liquid Crystal

    PubMed Central

    Dai, Liang; Mu, Yuguang; Nordenskiöld, Lars; Lapp, Alain; van der Maarel, Johan R. C.

    2007-01-01

    A hexagonal liquid crystal of DNA fragments (double-stranded, 150 basepairs) with tetramethylammonium (TMA) counterions was investigated with small angle neutron scattering (SANS). We obtained the structure factors pertaining to the DNA and counterion density correlations with contrast matching in the water. Molecular dynamics (MD) computer simulation of a hexagonal assembly of nine DNA molecules showed that the inter-DNA distance fluctuates with a correlation time around 2 ns and a standard deviation of 8.5% of the interaxial spacing. The MD simulation also showed a minimal effect of the fluctuations in inter-DNA distance on the radial counterion density profile and significant penetration of the grooves by TMA. The radial density profile of the counterions was also obtained from a Monte Carlo (MC) computer simulation of a hexagonal array of charged rods with fixed interaxial spacing. Strong ordering of the counterions between the DNA molecules and the absence of charge fluctuations at longer wavelengths was shown by the SANS number and charge structure factors. The DNA-counterion and counterion structure factors are interpreted with the correlation functions derived from the Poisson-Boltzmann equation, MD, and MC simulation. Best agreement is observed between the experimental structure factors and the prediction based on the Poisson-Boltzmann equation and/or MC simulation. The SANS results show that TMA is too large to penetrate the grooves to a significant extent, in contrast to what is shown by MD simulation. PMID:17098791

  15. Towards computational improvement of DNA database indexing and short DNA query searching.

    PubMed

    Stojanov, Done; Koceski, Sašo; Mileva, Aleksandra; Koceska, Nataša; Bande, Cveta Martinovska

    2014-09-03

    In order to facilitate and speed up the search of massive DNA databases, the database is indexed at the beginning, employing a mapping function. By searching through the indexed data structure, exact query hits can be identified. If the database is searched against an annotated DNA query, such as a known promoter consensus sequence, then the starting locations and the number of potential genes can be determined. This is particularly relevant if unannotated DNA sequences have to be functionally annotated. However, indexing a massive DNA database and searching an indexed data structure with millions of entries is a time-demanding process. In this paper, we propose a fast DNA database indexing and searching approach, identifying all query hits in the database, without having to examine all entries in the indexed data structure, limiting the maximum length of a query that can be searched against the database. By applying the proposed indexing equation, the whole human genome could be indexed in 10 hours on a personal computer, under the assumption that there is enough RAM to store the indexed data structure. Analysing the methodology proposed by Reneker, we observed that hits at starting positions [Formula: see text] are not reported, if the database is searched against a query shorter than [Formula: see text] nucleotides, such that [Formula: see text] is the length of the DNA database words being mapped and [Formula: see text] is the length of the query. A solution of this drawback is also presented.

  16. Plasmid mapping computer program.

    PubMed Central

    Nolan, G P; Maina, C V; Szalay, A A

    1984-01-01

    Three new computer algorithms are described which rapidly order the restriction fragments of a plasmid DNA which has been cleaved with two restriction endonucleases in single and double digestions. Two of the algorithms are contained within a single computer program (called MPCIRC). The Rule-Oriented algorithm, constructs all logical circular map solutions within sixty seconds (14 double-digestion fragments) when used in conjunction with the Permutation method. The program is written in Apple Pascal and runs on an Apple II Plus Microcomputer with 64K of memory. A third algorithm is described which rapidly maps double digests and uses the above two algorithms as adducts. Modifications of the algorithms for linear mapping are also presented. PMID:6320105

  17. Efficient Mining of Interesting Patterns in Large Biological Sequences

    PubMed Central

    Rashid, Md. Mamunur; Karim, Md. Rezaul; Jeong, Byeong-Soo

    2012-01-01

    Pattern discovery in biological sequences (e.g., DNA sequences) is one of the most challenging tasks in computational biology and bioinformatics. So far, in most approaches, the number of occurrences is a major measure of determining whether a pattern is interesting or not. In computational biology, however, a pattern that is not frequent may still be considered very informative if its actual support frequency exceeds the prior expectation by a large margin. In this paper, we propose a new interesting measure that can provide meaningful biological information. We also propose an efficient index-based method for mining such interesting patterns. Experimental results show that our approach can find interesting patterns within an acceptable computation time. PMID:23105928

  18. Efficient mining of interesting patterns in large biological sequences.

    PubMed

    Rashid, Md Mamunur; Karim, Md Rezaul; Jeong, Byeong-Soo; Choi, Ho-Jin

    2012-03-01

    Pattern discovery in biological sequences (e.g., DNA sequences) is one of the most challenging tasks in computational biology and bioinformatics. So far, in most approaches, the number of occurrences is a major measure of determining whether a pattern is interesting or not. In computational biology, however, a pattern that is not frequent may still be considered very informative if its actual support frequency exceeds the prior expectation by a large margin. In this paper, we propose a new interesting measure that can provide meaningful biological information. We also propose an efficient index-based method for mining such interesting patterns. Experimental results show that our approach can find interesting patterns within an acceptable computation time.

  19. GrigoraSNPs: Optimized Analysis of SNPs for DNA Forensics.

    PubMed

    Ricke, Darrell O; Shcherbina, Anna; Michaleas, Adam; Fremont-Smith, Philip

    2018-04-16

    High-throughput sequencing (HTS) of single nucleotide polymorphisms (SNPs) enables additional DNA forensic capabilities not attainable using traditional STR panels. However, the inclusion of sets of loci selected for mixture analysis, extended kinship, phenotype, biogeographic ancestry prediction, etc., can result in large panel sizes that are difficult to analyze in a rapid fashion. GrigoraSNP was developed to address the allele-calling bottleneck that was encountered when analyzing SNP panels with more than 5000 loci using HTS. GrigoraSNPs uses a MapReduce parallel data processing on multiple computational threads plus a novel locus-identification hashing strategy leveraging target sequence tags. This tool optimizes the SNP calling module of the DNA analysis pipeline with runtimes that scale linearly with the number of HTS reads. Results are compared with SNP analysis pipelines implemented with SAMtools and GATK. GrigoraSNPs removes a computational bottleneck for processing forensic samples with large HTS SNP panels. Published 2018. This article is a U.S. Government work and is in the public domain in the USA.

  20. HolT Hunter: Software for Identifying and Characterizing Low-Strain DNA Holliday Triangles

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Sherman W. B.

    2012-06-05

    Synthetic DNA nanostructures are most commonly held together via Holliday junctions. These junctions allow for a wide variety of different angles between the double helices they connect. Nevertheless, only constructs with a very limited selection of angles have been built, to date, because of the computational complexity of identifying structures that fit together with low strain at odd angles. I have developed an algorithm that finds over 95% of the possible solutions by breaking the problem down into two portions. First, there is a problem of how smooth rods can form triangles by lying across one another. This problem ismore » easily handled by numerical computation. Second, there is the question of how distorted DNA double helices would need to be to fit onto the rod structure. This strain is calculated directly. The algorithm has been implemented in a Mathematica 8 notebook called Holliday Triangle Hunter. A large database of solutions has been identified. Additional interface software is available to facilitate drawing and viewing models.« less

  1. New Genetics

    MedlinePlus

    ... Century-Old Evolutionary Puzzle Computing Genetics Model Organisms RNA Interference The New Genetics is a science education ... the basics of DNA and its molecular cousin RNA, and new directions in genetic research. The New ...

  2. Evidence of pervasive biologically functional secondary structures within the genomes of eukaryotic single-stranded DNA viruses.

    PubMed

    Muhire, Brejnev Muhizi; Golden, Michael; Murrell, Ben; Lefeuvre, Pierre; Lett, Jean-Michel; Gray, Alistair; Poon, Art Y F; Ngandu, Nobubelo Kwanele; Semegni, Yves; Tanov, Emil Pavlov; Monjane, Adérito Luis; Harkins, Gordon William; Varsani, Arvind; Shepherd, Dionne Natalie; Martin, Darren Patrick

    2014-02-01

    Single-stranded DNA (ssDNA) viruses have genomes that are potentially capable of forming complex secondary structures through Watson-Crick base pairing between their constituent nucleotides. A few of the structural elements formed by such base pairings are, in fact, known to have important functions during the replication of many ssDNA viruses. Unknown, however, are (i) whether numerous additional ssDNA virus genomic structural elements predicted to exist by computational DNA folding methods actually exist and (ii) whether those structures that do exist have any biological relevance. We therefore computationally inferred lists of the most evolutionarily conserved structures within a diverse selection of animal- and plant-infecting ssDNA viruses drawn from the families Circoviridae, Anelloviridae, Parvoviridae, Nanoviridae, and Geminiviridae and analyzed these for evidence of natural selection favoring the maintenance of these structures. While we find evidence that is consistent with purifying selection being stronger at nucleotide sites that are predicted to be base paired than at sites predicted to be unpaired, we also find strong associations between sites that are predicted to pair with one another and site pairs that are apparently coevolving in a complementary fashion. Collectively, these results indicate that natural selection actively preserves much of the pervasive secondary structure that is evident within eukaryote-infecting ssDNA virus genomes and, therefore, that much of this structure is biologically functional. Lastly, we provide examples of various highly conserved but completely uncharacterized structural elements that likely have important functions within some of the ssDNA virus genomes analyzed here.

  3. Evidence of Pervasive Biologically Functional Secondary Structures within the Genomes of Eukaryotic Single-Stranded DNA Viruses

    PubMed Central

    Muhire, Brejnev Muhizi; Golden, Michael; Murrell, Ben; Lefeuvre, Pierre; Lett, Jean-Michel; Gray, Alistair; Poon, Art Y. F.; Ngandu, Nobubelo Kwanele; Semegni, Yves; Tanov, Emil Pavlov; Monjane, Adérito Luis; Harkins, Gordon William; Varsani, Arvind; Shepherd, Dionne Natalie

    2014-01-01

    Single-stranded DNA (ssDNA) viruses have genomes that are potentially capable of forming complex secondary structures through Watson-Crick base pairing between their constituent nucleotides. A few of the structural elements formed by such base pairings are, in fact, known to have important functions during the replication of many ssDNA viruses. Unknown, however, are (i) whether numerous additional ssDNA virus genomic structural elements predicted to exist by computational DNA folding methods actually exist and (ii) whether those structures that do exist have any biological relevance. We therefore computationally inferred lists of the most evolutionarily conserved structures within a diverse selection of animal- and plant-infecting ssDNA viruses drawn from the families Circoviridae, Anelloviridae, Parvoviridae, Nanoviridae, and Geminiviridae and analyzed these for evidence of natural selection favoring the maintenance of these structures. While we find evidence that is consistent with purifying selection being stronger at nucleotide sites that are predicted to be base paired than at sites predicted to be unpaired, we also find strong associations between sites that are predicted to pair with one another and site pairs that are apparently coevolving in a complementary fashion. Collectively, these results indicate that natural selection actively preserves much of the pervasive secondary structure that is evident within eukaryote-infecting ssDNA virus genomes and, therefore, that much of this structure is biologically functional. Lastly, we provide examples of various highly conserved but completely uncharacterized structural elements that likely have important functions within some of the ssDNA virus genomes analyzed here. PMID:24284329

  4. GenomicTools: a computational platform for developing high-throughput analytics in genomics.

    PubMed

    Tsirigos, Aristotelis; Haiminen, Niina; Bilal, Erhan; Utro, Filippo

    2012-01-15

    Recent advances in sequencing technology have resulted in the dramatic increase of sequencing data, which, in turn, requires efficient management of computational resources, such as computing time, memory requirements as well as prototyping of computational pipelines. We present GenomicTools, a flexible computational platform, comprising both a command-line set of tools and a C++ API, for the analysis and manipulation of high-throughput sequencing data such as DNA-seq, RNA-seq, ChIP-seq and MethylC-seq. GenomicTools implements a variety of mathematical operations between sets of genomic regions thereby enabling the prototyping of computational pipelines that can address a wide spectrum of tasks ranging from pre-processing and quality control to meta-analyses. Additionally, the GenomicTools platform is designed to analyze large datasets of any size by minimizing memory requirements. In practical applications, where comparable, GenomicTools outperforms existing tools in terms of both time and memory usage. The GenomicTools platform (version 2.0.0) was implemented in C++. The source code, documentation, user manual, example datasets and scripts are available online at http://code.google.com/p/ibm-cbc-genomic-tools.

  5. Two- and three-input TALE-based AND logic computation in embryonic stem cells.

    PubMed

    Lienert, Florian; Torella, Joseph P; Chen, Jan-Hung; Norsworthy, Michael; Richardson, Ryan R; Silver, Pamela A

    2013-11-01

    Biological computing circuits can enhance our ability to control cellular functions and have potential applications in tissue engineering and medical treatments. Transcriptional activator-like effectors (TALEs) represent attractive components of synthetic gene regulatory circuits, as they can be designed de novo to target a given DNA sequence. We here demonstrate that TALEs can perform Boolean logic computation in mammalian cells. Using a split-intein protein-splicing strategy, we show that a functional TALE can be reconstituted from two inactive parts, thus generating two-input AND logic computation. We further demonstrate three-piece intein splicing in mammalian cells and use it to perform three-input AND computation. Using methods for random as well as targeted insertion of these relatively large genetic circuits, we show that TALE-based logic circuits are functional when integrated into the genome of mouse embryonic stem cells. Comparing construct variants in the same genomic context, we modulated the strength of the TALE-responsive promoter to improve the output of these circuits. Our work establishes split TALEs as a tool for building logic computation with the potential of controlling expression of endogenous genes or transgenes in response to a combination of cellular signals.

  6. Multiscale QM/MM molecular dynamics study on the first steps of guanine damage by free hydroxyl radicals in solution.

    PubMed

    Abolfath, Ramin M; Biswas, P K; Rajnarayanam, R; Brabec, Thomas; Kodym, Reinhard; Papiez, Lech

    2012-04-19

    Understanding the damage of DNA bases from hydrogen abstraction by free OH radicals is of particular importance to understanding the indirect effect of ionizing radiation. Previous studies address the problem with truncated DNA bases as ab initio quantum simulations required to study such electronic-spin-dependent processes are computationally expensive. Here, for the first time, we employ a multiscale and hybrid quantum mechanical-molecular mechanical simulation to study the interaction of OH radicals with a guanine-deoxyribose-phosphate DNA molecular unit in the presence of water, where all of the water molecules and the deoxyribose-phosphate fragment are treated with the simplistic classical molecular mechanical scheme. Our result illustrates that the presence of water strongly alters the hydrogen-abstraction reaction as the hydrogen bonding of OH radicals with water restricts the relative orientation of the OH radicals with respect to the DNA base (here, guanine). This results in an angular anisotropy in the chemical pathway and a lower efficiency in the hydrogen-abstraction mechanisms than previously anticipated for identical systems in vacuum. The method can easily be extended to single- and double-stranded DNA without any appreciable computational cost as these molecular units can be treated in the classical subsystem, as has been demonstrated here. © 2012 American Chemical Society

  7. gene GIS: Computational Tools for Spatial Analyses of DNA Profiles with Associated Photo-Identification and Telemetry Records of Marine Mammals

    DTIC Science & Technology

    2011-09-30

    DNA profiles. Referred to as geneGIS, the program will provide the ability to display, browse, select, filter and summarize spatial or temporal...of the SPLASH photo-identification records and available DNA profiles is underway through integration and crosschecking by Cascadia and MMI . An...Darwin Core standards where possible and can accommodate the current databases developed for telemetry data at MMI and SPLASH collection records at

  8. CMSA: a heterogeneous CPU/GPU computing system for multiple similar RNA/DNA sequence alignment.

    PubMed

    Chen, Xi; Wang, Chen; Tang, Shanjiang; Yu, Ce; Zou, Quan

    2017-06-24

    The multiple sequence alignment (MSA) is a classic and powerful technique for sequence analysis in bioinformatics. With the rapid growth of biological datasets, MSA parallelization becomes necessary to keep its running time in an acceptable level. Although there are a lot of work on MSA problems, their approaches are either insufficient or contain some implicit assumptions that limit the generality of usage. First, the information of users' sequences, including the sizes of datasets and the lengths of sequences, can be of arbitrary values and are generally unknown before submitted, which are unfortunately ignored by previous work. Second, the center star strategy is suited for aligning similar sequences. But its first stage, center sequence selection, is highly time-consuming and requires further optimization. Moreover, given the heterogeneous CPU/GPU platform, prior studies consider the MSA parallelization on GPU devices only, making the CPUs idle during the computation. Co-run computation, however, can maximize the utilization of the computing resources by enabling the workload computation on both CPU and GPU simultaneously. This paper presents CMSA, a robust and efficient MSA system for large-scale datasets on the heterogeneous CPU/GPU platform. It performs and optimizes multiple sequence alignment automatically for users' submitted sequences without any assumptions. CMSA adopts the co-run computation model so that both CPU and GPU devices are fully utilized. Moreover, CMSA proposes an improved center star strategy that reduces the time complexity of its center sequence selection process from O(mn 2 ) to O(mn). The experimental results show that CMSA achieves an up to 11× speedup and outperforms the state-of-the-art software. CMSA focuses on the multiple similar RNA/DNA sequence alignment and proposes a novel bitmap based algorithm to improve the center star strategy. We can conclude that harvesting the high performance of modern GPU is a promising approach to accelerate multiple sequence alignment. Besides, adopting the co-run computation model can maximize the entire system utilization significantly. The source code is available at https://github.com/wangvsa/CMSA .

  9. DNA-Binding Kinetics Determines the Mechanism of Noise-Induced Switching in Gene Networks

    PubMed Central

    Tse, Margaret J.; Chu, Brian K.; Roy, Mahua; Read, Elizabeth L.

    2015-01-01

    Gene regulatory networks are multistable dynamical systems in which attractor states represent cell phenotypes. Spontaneous, noise-induced transitions between these states are thought to underlie critical cellular processes, including cell developmental fate decisions, phenotypic plasticity in fluctuating environments, and carcinogenesis. As such, there is increasing interest in the development of theoretical and computational approaches that can shed light on the dynamics of these stochastic state transitions in multistable gene networks. We applied a numerical rare-event sampling algorithm to study transition paths of spontaneous noise-induced switching for a ubiquitous gene regulatory network motif, the bistable toggle switch, in which two mutually repressive genes compete for dominant expression. We find that the method can efficiently uncover detailed switching mechanisms that involve fluctuations both in occupancies of DNA regulatory sites and copy numbers of protein products. In addition, we show that the rate parameters governing binding and unbinding of regulatory proteins to DNA strongly influence the switching mechanism. In a regime of slow DNA-binding/unbinding kinetics, spontaneous switching occurs relatively frequently and is driven primarily by fluctuations in DNA-site occupancies. In contrast, in a regime of fast DNA-binding/unbinding kinetics, switching occurs rarely and is driven by fluctuations in levels of expressed protein. Our results demonstrate how spontaneous cell phenotype transitions involve collective behavior of both regulatory proteins and DNA. Computational approaches capable of simulating dynamics over many system variables are thus well suited to exploring dynamic mechanisms in gene networks. PMID:26488666

  10. Lnc2Meth: a manually curated database of regulatory relationships between long non-coding RNAs and DNA methylation associated with human disease

    PubMed Central

    Zhi, Hui; Li, Xin; Wang, Peng; Gao, Yue; Gao, Baoqing; Zhou, Dianshuang; Zhang, Yan; Guo, Maoni; Yue, Ming; Shen, Weitao

    2018-01-01

    Abstract Lnc2Meth (http://www.bio-bigdata.com/Lnc2Meth/), an interactive resource to identify regulatory relationships between human long non-coding RNAs (lncRNAs) and DNA methylation, is not only a manually curated collection and annotation of experimentally supported lncRNAs-DNA methylation associations but also a platform that effectively integrates tools for calculating and identifying the differentially methylated lncRNAs and protein-coding genes (PCGs) in diverse human diseases. The resource provides: (i) advanced search possibilities, e.g. retrieval of the database by searching the lncRNA symbol of interest, DNA methylation patterns, regulatory mechanisms and disease types; (ii) abundant computationally calculated DNA methylation array profiles for the lncRNAs and PCGs; (iii) the prognostic values for each hit transcript calculated from the patients clinical data; (iv) a genome browser to display the DNA methylation landscape of the lncRNA transcripts for a specific type of disease; (v) tools to re-annotate probes to lncRNA loci and identify the differential methylation patterns for lncRNAs and PCGs with user-supplied external datasets; (vi) an R package (LncDM) to complete the differentially methylated lncRNAs identification and visualization with local computers. Lnc2Meth provides a timely and valuable resource that can be applied to significantly expand our understanding of the regulatory relationships between lncRNAs and DNA methylation in various human diseases. PMID:29069510

  11. An Automated Pipeline for Engineering Many-Enzyme Pathways: Computational Sequence Design, Pathway Expression-Flux Mapping, and Scalable Pathway Optimization.

    PubMed

    Halper, Sean M; Cetnar, Daniel P; Salis, Howard M

    2018-01-01

    Engineering many-enzyme metabolic pathways suffers from the design curse of dimensionality. There are an astronomical number of synonymous DNA sequence choices, though relatively few will express an evolutionary robust, maximally productive pathway without metabolic bottlenecks. To solve this challenge, we have developed an integrated, automated computational-experimental pipeline that identifies a pathway's optimal DNA sequence without high-throughput screening or many cycles of design-build-test. The first step applies our Operon Calculator algorithm to design a host-specific evolutionary robust bacterial operon sequence with maximally tunable enzyme expression levels. The second step applies our RBS Library Calculator algorithm to systematically vary enzyme expression levels with the smallest-sized library. After characterizing a small number of constructed pathway variants, measurements are supplied to our Pathway Map Calculator algorithm, which then parameterizes a kinetic metabolic model that ultimately predicts the pathway's optimal enzyme expression levels and DNA sequences. Altogether, our algorithms provide the ability to efficiently map the pathway's sequence-expression-activity space and predict DNA sequences with desired metabolic fluxes. Here, we provide a step-by-step guide to applying the Pathway Optimization Pipeline on a desired multi-enzyme pathway in a bacterial host.

  12. Rational design of DNA sequences for nanotechnology, microarrays and molecular computers using Eulerian graphs.

    PubMed

    Pancoska, Petr; Moravek, Zdenek; Moll, Ute M

    2004-01-01

    Nucleic acids are molecules of choice for both established and emerging nanoscale technologies. These technologies benefit from large functional densities of 'DNA processing elements' that can be readily manufactured. To achieve the desired functionality, polynucleotide sequences are currently designed by a process that involves tedious and laborious filtering of potential candidates against a series of requirements and parameters. Here, we present a complete novel methodology for the rapid rational design of large sets of DNA sequences. This method allows for the direct implementation of very complex and detailed requirements for the generated sequences, thus avoiding 'brute force' filtering. At the same time, these sequences have narrow distributions of melting temperatures. The molecular part of the design process can be done without computer assistance, using an efficient 'human engineering' approach by drawing a single blueprint graph that represents all generated sequences. Moreover, the method eliminates the necessity for extensive thermodynamic calculations. Melting temperature can be calculated only once (or not at all). In addition, the isostability of the sequences is independent of the selection of a particular set of thermodynamic parameters. Applications are presented for DNA sequence designs for microarrays, universal microarray zip sequences and electron transfer experiments.

  13. Synthetic Ion Channels and DNA Logic Gates as Components of Molecular Robots.

    PubMed

    Kawano, Ryuji

    2018-02-19

    A molecular robot is a next-generation biochemical machine that imitates the actions of microorganisms. It is made of biomaterials such as DNA, proteins, and lipids. Three prerequisites have been proposed for the construction of such a robot: sensors, intelligence, and actuators. This Minireview focuses on recent research on synthetic ion channels and DNA computing technologies, which are viewed as potential candidate components of molecular robots. Synthetic ion channels, which are embedded in artificial cell membranes (lipid bilayers), sense ambient ions or chemicals and import them. These artificial sensors are useful components for molecular robots with bodies consisting of a lipid bilayer because they enable the interface between the inside and outside of the molecular robot to function as gates. After the signal molecules arrive inside the molecular robot, they can operate DNA logic gates, which perform computations. These functions will be integrated into the intelligence and sensor sections of molecular robots. Soon, these molecular machines will be able to be assembled to operate as a mass microrobot and play an active role in environmental monitoring and in vivo diagnosis or therapy. © 2018 Wiley-VCH Verlag GmbH & Co. KGaA, Weinheim.

  14. Privacy-preserving microbiome analysis using secure computation.

    PubMed

    Wagner, Justin; Paulson, Joseph N; Wang, Xiao; Bhattacharjee, Bobby; Corrada Bravo, Héctor

    2016-06-15

    Developing targeted therapeutics and identifying biomarkers relies on large amounts of research participant data. Beyond human DNA, scientists now investigate the DNA of micro-organisms inhabiting the human body. Recent work shows that an individual's collection of microbial DNA consistently identifies that person and could be used to link a real-world identity to a sensitive attribute in a research dataset. Unfortunately, the current suite of DNA-specific privacy-preserving analysis tools does not meet the requirements for microbiome sequencing studies. To address privacy concerns around microbiome sequencing, we implement metagenomic analyses using secure computation. Our implementation allows comparative analysis over combined data without revealing the feature counts for any individual sample. We focus on three analyses and perform an evaluation on datasets currently used by the microbiome research community. We use our implementation to simulate sharing data between four policy-domains. Additionally, we describe an application of our implementation for patients to combine data that allows drug developers to query against and compensate patients for the analysis. The software is freely available for download at: http://cbcb.umd.edu/∼hcorrada/projects/secureseq.html Supplementary data are available at Bioinformatics online. hcorrada@umiacs.umd.edu. © The Author 2016. Published by Oxford University Press.

  15. DMINDA: an integrated web server for DNA motif identification and analyses.

    PubMed

    Ma, Qin; Zhang, Hanyuan; Mao, Xizeng; Zhou, Chuan; Liu, Bingqiang; Chen, Xin; Xu, Ying

    2014-07-01

    DMINDA (DNA motif identification and analyses) is an integrated web server for DNA motif identification and analyses, which is accessible at http://csbl.bmb.uga.edu/DMINDA/. This web site is freely available to all users and there is no login requirement. This server provides a suite of cis-regulatory motif analysis functions on DNA sequences, which are important to elucidation of the mechanisms of transcriptional regulation: (i) de novo motif finding for a given set of promoter sequences along with statistical scores for the predicted motifs derived based on information extracted from a control set, (ii) scanning motif instances of a query motif in provided genomic sequences, (iii) motif comparison and clustering of identified motifs, and (iv) co-occurrence analyses of query motifs in given promoter sequences. The server is powered by a backend computer cluster with over 150 computing nodes, and is particularly useful for motif prediction and analyses in prokaryotic genomes. We believe that DMINDA, as a new and comprehensive web server for cis-regulatory motif finding and analyses, will benefit the genomic research community in general and prokaryotic genome researchers in particular. © The Author(s) 2014. Published by Oxford University Press on behalf of Nucleic Acids Research.

  16. A computational method for estimating the PCR duplication rate in DNA and RNA-seq experiments.

    PubMed

    Bansal, Vikas

    2017-03-14

    PCR amplification is an important step in the preparation of DNA sequencing libraries prior to high-throughput sequencing. PCR amplification introduces redundant reads in the sequence data and estimating the PCR duplication rate is important to assess the frequency of such reads. Existing computational methods do not distinguish PCR duplicates from "natural" read duplicates that represent independent DNA fragments and therefore, over-estimate the PCR duplication rate for DNA-seq and RNA-seq experiments. In this paper, we present a computational method to estimate the average PCR duplication rate of high-throughput sequence datasets that accounts for natural read duplicates by leveraging heterozygous variants in an individual genome. Analysis of simulated data and exome sequence data from the 1000 Genomes project demonstrated that our method can accurately estimate the PCR duplication rate on paired-end as well as single-end read datasets which contain a high proportion of natural read duplicates. Further, analysis of exome datasets prepared using the Nextera library preparation method indicated that 45-50% of read duplicates correspond to natural read duplicates likely due to fragmentation bias. Finally, analysis of RNA-seq datasets from individuals in the 1000 Genomes project demonstrated that 70-95% of read duplicates observed in such datasets correspond to natural duplicates sampled from genes with high expression and identified outlier samples with a 2-fold greater PCR duplication rate than other samples. The method described here is a useful tool for estimating the PCR duplication rate of high-throughput sequence datasets and for assessing the fraction of read duplicates that correspond to natural read duplicates. An implementation of the method is available at https://github.com/vibansal/PCRduplicates .

  17. Quantum-assisted biomolecular modelling.

    PubMed

    Harris, Sarah A; Kendon, Vivien M

    2010-08-13

    Our understanding of the physics of biological molecules, such as proteins and DNA, is limited because the approximations we usually apply to model inert materials are not, in general, applicable to soft, chemically inhomogeneous systems. The configurational complexity of biomolecules means the entropic contribution to the free energy is a significant factor in their behaviour, requiring detailed dynamical calculations to fully evaluate. Computer simulations capable of taking all interatomic interactions into account are therefore vital. However, even with the best current supercomputing facilities, we are unable to capture enough of the most interesting aspects of their behaviour to properly understand how they work. This limits our ability to design new molecules, to treat diseases, for example. Progress in biomolecular simulation depends crucially on increasing the computing power available. Faster classical computers are in the pipeline, but these provide only incremental improvements. Quantum computing offers the possibility of performing huge numbers of calculations in parallel, when it becomes available. We discuss the current open questions in biomolecular simulation, how these might be addressed using quantum computation and speculate on the future importance of quantum-assisted biomolecular modelling.

  18. Selecting Summary Statistics in Approximate Bayesian Computation for Calibrating Stochastic Models

    PubMed Central

    Burr, Tom

    2013-01-01

    Approximate Bayesian computation (ABC) is an approach for using measurement data to calibrate stochastic computer models, which are common in biology applications. ABC is becoming the “go-to” option when the data and/or parameter dimension is large because it relies on user-chosen summary statistics rather than the full data and is therefore computationally feasible. One technical challenge with ABC is that the quality of the approximation to the posterior distribution of model parameters depends on the user-chosen summary statistics. In this paper, the user requirement to choose effective summary statistics in order to accurately estimate the posterior distribution of model parameters is investigated and illustrated by example, using a model and corresponding real data of mitochondrial DNA population dynamics. We show that for some choices of summary statistics, the posterior distribution of model parameters is closely approximated and for other choices of summary statistics, the posterior distribution is not closely approximated. A strategy to choose effective summary statistics is suggested in cases where the stochastic computer model can be run at many trial parameter settings, as in the example. PMID:24288668

  19. Selecting summary statistics in approximate Bayesian computation for calibrating stochastic models.

    PubMed

    Burr, Tom; Skurikhin, Alexei

    2013-01-01

    Approximate Bayesian computation (ABC) is an approach for using measurement data to calibrate stochastic computer models, which are common in biology applications. ABC is becoming the "go-to" option when the data and/or parameter dimension is large because it relies on user-chosen summary statistics rather than the full data and is therefore computationally feasible. One technical challenge with ABC is that the quality of the approximation to the posterior distribution of model parameters depends on the user-chosen summary statistics. In this paper, the user requirement to choose effective summary statistics in order to accurately estimate the posterior distribution of model parameters is investigated and illustrated by example, using a model and corresponding real data of mitochondrial DNA population dynamics. We show that for some choices of summary statistics, the posterior distribution of model parameters is closely approximated and for other choices of summary statistics, the posterior distribution is not closely approximated. A strategy to choose effective summary statistics is suggested in cases where the stochastic computer model can be run at many trial parameter settings, as in the example.

  20. Extreme-Scale De Novo Genome Assembly

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Georganas, Evangelos; Hofmeyr, Steven; Egan, Rob

    De novo whole genome assembly reconstructs genomic sequence from short, overlapping, and potentially erroneous DNA segments and is one of the most important computations in modern genomics. This work presents HipMER, a high-quality end-to-end de novo assembler designed for extreme scale analysis, via efficient parallelization of the Meraculous code. Genome assembly software has many components, each of which stresses different components of a computer system. This chapter explains the computational challenges involved in each step of the HipMer pipeline, the key distributed data structures, and communication costs in detail. We present performance results of assembling the human genome and themore » large hexaploid wheat genome on large supercomputers up to tens of thousands of cores.« less

  1. Computational solutions to large-scale data management and analysis

    PubMed Central

    Schadt, Eric E.; Linderman, Michael D.; Sorenson, Jon; Lee, Lawrence; Nolan, Garry P.

    2011-01-01

    Today we can generate hundreds of gigabases of DNA and RNA sequencing data in a week for less than US$5,000. The astonishing rate of data generation by these low-cost, high-throughput technologies in genomics is being matched by that of other technologies, such as real-time imaging and mass spectrometry-based flow cytometry. Success in the life sciences will depend on our ability to properly interpret the large-scale, high-dimensional data sets that are generated by these technologies, which in turn requires us to adopt advances in informatics. Here we discuss how we can master the different types of computational environments that exist — such as cloud and heterogeneous computing — to successfully tackle our big data problems. PMID:20717155

  2. Engineering bacteria to solve the Burnt Pancake Problem

    PubMed Central

    Haynes, Karmella A; Broderick, Marian L; Brown, Adam D; Butner, Trevor L; Dickson, James O; Harden, W Lance; Heard, Lane H; Jessen, Eric L; Malloy, Kelly J; Ogden, Brad J; Rosemond, Sabriya; Simpson, Samantha; Zwack, Erin; Campbell, A Malcolm; Eckdahl, Todd T; Heyer, Laurie J; Poet, Jeffrey L

    2008-01-01

    Background We investigated the possibility of executing DNA-based computation in living cells by engineering Escherichia coli to address a classic mathematical puzzle called the Burnt Pancake Problem (BPP). The BPP is solved by sorting a stack of distinct objects (pancakes) into proper order and orientation using the minimum number of manipulations. Each manipulation reverses the order and orientation of one or more adjacent objects in the stack. We have designed a system that uses site-specific DNA recombination to mediate inversions of genetic elements that represent pancakes within plasmid DNA. Results Inversions (or "flips") of the DNA fragment pancakes are driven by the Salmonella typhimurium Hin/hix DNA recombinase system that we reconstituted as a collection of modular genetic elements for use in E. coli. Our system sorts DNA segments by inversions to produce different permutations of a promoter and a tetracycline resistance coding region; E. coli cells become antibiotic resistant when the segments are properly sorted. Hin recombinase can mediate all possible inversion operations on adjacent flippable DNA fragments. Mathematical modeling predicts that the system reaches equilibrium after very few flips, where equal numbers of permutations are randomly sorted and unsorted. Semiquantitative PCR analysis of in vivo flipping suggests that inversion products accumulate on a time scale of hours or days rather than minutes. Conclusion The Hin/hix system is a proof-of-concept demonstration of in vivo computation with the potential to be scaled up to accommodate larger and more challenging problems. Hin/hix may provide a flexible new tool for manipulating transgenic DNA in vivo. PMID:18492232

  3. Genegis: Computational Tools for Spatial Analyses of DNA Profiles with Associated Photo-Identification and Telemetry Records of Marine Mammals

    DTIC Science & Technology

    2013-09-30

    profiles of right whales Eubalaena glacialis from the North Atlantic Right Whale Consortium; 2) DNA profiles of sperm whales Physeter macrocephalus...of other cetacean databases in Wildbook format (e.g., North Atlantic right whales, sperm whales and Hector’s dolphins); 8) Supported continuing...of sperm whales, using samples collected during the 5-year Voyage of the Odyssey; and 3) DNA profiles of Hector’s dolphins from Cloudy Bay, New

  4. Cloud-based MOTIFSIM: Detecting Similarity in Large DNA Motif Data Sets.

    PubMed

    Tran, Ngoc Tam L; Huang, Chun-Hsi

    2017-05-01

    We developed the cloud-based MOTIFSIM on Amazon Web Services (AWS) cloud. The tool is an extended version from our web-based tool version 2.0, which was developed based on a novel algorithm for detecting similarity in multiple DNA motif data sets. This cloud-based version further allows researchers to exploit the computing resources available from AWS to detect similarity in multiple large-scale DNA motif data sets resulting from the next-generation sequencing technology. The tool is highly scalable with expandable AWS.

  5. IMG/VR: a database of cultured and uncultured DNA Viruses and retroviruses

    DOE PAGES

    Paez-Espino, David; Chen, I. -Min A.; Palaniappan, Krishna; ...

    2016-10-30

    Viruses represent the most abundant life forms on the planet. Recent experimental and computational improvements have led to a dramatic increase in the number of viral genome sequences identified primarily from metagenomic samples. As a result of the expanding catalog of metagenomic viral sequences, there exists a need for a comprehensive computational platform integrating all these sequences with associated metadata and analytical tools. Here we present IMG/VR (https://img.jgi.doe.gov/vr/), the largest publicly available database of 3908 isolate reference DNA viruses with 264 413 computationally identified viral contigs from > 6000 ecologically diverse metagenomic samples. Approximately half of the viral contigs aremore » grouped into genetically distinct quasi-species clusters. Microbial hosts are predicted for 20 000 viral sequences, revealing nine microbial phyla previously unreported to be infected by viruses. Viral sequences can be queried using a variety of associated metadata, including habitat type and geographic location of the samples, or taxonomic classification according to hallmark viral genes. IMG/VR has a user-friendly interface that allows users to interrogate all integrated data and interact by comparingwith external sequences, thus serving as an essential resource in the viral genomics community.« less

  6. IMG/VR: a database of cultured and uncultured DNA Viruses and retroviruses

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Paez-Espino, David; Chen, I. -Min A.; Palaniappan, Krishna

    Viruses represent the most abundant life forms on the planet. Recent experimental and computational improvements have led to a dramatic increase in the number of viral genome sequences identified primarily from metagenomic samples. As a result of the expanding catalog of metagenomic viral sequences, there exists a need for a comprehensive computational platform integrating all these sequences with associated metadata and analytical tools. Here we present IMG/VR (https://img.jgi.doe.gov/vr/), the largest publicly available database of 3908 isolate reference DNA viruses with 264 413 computationally identified viral contigs from > 6000 ecologically diverse metagenomic samples. Approximately half of the viral contigs aremore » grouped into genetically distinct quasi-species clusters. Microbial hosts are predicted for 20 000 viral sequences, revealing nine microbial phyla previously unreported to be infected by viruses. Viral sequences can be queried using a variety of associated metadata, including habitat type and geographic location of the samples, or taxonomic classification according to hallmark viral genes. IMG/VR has a user-friendly interface that allows users to interrogate all integrated data and interact by comparingwith external sequences, thus serving as an essential resource in the viral genomics community.« less

  7. Least-Squares Support Vector Machine Approach to Viral Replication Origin Prediction

    PubMed Central

    Cruz-Cano, Raul; Chew, David S.H.; Kwok-Pui, Choi; Ming-Ying, Leung

    2010-01-01

    Replication of their DNA genomes is a central step in the reproduction of many viruses. Procedures to find replication origins, which are initiation sites of the DNA replication process, are therefore of great importance for controlling the growth and spread of such viruses. Existing computational methods for viral replication origin prediction have mostly been tested within the family of herpesviruses. This paper proposes a new approach by least-squares support vector machines (LS-SVMs) and tests its performance not only on the herpes family but also on a collection of caudoviruses coming from three viral families under the order of caudovirales. The LS-SVM approach provides sensitivities and positive predictive values superior or comparable to those given by the previous methods. When suitably combined with previous methods, the LS-SVM approach further improves the prediction accuracy for the herpesvirus replication origins. Furthermore, by recursive feature elimination, the LS-SVM has also helped find the most significant features of the data sets. The results suggest that the LS-SVMs will be a highly useful addition to the set of computational tools for viral replication origin prediction and illustrate the value of optimization-based computing techniques in biomedical applications. PMID:20729987

  8. Least-Squares Support Vector Machine Approach to Viral Replication Origin Prediction.

    PubMed

    Cruz-Cano, Raul; Chew, David S H; Kwok-Pui, Choi; Ming-Ying, Leung

    2010-06-01

    Replication of their DNA genomes is a central step in the reproduction of many viruses. Procedures to find replication origins, which are initiation sites of the DNA replication process, are therefore of great importance for controlling the growth and spread of such viruses. Existing computational methods for viral replication origin prediction have mostly been tested within the family of herpesviruses. This paper proposes a new approach by least-squares support vector machines (LS-SVMs) and tests its performance not only on the herpes family but also on a collection of caudoviruses coming from three viral families under the order of caudovirales. The LS-SVM approach provides sensitivities and positive predictive values superior or comparable to those given by the previous methods. When suitably combined with previous methods, the LS-SVM approach further improves the prediction accuracy for the herpesvirus replication origins. Furthermore, by recursive feature elimination, the LS-SVM has also helped find the most significant features of the data sets. The results suggest that the LS-SVMs will be a highly useful addition to the set of computational tools for viral replication origin prediction and illustrate the value of optimization-based computing techniques in biomedical applications.

  9. Computer-based image analysis of one-dimensional electrophoretic gels used for the separation of DNA restriction fragments.

    PubMed Central

    Gray, A J; Beecher, D E; Olson, M V

    1984-01-01

    A stand-alone, interactive computer system has been developed that automates the analysis of ethidium bromide-stained agarose and acrylamide gels on which DNA restriction fragments have been separated by size. High-resolution digital images of the gels are obtained using a camera that contains a one-dimensional, 2048-pixel photodiode array that is mechanically translated through 2048 discrete steps in a direction perpendicular to the gel lanes. An automatic band-detection algorithm is used to establish the positions of the gel bands. A color-video graphics system, on which both the gel image and a variety of operator-controlled overlays are displayed, allows the operator to visualize and interact with critical stages of the analysis. The principal interactive steps involve defining the regions of the image that are to be analyzed and editing the results of the band-detection process. The system produces a machine-readable output file that contains the positions, intensities, and descriptive classifications of all the bands, as well as documentary information about the experiment. This file is normally further processed on a larger computer to obtain fragment-size assignments. Images PMID:6320097

  10. Label-free logic modules and two-layer cascade based on stem-loop probes containing a G-quadruplex domain.

    PubMed

    Guo, Yahui; Cheng, Junjie; Wang, Jine; Zhou, Xiaodong; Hu, Jiming; Pei, Renjun

    2014-09-01

    A simple, versatile, and label-free DNA computing strategy was designed by using toehold-mediated strand displacement and stem-loop probes. A full set of logic gates (YES, NOT, OR, NAND, AND, INHIBIT, NOR, XOR, XNOR) and a two-layer logic cascade were constructed. The probes contain a G-quadruplex domain, which was blocked or unfolded through inputs initiating strand displacement and the obviously distinguishable light-up fluorescent signal of G-quadruplex/NMM complex was used as the output readout. The inputs are the disease-specific nucleotide sequences with potential for clinic diagnosis. The developed versatile computing system based on our label-free and modular strategy might be adapted in multi-target diagnosis through DNA hybridization and aptamer-target interaction. © 2014 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.

  11. Apta-nanosensor preparation and in vitro assay for rapid Diazinon detection using a computational molecular approach.

    PubMed

    Jokar, Mahmoud; Safaralizadeh, Mohammad Hassan; Hadizadeh, Farzin; Rahmani, Fatemeh; Kalani, Mohamad Reza

    2017-02-01

    Aptamers (ss-DNA or ss-RNA), also known as artificial antibodies, have been selected in vitro median to bind target molecules with high affinity and selectivity. Diazinon is one of the most widely used organophosphorus insecticides in developing and underdeveloped countries as insecticide and acaricide. Diazinon is readily absorbed from the gastrointestinal system and rapidly distributed throughout the body. Thus, the design of clinical and laboratory diagnostics using nanobiosensors is necessary. A computational approach allows us to screen or rank receptor structure and predict interaction outcomes with a deeper understanding, and it is much more cost effective than laboratory attempts. In this research, the best sequence (high affinity bind Diazinon-ssDNA) was ranked among 12 aptamers isolated from SELEX experimentation. Docking results, as the first virtual screening stage and static technique, selected frequent conformation of each aptamer. Then, the quantity and quality of aptamer-Diazinon interaction were simulated using molecular dynamics as a mobility technique. RMSD, RMSF, radius of gyration, and the number of hydrogen bonds formed between Diazinon-aptamer were monitored to assess the quantity and quality of interactions. G-quadruplex DNA aptamer (DF20) showed to be a reliable candidate for Diazinon biosensing. The apta-nanosensor designed using simulation results allowed with linearity detection in the range of .141-.65 nM and a LOD of 17.903 nM, and it was validated using a computational molecular approach.

  12. SSR_pipeline--computer software for the identification of microsatellite sequences from paired-end Illumina high-throughput DNA sequence data

    USGS Publications Warehouse

    Miller, Mark P.; Knaus, Brian J.; Mullins, Thomas D.; Haig, Susan M.

    2013-01-01

    SSR_pipeline is a flexible set of programs designed to efficiently identify simple sequence repeats (SSRs; for example, microsatellites) from paired-end high-throughput Illumina DNA sequencing data. The program suite contains three analysis modules along with a fourth control module that can be used to automate analyses of large volumes of data. The modules are used to (1) identify the subset of paired-end sequences that pass quality standards, (2) align paired-end reads into a single composite DNA sequence, and (3) identify sequences that possess microsatellites conforming to user specified parameters. Each of the three separate analysis modules also can be used independently to provide greater flexibility or to work with FASTQ or FASTA files generated from other sequencing platforms (Roche 454, Ion Torrent, etc). All modules are implemented in the Python programming language and can therefore be used from nearly any computer operating system (Linux, Macintosh, Windows). The program suite relies on a compiled Python extension module to perform paired-end alignments. Instructions for compiling the extension from source code are provided in the documentation. Users who do not have Python installed on their computers or who do not have the ability to compile software also may choose to download packaged executable files. These files include all Python scripts, a copy of the compiled extension module, and a minimal installation of Python in a single binary executable. See program documentation for more information.

  13. Exploiting Parallel R in the Cloud with SPRINT

    PubMed Central

    Piotrowski, M.; McGilvary, G.A.; Sloan, T. M.; Mewissen, M.; Lloyd, A.D.; Forster, T.; Mitchell, L.; Ghazal, P.; Hill, J.

    2012-01-01

    Background Advances in DNA Microarray devices and next-generation massively parallel DNA sequencing platforms have led to an exponential growth in data availability but the arising opportunities require adequate computing resources. High Performance Computing (HPC) in the Cloud offers an affordable way of meeting this need. Objectives Bioconductor, a popular tool for high-throughput genomic data analysis, is distributed as add-on modules for the R statistical programming language but R has no native capabilities for exploiting multi-processor architectures. SPRINT is an R package that enables easy access to HPC for genomics researchers. This paper investigates: setting up and running SPRINT-enabled genomic analyses on Amazon’s Elastic Compute Cloud (EC2), the advantages of submitting applications to EC2 from different parts of the world and, if resource underutilization can improve application performance. Methods The SPRINT parallel implementations of correlation, permutation testing, partitioning around medoids and the multi-purpose papply have been benchmarked on data sets of various size on Amazon EC2. Jobs have been submitted from both the UK and Thailand to investigate monetary differences. Results It is possible to obtain good, scalable performance but the level of improvement is dependent upon the nature of algorithm. Resource underutilization can further improve the time to result. End-user’s location impacts on costs due to factors such as local taxation. Conclusions: Although not designed to satisfy HPC requirements, Amazon EC2 and cloud computing in general provides an interesting alternative and provides new possibilities for smaller organisations with limited funds. PMID:23223611

  14. Derivation of Reliable Geometries in QM Calculations of DNA Structures: Explicit Solvent QM/MM and Restrained Implicit Solvent QM Optimizations of G-Quadruplexes.

    PubMed

    Gkionis, Konstantinos; Kruse, Holger; Šponer, Jiří

    2016-04-12

    Modern dispersion-corrected DFT methods have made it possible to perform reliable QM studies on complete nucleic acid (NA) building blocks having hundreds of atoms. Such calculations, although still limited to investigations of potential energy surfaces, enhance the portfolio of computational methods applicable to NAs and offer considerably more accurate intrinsic descriptions of NAs than standard MM. However, in practice such calculations are hampered by the use of implicit solvent environments and truncation of the systems. Conventional QM optimizations are spoiled by spurious intramolecular interactions and severe structural deformations. Here we compare two approaches designed to suppress such artifacts: partially restrained continuum solvent QM and explicit solvent QM/MM optimizations. We report geometry relaxations of a set of diverse double-quartet guanine quadruplex (GQ) DNA stems. Both methods provide neat structures without major artifacts. However, each one also has distinct weaknesses. In restrained optimizations, all errors in the target geometries (i.e., low-resolution X-ray and NMR structures) are transferred to the optimized geometries. In QM/MM, the initial solvent configuration causes some heterogeneity in the geometries. Nevertheless, both approaches represent a decisive step forward compared to conventional optimizations. We refine earlier computations that revealed sizable differences in the relative energies of GQ stems computed with AMBER MM and QM. We also explore the dependence of the QM/MM results on the applied computational protocol.

  15. DEEP: a general computational framework for predicting enhancers

    PubMed Central

    Kleftogiannis, Dimitrios; Kalnis, Panos; Bajic, Vladimir B.

    2015-01-01

    Transcription regulation in multicellular eukaryotes is orchestrated by a number of DNA functional elements located at gene regulatory regions. Some regulatory regions (e.g. enhancers) are located far away from the gene they affect. Identification of distal regulatory elements is a challenge for the bioinformatics research. Although existing methodologies increased the number of computationally predicted enhancers, performance inconsistency of computational models across different cell-lines, class imbalance within the learning sets and ad hoc rules for selecting enhancer candidates for supervised learning, are some key questions that require further examination. In this study we developed DEEP, a novel ensemble prediction framework. DEEP integrates three components with diverse characteristics that streamline the analysis of enhancer's properties in a great variety of cellular conditions. In our method we train many individual classification models that we combine to classify DNA regions as enhancers or non-enhancers. DEEP uses features derived from histone modification marks or attributes coming from sequence characteristics. Experimental results indicate that DEEP performs better than four state-of-the-art methods on the ENCODE data. We report the first computational enhancer prediction results on FANTOM5 data where DEEP achieves 90.2% accuracy and 90% geometric mean (GM) of specificity and sensitivity across 36 different tissues. We further present results derived using in vivo-derived enhancer data from VISTA database. DEEP-VISTA, when tested on an independent test set, achieved GM of 80.1% and accuracy of 89.64%. DEEP framework is publicly available at http://cbrc.kaust.edu.sa/deep/. PMID:25378307

  16. Exploiting parallel R in the cloud with SPRINT.

    PubMed

    Piotrowski, M; McGilvary, G A; Sloan, T M; Mewissen, M; Lloyd, A D; Forster, T; Mitchell, L; Ghazal, P; Hill, J

    2013-01-01

    Advances in DNA Microarray devices and next-generation massively parallel DNA sequencing platforms have led to an exponential growth in data availability but the arising opportunities require adequate computing resources. High Performance Computing (HPC) in the Cloud offers an affordable way of meeting this need. Bioconductor, a popular tool for high-throughput genomic data analysis, is distributed as add-on modules for the R statistical programming language but R has no native capabilities for exploiting multi-processor architectures. SPRINT is an R package that enables easy access to HPC for genomics researchers. This paper investigates: setting up and running SPRINT-enabled genomic analyses on Amazon's Elastic Compute Cloud (EC2), the advantages of submitting applications to EC2 from different parts of the world and, if resource underutilization can improve application performance. The SPRINT parallel implementations of correlation, permutation testing, partitioning around medoids and the multi-purpose papply have been benchmarked on data sets of various size on Amazon EC2. Jobs have been submitted from both the UK and Thailand to investigate monetary differences. It is possible to obtain good, scalable performance but the level of improvement is dependent upon the nature of the algorithm. Resource underutilization can further improve the time to result. End-user's location impacts on costs due to factors such as local taxation. Although not designed to satisfy HPC requirements, Amazon EC2 and cloud computing in general provides an interesting alternative and provides new possibilities for smaller organisations with limited funds.

  17. Paternity testing that involves a DNA mixture.

    PubMed

    Mortera, Julia; Vecchiotti, Carla; Zoppis, Silvia; Merigioli, Sara

    2016-07-01

    Here we analyse a complex disputed paternity case, where the DNA of the putative father was extracted from his corpse that had been inhumed for over 20 years. This DNA was contaminated and appears to be a mixture of at least two individuals. Furthermore, the mother's DNA was not available. The DNA mixture was analysed so as to predict the most probable genotypes of each contributor. The major contributor's profile was then used to compute the likelihood ratio for paternity. We also show how to take into account a dropout allele and the possibility of mutation in paternity testing. Copyright © 2016 Elsevier Ireland Ltd. All rights reserved.

  18. Simulation of the charge migration in DNA under irradiation with heavy ions.

    PubMed

    Belov, Oleg V; Boyda, Denis L; Plante, Ianik; Shirmovsky, Sergey Eh

    2015-01-01

    A computer model to simulate the processes of charge injection and migration through DNA after irradiation by a heavy charged particle was developed. The most probable sites of charge injection were obtained by merging spatial models of short DNA sequence and a single 1 GeV/u iron particle track simulated by the code RITRACKS (Relativistic Ion Tracks). Charge migration was simulated by using a quantum-classical nonlinear model of the DNA-charge system. It was found that charge migration depends on the environmental conditions. The oxidative damage in DNA occurring during hole migration was simulated concurrently, which allowed the determination of probable locations of radiation-induced DNA lesions.

  19. Spectroscopic and molecular docking studies on the interaction of antiviral drug nevirapine with calf thymus DNA.

    PubMed

    Moghadam, Neda Hosseinpour; Salehzadeh, Sadegh; Shahabadi, Nahid

    2017-09-02

    The interaction of calf thymus DNA with nevirapine at physiological pH was studied by using absorption, circular dichroism, viscosity, differential pulse voltammetry, fluorescence techniques, salt effect studies and computational methods. The drug binds to ct-DNA in a groove binding mode, as shown by slight variation in the viscosity of ct-DNA. Furthermore, competitive fluorimetric studies with Hoechst 33258 indicate that nevirapine binds to DNA via groove binding. Moreover, the structure of nevirapine was optimized by DFT calculations and was used for the molecular docking calculations. The molecular docking results suggested that nevirapine prefers to bind on the minor groove of ct-DNA.

  20. Characterization of Ofloxacin Interaction with Mutated (A91V) Quinolone Resistance Determining Region of DNA Gyrase in Mycobacterium Leprae through Computational Simulation.

    PubMed

    Nisha, J; Shanthi, V

    2018-06-01

    Mycobacterium leprae, the causal agent of leprosy is non-cultivable in vitro. Thus, the assessment of antibiotic activity against Mycobacterium leprae depends primarily upon the time-consuming mouse footpad system. The GyrA protein of Mycobacterium leprae is the target of the antimycobacterial drug, Ofloxacin. In recent times, the GyrA mutation (A91V) has been found to be resistant to Ofloxacin. This phenomenon has necessitated the development of new, long-acting antimycobacterial compounds. The underlying mechanism of drug resistance is not completely known. Currently, experimentally crystallized GyrA-DNA-OFLX models are not available for highlighting the binding and mechanism of Ofloxacin resistance. Hence, we employed computational approaches to characterize the Ofloxacin interaction with both the native and mutant forms of GyrA complexed with DNA. Binding energy measurements obtained from molecular docking studies highlights hydrogen bond-mediated efficient binding of Ofloxacin to Asp47 in the native GyrA-DNA complex in comparison with that of the mutant GyrA-DNA complex. Further, molecular dynamics studies highlighted the stable binding of Ofloxacin with native GyrA-DNA complex than with the mutant GyrA-DNA complex. This mechanism provided a plausible reason for the reported, reduced effect of Ofloxacin to control leprosy in individuals with the A91V mutation. Our report is the first of its kind wherein the basis for the Ofloxacin drug resistance mechanism has been explored with the help of ternary Mycobacterium leprae complex, GyrA-DNA-OFLX. These structural insights will provide useful information for designing new drugs to target the Ofloxacin-resistant DNA gyrase.

  1. Synthesis, DNA binding and cytotoxic activity of pyrimido[4',5':4,5]thieno(2,3-b)quinoline with 9-hydroxy-4-(3-diethylaminopropylamino) and 8-methoxy-4-(3-diethylaminopropylamino) substitutions.

    PubMed

    KiranKumar, Hulihalli N; RohitKumar, Heggodu G; Advirao, Gopal M

    2018-01-01

    Two new derivatives of pyrimido[4',5';4,5]thieno(2,3-b)quinoline (PTQ), 9-hydroxy-4-(3-diethylaminopropylamino)pyrimido[4',5';4,5]thieno(2,3-b)quinoline (Hydroxy-DPTQ) and 8-methoxy-4-(3-diethylaminopropylamino)pyrimido[4',5';4,5]thieno(2,3-b)quinoline (Methoxy-DPTQ) were synthesized and their DNA binding ability was analyzed using spectroscopy (UV-visible, fluorescence and circular dichroism), ethidium bromide dye displacement assay, melting temperature (T m ) analysis and computational docking studies. The hypochromism in UV-visible spectrum and increased fluorescence emission of Hydroxy-DPTQ and Methoxy-DPTQ in the presence of DNA suggested the molecule-DNA interaction. The association constants calculated from UV-visible and spectral titrations were of the order 10 4 to 10 6 M -1 . Circular dichroism studies corroborated the induced conformational changes in DNA upon addition of molecules. The change in the ellipticity was observed both in negative and positive peak of DNA, thus, suggesting the intercalation of molecules. The observed displacement of ethidium bromide from the DNA and increased T m , upon addition of DNA confirmed the intercalative mode of binding. This was further validated by computational docking, which showed clear intercalation of molecules into the d(GpC)-d(CpG) site of the receptor DNA. Anticancer activities of these molecules are evaluated by using MTT assay. Both molecules showed antiproliferative activity against all the three cancer cells studied, with Hydroxy-DPTQ being more potential molecule among the two. IC 50 value of Hydroxy-DPTQ and Methoxy-DPTQ were in the range of 3-5μM and 130-250μM, respectively. Copyright © 2017 Elsevier B.V. All rights reserved.

  2. Computational identification of novel biochemical systems involved in oxidation, glycosylation and other complex modifications of bases in DNA.

    PubMed

    Iyer, Lakshminarayan M; Zhang, Dapeng; Burroughs, A Maxwell; Aravind, L

    2013-09-01

    Discovery of the TET/JBP family of dioxygenases that modify bases in DNA has sparked considerable interest in novel DNA base modifications and their biological roles. Using sensitive sequence and structure analyses combined with contextual information from comparative genomics, we computationally characterize over 12 novel biochemical systems for DNA modifications. We predict previously unidentified enzymes, such as the kinetoplastid J-base generating glycosyltransferase (and its homolog GREB1), the catalytic specificity of bacteriophage TET/JBP proteins and their role in complex DNA base modifications. We also predict the enzymes involved in synthesis of hypermodified bases such as alpha-glutamylthymine and alpha-putrescinylthymine that have remained enigmatic for several decades. Moreover, the current analysis suggests that bacteriophages and certain nucleo-cytoplasmic large DNA viruses contain an unexpectedly diverse range of DNA modification systems, in addition to those using previously characterized enzymes such as Dam, Dcm, TET/JBP, pyrimidine hydroxymethylases, Mom and glycosyltransferases. These include enzymes generating modified bases such as deazaguanines related to queuine and archaeosine, pyrimidines comparable with lysidine, those derived using modified S-adenosyl methionine derivatives and those using TET/JBP-generated hydroxymethyl pyrimidines as biosynthetic starting points. We present evidence that some of these modification systems are also widely dispersed across prokaryotes and certain eukaryotes such as basidiomycetes, chlorophyte and stramenopile alga, where they could serve as novel epigenetic marks for regulation or discrimination of self from non-self DNA. Our study extends the role of the PUA-like fold domains in recognition of modified nucleic acids and predicts versions of the ASCH and EVE domains to be novel 'readers' of modified bases in DNA. These results open opportunities for the investigation of the biology of these systems and their use in biotechnology.

  3. Computational identification of novel biochemical systems involved in oxidation, glycosylation and other complex modifications of bases in DNA

    PubMed Central

    Iyer, Lakshminarayan M.; Zhang, Dapeng; Maxwell Burroughs, A.; Aravind, L.

    2013-01-01

    Discovery of the TET/JBP family of dioxygenases that modify bases in DNA has sparked considerable interest in novel DNA base modifications and their biological roles. Using sensitive sequence and structure analyses combined with contextual information from comparative genomics, we computationally characterize over 12 novel biochemical systems for DNA modifications. We predict previously unidentified enzymes, such as the kinetoplastid J-base generating glycosyltransferase (and its homolog GREB1), the catalytic specificity of bacteriophage TET/JBP proteins and their role in complex DNA base modifications. We also predict the enzymes involved in synthesis of hypermodified bases such as alpha-glutamylthymine and alpha-putrescinylthymine that have remained enigmatic for several decades. Moreover, the current analysis suggests that bacteriophages and certain nucleo-cytoplasmic large DNA viruses contain an unexpectedly diverse range of DNA modification systems, in addition to those using previously characterized enzymes such as Dam, Dcm, TET/JBP, pyrimidine hydroxymethylases, Mom and glycosyltransferases. These include enzymes generating modified bases such as deazaguanines related to queuine and archaeosine, pyrimidines comparable with lysidine, those derived using modified S-adenosyl methionine derivatives and those using TET/JBP-generated hydroxymethyl pyrimidines as biosynthetic starting points. We present evidence that some of these modification systems are also widely dispersed across prokaryotes and certain eukaryotes such as basidiomycetes, chlorophyte and stramenopile alga, where they could serve as novel epigenetic marks for regulation or discrimination of self from non-self DNA. Our study extends the role of the PUA-like fold domains in recognition of modified nucleic acids and predicts versions of the ASCH and EVE domains to be novel ‘readers’ of modified bases in DNA. These results open opportunities for the investigation of the biology of these systems and their use in biotechnology. PMID:23814188

  4. Characterization of HIFU ablation using DNA fragmentation labeling as apoptosis stain

    NASA Astrophysics Data System (ADS)

    Anquez, Jeremie; Corréas, Jean-Michel; Pau, Bernard; Lacoste, François; Yon, Sylvain

    2012-11-01

    The goal of this work was to compare modalities to precisely quantify the extent of thermally induced lesions: gross pathology vs. histopathology vs. devascularization. Liver areas of 14 rabbits were targeted with HIFU and RF ablations in an acute study. Contrast enhanced computorized tomography (CE-CT) scan images were acquired two hours after HIFU and RF treatment to obtain the devascularized volumes of the livers. The animals were then euthanized and deep frozen. The livers were sliced and each slice was photographed and stacked yielding a volume of gross pathology. The volume VGP of the HIFU lesions were derived. The area AGP of the lesions were computed on a particular slice. The lesions were segmented as hypo intense (devascularized) regions on CE-CT images and their volumes VC were computed. The ratios VC/VGP were computed for all the HIFU lesions on all the 14 subjects with a mean value of 1.2. Histology was performed on the livers using Hematoxyline Eosine Staining (HES) and DNA Fragmentation labeling (TUNEL® technology) which characterizes apoptosis. Apoptotic regions of area AT were segmented on the images stained by TUNEL®. No necrosis was identified on the HES data. While TUNEL® did not mark the cores of the RF lesions as apoptotic, the periphery of HIFU and RF lesions was always recognized with TUNEL® as apoptotic. The ratio AGP/AT was computed. The mean value was 0.95 and 0.25 for HIFU and RF lesions respectively. These findings show that the devascularized territory seen on CE-CT scan coincide with the coagulated territories seen with gross pathology. Those actually correspond to cells in apoptosis. It is confirmed that HES stain does not show necrosis 2 hours after thermal ablation. TUNEL® technology for DNA fragmentation labeling appears as a useful marker for thermally induced acute lesions in the liver.

  5. G-Anchor: a novel approach for whole-genome comparative mapping utilizing evolutionary conserved DNA sequences.

    PubMed

    Lenis, Vasileios Panagiotis E; Swain, Martin; Larkin, Denis M

    2018-05-01

    Cross-species whole-genome sequence alignment is a critical first step for genome comparative analyses, ranging from the detection of sequence variants to studies of chromosome evolution. Animal genomes are large and complex, and whole-genome alignment is a computationally intense process, requiring expensive high-performance computing systems due to the need to explore extensive local alignments. With hundreds of sequenced animal genomes available from multiple projects, there is an increasing demand for genome comparative analyses. Here, we introduce G-Anchor, a new, fast, and efficient pipeline that uses a strictly limited but highly effective set of local sequence alignments to anchor (or map) an animal genome to another species' reference genome. G-Anchor makes novel use of a databank of highly conserved DNA sequence elements. We demonstrate how these elements may be aligned to a pair of genomes, creating anchors. These anchors enable the rapid mapping of scaffolds from a de novo assembled genome to chromosome assemblies of a reference species. Our results demonstrate that G-Anchor can successfully anchor a vertebrate genome onto a phylogenetically related reference species genome using a desktop or laptop computer within a few hours and with comparable accuracy to that achieved by a highly accurate whole-genome alignment tool such as LASTZ. G-Anchor thus makes whole-genome comparisons accessible to researchers with limited computational resources. G-Anchor is a ready-to-use tool for anchoring a pair of vertebrate genomes. It may be used with large genomes that contain a significant fraction of evolutionally conserved DNA sequences and that are not highly repetitive, polypoid, or excessively fragmented. G-Anchor is not a substitute for whole-genome aligning software but can be used for fast and accurate initial genome comparisons. G-Anchor is freely available and a ready-to-use tool for the pairwise comparison of two genomes.

  6. Building block synthesis using the polymerase chain assembly method.

    PubMed

    Marchand, Julie A; Peccoud, Jean

    2012-01-01

    De novo gene synthesis allows the creation of custom DNA molecules without the typical constraints of traditional cloning assembly: scars, restriction site incompatibility, and the quest to find all the desired parts to name a few. Moreover, with the help of computer-assisted design, the perfect DNA molecule can be created along with its matching sequence ready to download. The challenge is to build the physical DNA molecules that have been designed with the software. Although there are several DNA assembly methods, this section presents and describes a method using the polymerase chain assembly (PCA).

  7. DNA-Based Dynamic Reaction Networks.

    PubMed

    Fu, Ting; Lyu, Yifan; Liu, Hui; Peng, Ruizi; Zhang, Xiaobing; Ye, Mao; Tan, Weihong

    2018-05-21

    Deriving from logical and mechanical interactions between DNA strands and complexes, DNA-based artificial reaction networks (RNs) are attractive for their high programmability, as well as cascading and fan-out ability, which are similar to the basic principles of electronic logic gates. Arising from the dream of creating novel computing mechanisms, researchers have placed high hopes on the development of DNA-based dynamic RNs and have strived to establish the basic theories and operative strategies of these networks. This review starts by looking back on the evolution of DNA dynamic RNs; in particular' the most significant applications in biochemistry occurring in recent years. Finally, we discuss the perspectives of DNA dynamic RNs and give a possible direction for the development of DNA circuits. Copyright © 2018. Published by Elsevier Ltd.

  8. Structural DNA Nanotechnology: State of the Art and Future Perspective

    PubMed Central

    2015-01-01

    Over the past three decades DNA has emerged as an exceptional molecular building block for nanoconstruction due to its predictable conformation and programmable intra- and intermolecular Watson–Crick base-pairing interactions. A variety of convenient design rules and reliable assembly methods have been developed to engineer DNA nanostructures of increasing complexity. The ability to create designer DNA architectures with accurate spatial control has allowed researchers to explore novel applications in many directions, such as directed material assembly, structural biology, biocatalysis, DNA computing, nanorobotics, disease diagnosis, and drug delivery. This Perspective discusses the state of the art in the field of structural DNA nanotechnology and presents some of the challenges and opportunities that exist in DNA-based molecular design and programming. PMID:25029570

  9. Software Reviews.

    ERIC Educational Resources Information Center

    Science Software Quarterly, 1984

    1984-01-01

    Provides extensive reviews of computer software, examining documentation, ease of use, performance, error handling, special features, and system requirements. Includes statistics, problem-solving (TK Solver), label printing, database management, experimental psychology, Encyclopedia Britannica biology, and DNA-sequencing programs. A program for…

  10. Hot Chips and Hot Interconnects for High End Computing Systems

    NASA Technical Reports Server (NTRS)

    Saini, Subhash

    2005-01-01

    I will discuss several processors: 1. The Cray proprietary processor used in the Cray X1; 2. The IBM Power 3 and Power 4 used in an IBM SP 3 and IBM SP 4 systems; 3. The Intel Itanium and Xeon, used in the SGI Altix systems and clusters respectively; 4. IBM System-on-a-Chip used in IBM BlueGene/L; 5. HP Alpha EV68 processor used in DOE ASCI Q cluster; 6. SPARC64 V processor, which is used in the Fujitsu PRIMEPOWER HPC2500; 7. An NEC proprietary processor, which is used in NEC SX-6/7; 8. Power 4+ processor, which is used in Hitachi SR11000; 9. NEC proprietary processor, which is used in Earth Simulator. The IBM POWER5 and Red Storm Computing Systems will also be discussed. The architectures of these processors will first be presented, followed by interconnection networks and a description of high-end computer systems based on these processors and networks. The performance of various hardware/programming model combinations will then be compared, based on latest NAS Parallel Benchmark results (MPI, OpenMP/HPF and hybrid (MPI + OpenMP). The tutorial will conclude with a discussion of general trends in the field of high performance computing, (quantum computing, DNA computing, cellular engineering, and neural networks).

  11. Cloud-based adaptive exon prediction for DNA analysis.

    PubMed

    Putluri, Srinivasareddy; Zia Ur Rahman, Md; Fathima, Shaik Yasmeen

    2018-02-01

    Cloud computing offers significant research and economic benefits to healthcare organisations. Cloud services provide a safe place for storing and managing large amounts of such sensitive data. Under conventional flow of gene information, gene sequence laboratories send out raw and inferred information via Internet to several sequence libraries. DNA sequencing storage costs will be minimised by use of cloud service. In this study, the authors put forward a novel genomic informatics system using Amazon Cloud Services, where genomic sequence information is stored and accessed for processing. True identification of exon regions in a DNA sequence is a key task in bioinformatics, which helps in disease identification and design drugs. Three base periodicity property of exons forms the basis of all exon identification techniques. Adaptive signal processing techniques found to be promising in comparison with several other methods. Several adaptive exon predictors (AEPs) are developed using variable normalised least mean square and its maximum normalised variants to reduce computational complexity. Finally, performance evaluation of various AEPs is done based on measures such as sensitivity, specificity and precision using various standard genomic datasets taken from National Center for Biotechnology Information genomic sequence database.

  12. Reconstructing evolutionary trees in parallel for massive sequences.

    PubMed

    Zou, Quan; Wan, Shixiang; Zeng, Xiangxiang; Ma, Zhanshan Sam

    2017-12-14

    Building the evolutionary trees for massive unaligned DNA sequences is challenging and crucial. However, reconstructing evolutionary tree for ultra-large sequences is hard. Massive multiple sequence alignment is also challenging and time/space consuming. Hadoop and Spark are developed recently, which bring spring light for the classical computational biology problems. In this paper, we tried to solve the multiple sequence alignment and evolutionary reconstruction in parallel. HPTree, which is developed in this paper, can deal with big DNA sequence files quickly. It works well on the >1GB files, and gets better performance than other evolutionary reconstruction tools. Users could use HPTree for reonstructing evolutioanry trees on the computer clusters or cloud platform (eg. Amazon Cloud). HPTree could help on population evolution research and metagenomics analysis. In this paper, we employ the Hadoop and Spark platform and design an evolutionary tree reconstruction software tool for unaligned massive DNA sequences. Clustering and multiple sequence alignment are done in parallel. Neighbour-joining model was employed for the evolutionary tree building. We opened our software together with source codes via http://lab.malab.cn/soft/HPtree/ .

  13. A QM/MM study of the absorption spectrum of harmane in water solution and interacting with DNA: the crucial role of dynamic effects.

    PubMed

    Etienne, Thibaud; Very, Thibaut; Perpète, Eric A; Monari, Antonio; Assfeld, Xavier

    2013-05-02

    We present a time-dependent density functional theory computation of the absorption spectra of one β-carboline system: the harmane molecule in its neutral and cationic forms. The spectra are computed in aqueous solution. The interaction of cationic harmane with DNA is also studied. In particular, the use of hybrid quantum mechanics/molecular mechanics methods is discussed, together with its coupling to a molecular dynamics strategy to take into account dynamic effects of the environment and the vibrational degrees of freedom of the chromophore. Different levels of treatment of the environment are addressed starting from purely mechanical embedding to electrostatic and polarizable embedding. We show that a static description of the spectrum based on equilibrium geometry only is unable to give a correct agreement with experimental results, and dynamic effects need to be taken into account. The presence of two stable noncovalent interaction modes between harmane and DNA is also presented, as well as the associated absorption spectrum of harmane cation.

  14. Computational fishing of new DNA methyltransferase inhibitors from natural products.

    PubMed

    Maldonado-Rojas, Wilson; Olivero-Verbel, Jesus; Marrero-Ponce, Yovani

    2015-07-01

    DNA methyltransferase inhibitors (DNMTis) have become an alternative for cancer therapies. However, only two DNMTis have been approved as anticancer drugs, although with some restrictions. Natural products (NPs) are a promising source of drugs. In order to find NPs with novel chemotypes as DNMTis, 47 compounds with known activity against these enzymes were used to build a LDA-based QSAR model for active/inactive molecules (93% accuracy) based on molecular descriptors. This classifier was employed to identify potential DNMTis on 800 NPs from NatProd Collection. 447 selected compounds were docked on two human DNA methyltransferase (DNMT) structures (PDB codes: 3SWR and 2QRV) using AutoDock Vina and Surflex-Dock, prioritizing according to their score values, contact patterns at 4 Å and molecular diversity. Six consensus NPs were identified as virtual hits against DNMTs, including 9,10-dihydro-12-hydroxygambogic, phloridzin, 2',4'-dihydroxychalcone 4'-glucoside, daunorubicin, pyrromycin and centaurein. This method is an innovative computational strategy for identifying DNMTis, useful in the identification of potent and selective anticancer drugs. Copyright © 2015 Elsevier Inc. All rights reserved.

  15. Vander Lugt correlation of DNA sequence data

    NASA Astrophysics Data System (ADS)

    Christens-Barry, William A.; Hawk, James F.; Martin, James C.

    1990-12-01

    DNA, the molecule containing the genetic code of an organism, is a linear chain of subunits. It is the sequence of subunits, of which there are four kinds, that constitutes the unique blueprint of an individual. This sequence is the focus of a large number of analyses performed by an army of geneticists, biologists, and computer scientists. Most of these analyses entail searches for specific subsequences within the larger set of sequence data. Thus, most analyses are essentially pattern recognition or correlation tasks. Yet, there are special features to such analysis that influence the strategy and methods of an optical pattern recognition approach. While the serial processing employed in digital electronic computers remains the main engine of sequence analyses, there is no fundamental reason that more efficient parallel methods cannot be used. We describe an approach using optical pattern recognition (OPR) techniques based on matched spatial filtering. This allows parallel comparison of large blocks of sequence data. In this study we have simulated a Vander Lugt1 architecture implementing our approach. Searches for specific target sequence strings within a block of DNA sequence from the Co/El plasmid2 are performed.

  16. Optically Controlled Signal Amplification for DNA Computation.

    PubMed

    Prokup, Alexander; Hemphill, James; Liu, Qingyang; Deiters, Alexander

    2015-10-16

    The hybridization chain reaction (HCR) and fuel-catalyst cycles have been applied to address the problem of signal amplification in DNA-based computation circuits. While they function efficiently, these signal amplifiers cannot be switched ON or OFF quickly and noninvasively. To overcome these limitations, a light-activated initiator strand for the HCR, which enabled fast optical OFF → ON switching, was developed. Similarly, when a light-activated version of the catalyst strand or the inhibitor strand of a fuel-catalyst cycle was applied, the cycle could be optically switched from OFF → ON or ON → OFF, respectively. To move the capabilities of these devices beyond solution-based operations, the components were embedded in agarose gels. Irradiation with customizable light patterns and at different time points demonstrated both spatial and temporal control. The addition of a translator gate enabled a spatially activated signal to travel along a predefined path, akin to a chemical wire. Overall, the addition of small light-cleavable photocaging groups to DNA signal amplification circuits enabled conditional control as well as fast photocontrol of signal amplification.

  17. Powering the programmed nanostructure and function of gold nanoparticles with catenated DNA machines

    NASA Astrophysics Data System (ADS)

    Elbaz, Johann; Cecconello, Alessandro; Fan, Zhiyuan; Govorov, Alexander O.; Willner, Itamar

    2013-06-01

    DNA nanotechnology is a rapidly developing research area in nanoscience. It includes the development of DNA machines, tailoring of DNA nanostructures, application of DNA nanostructures for computing, and more. Different DNA machines were reported in the past and DNA-guided assembly of nanoparticles represents an active research effort in DNA nanotechnology. Several DNA-dictated nanoparticle structures were reported, including a tetrahedron, a triangle or linear nanoengineered nanoparticle structures; however, the programmed, dynamic reversible switching of nanoparticle structures and, particularly, the dictated switchable functions emerging from the nanostructures, are missing elements in DNA nanotechnology. Here we introduce DNA catenane systems (interlocked DNA rings) as molecular DNA machines for the programmed, reversible and switchable arrangement of different-sized gold nanoparticles. We further demonstrate that the machine-powered gold nanoparticle structures reveal unique emerging switchable spectroscopic features, such as plasmonic coupling or surface-enhanced fluorescence.

  18. Modulating the DNA affinity of Elk-1 with computationally selected mutations.

    PubMed

    Park, Sheldon; Boder, Eric T; Saven, Jeffery G

    2005-04-22

    In order to regulate gene expression, transcription factors must first bind their target DNA sequences. The affinity of this binding is determined by both the network of interactions at the interface and the entropy change associated with the complex formation. To study the role of structural fluctuation in fine-tuning DNA affinity, we performed molecular dynamics simulations of two highly homologous proteins, Elk-1 and SAP-1, that exhibit different sequence specificity. Simulation studies show that several residues in Elk have significantly higher main-chain root-mean-square deviations than their counterparts in SAP. In particular, a single residue, D69, may contribute to Elk's lower DNA affinity for P(c-fos) by structurally destabilizing the carboxy terminus of the recognition helix. While D69 does not contact DNA directly, the increased mobility in the region may contribute to its weaker binding. We measured the ability of single point mutants of Elk to bind P(c-fos) in a reporter assay, in which D69 of wild-type Elk has been mutated to other residues with higher helix propensity in order to stabilize the local conformation. The gains in transcriptional activity and the free energy of binding suggested from these measurements correlate well with stability gains computed from helix propensity and charge-macrodipole interactions. The study suggests that residues that are distal to the binding interface may indirectly modulate the binding affinity by stabilizing the protein scaffold required for efficient DNA interaction.

  19. Quantitative fluorescence correlation spectroscopy on DNA in living cells

    NASA Astrophysics Data System (ADS)

    Hodges, Cameron; Kafle, Rudra P.; Meiners, Jens-Christian

    2017-02-01

    FCS is a fluorescence technique conventionally used to study the kinetics of fluorescent molecules in a dilute solution. Being a non-invasive technique, it is now drawing increasing interest for the study of more complex systems like the dynamics of DNA or proteins in living cells. Unlike an ordinary dye solution, the dynamics of macromolecules like proteins or entangled DNA in crowded environments is often slow and subdiffusive in nature. This in turn leads to longer residence times of the attached fluorophores in the excitation volume of the microscope and artifacts from photobleaching abound that can easily obscure the signature of the molecular dynamics of interest and make quantitative analysis challenging.We discuss methods and procedures to make FCS applicable to quantitative studies of the dynamics of DNA in live prokaryotic and eukaryotic cells. The intensity autocorrelation is computed function from weighted arrival times of the photons on the detector that maximizes the information content while simultaneously correcting for the effect of photobleaching to yield an autocorrelation function that reflects only the underlying dynamics of the sample. This autocorrelation function in turn is used to calculate the mean square displacement of the fluorophores attached to DNA. The displacement data is more amenable to further quantitative analysis than the raw correlation functions. By using a suitable integral transform of the mean square displacement, we can then determine the viscoelastic moduli of the DNA in its cellular environment. The entire analysis procedure is extensively calibrated and validated using model systems and computational simulations.

  20. A novel model for DNA sequence similarity analysis based on graph theory.

    PubMed

    Qi, Xingqin; Wu, Qin; Zhang, Yusen; Fuller, Eddie; Zhang, Cun-Quan

    2011-01-01

    Determination of sequence similarity is one of the major steps in computational phylogenetic studies. As we know, during evolutionary history, not only DNA mutations for individual nucleotide but also subsequent rearrangements occurred. It has been one of major tasks of computational biologists to develop novel mathematical descriptors for similarity analysis such that various mutation phenomena information would be involved simultaneously. In this paper, different from traditional methods (eg, nucleotide frequency, geometric representations) as bases for construction of mathematical descriptors, we construct novel mathematical descriptors based on graph theory. In particular, for each DNA sequence, we will set up a weighted directed graph. The adjacency matrix of the directed graph will be used to induce a representative vector for DNA sequence. This new approach measures similarity based on both ordering and frequency of nucleotides so that much more information is involved. As an application, the method is tested on a set of 0.9-kb mtDNA sequences of twelve different primate species. All output phylogenetic trees with various distance estimations have the same topology, and are generally consistent with the reported results from early studies, which proves the new method's efficiency; we also test the new method on a simulated data set, which shows our new method performs better than traditional global alignment method when subsequent rearrangements happen frequently during evolutionary history.

  1. Recurrence time statistics: versatile tools for genomic DNA sequence analysis.

    PubMed

    Cao, Yinhe; Tung, Wen-Wen; Gao, J B

    2004-01-01

    With the completion of the human and a few model organisms' genomes, and the genomes of many other organisms waiting to be sequenced, it has become increasingly important to develop faster computational tools which are capable of easily identifying the structures and extracting features from DNA sequences. One of the more important structures in a DNA sequence is repeat-related. Often they have to be masked before protein coding regions along a DNA sequence are to be identified or redundant expressed sequence tags (ESTs) are to be sequenced. Here we report a novel recurrence time based method for sequence analysis. The method can conveniently study all kinds of periodicity and exhaustively find all repeat-related features from a genomic DNA sequence. An efficient codon index is also derived from the recurrence time statistics, which has the salient features of being largely species-independent and working well on very short sequences. Efficient codon indices are key elements of successful gene finding algorithms, and are particularly useful for determining whether a suspected EST belongs to a coding or non-coding region. We illustrate the power of the method by studying the genomes of E. coli, the yeast S. cervisivae, the nematode worm C. elegans, and the human, Homo sapiens. Computationally, our method is very efficient. It allows us to carry out analysis of genomes on the whole genomic scale by a PC.

  2. Brownian dynamics simulations of sequence-dependent duplex denaturation in dynamically superhelical DNA

    NASA Astrophysics Data System (ADS)

    Mielke, Steven P.; Grønbech-Jensen, Niels; Krishnan, V. V.; Fink, William H.; Benham, Craig J.

    2005-09-01

    The topological state of DNA in vivo is dynamically regulated by a number of processes that involve interactions with bound proteins. In one such process, the tracking of RNA polymerase along the double helix during transcription, restriction of rotational motion of the polymerase and associated structures, generates waves of overtwist downstream and undertwist upstream from the site of transcription. The resulting superhelical stress is often sufficient to drive double-stranded DNA into a denatured state at locations such as promoters and origins of replication, where sequence-specific duplex opening is a prerequisite for biological function. In this way, transcription and other events that actively supercoil the DNA provide a mechanism for dynamically coupling genetic activity with regulatory and other cellular processes. Although computer modeling has provided insight into the equilibrium dynamics of DNA supercoiling, to date no model has appeared for simulating sequence-dependent DNA strand separation under the nonequilibrium conditions imposed by the dynamic introduction of torsional stress. Here, we introduce such a model and present results from an initial set of computer simulations in which the sequences of dynamically superhelical, 147 base pair DNA circles were systematically altered in order to probe the accuracy with which the model can predict location, extent, and time of stress-induced duplex denaturation. The results agree both with well-tested statistical mechanical calculations and with available experimental information. Additionally, we find that sites susceptible to denaturation show a propensity for localizing to supercoil apices, suggesting that base sequence determines locations of strand separation not only through the energetics of interstrand interactions, but also by influencing the geometry of supercoiling.

  3. Emergence of Resistance to Atovaquone-Proguanil in Malaria Parasites: Insights from Computational Modeling and Clinical Case Reports

    PubMed Central

    Musset, Lise; Hubert, Véronique; Le Bras, Jacques

    2014-01-01

    The usefulness of atovaquone-proguanil (AP) as an antimalarial treatment is compromised by the emergence of atovaquone resistance during therapy. However, the origin of the parasite mitochondrial DNA (mtDNA) mutation conferring atovaquone resistance remains elusive. Here, we report a patient-based stochastic model that tracks the intrahost emergence of mutations in the multicopy mtDNA during the first erythrocytic parasite cycles leading to the malaria febrile episode. The effect of mtDNA copy number, mutation rate, mutation cost, and total parasite load on the mutant parasite load per patient was evaluated. Computer simulations showed that almost any infected patient carried, after four to seven erythrocytic cycles, de novo mutant parasites at low frequency, with varied frequencies of parasites carrying varied numbers of mutant mtDNA copies. A large interpatient variability in the size of this mutant reservoir was found; this variability was due to the different parameters tested but also to the relaxed replication and partitioning of mtDNA copies during mitosis. We also report seven clinical cases in which AP-resistant infections were treated by AP. These provided evidence that parasiticidal drug concentrations against AP-resistant parasites were transiently obtained within days after treatment initiation. Altogether, these results suggest that each patient carries new mtDNA mutant parasites that emerge before treatment but are killed by high starting drug concentrations. However, because the size of this mutant reservoir is highly variable from patient to patient, we propose that some patients fail to eliminate all of the mutant parasites, repeatedly producing de novo AP treatment failures. PMID:24867967

  4. Brownian dynamics simulations of sequence-dependent duplex denaturation in dynamically superhelical DNA.

    PubMed

    Mielke, Steven P; Grønbech-Jensen, Niels; Krishnan, V V; Fink, William H; Benham, Craig J

    2005-09-22

    The topological state of DNA in vivo is dynamically regulated by a number of processes that involve interactions with bound proteins. In one such process, the tracking of RNA polymerase along the double helix during transcription, restriction of rotational motion of the polymerase and associated structures, generates waves of overtwist downstream and undertwist upstream from the site of transcription. The resulting superhelical stress is often sufficient to drive double-stranded DNA into a denatured state at locations such as promoters and origins of replication, where sequence-specific duplex opening is a prerequisite for biological function. In this way, transcription and other events that actively supercoil the DNA provide a mechanism for dynamically coupling genetic activity with regulatory and other cellular processes. Although computer modeling has provided insight into the equilibrium dynamics of DNA supercoiling, to date no model has appeared for simulating sequence-dependent DNA strand separation under the nonequilibrium conditions imposed by the dynamic introduction of torsional stress. Here, we introduce such a model and present results from an initial set of computer simulations in which the sequences of dynamically superhelical, 147 base pair DNA circles were systematically altered in order to probe the accuracy with which the model can predict location, extent, and time of stress-induced duplex denaturation. The results agree both with well-tested statistical mechanical calculations and with available experimental information. Additionally, we find that sites susceptible to denaturation show a propensity for localizing to supercoil apices, suggesting that base sequence determines locations of strand separation not only through the energetics of interstrand interactions, but also by influencing the geometry of supercoiling.

  5. Lnc2Meth: a manually curated database of regulatory relationships between long non-coding RNAs and DNA methylation associated with human disease.

    PubMed

    Zhi, Hui; Li, Xin; Wang, Peng; Gao, Yue; Gao, Baoqing; Zhou, Dianshuang; Zhang, Yan; Guo, Maoni; Yue, Ming; Shen, Weitao; Ning, Shangwei; Jin, Lianhong; Li, Xia

    2018-01-04

    Lnc2Meth (http://www.bio-bigdata.com/Lnc2Meth/), an interactive resource to identify regulatory relationships between human long non-coding RNAs (lncRNAs) and DNA methylation, is not only a manually curated collection and annotation of experimentally supported lncRNAs-DNA methylation associations but also a platform that effectively integrates tools for calculating and identifying the differentially methylated lncRNAs and protein-coding genes (PCGs) in diverse human diseases. The resource provides: (i) advanced search possibilities, e.g. retrieval of the database by searching the lncRNA symbol of interest, DNA methylation patterns, regulatory mechanisms and disease types; (ii) abundant computationally calculated DNA methylation array profiles for the lncRNAs and PCGs; (iii) the prognostic values for each hit transcript calculated from the patients clinical data; (iv) a genome browser to display the DNA methylation landscape of the lncRNA transcripts for a specific type of disease; (v) tools to re-annotate probes to lncRNA loci and identify the differential methylation patterns for lncRNAs and PCGs with user-supplied external datasets; (vi) an R package (LncDM) to complete the differentially methylated lncRNAs identification and visualization with local computers. Lnc2Meth provides a timely and valuable resource that can be applied to significantly expand our understanding of the regulatory relationships between lncRNAs and DNA methylation in various human diseases. © The Author(s) 2017. Published by Oxford University Press on behalf of Nucleic Acids Research.

  6. Connecting localized DNA strand displacement reactions

    NASA Astrophysics Data System (ADS)

    Mullor Ruiz, Ismael; Arbona, Jean-Michel; Lad, Amitkumar; Mendoza, Oscar; Aimé, Jean-Pierre; Elezgaray, Juan

    2015-07-01

    Logic circuits based on DNA strand displacement reactions have been shown to be versatile enough to compute the square root of four-bit numbers. The implementation of these circuits as a set of bulk reactions faces difficulties which include leaky reactions and intrinsically slow, diffusion-limited reaction rates. In this paper, we consider simple examples of these circuits when they are attached to platforms (DNA origamis). As expected, constraining distances between DNA strands leads to faster reaction rates. However, it also induces side-effects that are not detectable in the solution-phase version of this circuitry. Appropriate design of the system, including protection and asymmetry between input and fuel strands, leads to a reproducible behaviour, at least one order of magnitude faster than the one observed under bulk conditions.Logic circuits based on DNA strand displacement reactions have been shown to be versatile enough to compute the square root of four-bit numbers. The implementation of these circuits as a set of bulk reactions faces difficulties which include leaky reactions and intrinsically slow, diffusion-limited reaction rates. In this paper, we consider simple examples of these circuits when they are attached to platforms (DNA origamis). As expected, constraining distances between DNA strands leads to faster reaction rates. However, it also induces side-effects that are not detectable in the solution-phase version of this circuitry. Appropriate design of the system, including protection and asymmetry between input and fuel strands, leads to a reproducible behaviour, at least one order of magnitude faster than the one observed under bulk conditions. Electronic supplementary information (ESI) available. See DOI: 10.1039/C5NR02434J

  7. Sequential addition of short DNA oligos in DNA-polymerase-based synthesis reactions

    DOEpatents

    Gardner, Shea N; Mariella, Jr., Raymond P; Christian, Allen T; Young, Jennifer A; Clague, David S

    2013-06-25

    A method of preselecting a multiplicity of DNA sequence segments that will comprise the DNA molecule of user-defined sequence, separating the DNA sequence segments temporally, and combining the multiplicity of DNA sequence segments with at least one polymerase enzyme wherein the multiplicity of DNA sequence segments join to produce the DNA molecule of user-defined sequence. Sequence segments may be of length n, where n is an odd integer. In one embodiment the length of desired hybridizing overlap is specified by the user and the sequences and the protocol for combining them are guided by computational (bioinformatics) predictions. In one embodiment sequence segments are combined from multiple reading frames to span the same region of a sequence, so that multiple desired hybridizations may occur with different overlap lengths.

  8. Well-characterized sequence features of eukaryote genomes and implications for ab initio gene prediction.

    PubMed

    Huang, Ying; Chen, Shi-Yi; Deng, Feilong

    2016-01-01

    In silico analysis of DNA sequences is an important area of computational biology in the post-genomic era. Over the past two decades, computational approaches for ab initio prediction of gene structure from genome sequence alone have largely facilitated our understanding on a variety of biological questions. Although the computational prediction of protein-coding genes has already been well-established, we are also facing challenges to robustly find the non-coding RNA genes, such as miRNA and lncRNA. Two main aspects of ab initio gene prediction include the computed values for describing sequence features and used algorithm for training the discriminant function, and by which different combinations are employed into various bioinformatic tools. Herein, we briefly review these well-characterized sequence features in eukaryote genomes and applications to ab initio gene prediction. The main purpose of this article is to provide an overview to beginners who aim to develop the related bioinformatic tools.

  9. Diversity of Streptomyces spp. in Eastern Himalayan region – computational RNomics approach to phylogeny

    PubMed Central

    Bhattacharjee, Kaushik; Banerjee, Subhro; Joshi, Santa Ram

    2012-01-01

    Isolation and characterization of actinomycetes from soil samples from altitudinal gradient of North-East India were investigated for computational RNomics based phylogeny. A total of 52 diverse isolates of Streptomyces from the soil samples were isolated on four different media and from these 6 isolates were selected on the basis of cultural characteristics, microscopic and biochemical studies. Sequencing of 16S rDNA of the selected isolates identified them to belong to six different species of Streptomyces. The molecular morphometric and physico-kinetic analysis of 16S rRNA sequences were performed to predict the diversity of the genus. The computational RNomics study revealed the significance of the structural RNA based phylogenetic analysis in a relatively diverse group of Streptomyces. PMID:22829729

  10. High-resolution mapping of bifurcations in nonlinear biochemical circuits

    NASA Astrophysics Data System (ADS)

    Genot, A. J.; Baccouche, A.; Sieskind, R.; Aubert-Kato, N.; Bredeche, N.; Bartolo, J. F.; Taly, V.; Fujii, T.; Rondelez, Y.

    2016-08-01

    Analog molecular circuits can exploit the nonlinear nature of biochemical reaction networks to compute low-precision outputs with fewer resources than digital circuits. This analog computation is similar to that employed by gene-regulation networks. Although digital systems have a tractable link between structure and function, the nonlinear and continuous nature of analog circuits yields an intricate functional landscape, which makes their design counter-intuitive, their characterization laborious and their analysis delicate. Here, using droplet-based microfluidics, we map with high resolution and dimensionality the bifurcation diagrams of two synthetic, out-of-equilibrium and nonlinear programs: a bistable DNA switch and a predator-prey DNA oscillator. The diagrams delineate where function is optimal, dynamics bifurcates and models fail. Inverse problem solving on these large-scale data sets indicates interference from enzymatic coupling. Additionally, data mining exposes the presence of rare, stochastically bursting oscillators near deterministic bifurcations.

  11. Quantum annealing versus classical machine learning applied to a simplified computational biology problem

    PubMed Central

    Li, Richard Y.; Di Felice, Rosa; Rohs, Remo; Lidar, Daniel A.

    2018-01-01

    Transcription factors regulate gene expression, but how these proteins recognize and specifically bind to their DNA targets is still debated. Machine learning models are effective means to reveal interaction mechanisms. Here we studied the ability of a quantum machine learning approach to predict binding specificity. Using simplified datasets of a small number of DNA sequences derived from actual binding affinity experiments, we trained a commercially available quantum annealer to classify and rank transcription factor binding. The results were compared to state-of-the-art classical approaches for the same simplified datasets, including simulated annealing, simulated quantum annealing, multiple linear regression, LASSO, and extreme gradient boosting. Despite technological limitations, we find a slight advantage in classification performance and nearly equal ranking performance using the quantum annealer for these fairly small training data sets. Thus, we propose that quantum annealing might be an effective method to implement machine learning for certain computational biology problems. PMID:29652405

  12. Beyond textbook illustrations: Hand-held models of ordered DNA and protein structures as 3D supplements to enhance student learning of helical biopolymers.

    PubMed

    Jittivadhna, Karnyupha; Ruenwongsa, Pintip; Panijpan, Bhinyo

    2010-11-01

    Textbook illustrations of 3D biopolymers on printed paper, regardless of how detailed and colorful, suffer from its two-dimensionality. For beginners, computer screen display of skeletal models of biopolymers and their animation usually does not provide the at-a-glance 3D perception and details, which can be done by good hand-held models. Here, we report a study on how our students learned more from using our ordered DNA and protein models assembled from colored computer-printouts on transparency film sheets that have useful structural details. Our models (reported in BAMBED 2009), having certain distinguished features, helped our students to grasp various aspects of these biopolymers that they usually find difficult. Quantitative and qualitative learning data from this study are reported. Copyright © 2010 International Union of Biochemistry and Molecular Biology, Inc.

  13. Theoretical studies of protein-protein and protein-DNA binding rates

    NASA Astrophysics Data System (ADS)

    Alsallaq, Ramzi A.

    Proteins are folded chains of amino acids. Some of the amino acids (e.g. Lys, Arg, His, Asp, and Glu) carry charges under physiological conditions. Proteins almost always function through binding to other proteins or ligands, for example barnase is a ribonuclease protein, found in the bacterium Bacillus amyloliquefaceus. Barnase degrades RNA by hydrolysis. For the bacterium to inhibit the potentially lethal action of Barnase within its own cell it co-produces another protein called barstar which binds quickly, and tightly, to barnase. The biological function of this binding is to block the active site of barnase. The speeds (rates) at which proteins associate are vital to many biological processes. They span a wide range (from less than 103 to 108 M-1s-1 ). Rates greater than ˜ 106 M -1s-1 are typically found to be manifestations of enhancements by long-range electrostatic interactions between the associating proteins. A different paradigm appears in the case of protein binding to DNA. The rate in this case is enhanced through attractive surface potential that effectively reduces the dimensionality of the available search space for the diffusing protein. This thesis presents computational and theoretical models on the rate of association of ligands/proteins to other proteins or DNA. For protein-protein association we present a general strategy for computing protein-protein rates of association. The main achievements of this strategy is the ability to obtain a stringent reaction criteria based on the landscape of short-range interactions between the associating proteins, and the ability to compute the effect of the electrostatic interactions on the rates of association accurately using the best known solvers for Poisson-Boltzmann equation presently available. For protein-DNA association we present a mathematical model for proteins targeting specific sites on a circular DNA topology. The main achievements are the realization that a linear DNA with reflecting ends and specific site in the middle of the chain is kinetically indistinguishable from its circularized topology, and the ability to predict the effect of the dissociation via the ends of linear DNA on the rate of association which is to reduce the rate.* *This dissertation is a compound document (contains both a paper copy and a CD as part of the dissertation). The CD requires the following system requirements: QuickTime.

  14. Quantum Mechanical Modeling of Ballistic MOSFETs

    NASA Technical Reports Server (NTRS)

    Svizhenko, Alexei; Anantram, M. P.; Govindan, T. R.; Biegel, Bryan (Technical Monitor)

    2001-01-01

    The objective of this project was to develop theory, approximations, and computer code to model quasi 1D structures such as nanotubes, DNA, and MOSFETs: (1) Nanotubes: Influence of defects on ballistic transport, electro-mechanical properties, and metal-nanotube coupling; (2) DNA: Model electron transfer (biochemistry) and transport experiments, and sequence dependence of conductance; and (3) MOSFETs: 2D doping profiles, polysilicon depletion, source to drain and gate tunneling, understand ballistic limit.

  15. Hydration of nucleic acid fragments: comparison of theory and experiment for high-resolution crystal structures of RNA, DNA, and DNA-drug complexes.

    PubMed Central

    Hummer, G; García, A E; Soumpasis, D M

    1995-01-01

    A computationally efficient method to describe the organization of water around solvated biomolecules is presented. It is based on a statistical mechanical expression for the water-density distribution in terms of particle correlation functions. The method is applied to analyze the hydration of small nucleic acid molecules in the crystal environment, for which high-resolution x-ray crystal structures have been reported. Results for RNA [r(ApU).r(ApU)] and DNA [d(CpG).d(CpG) in Z form and with parallel strand orientation] and for DNA-drug complexes [d(CpG).d(CpG) with the drug proflavine intercalated] are described. A detailed comparison of theoretical and experimental data shows positional agreement for the experimentally observed water sites. The presented method can be used for refinement of the water structure in x-ray crystallography, hydration analysis of nuclear magnetic resonance structures, and theoretical modeling of biological macromolecules such as molecular docking studies. The speed of the computations allows hydration analyses of molecules of almost arbitrary size (tRNA, protein-nucleic acid complexes, etc.) in the crystal environment and in aqueous solution. Images FIGURE 1 FIGURE 2 FIGURE 5 FIGURE 6 FIGURE 9 FIGURE 12 FIGURE 13 PMID:7542034

  16. Quantification of DNA cleavage specificity in Hi-C experiments.

    PubMed

    Meluzzi, Dario; Arya, Gaurav

    2016-01-08

    Hi-C experiments produce large numbers of DNA sequence read pairs that are typically analyzed to deduce genomewide interactions between arbitrary loci. A key step in these experiments is the cleavage of cross-linked chromatin with a restriction endonuclease. Although this cleavage should happen specifically at the enzyme's recognition sequence, an unknown proportion of cleavage events may involve other sequences, owing to the enzyme's star activity or to random DNA breakage. A quantitative estimation of these non-specific cleavages may enable simulating realistic Hi-C read pairs for validation of downstream analyses, monitoring the reproducibility of experimental conditions and investigating biophysical properties that correlate with DNA cleavage patterns. Here we describe a computational method for analyzing Hi-C read pairs to estimate the fractions of cleavages at different possible targets. The method relies on expressing an observed local target distribution downstream of aligned reads as a linear combination of known conditional local target distributions. We validated this method using Hi-C read pairs obtained by computer simulation. Application of the method to experimental Hi-C datasets from murine cells revealed interesting similarities and differences in patterns of cleavage across the various experiments considered. © The Author(s) 2015. Published by Oxford University Press on behalf of Nucleic Acids Research.

  17. Validation of DNA-based identification software by computation of pedigree likelihood ratios.

    PubMed

    Slooten, K

    2011-08-01

    Disaster victim identification (DVI) can be aided by DNA-evidence, by comparing the DNA-profiles of unidentified individuals with those of surviving relatives. The DNA-evidence is used optimally when such a comparison is done by calculating the appropriate likelihood ratios. Though conceptually simple, the calculations can be quite involved, especially with large pedigrees, precise mutation models etc. In this article we describe a series of test cases designed to check if software designed to calculate such likelihood ratios computes them correctly. The cases include both simple and more complicated pedigrees, among which inbred ones. We show how to calculate the likelihood ratio numerically and algebraically, including a general mutation model and possibility of allelic dropout. In Appendix A we show how to derive such algebraic expressions mathematically. We have set up these cases to validate new software, called Bonaparte, which performs pedigree likelihood ratio calculations in a DVI context. Bonaparte has been developed by SNN Nijmegen (The Netherlands) for the Netherlands Forensic Institute (NFI). It is available free of charge for non-commercial purposes (see www.dnadvi.nl for details). Commercial licenses can also be obtained. The software uses Bayesian networks and the junction tree algorithm to perform its calculations. Copyright © 2010 Elsevier Ireland Ltd. All rights reserved.

  18. The missing graphical user interface for genomics.

    PubMed

    Schatz, Michael C

    2010-01-01

    The Galaxy package empowers regular users to perform rich DNA sequence analysis through a much-needed and user-friendly graphical web interface. See research article http://genomebiology.com/2010/11/8/R86 RESEARCH HIGHLIGHT: With the advent of affordable and high-throughput DNA sequencing, sequencing is becoming an essential component in nearly every genetics lab. These data are being generated to probe sequence variations, to understand transcribed, regulated or methylated DNA elements, and to explore a host of other biological features across the tree of life and across a range of environments and conditions. Given this deluge of data, novices and experts alike are facing the daunting challenge of trying to analyze the raw sequence data computationally. With so many tools available and so many assays to analyze, how can one be expected to stay current with the state of the art? How can one be expected to learn to use each tool and construct robust end-to-end analysis pipelines, all while ensuring that input formats, command-line options, sequence databases and program libraries are set correctly? Finally, once the analysis is complete, how does one ensure the results are reproducible and transparent for others to scrutinize and study?In an article published in Genome Biology, Jeremy Goecks, Anton Nekrutenko, James Taylor and the rest of the Galaxy Team (Goecks et al. 1) make a great advance towards resolving these critical questions with the latest update to their Galaxy Project. The ambitious goal of Galaxy is to empower regular users to carry out their own computational analysis without having to be an expert in computational biology or computer science. Galaxy adds a desperately needed graphical user interface to genomics research, making data analysis universally accessible in a web browser, and freeing users from the minutiae of archaic command-line parameters, data formats and scripting languages. Data inputs and computational steps are selected from dynamic graphical menus, and the results are displayed in intuitive plots and summaries that encourage interactive workflows and the exploration of hypotheses. The underlying data analysis tools can be almost any piece of software, written in any language, but all their complexity is neatly hidden inside of Galaxy, allowing users to focus on scientific rather than technical questions.

  19. "First generation" automated DNA sequencing technology.

    PubMed

    Slatko, Barton E; Kieleczawa, Jan; Ju, Jingyue; Gardner, Andrew F; Hendrickson, Cynthia L; Ausubel, Frederick M

    2011-10-01

    Beginning in the 1980s, automation of DNA sequencing has greatly increased throughput, reduced costs, and enabled large projects to be completed more easily. The development of automation technology paralleled the development of other aspects of DNA sequencing: better enzymes and chemistry, separation and imaging technology, sequencing protocols, robotics, and computational advancements (including base-calling algorithms with quality scores, database developments, and sequence analysis programs). Despite the emergence of high-throughput sequencing platforms, automated Sanger sequencing technology remains useful for many applications. This unit provides background and a description of the "First-Generation" automated DNA sequencing technology. It also includes protocols for using the current Applied Biosystems (ABI) automated DNA sequencing machines. © 2011 by John Wiley & Sons, Inc.

  20. Long-range interactions and parallel scalability in molecular simulations

    NASA Astrophysics Data System (ADS)

    Patra, Michael; Hyvönen, Marja T.; Falck, Emma; Sabouri-Ghomi, Mohsen; Vattulainen, Ilpo; Karttunen, Mikko

    2007-01-01

    Typical biomolecular systems such as cellular membranes, DNA, and protein complexes are highly charged. Thus, efficient and accurate treatment of electrostatic interactions is of great importance in computational modeling of such systems. We have employed the GROMACS simulation package to perform extensive benchmarking of different commonly used electrostatic schemes on a range of computer architectures (Pentium-4, IBM Power 4, and Apple/IBM G5) for single processor and parallel performance up to 8 nodes—we have also tested the scalability on four different networks, namely Infiniband, GigaBit Ethernet, Fast Ethernet, and nearly uniform memory architecture, i.e. communication between CPUs is possible by directly reading from or writing to other CPUs' local memory. It turns out that the particle-mesh Ewald method (PME) performs surprisingly well and offers competitive performance unless parallel runs on PC hardware with older network infrastructure are needed. Lipid bilayers of sizes 128, 512 and 2048 lipid molecules were used as the test systems representing typical cases encountered in biomolecular simulations. Our results enable an accurate prediction of computational speed on most current computing systems, both for serial and parallel runs. These results should be helpful in, for example, choosing the most suitable configuration for a small departmental computer cluster.

  1. HAlign-II: efficient ultra-large multiple sequence alignment and phylogenetic tree reconstruction with distributed and parallel computing.

    PubMed

    Wan, Shixiang; Zou, Quan

    2017-01-01

    Multiple sequence alignment (MSA) plays a key role in biological sequence analyses, especially in phylogenetic tree construction. Extreme increase in next-generation sequencing results in shortage of efficient ultra-large biological sequence alignment approaches for coping with different sequence types. Distributed and parallel computing represents a crucial technique for accelerating ultra-large (e.g. files more than 1 GB) sequence analyses. Based on HAlign and Spark distributed computing system, we implement a highly cost-efficient and time-efficient HAlign-II tool to address ultra-large multiple biological sequence alignment and phylogenetic tree construction. The experiments in the DNA and protein large scale data sets, which are more than 1GB files, showed that HAlign II could save time and space. It outperformed the current software tools. HAlign-II can efficiently carry out MSA and construct phylogenetic trees with ultra-large numbers of biological sequences. HAlign-II shows extremely high memory efficiency and scales well with increases in computing resource. THAlign-II provides a user-friendly web server based on our distributed computing infrastructure. HAlign-II with open-source codes and datasets was established at http://lab.malab.cn/soft/halign.

  2. Efficient computation of the joint sample frequency spectra for multiple populations.

    PubMed

    Kamm, John A; Terhorst, Jonathan; Song, Yun S

    2017-01-01

    A wide range of studies in population genetics have employed the sample frequency spectrum (SFS), a summary statistic which describes the distribution of mutant alleles at a polymorphic site in a sample of DNA sequences and provides a highly efficient dimensional reduction of large-scale population genomic variation data. Recently, there has been much interest in analyzing the joint SFS data from multiple populations to infer parameters of complex demographic histories, including variable population sizes, population split times, migration rates, admixture proportions, and so on. SFS-based inference methods require accurate computation of the expected SFS under a given demographic model. Although much methodological progress has been made, existing methods suffer from numerical instability and high computational complexity when multiple populations are involved and the sample size is large. In this paper, we present new analytic formulas and algorithms that enable accurate, efficient computation of the expected joint SFS for thousands of individuals sampled from hundreds of populations related by a complex demographic model with arbitrary population size histories (including piecewise-exponential growth). Our results are implemented in a new software package called momi (MOran Models for Inference). Through an empirical study we demonstrate our improvements to numerical stability and computational complexity.

  3. Efficient computation of the joint sample frequency spectra for multiple populations

    PubMed Central

    Kamm, John A.; Terhorst, Jonathan; Song, Yun S.

    2016-01-01

    A wide range of studies in population genetics have employed the sample frequency spectrum (SFS), a summary statistic which describes the distribution of mutant alleles at a polymorphic site in a sample of DNA sequences and provides a highly efficient dimensional reduction of large-scale population genomic variation data. Recently, there has been much interest in analyzing the joint SFS data from multiple populations to infer parameters of complex demographic histories, including variable population sizes, population split times, migration rates, admixture proportions, and so on. SFS-based inference methods require accurate computation of the expected SFS under a given demographic model. Although much methodological progress has been made, existing methods suffer from numerical instability and high computational complexity when multiple populations are involved and the sample size is large. In this paper, we present new analytic formulas and algorithms that enable accurate, efficient computation of the expected joint SFS for thousands of individuals sampled from hundreds of populations related by a complex demographic model with arbitrary population size histories (including piecewise-exponential growth). Our results are implemented in a new software package called momi (MOran Models for Inference). Through an empirical study we demonstrate our improvements to numerical stability and computational complexity. PMID:28239248

  4. Efficient algorithms for the simulation of non-adiabatic electron transfer in complex molecular systems: application to DNA.

    PubMed

    Kubař, Tomáš; Elstner, Marcus

    2013-04-28

    In this work, a fragment-orbital density functional theory-based method is combined with two different non-adiabatic schemes for the propagation of the electronic degrees of freedom. This allows us to perform unbiased simulations of electron transfer processes in complex media, and the computational scheme is applied to the transfer of a hole in solvated DNA. It turns out that the mean-field approach, where the wave function of the hole is driven into a superposition of adiabatic states, leads to over-delocalization of the hole charge. This problem is avoided using a surface hopping scheme, resulting in a smaller rate of hole transfer. The method is highly efficient due to the on-the-fly computation of the coarse-grained DFT Hamiltonian for the nucleobases, which is coupled to the environment using a QM/MM approach. The computational efficiency and partial parallel character of the methodology make it possible to simulate electron transfer in systems of relevant biochemical size on a nanosecond time scale. Since standard non-polarizable force fields are applied in the molecular-mechanics part of the calculation, a simple scaling scheme was introduced into the electrostatic potential in order to simulate the effect of electronic polarization. It is shown that electronic polarization has an important effect on the features of charge transfer. The methodology is applied to two kinds of DNA sequences, illustrating the features of transfer along a flat energy landscape as well as over an energy barrier. The performance and relative merit of the mean-field scheme and the surface hopping for this application are discussed.

  5. Computational modeling of RNA 3D structures, with the aid of experimental restraints

    PubMed Central

    Magnus, Marcin; Matelska, Dorota; Łach, Grzegorz; Chojnowski, Grzegorz; Boniecki, Michal J; Purta, Elzbieta; Dawson, Wayne; Dunin-Horkawicz, Stanislaw; Bujnicki, Janusz M

    2014-01-01

    In addition to mRNAs whose primary function is transmission of genetic information from DNA to proteins, numerous other classes of RNA molecules exist, which are involved in a variety of functions, such as catalyzing biochemical reactions or performing regulatory roles. In analogy to proteins, the function of RNAs depends on their structure and dynamics, which are largely determined by the ribonucleotide sequence. Experimental determination of high-resolution RNA structures is both laborious and difficult, and therefore, the majority of known RNAs remain structurally uncharacterized. To address this problem, computational structure prediction methods were developed that simulate either the physical process of RNA structure formation (“Greek science” approach) or utilize information derived from known structures of other RNA molecules (“Babylonian science” approach). All computational methods suffer from various limitations that make them generally unreliable for structure prediction of long RNA sequences. However, in many cases, the limitations of computational and experimental methods can be overcome by combining these two complementary approaches with each other. In this work, we review computational approaches for RNA structure prediction, with emphasis on implementations (particular programs) that can utilize restraints derived from experimental analyses. We also list experimental approaches, whose results can be relatively easily used by computational methods. Finally, we describe case studies where computational and experimental analyses were successfully combined to determine RNA structures that would remain out of reach for each of these approaches applied separately. PMID:24785264

  6. In Silico Design and Characterization of DNA Nanomaterials

    NASA Astrophysics Data System (ADS)

    Nash, Jessica A.

    Deoxyribonucleic acid (DNA) and ribonucleic acid (RNA) function biologically as carriers of genetic information. However, due to their ability to self-assemble via base pairing, nucleic acid molecules have become widely used in nanotechnology. In this dissertation, in silico techniques are used to probe the structure-property relationships of nucleic acid based nanomaterials. In Part 1, computational methods are employed to formulate nanoparticle design rules for applications in nucleic acid packaging and delivery. Nanoparticles (NPs) play increasingly important roles in nanomedicine, where the surface chemistry allows for control over interactions with biomolecules. Understanding how DNA and RNA compaction occurs is relevant to biological systems and systems in nanotechnology, and critical for the development of more efficient and effective nanoparticle carriers. Computational modeling allows for the description of bio-nano systems and processes with unprecedented detail, and can provide insights and guidelines for the creation of new nanomaterials. Using all-atom molecular dynamics simulations, the effect of nanoparticle surface chemistry, size, and solvent ionic strength on interactions with DNA and RNA are reported. In Chapter 2, a systematic study of the effect of nanoparticle charge on ability to bend and wrap short sequences of DNA and RNA is presented. To cause bending of DNA, a nanoparticle charge of at least +30 is required. Higher nanoparticle charges cause a greater degree of compaction. For RNA, however, charged ligand end-groups bind internally and prevent RNA bending. Nanoparticles were designed to test the influence of NP ligand shell shape and length on RNA binding using these results. In Chapter 3, all-atom simulation of NPs with long double stranded RNA are reported. Simulations show that by shortening NP ligand length, double stranded RNA can be wrapped. In Chapter 4, we consider compaction of long DNA by nanoparticles. NPs with +120 charge can fully compact DNA, but the wrapping is unordered on the surface. Chapter 5 reports the influence of NPs on the structure of single stranded DNA and RNA, showing that NPs have a greater influence on poly-pyrimidine strands than poly-purine strands, and can interrupt hydrogen bonds and pi-pi stacking. In Part II of this dissertation, computational techniques are applied to study DNA tiles and origami. Due to base-pairing DNA can be used to place objects with nanoscale precision, with applications in nanoscience and nanomedicine. Chapter 6 presents the development of anticoagulants using DNA weave tiles and aptamers. More effective anticoagulants can be created by varying the DNA aptamer used, and increasing local concentration by attaching aptamers to a DNA tile. Molecular dynamics simulations show that increasing the number of helices on a DNA weave tile increases tile flexibility. Chapter 7 introduces a tool developed for visualization of DNA origami design. We develop circle map visualizations for DNA origami and maps of the base composition, allowing for visualizations of DNA origami that were not previously available. This tool is currently available online via nanohub (open source) for users around the world. The results reported here provide a fundamental understanding of the behavior of DNA systems in nanotechnology. Results are expected to aid in the development of more effective NP compaction agents, DNA delivery vehicles, and DNA origami design.

  7. Systolic array IC for genetic computation

    NASA Technical Reports Server (NTRS)

    Anderson, D.

    1991-01-01

    Measuring similarities between large sequences of genetic information is a formidable task requiring enormous amounts of computer time. Geneticists claim that nearly two months of CRAY-2 time are required to run a single comparison of the known database against the new bases that will be found this year, and more than a CRAY-2 year for next year's genetic discoveries, and so on. The DNA IC, designed at HP-ICBD in cooperation with the California Institute of Technology and the Jet Propulsion Laboratory, is being implemented in order to move the task of genetic comparison onto workstations and personal computers, while vastly improving performance. The chip is a systolic (pumped) array comprised of 16 processors, control logic, and global RAM, totaling 400,000 FETS. At 12 MHz, each chip performs 2.7 billion 16 bit operations per second. Using 35 of these chips in series on one PC board (performing nearly 100 billion operations per second), a sequence of 560 bases can be compared against the eventual total genome of 3 billion bases, in minutes--on a personal computer. While the designed purpose of the DNA chip is for genetic research, other disciplines requiring similarity measurements between strings of 7 bit encoded data could make use of this chip as well. Cryptography and speech recognition are two examples. A mix of full custom design and standard cells, in CMOS34, were used to achieve these goals. Innovative test methods were developed to enhance controllability and observability in the array. This paper describes these techniques as well as the chip's functionality. This chip was designed in the 1989-90 timeframe.

  8. Cores Of Recurrent Events (CORE) | Informatics Technology for Cancer Research (ITCR)

    Cancer.gov

    CORE is a statistically supported computational method for finding recurrently targeted regions in massive collections of genomic intervals, such as those arising from DNA copy number analysis of single tumor cells or bulk tumor tissues.

  9. A parallelized binary search tree

    USDA-ARS?s Scientific Manuscript database

    PTTRNFNDR is an unsupervised statistical learning algorithm that detects patterns in DNA sequences, protein sequences, or any natural language texts that can be decomposed into letters of a finite alphabet. PTTRNFNDR performs complex mathematical computations and its processing time increases when i...

  10. DNA Microarray-based Ecotoxicological Biomarker Discovery in a Small Fish Model Species

    EPA Science Inventory

    This paper addresses several issues critical to use of zebrafish oligonucleotide microarrays for computational toxicology research on endocrine disrupting chemicals using small fish models, and more generally, the use of microarrays in aquatic toxicology.

  11. Computer Simulation of Mutagenesis.

    ERIC Educational Resources Information Center

    North, J. C.; Dent, M. T.

    1978-01-01

    A FORTRAN program is described which simulates point-substitution mutations in the DNA strands of typical organisms. Its objective is to help students to understand the significance and structure of the genetic code, and the mechanisms and effect of mutagenesis. (Author/BB)

  12. Distribution of recombination hotspots in the human genome--a comparison of computer simulations with real data.

    PubMed

    Mackiewicz, Dorota; de Oliveira, Paulo Murilo Castro; Moss de Oliveira, Suzana; Cebrat, Stanisław

    2013-01-01

    Recombination is the main cause of genetic diversity. Thus, errors in this process can lead to chromosomal abnormalities. Recombination events are confined to narrow chromosome regions called hotspots in which characteristic DNA motifs are found. Genomic analyses have shown that both recombination hotspots and DNA motifs are distributed unevenly along human chromosomes and are much more frequent in the subtelomeric regions of chromosomes than in their central parts. Clusters of motifs roughly follow the distribution of recombination hotspots whereas single motifs show a negative correlation with the hotspot distribution. To model the phenomena related to recombination, we carried out computer Monte Carlo simulations of genome evolution. Computer simulations generated uneven distribution of hotspots with their domination in the subtelomeric regions of chromosomes. They also revealed that purifying selection eliminating defective alleles is strong enough to cause such hotspot distribution. After sufficiently long time of simulations, the structure of chromosomes reached a dynamic equilibrium, in which number and global distribution of both hotspots and defective alleles remained statistically unchanged, while their precise positions were shifted. This resembles the dynamic structure of human and chimpanzee genomes, where hotspots change their exact locations but the global distributions of recombination events are very similar.

  13. IMG/VR: a database of cultured and uncultured DNA Viruses and retroviruses.

    PubMed

    Paez-Espino, David; Chen, I-Min A; Palaniappan, Krishna; Ratner, Anna; Chu, Ken; Szeto, Ernest; Pillay, Manoj; Huang, Jinghua; Markowitz, Victor M; Nielsen, Torben; Huntemann, Marcel; K Reddy, T B; Pavlopoulos, Georgios A; Sullivan, Matthew B; Campbell, Barbara J; Chen, Feng; McMahon, Katherine; Hallam, Steve J; Denef, Vincent; Cavicchioli, Ricardo; Caffrey, Sean M; Streit, Wolfgang R; Webster, John; Handley, Kim M; Salekdeh, Ghasem H; Tsesmetzis, Nicolas; Setubal, Joao C; Pope, Phillip B; Liu, Wen-Tso; Rivers, Adam R; Ivanova, Natalia N; Kyrpides, Nikos C

    2017-01-04

    Viruses represent the most abundant life forms on the planet. Recent experimental and computational improvements have led to a dramatic increase in the number of viral genome sequences identified primarily from metagenomic samples. As a result of the expanding catalog of metagenomic viral sequences, there exists a need for a comprehensive computational platform integrating all these sequences with associated metadata and analytical tools. Here we present IMG/VR (https://img.jgi.doe.gov/vr/), the largest publicly available database of 3908 isolate reference DNA viruses with 264 413 computationally identified viral contigs from >6000 ecologically diverse metagenomic samples. Approximately half of the viral contigs are grouped into genetically distinct quasi-species clusters. Microbial hosts are predicted for 20 000 viral sequences, revealing nine microbial phyla previously unreported to be infected by viruses. Viral sequences can be queried using a variety of associated metadata, including habitat type and geographic location of the samples, or taxonomic classification according to hallmark viral genes. IMG/VR has a user-friendly interface that allows users to interrogate all integrated data and interact by comparing with external sequences, thus serving as an essential resource in the viral genomics community. © The Author(s) 2016. Published by Oxford University Press on behalf of Nucleic Acids Research.

  14. Programming chemistry in DNA-addressable bioreactors

    PubMed Central

    Fellermann, Harold; Cardelli, Luca

    2014-01-01

    We present a formal calculus, termed the chemtainer calculus, able to capture the complexity of compartmentalized reaction systems such as populations of possibly nested vesicular compartments. Compartments contain molecular cargo as well as surface markers in the form of DNA single strands. These markers serve as compartment addresses and allow for their targeted transport and fusion, thereby enabling reactions of previously separated chemicals. The overall system organization allows for the set-up of programmable chemistry in microfluidic or other automated environments. We introduce a simple sequential programming language whose instructions are motivated by state-of-the-art microfluidic technology. Our approach integrates electronic control, chemical computing and material production in a unified formal framework that is able to mimic the integrated computational and constructive capabilities of the subcellular matrix. We provide a non-deterministic semantics of our programming language that enables us to analytically derive the computational and constructive power of our machinery. This semantics is used to derive the sets of all constructable chemicals and supermolecular structures that emerge from different underlying instruction sets. Because our proofs are constructive, they can be used to automatically infer control programs for the construction of target structures from a limited set of resource molecules. Finally, we present an example of our framework from the area of oligosaccharide synthesis. PMID:25121647

  15. Distribution of Recombination Hotspots in the Human Genome – A Comparison of Computer Simulations with Real Data

    PubMed Central

    Mackiewicz, Dorota; de Oliveira, Paulo Murilo Castro; Moss de Oliveira, Suzana; Cebrat, Stanisław

    2013-01-01

    Recombination is the main cause of genetic diversity. Thus, errors in this process can lead to chromosomal abnormalities. Recombination events are confined to narrow chromosome regions called hotspots in which characteristic DNA motifs are found. Genomic analyses have shown that both recombination hotspots and DNA motifs are distributed unevenly along human chromosomes and are much more frequent in the subtelomeric regions of chromosomes than in their central parts. Clusters of motifs roughly follow the distribution of recombination hotspots whereas single motifs show a negative correlation with the hotspot distribution. To model the phenomena related to recombination, we carried out computer Monte Carlo simulations of genome evolution. Computer simulations generated uneven distribution of hotspots with their domination in the subtelomeric regions of chromosomes. They also revealed that purifying selection eliminating defective alleles is strong enough to cause such hotspot distribution. After sufficiently long time of simulations, the structure of chromosomes reached a dynamic equilibrium, in which number and global distribution of both hotspots and defective alleles remained statistically unchanged, while their precise positions were shifted. This resembles the dynamic structure of human and chimpanzee genomes, where hotspots change their exact locations but the global distributions of recombination events are very similar. PMID:23776462

  16. Toehold-Mediated Displacement of an Adenosine-Binding Aptamer from a DNA Duplex by its Ligand.

    PubMed

    Monserud, Jon H; Macri, Katherine M; Schwartz, Daniel K

    2016-10-24

    DNA is increasingly used to engineer dynamic nanoscale circuits, structures, and motors, many of which rely on DNA strand-displacement reactions. The use of functional DNA sequences (e.g., aptamers, which bind to a wide range of ligands) in these reactions would potentially confer responsiveness on such devices, and integrate DNA computation with highly varied molecular stimuli. By using high-throughput single-molecule FRET methods, we compared the kinetics of a putative aptamer-ligand and aptamer-complement strand-displacement reaction. We found that the ligands actively disrupted the DNA duplex in the presence of a DNA toehold in a similar manner to complementary DNA, with kinetic details specific to the aptamer structure, thus suggesting that the DNA strand-displacement concept can be extended to functional DNA-ligand systems. © 2016 Wiley-VCH Verlag GmbH & Co. KGaA, Weinheim.

  17. Parallel In Vivo DNA Assembly by Recombination: Experimental Demonstration and Theoretical Approaches

    PubMed Central

    Shi, Zhenyu; Wedd, Anthony G.; Gras, Sally L.

    2013-01-01

    The development of synthetic biology requires rapid batch construction of large gene networks from combinations of smaller units. Despite the availability of computational predictions for well-characterized enzymes, the optimization of most synthetic biology projects requires combinational constructions and tests. A new building-brick-style parallel DNA assembly framework for simple and flexible batch construction is presented here. It is based on robust recombination steps and allows a variety of DNA assembly techniques to be organized for complex constructions (with or without scars). The assembly of five DNA fragments into a host genome was performed as an experimental demonstration. PMID:23468883

  18. A multi-spectroscopic and molecular docking approach to investigate the interaction of antiviral drug oseltamivir with ct-DNA.

    PubMed

    Moghadam, Neda Hosseinpour; Salehzadeh, Sadegh; Shahabadi, Nahid; Golbedaghi, Reza

    2017-07-03

    The possible interaction between the antiviral drug oseltamivir and calf thymus DNA at physiological pH was studied by spectrophotometry, competitive spectrofluorimetry, differential pulse voltammogram (DPV), circular dichroism spectroscopy (CD), viscosity measurements, salt effect, and computational studies. Intercalation of oseltamivir between the base pairs of DNA was shown by a sharp increase in specific viscosity of DNA and a decrease of the peak current and a positive shift in differential pulse voltammogram. Competitive fluorescence experiments were performed using neutral red (NR) as a probe for the intercalation binding mode. The studies showed that oseltamivir is able to release the NR.

  19. Ni-DNA-based nanowires and nanodevices

    NASA Astrophysics Data System (ADS)

    Chang, Chia-Ching; Yuan, Chiun-Jye; Jian, Wen-Bin; Chen, Yu-Chang; di Ventra, Massimiliano

    DNA is a highly versatile biopolymer that has been a recent focus in the field of nanomachines and nanoelectronics. DNA exhibits high stability, adjustable conductance, self-organizing capability, programmability and vast information storage. It is an ideal material in the applications of nanodevices, nanoelectronics, and molecular computing. Low conductance of native DNA renders applications difficult. However, doping with nickel ions tunes the DNA into a conducting polymer. Further studies showed that nickel ions containing DNA (Ni-DNA) nanowires exhibit characteristics of memristor and memcapacitor making them a potential mass information storage system. In summary, Ni-DNA has promising applications in a variety of fields, including nanoelectronics, biosensors and memcomputing. This study was supported in part by the Ministry of Science and Technology (MOST), Taiwan (ROC) MOST 103-2112-M-009-011 -MY3, and MOST 105-2627-M-009-006.

  20. In silico studies to explore the mutagenic ability of 5-halo/oxy/li-oxy-uracil bases with guanine of DNA base pairs.

    PubMed

    Jana, Kalyanashis; Ganguly, Bishwajit

    2014-10-16

    DNA nucleobases are reactive in nature and undergo modifications by deamination, oxidation, alkylation, or hydrolysis processes. Many such modified bases are susceptible to mutagenesis when formed in cellular DNA. The mutagenesis can occur by mispairing with DNA nucleobases by a DNA polymerase during replication. We have performed a study of mispairing of DNA bases with unnatural bases computationally. 5-Halo uracils have been studied as mispairs in mutagenesis; however, the reports on their different forms are scarce in the literature. The stability of mispairs with keto form, enol form, and ionized form of 5-halo-uracil has been computed with the M06-2X/6-31+G** level of theory. The enol form of 5-halo-uracil showed remarkable stability toward DNA mispair compared to the corresponding keto and ionized forms. (F)U-G mispair showed the highest stability in the series and (Halo)(U(enol/ionized)-G mispair interactions energies are more stable than the natural G-C basepair of DNA. To enhance the stability of DNA mispairs, we have introduced the hydroxyl group in the place of halogen atoms, which provides additional hydrogen-bonding interactions in the system while forming the 5-membered ring. The study has been further extended with lithiated 5-hydroxymethyl-uracil to stabilize the DNA mispair. (CH2OLi)U(ionized)-G mispair has shown the highest stability (ΔG = -32.4 kcal/mol) with multi O-Li interactions. AIM (atoms in molecules) and EDA (energy decomposition analysis) analysis has been performed to examine the nature of noncovalent interactions in such mispairs. EDA analysis has shown that electrostatic energy mainly contributes toward the interaction energy of mispairs. The higher stability achieved in these studied mispairs can play a pivotal role in the mutagenesis and can help to attain the mutation for many desired biological processes.

  1. A dictionary based informational genome analysis

    PubMed Central

    2012-01-01

    Background In the post-genomic era several methods of computational genomics are emerging to understand how the whole information is structured within genomes. Literature of last five years accounts for several alignment-free methods, arisen as alternative metrics for dissimilarity of biological sequences. Among the others, recent approaches are based on empirical frequencies of DNA k-mers in whole genomes. Results Any set of words (factors) occurring in a genome provides a genomic dictionary. About sixty genomes were analyzed by means of informational indexes based on genomic dictionaries, where a systemic view replaces a local sequence analysis. A software prototype applying a methodology here outlined carried out some computations on genomic data. We computed informational indexes, built the genomic dictionaries with different sizes, along with frequency distributions. The software performed three main tasks: computation of informational indexes, storage of these in a database, index analysis and visualization. The validation was done by investigating genomes of various organisms. A systematic analysis of genomic repeats of several lengths, which is of vivid interest in biology (for example to compute excessively represented functional sequences, such as promoters), was discussed, and suggested a method to define synthetic genetic networks. Conclusions We introduced a methodology based on dictionaries, and an efficient motif-finding software application for comparative genomics. This approach could be extended along many investigation lines, namely exported in other contexts of computational genomics, as a basis for discrimination of genomic pathologies. PMID:22985068

  2. Interpretation of sucrose gradient sedimentation pattern of deoxyribonucleic acid fragments resulting from random breaks.

    PubMed

    Litwin, S; Shahn, E; Kozinski, A W

    1969-07-01

    Mass distribution in a sucrose gradient of deoxyribonucleic acid (DNA) fragments arising as a result of random breaks is predicted by analytical means from which computer evaluations are plotted. The analytical results are compared with the results of verifying experiments: (i) a Monte Carlo computer experiment in which simulated molecules of DNA were individuals of unit length subjected to random "breaks" applied by a random number generator, and (ii) an in vitro experiment in which molecules of T4 DNA, highly labeled with (32)P, were stored in liquid nitrogen for variable periods of time during which a precisely known number of (32)P atoms decayed, causing single-stranded breaks. The distribution of sizes of the resulting fragments was measured in an alkaline sucrose gradient. The profiles obtained in this fashion were compared with the mathematical predictions. Both experiments agree with the analytical approach and thus permit the use of the graphs obtained from the latter as a means of determining the average number of random breaks in DNA from distributions obtained experimentally in a sucrose gradient. An example of the application of this procedure to a previously unresolved problem is provided in the case of DNA from ultraviolet-irradiated phage which undergoes a dose-dependent intracellular breakdown. The relationship between the number of lethal hits and the number of single-stranded breaks was not previously established. A comparison of the calculated number of nicks per strand of DNA with the known dose in phage-lethal hits reveals a relationship closely approximating one lethal hit to one single-stranded break.

  3. Biomolecular Assembly of Gold Nanocrystals

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Micheel, Christine Marya

    2005-05-20

    Over the past ten years, methods have been developed to construct discrete nanostructures using nanocrystals and biomolecules. While these frequently consist of gold nanocrystals and DNA, semiconductor nanocrystals as well as antibodies and enzymes have also been used. One example of discrete nanostructures is dimers of gold nanocrystals linked together with complementary DNA. This type of nanostructure is also known as a nanocrystal molecule. Discrete nanostructures of this kind have a number of potential applications, from highly parallel self-assembly of electronics components and rapid read-out of DNA computations to biological imaging and a variety of bioassays. My research focused inmore » three main areas. The first area, the refinement of electrophoresis as a purification and characterization method, included application of agarose gel electrophoresis to the purification of discrete gold nanocrystal/DNA conjugates and nanocrystal molecules, as well as development of a more detailed understanding of the hydrodynamic behavior of these materials in gels. The second area, the development of methods for quantitative analysis of transmission electron microscope data, used computer programs written to find pair correlations as well as higher order correlations. With these programs, it is possible to reliably locate and measure nanocrystal molecules in TEM images. The final area of research explored the use of DNA ligase in the formation of nanocrystal molecules. Synthesis of dimers of gold particles linked with a single strand of DNA possible through the use of DNA ligase opens the possibility for amplification of nanostructures in a manner similar to polymerase chain reaction. These three areas are discussed in the context of the work in the Alivisatos group, as well as the field as a whole.« less

  4. Electronic couplings and on-site energies for hole transfer in DNA: Systematic quantum mechanical/molecular dynamic study

    NASA Astrophysics Data System (ADS)

    Voityuk, Alexander A.

    2008-03-01

    The electron hole transfer (HT) properties of DNA are substantially affected by thermal fluctuations of the π stack structure. Depending on the mutual position of neighboring nucleobases, electronic coupling V may change by several orders of magnitude. In the present paper, we report the results of systematic QM/molecular dynamic (MD) calculations of the electronic couplings and on-site energies for the hole transfer. Based on 15ns MD trajectories for several DNA oligomers, we calculate the average coupling squares ⟨V2⟩ and the energies of basepair triplets XG +Y and XA +Y, where X, Y =G, A, T, and C. For each of the 32 systems, 15 000 conformations separated by 1ps are considered. The three-state generalized Mulliken-Hush method is used to derive electronic couplings for HT between neighboring basepairs. The adiabatic energies and dipole moment matrix elements are computed within the INDO/S method. We compare the rms values of V with the couplings estimated for the idealized B-DNA structure and show that in several important cases the couplings calculated for the idealized B-DNA structure are considerably underestimated. The rms values for intrastrand couplings G-G, A-A, G-A, and A-G are found to be similar, ˜0.07eV, while the interstrand couplings are quite different. The energies of hole states G+ and A+ in the stack depend on the nature of the neighboring pairs. The XG +Y are by 0.5eV more stable than XA +Y. The thermal fluctuations of the DNA structure facilitate the HT process from guanine to adenine. The tabulated couplings and on-site energies can be used as reference parameters in theoretical and computational studies of HT processes in DNA.

  5. A Novel Computational Method for Detecting DNA Methylation Sites with DNA Sequence Information and Physicochemical Properties.

    PubMed

    Pan, Gaofeng; Jiang, Limin; Tang, Jijun; Guo, Fei

    2018-02-08

    DNA methylation is an important biochemical process, and it has a close connection with many types of cancer. Research about DNA methylation can help us to understand the regulation mechanism and epigenetic reprogramming. Therefore, it becomes very important to recognize the methylation sites in the DNA sequence. In the past several decades, many computational methods-especially machine learning methods-have been developed since the high-throughout sequencing technology became widely used in research and industry. In order to accurately identify whether or not a nucleotide residue is methylated under the specific DNA sequence context, we propose a novel method that overcomes the shortcomings of previous methods for predicting methylation sites. We use k -gram, multivariate mutual information, discrete wavelet transform, and pseudo amino acid composition to extract features, and train a sparse Bayesian learning model to do DNA methylation prediction. Five criteria-area under the receiver operating characteristic curve (AUC), Matthew's correlation coefficient (MCC), accuracy (ACC), sensitivity (SN), and specificity-are used to evaluate the prediction results of our method. On the benchmark dataset, we could reach 0.8632 on AUC, 0.8017 on ACC, 0.5558 on MCC, and 0.7268 on SN. Additionally, the best results on two scBS-seq profiled mouse embryonic stem cells datasets were 0.8896 and 0.9511 by AUC, respectively. When compared with other outstanding methods, our method surpassed them on the accuracy of prediction. The improvement of AUC by our method compared to other methods was at least 0.0399 . For the convenience of other researchers, our code has been uploaded to a file hosting service, and can be downloaded from: https://figshare.com/s/0697b692d802861282d3.

  6. An accurate algorithm for the detection of DNA fragments from dilution pool sequencing experiments.

    PubMed

    Bansal, Vikas

    2018-01-01

    The short read lengths of current high-throughput sequencing technologies limit the ability to recover long-range haplotype information. Dilution pool methods for preparing DNA sequencing libraries from high molecular weight DNA fragments enable the recovery of long DNA fragments from short sequence reads. These approaches require computational methods for identifying the DNA fragments using aligned sequence reads and assembling the fragments into long haplotypes. Although a number of computational methods have been developed for haplotype assembly, the problem of identifying DNA fragments from dilution pool sequence data has not received much attention. We formulate the problem of detecting DNA fragments from dilution pool sequencing experiments as a genome segmentation problem and develop an algorithm that uses dynamic programming to optimize a likelihood function derived from a generative model for the sequence reads. This algorithm uses an iterative approach to automatically infer the mean background read depth and the number of fragments in each pool. Using simulated data, we demonstrate that our method, FragmentCut, has 25-30% greater sensitivity compared with an HMM based method for fragment detection and can also detect overlapping fragments. On a whole-genome human fosmid pool dataset, the haplotypes assembled using the fragments identified by FragmentCut had greater N50 length, 16.2% lower switch error rate and 35.8% lower mismatch error rate compared with two existing methods. We further demonstrate the greater accuracy of our method using two additional dilution pool datasets. FragmentCut is available from https://bansal-lab.github.io/software/FragmentCut. vibansal@ucsd.edu. Supplementary data are available at Bioinformatics online. © The Author 2017. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com

  7. A Theoretical and Experimental Study of DNA Self-assembly

    NASA Astrophysics Data System (ADS)

    Chandran, Harish

    The control of matter and phenomena at the nanoscale is fast becoming one of the most important challenges of the 21st century with wide-ranging applications from energy and health care to computing and material science. Conventional top-down approaches to nanotechnology, having served us well for long, are reaching their inherent limitations. Meanwhile, bottom-up methods such as self-assembly are emerging as viable alternatives for nanoscale fabrication and manipulation. A particularly successful bottom up technique is DNA self-assembly where a set of carefully designed DNA strands form a nanoscale object as a consequence of specific, local interactions among the different components, without external direction. The final product of the self-assembly process might be a static nanostructure or a dynamic nanodevice that performs a specific function. Over the past two decades, DNA self-assembly has produced stunning nanoscale objects such as 2D and 3D lattices, polyhedra and addressable arbitrary shaped substrates, and a myriad of nanoscale devices such as molecular tweezers, computational circuits, biosensors and molecular assembly lines. In this dissertation we study multiple problems in the theory, simulations and experiments of DNA self-assembly. We extend the Turing-universal mathematical framework of self-assembly known as the Tile Assembly Model by incorporating randomization during the assembly process. This allows us to reduce the tile complexity of linear assemblies. We develop multiple techniques to build linear assemblies of expected length N using far fewer tile types than previously possible. We abstract the fundamental properties of DNA and develop a biochemical system, which we call meta-DNA, based entirely on strands of DNA as the only component molecule. We further develop various enzyme-free protocols to manipulate meta-DNA systems and provide strand level details along with abstract notations for these mechanisms. We simulate DNA circuits by providing detailed designs for local molecular computations that involve spatially contiguous molecules arranged on addressable substrates via enzyme-free DNA hybridization reaction cascades. We use the Visual DSD simulation software in conjunction with localized reaction rates obtained from biophysical modeling to create chemical reaction networks of localized hybridization circuits that are then model checked using the PRISM model checking software. We develop a DNA detection system employing the triggered self-assembly of a novel DNA dendritic nanostructure. Detection begins when a specific, single-stranded target DNA strand triggers a hybridization chain reaction between two distinct DNA hairpins. Each hairpin opens and hybridizes up to two copies of the other, and hence each layer of the growing dendritic nanostructure can in principle accommodate an exponentially increasing number of cognate molecules, generating a nanostructure with high molecular weight. We build linear activatable assemblies employing a novel protection/deprotection strategy to strictly enforce the direction of tiling assembly growth to ensure the robustness of the assembly process. Our system consists of two tiles that can form a linear co-polymer. These tiles, which are initially protected such that they do not react with each other, can be activated to form linear co-polymers via the use of a strand displacing enzyme.

  8. Logic Gate Operation by DNA Translocation through Biological Nanopores.

    PubMed

    Yasuga, Hiroki; Kawano, Ryuji; Takinoue, Masahiro; Tsuji, Yutaro; Osaki, Toshihisa; Kamiya, Koki; Miki, Norihisa; Takeuchi, Shoji

    2016-01-01

    Logical operations using biological molecules, such as DNA computing or programmable diagnosis using DNA, have recently received attention. Challenges remain with respect to the development of such systems, including label-free output detection and the rapidity of operation. Here, we propose integration of biological nanopores with DNA molecules for development of a logical operating system. We configured outputs "1" and "0" as single-stranded DNA (ssDNA) that is or is not translocated through a nanopore; unlabeled DNA was detected electrically. A negative-AND (NAND) operation was successfully conducted within approximately 10 min, which is rapid compared with previous studies using unlabeled DNA. In addition, this operation was executed in a four-droplet network. DNA molecules and associated information were transferred among droplets via biological nanopores. This system would facilitate linking of molecules and electronic interfaces. Thus, it could be applied to molecular robotics, genetic engineering, and even medical diagnosis and treatment.

  9. Logic Gate Operation by DNA Translocation through Biological Nanopores

    PubMed Central

    Takinoue, Masahiro; Tsuji, Yutaro; Osaki, Toshihisa; Kamiya, Koki; Miki, Norihisa; Takeuchi, Shoji

    2016-01-01

    Logical operations using biological molecules, such as DNA computing or programmable diagnosis using DNA, have recently received attention. Challenges remain with respect to the development of such systems, including label-free output detection and the rapidity of operation. Here, we propose integration of biological nanopores with DNA molecules for development of a logical operating system. We configured outputs “1” and “0” as single-stranded DNA (ssDNA) that is or is not translocated through a nanopore; unlabeled DNA was detected electrically. A negative-AND (NAND) operation was successfully conducted within approximately 10 min, which is rapid compared with previous studies using unlabeled DNA. In addition, this operation was executed in a four-droplet network. DNA molecules and associated information were transferred among droplets via biological nanopores. This system would facilitate linking of molecules and electronic interfaces. Thus, it could be applied to molecular robotics, genetic engineering, and even medical diagnosis and treatment. PMID:26890568

  10. Barcode extension for analysis and reconstruction of structures

    NASA Astrophysics Data System (ADS)

    Myhrvold, Cameron; Baym, Michael; Hanikel, Nikita; Ong, Luvena L.; Gootenberg, Jonathan S.; Yin, Peng

    2017-03-01

    Collections of DNA sequences can be rationally designed to self-assemble into predictable three-dimensional structures. The geometric and functional diversity of DNA nanostructures created to date has been enhanced by improvements in DNA synthesis and computational design. However, existing methods for structure characterization typically image the final product or laboriously determine the presence of individual, labelled strands using gel electrophoresis. Here we introduce a new method of structure characterization that uses barcode extension and next-generation DNA sequencing to quantitatively measure the incorporation of every strand into a DNA nanostructure. By quantifying the relative abundances of distinct DNA species in product and monomer bands, we can study the influence of geometry and sequence on assembly. We have tested our method using 2D and 3D DNA brick and DNA origami structures. Our method is general and should be extensible to a wide variety of DNA nanostructures.

  11. Barcode extension for analysis and reconstruction of structures.

    PubMed

    Myhrvold, Cameron; Baym, Michael; Hanikel, Nikita; Ong, Luvena L; Gootenberg, Jonathan S; Yin, Peng

    2017-03-13

    Collections of DNA sequences can be rationally designed to self-assemble into predictable three-dimensional structures. The geometric and functional diversity of DNA nanostructures created to date has been enhanced by improvements in DNA synthesis and computational design. However, existing methods for structure characterization typically image the final product or laboriously determine the presence of individual, labelled strands using gel electrophoresis. Here we introduce a new method of structure characterization that uses barcode extension and next-generation DNA sequencing to quantitatively measure the incorporation of every strand into a DNA nanostructure. By quantifying the relative abundances of distinct DNA species in product and monomer bands, we can study the influence of geometry and sequence on assembly. We have tested our method using 2D and 3D DNA brick and DNA origami structures. Our method is general and should be extensible to a wide variety of DNA nanostructures.

  12. Barcode extension for analysis and reconstruction of structures

    PubMed Central

    Myhrvold, Cameron; Baym, Michael; Hanikel, Nikita; Ong, Luvena L; Gootenberg, Jonathan S; Yin, Peng

    2017-01-01

    Collections of DNA sequences can be rationally designed to self-assemble into predictable three-dimensional structures. The geometric and functional diversity of DNA nanostructures created to date has been enhanced by improvements in DNA synthesis and computational design. However, existing methods for structure characterization typically image the final product or laboriously determine the presence of individual, labelled strands using gel electrophoresis. Here we introduce a new method of structure characterization that uses barcode extension and next-generation DNA sequencing to quantitatively measure the incorporation of every strand into a DNA nanostructure. By quantifying the relative abundances of distinct DNA species in product and monomer bands, we can study the influence of geometry and sequence on assembly. We have tested our method using 2D and 3D DNA brick and DNA origami structures. Our method is general and should be extensible to a wide variety of DNA nanostructures. PMID:28287117

  13. Computationally expanding infinium HumanMethylation450 BeadChip array data to reveal distinct DNA methylation patterns of rheumatoid arthritis

    PubMed Central

    Li, Chengzhe; Ai, Rizi; Wang, Mengchi; Firestein, Gary S.; Wang, Wei

    2016-01-01

    Motivation: DNA methylation signatures in rheumatoid arthritis (RA) have been identified in fibroblast-like synoviocytes (FLS) with Illumina HumanMethylation450 array. Since <2% of CpG sites are covered by the Illumina 450K array and whole genome bisulfite sequencing is still too expensive for many samples, computationally predicting DNA methylation levels based on 450K data would be valuable to discover more RA-related genes. Results: We developed a computational model that is trained on 14 tissues with both whole genome bisulfite sequencing and 450K array data. This model integrates information derived from the similarity of local methylation pattern between tissues, the methylation information of flanking CpG sites and the methylation tendency of flanking DNA sequences. The predicted and measured methylation values were highly correlated with a Pearson correlation coefficient of 0.9 in leave-one-tissue-out cross-validations. Importantly, the majority (76%) of the top 10% differentially methylated loci among the 14 tissues was correctly detected using the predicted methylation values. Applying this model to 450K data of RA, osteoarthritis and normal FLS, we successfully expanded the coverage of CpG sites 18.5-fold and accounts for about 30% of all the CpGs in the human genome. By integrative omics study, we identified genes and pathways tightly related to RA pathogenesis, among which 12 genes were supported by triple evidences, including 6 genes already known to perform specific roles in RA and 6 genes as new potential therapeutic targets. Availability and implementation: The source code, required data for prediction, and demo data for test are freely available at: http://wanglab.ucsd.edu/star/LR450K/. Contact: wei-wang@ucsd.edu or gfirestein@ucsd.edu Supplementary information: Supplementary data are available at Bioinformatics online. PMID:26883487

  14. Automated property optimization via ab initio O(N) elongation method: Application to (hyper-)polarizability in DNA

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Orimoto, Yuuichi, E-mail: orimoto.yuuichi.888@m.kyushu-u.ac.jp; Aoki, Yuriko; Japan Science and Technology Agency, CREST, 4-1-8 Hon-chou, Kawaguchi, Saitama 332-0012

    An automated property optimization method was developed based on the ab initio O(N) elongation (ELG) method and applied to the optimization of nonlinear optical (NLO) properties in DNA as a first test. The ELG method mimics a polymerization reaction on a computer, and the reaction terminal of a starting cluster is attacked by monomers sequentially to elongate the electronic structure of the system by solving in each step a limited space including the terminal (localized molecular orbitals at the terminal) and monomer. The ELG-finite field (ELG-FF) method for calculating (hyper-)polarizabilities was used as the engine program of the optimization method,more » and it was found to show linear scaling efficiency while maintaining high computational accuracy for a random sequenced DNA model. Furthermore, the self-consistent field convergence was significantly improved by using the ELG-FF method compared with a conventional method, and it can lead to more feasible NLO property values in the FF treatment. The automated optimization method successfully chose an appropriate base pair from four base pairs (A, T, G, and C) for each elongation step according to an evaluation function. From test optimizations for the first order hyper-polarizability (β) in DNA, a substantial difference was observed depending on optimization conditions between “choose-maximum” (choose a base pair giving the maximum β for each step) and “choose-minimum” (choose a base pair giving the minimum β). In contrast, there was an ambiguous difference between these conditions for optimizing the second order hyper-polarizability (γ) because of the small absolute value of γ and the limitation of numerical differential calculations in the FF method. It can be concluded that the ab initio level property optimization method introduced here can be an effective step towards an advanced computer aided material design method as long as the numerical limitation of the FF method is taken into account.« less

  15. Automated property optimization via ab initio O(N) elongation method: Application to (hyper-)polarizability in DNA.

    PubMed

    Orimoto, Yuuichi; Aoki, Yuriko

    2016-07-14

    An automated property optimization method was developed based on the ab initio O(N) elongation (ELG) method and applied to the optimization of nonlinear optical (NLO) properties in DNA as a first test. The ELG method mimics a polymerization reaction on a computer, and the reaction terminal of a starting cluster is attacked by monomers sequentially to elongate the electronic structure of the system by solving in each step a limited space including the terminal (localized molecular orbitals at the terminal) and monomer. The ELG-finite field (ELG-FF) method for calculating (hyper-)polarizabilities was used as the engine program of the optimization method, and it was found to show linear scaling efficiency while maintaining high computational accuracy for a random sequenced DNA model. Furthermore, the self-consistent field convergence was significantly improved by using the ELG-FF method compared with a conventional method, and it can lead to more feasible NLO property values in the FF treatment. The automated optimization method successfully chose an appropriate base pair from four base pairs (A, T, G, and C) for each elongation step according to an evaluation function. From test optimizations for the first order hyper-polarizability (β) in DNA, a substantial difference was observed depending on optimization conditions between "choose-maximum" (choose a base pair giving the maximum β for each step) and "choose-minimum" (choose a base pair giving the minimum β). In contrast, there was an ambiguous difference between these conditions for optimizing the second order hyper-polarizability (γ) because of the small absolute value of γ and the limitation of numerical differential calculations in the FF method. It can be concluded that the ab initio level property optimization method introduced here can be an effective step towards an advanced computer aided material design method as long as the numerical limitation of the FF method is taken into account.

  16. Models of the solvent-accessible surface of biopolymers

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Smith, R.E.

    1996-09-01

    Many biopolymers such as proteins, DNA, and RNA have been studied because they have important biomedical roles and may be good targets for therapeutic action in treating diseases. This report describes how plastic models of the solvent-accessible surface of biopolymers were made. Computer files containing sets of triangles were calculated, then used on a stereolithography machine to make the models. Small (2 in.) models were made to test whether the computer calculations were done correctly. Also, files of the type (.stl) required by any ISO 9001 rapid prototyping machine were written onto a CD-ROM for distribution to American companies.

  17. The Human Genome Project: Information access, management, and regulation. Final report

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    McInerney, J.D.; Micikas, L.B.

    The Human Genome Project is a large, internationally coordinated effort in biological research directed at creating a detailed map of human DNA. This report describes the access of information, management, and regulation of the project. The project led to the development of an instructional module titled The Human Genome Project: Biology, Computers, and Privacy, designed for use in high school biology classes. The module consists of print materials and both Macintosh and Windows versions of related computer software-Appendix A contains a copy of the print materials and discs containing the two versions of the software.

  18. Two new computational methods for universal DNA barcoding: a benchmark using barcode sequences of bacteria, archaea, animals, fungi, and land plants.

    PubMed

    Tanabe, Akifumi S; Toju, Hirokazu

    2013-01-01

    Taxonomic identification of biological specimens based on DNA sequence information (a.k.a. DNA barcoding) is becoming increasingly common in biodiversity science. Although several methods have been proposed, many of them are not universally applicable due to the need for prerequisite phylogenetic/machine-learning analyses, the need for huge computational resources, or the lack of a firm theoretical background. Here, we propose two new computational methods of DNA barcoding and show a benchmark for bacterial/archeal 16S, animal COX1, fungal internal transcribed spacer, and three plant chloroplast (rbcL, matK, and trnH-psbA) barcode loci that can be used to compare the performance of existing and new methods. The benchmark was performed under two alternative situations: query sequences were available in the corresponding reference sequence databases in one, but were not available in the other. In the former situation, the commonly used "1-nearest-neighbor" (1-NN) method, which assigns the taxonomic information of the most similar sequences in a reference database (i.e., BLAST-top-hit reference sequence) to a query, displays the highest rate and highest precision of successful taxonomic identification. However, in the latter situation, the 1-NN method produced extremely high rates of misidentification for all the barcode loci examined. In contrast, one of our new methods, the query-centric auto-k-nearest-neighbor (QCauto) method, consistently produced low rates of misidentification for all the loci examined in both situations. These results indicate that the 1-NN method is most suitable if the reference sequences of all potentially observable species are available in databases; otherwise, the QCauto method returns the most reliable identification results. The benchmark results also indicated that the taxon coverage of reference sequences is far from complete for genus or species level identification in all the barcode loci examined. Therefore, we need to accelerate the registration of reference barcode sequences to apply high-throughput DNA barcoding to genus or species level identification in biodiversity research.

  19. Two New Computational Methods for Universal DNA Barcoding: A Benchmark Using Barcode Sequences of Bacteria, Archaea, Animals, Fungi, and Land Plants

    PubMed Central

    Tanabe, Akifumi S.; Toju, Hirokazu

    2013-01-01

    Taxonomic identification of biological specimens based on DNA sequence information (a.k.a. DNA barcoding) is becoming increasingly common in biodiversity science. Although several methods have been proposed, many of them are not universally applicable due to the need for prerequisite phylogenetic/machine-learning analyses, the need for huge computational resources, or the lack of a firm theoretical background. Here, we propose two new computational methods of DNA barcoding and show a benchmark for bacterial/archeal 16S, animal COX1, fungal internal transcribed spacer, and three plant chloroplast (rbcL, matK, and trnH-psbA) barcode loci that can be used to compare the performance of existing and new methods. The benchmark was performed under two alternative situations: query sequences were available in the corresponding reference sequence databases in one, but were not available in the other. In the former situation, the commonly used “1-nearest-neighbor” (1-NN) method, which assigns the taxonomic information of the most similar sequences in a reference database (i.e., BLAST-top-hit reference sequence) to a query, displays the highest rate and highest precision of successful taxonomic identification. However, in the latter situation, the 1-NN method produced extremely high rates of misidentification for all the barcode loci examined. In contrast, one of our new methods, the query-centric auto-k-nearest-neighbor (QCauto) method, consistently produced low rates of misidentification for all the loci examined in both situations. These results indicate that the 1-NN method is most suitable if the reference sequences of all potentially observable species are available in databases; otherwise, the QCauto method returns the most reliable identification results. The benchmark results also indicated that the taxon coverage of reference sequences is far from complete for genus or species level identification in all the barcode loci examined. Therefore, we need to accelerate the registration of reference barcode sequences to apply high-throughput DNA barcoding to genus or species level identification in biodiversity research. PMID:24204702

  20. DNA Origami-Graphene Hybrid Nanopore for DNA Detection.

    PubMed

    Barati Farimani, Amir; Dibaeinia, Payam; Aluru, Narayana R

    2017-01-11

    DNA origami nanostructures can be used to functionalize solid-state nanopores for single molecule studies. In this study, we characterized a nanopore in a DNA origami-graphene heterostructure for DNA detection. The DNA origami nanopore is functionalized with a specific nucleotide type at the edge of the pore. Using extensive molecular dynamics (MD) simulations, we computed and analyzed the ionic conductivity of nanopores in heterostructures carpeted with one or two layers of DNA origami on graphene. We demonstrate that a nanopore in DNA origami-graphene gives rise to distinguishable dwell times for the four DNA base types, whereas for a nanopore in bare graphene, the dwell time is almost the same for all types of bases. The specific interactions (hydrogen bonds) between DNA origami and the translocating DNA strand yield different residence times and ionic currents. We also conclude that the speed of DNA translocation decreases due to the friction between the dangling bases at the pore mouth and the sequencing DNA strands.

  1. Understanding the structural and dynamic consequences of DNA epigenetic modifications: Computational insights into cytosine methylation and hydroxymethylation

    PubMed Central

    Carvalho, Alexandra T P; Gouveia, Leonor; Kanna, Charan Raju; Wärmländer, Sebastian K T S; Platts, Jamie A; Kamerlin, Shina Caroline Lynn

    2014-01-01

    We report a series of molecular dynamics (MD) simulations of up to a microsecond combined simulation time designed to probe epigenetically modified DNA sequences. More specifically, by monitoring the effects of methylation and hydroxymethylation of cytosine in different DNA sequences, we show, for the first time, that DNA epigenetic modifications change the molecule's dynamical landscape, increasing the propensity of DNA toward different values of twist and/or roll/tilt angles (in relation to the unmodified DNA) at the modification sites. Moreover, both the extent and position of different modifications have significant effects on the amount of structural variation observed. We propose that these conformational differences, which are dependent on the sequence environment, can provide specificity for protein binding. PMID:25625845

  2. The effect of volume exclusion on the formation of DNA minicircle networks: implications to kinetoplast DNA

    NASA Astrophysics Data System (ADS)

    Diao, Y.; Hinson, K.; Sun, Y.; Arsuaga, J.

    2015-10-01

    Kinetoplast DNA (kDNA) is the mitochondrial of DNA of disease causing organisms such as Trypanosoma Brucei (T. Brucei) and Trypanosoma Cruzi (T. Cruzi). In most organisms, KDNA is made of thousands of small circular DNA molecules that are highly condensed and topologically linked forming a gigantic planar network. In our previous work we have developed mathematical and computational models to test the confinement hypothesis, that is that the formation of kDNA minicircle networks is a product of the high DNA condensation achieved in the mitochondrion of these organisms. In these studies we studied three parameters that characterize the growth of the network topology upon confinement: the critical percolation density, the mean saturation density and the mean valence (i.e. the number of mini circles topologically linked to any chosen minicircle). Experimental results on insect-infecting organisms showed that the mean valence is equal to three, forming a structure similar to those found in medieval chain-mails. These same studies hypothesized that this value of the mean valence was driven by the DNA excluded volume. Here we extend our previous work on kDNA by characterizing the effects of DNA excluded volume on the three descriptive parameters. Using computer simulations of polymer swelling we found that (1) in agreement with previous studies the linking probability of two minicircles does not decrease linearly with the distance between the two minicircles, (2) the mean valence grows linearly with the density of minicircles and decreases with the thickness of the excluded volume, (3) the critical percolation and mean saturation densities grow linearly with the thickness of the excluded volume. Our results therefore suggest that the swelling of the DNA molecule, due to electrostatic interactions, has relatively mild implications on the overall topology of the network. Our results also validate our topological descriptors since they appear to reflect the changes in the physical properties of the polymeric chains and at the same time remain faithful to their description of kDNA.

  3. Cloud-based adaptive exon prediction for DNA analysis

    PubMed Central

    Putluri, Srinivasareddy; Fathima, Shaik Yasmeen

    2018-01-01

    Cloud computing offers significant research and economic benefits to healthcare organisations. Cloud services provide a safe place for storing and managing large amounts of such sensitive data. Under conventional flow of gene information, gene sequence laboratories send out raw and inferred information via Internet to several sequence libraries. DNA sequencing storage costs will be minimised by use of cloud service. In this study, the authors put forward a novel genomic informatics system using Amazon Cloud Services, where genomic sequence information is stored and accessed for processing. True identification of exon regions in a DNA sequence is a key task in bioinformatics, which helps in disease identification and design drugs. Three base periodicity property of exons forms the basis of all exon identification techniques. Adaptive signal processing techniques found to be promising in comparison with several other methods. Several adaptive exon predictors (AEPs) are developed using variable normalised least mean square and its maximum normalised variants to reduce computational complexity. Finally, performance evaluation of various AEPs is done based on measures such as sensitivity, specificity and precision using various standard genomic datasets taken from National Center for Biotechnology Information genomic sequence database. PMID:29515813

  4. Electron holes appear to trigger cancer-implicated mutations

    NASA Astrophysics Data System (ADS)

    Miller, John; Villagran, Martha

    Malignant tumors are caused by mutations, which also affect their subsequent growth and evolution. We use a novel approach, computational DNA hole spectroscopy [M.Y. Suarez-Villagran & J.H. Miller, Sci. Rep. 5, 13571 (2015)], to compute spectra of enhanced hole probability based on actual sequence data. A hole is a mobile site of positive charge created when an electron is removed, for example by radiation or contact with a mutagenic agent. Peaks in the hole spectrum depict sites where holes tend to localize and potentially trigger a base pair mismatch during replication. Our studies of reveal a correlation between hole spectrum peaks and spikes in human mutation frequencies. Importantly, we also find that hole peak positions that do not coincide with large variant frequencies often coincide with cancer-implicated mutations and/or (for coding DNA) encoded conserved amino acids. This enables combining hole spectra with variant data to identify critical base pairs and potential cancer `driver' mutations. Such integration of DNA hole and variance spectra could also prove invaluable for pinpointing critical regions, and sites of driver mutations, in the vast non-protein-coding genome. Supported by the State of Texas through the Texas Ctr. for Superconductivity.

  5. Low-Lying π* Resonances of Standard and Rare DNA and RNA Bases Studied by the Projected CAP/SAC-CI Method.

    PubMed

    Kanazawa, Yuki; Ehara, Masahiro; Sommerfeld, Thomas

    2016-03-10

    Low-lying π* resonance states of DNA and RNA bases have been investigated by the recently developed projected complex absorbing potential (CAP)/symmetry-adapted cluster-configuration interaction (SAC-CI) method using a smooth Voronoi potential as CAP. In spite of the challenging CAP applications to higher resonance states of molecules of this size, the present calculations reproduce resonance positions observed by electron transmission spectra (ETS) provided the anticipated deviations due to vibronic effects and limited basis sets are taken into account. Moreover, for the standard nucleobases, the calculated positions and widths qualitatively agree with those obtained in previous electron scattering calculations. For guanine, both keto and enol forms were examined, and the calculated values of the keto form agree clearly better with the experimental findings. In addition to these standard bases, three modified forms of cytosine, which serve as epigenetic or biomarkers, were investigated: formylcytosine, methylcytosine, and chlorocytosine. Last, a strong correlation between the computed positions and the observed ETS values is demonstrated, clearly suggesting that the present computational protocol should be useful for predicting the π* resonances of congeners of DNA and RNA bases.

  6. Computational model of chromosome aberration yield induced by high- and low-LET radiation exposures.

    PubMed

    Ponomarev, Artem L; George, Kerry; Cucinotta, Francis A

    2012-06-01

    We present a computational model for calculating the yield of radiation-induced chromosomal aberrations in human cells based on a stochastic Monte Carlo approach and calibrated using the relative frequencies and distributions of chromosomal aberrations reported in the literature. A previously developed DNA-fragmentation model for high- and low-LET radiation called the NASARadiationTrackImage model was enhanced to simulate a stochastic process of the formation of chromosomal aberrations from DNA fragments. The current version of the model gives predictions of the yields and sizes of translocations, dicentrics, rings, and more complex-type aberrations formed in the G(0)/G(1) cell cycle phase during the first cell division after irradiation. As the model can predict smaller-sized deletions and rings (<3 Mbp) that are below the resolution limits of current cytogenetic analysis techniques, we present predictions of hypothesized small deletions that may be produced as a byproduct of properly repaired DNA double-strand breaks (DSB) by nonhomologous end-joining. Additionally, the model was used to scale chromosomal exchanges in two or three chromosomes that were obtained from whole-chromosome FISH painting analysis techniques to whole-genome equivalent values.

  7. Methodological approach to crime scene investigation: the dangers of technology

    NASA Astrophysics Data System (ADS)

    Barnett, Peter D.

    1997-02-01

    The visitor to any modern forensic science laboratory is confronted with equipment and processes that did not exist even 10 years ago: thermocyclers to allow genetic typing of nanogram amounts of DNA isolated from a few spermatozoa; scanning electron microscopes that can nearly automatically detect submicrometer sized particles of molten lead, barium and antimony produced by the discharge of a firearm and deposited on the hands of the shooter; and computers that can compare an image of a latent fingerprint with millions of fingerprints stored in the computer memory. Analysis of populations of physical evidence has permitted statistically minded forensic scientists to use Bayesian inference to draw conclusions based on a priori assumptions which are often poorly understood, irrelevant, or misleading. National commissions who are studying quality control in DNA analysis propose that people with barely relevant graduate degrees and little forensic science experience be placed in charge of forensic DNA laboratories. It is undeniable that high- tech has reversed some miscarriages of justice by establishing the innocence of a number of people who were imprisoned for years for crimes that they did not commit. However, this papers deals with the dangers of technology in criminal investigations.

  8. [Definition of the specificity of DNA-methyltransferase M.Bsc4I in cell lysate by blocking of restriction endonucleases and computer modeling].

    PubMed

    Dedkov, V S

    2009-01-01

    The specificity of DNA-methyltransferase M.Bsc4I was defined in cellular lysate of Bacillus schlegelii 4. For this purpose, we used methylation sensitivity of restriction endonucleases, and also modeling of methylation. The modeling consisted in editing sequences of DNA using replacements of methylated bases and their complementary bases. The substratum DNA processed by M.Bsc4I also were used for studying sensitivity of some restriction endonucleases to methylation. Thus, it was shown that M.Bsc4I methylated 5'-Cm4CNNNNNNNGG-3' and the overlapped dcm-methylation blocked its activity. The offered approach can appear universal enough and simple for definition of specificity of DNA-methyltransferases.

  9. Fundamental device design considerations in the development of disruptive nanoelectronics.

    PubMed

    Singh, R; Poole, J O; Poole, K F; Vaidya, S D

    2002-01-01

    In the last quarter of a century silicon-based integrated circuits (ICs) have played a major role in the growth of the economy throughout the world. A number of new technologies, such as quantum computing, molecular computing, DNA molecules for computing, etc., are currently being explored to create a product to replace semiconductor transistor technology. We have examined all of the currently explored options and found that none of these options are suitable as silicon IC's replacements. In this paper we provide fundamental device criteria that must be satisfied for the successful operation of a manufacturable, not yet invented, device. The two fundamental limits are the removal of heat and reliability. The switching speed of any practical man-made computing device will be in the range of 10(-15) to 10(-3) s. Heisenberg's uncertainty principle and the computer architecture set the heat generation limit. The thermal conductivity of the materials used in the fabrication of a nanodimensional device sets the heat removal limit. In current electronic products, redundancy plays a significant part in improving the reliability of parts with macroscopic defects. In the future, microscopic and even nanoscopic defects will play a critical role in the reliability of disruptive nanoelectronics. The lattice vibrations will set the intrinsic reliability of future computing systems. The two critical limits discussed in this paper provide criteria for the selection of materials used in the fabrication of future devices. Our work shows that diamond contains the clue to providing computing devices that will surpass the performance of silicon-based nanoelectronics.

  10. Oncologists partner with Watson on genomics.

    PubMed

    2015-08-01

    A new collaboration between IBM Watson Health and more than a dozen cancer centers uses the power of cognitive computing to dramatically reduce the time it takes to analyze data from patients' DNA and identify targeted treatment options. ©2015 American Association for Cancer Research.

  11. 32 CFR 291.3 - Definitions.

    Code of Federal Regulations, 2010 CFR

    2010-07-01

    ..., functions, decisions, or procedures of a DNA organization. Normally, computer software, including source.... (This does not include the underlying data which is processed and produced by such software and which may in some instances be stored with the software.) Exceptions to this position are outlined in...

  12. Home - Virginia Department of Forensic Science

    Science.gov Websites

    Procedure Manuals Training Manuals Digital & Multimedia Evidence Computer Analysis Video Analysis Procedure Manual Training Manual FAQ Updates Firearms & Toolmarks Procedure Manuals Training Manuals Forensic Biology Procedure Manuals Training Manuals Familial Searches Post-Conviction DNA Issues FAQ

  13. Noise reduction in single time frame optical DNA maps

    PubMed Central

    Müller, Vilhelm; Westerlund, Fredrik

    2017-01-01

    In optical DNA mapping technologies sequence-specific intensity variations (DNA barcodes) along stretched and stained DNA molecules are produced. These “fingerprints” of the underlying DNA sequence have a resolution of the order one kilobasepairs and the stretching of the DNA molecules are performed by surface adsorption or nano-channel setups. A post-processing challenge for nano-channel based methods, due to local and global random movement of the DNA molecule during imaging, is how to align different time frames in order to produce reproducible time-averaged DNA barcodes. The current solutions to this challenge are computationally rather slow. With high-throughput applications in mind, we here introduce a parameter-free method for filtering a single time frame noisy barcode (snap-shot optical map), measured in a fraction of a second. By using only a single time frame barcode we circumvent the need for post-processing alignment. We demonstrate that our method is successful at providing filtered barcodes which are less noisy and more similar to time averaged barcodes. The method is based on the application of a low-pass filter on a single noisy barcode using the width of the Point Spread Function of the system as a unique, and known, filtering parameter. We find that after applying our method, the Pearson correlation coefficient (a real number in the range from -1 to 1) between the single time-frame barcode and the time average of the aligned kymograph increases significantly, roughly by 0.2 on average. By comparing to a database of more than 3000 theoretical plasmid barcodes we show that the capabilities to identify plasmids is improved by filtering single time-frame barcodes compared to the unfiltered analogues. Since snap-shot experiments and computational time using our method both are less than a second, this study opens up for high throughput optical DNA mapping with improved reproducibility. PMID:28640821

  14. Calculation of DNA strand breaks due to direct and indirect effects of Auger electrons from incorporated 123I and 125I radionuclides using the Geant4 computer code.

    PubMed

    Raisali, Gholamreza; Mirzakhanian, Lalageh; Masoudi, Seyed Farhad; Semsarha, Farid

    2013-01-01

    In this work the number of DNA single-strand breaks (SSB) and double-strand breaks (DSB) due to direct and indirect effects of Auger electrons from incorporated (123)I and (125)I have been calculated by using the Geant4-DNA toolkit. We have performed and compared the calculations for several cases: (125)I versus (123)I, source positions and direct versus indirect breaks to study the capability of the Geant4-DNA in calculations of DNA damage yields. Two different simple geometries of a 41 base pair of B-DNA have been simulated. The location of (123)I has been considered to be in (123)IdUrd and three different locations for (125)I. The results showed that the simpler geometry is sufficient for direct break calculations while indirect damage yield is more sensitive to the helical shape of DNA. For (123)I Auger electrons, the average number of DSB due to the direct hits is almost twice the DSB due to the indirect hits. Furthermore, a comparison between the average number of SSB or DSB caused by Auger electrons of (125)I and (123)I in (125)IdUrd and (123)IdUrd shows that (125)I is 1.5 times more effective than (123)I per decay. The results are in reasonable agreement with previous experimental and theoretical results which shows the applicability of the Geant-DNA toolkit in nanodosimetry calculations which benefits from the open-source accessibility with the advantage that the DNA models used in this work enable us to save the computational time. Also, the results showed that the simpler geometry is suitable for direct break calculations, while for the indirect damage yield, the more precise model is preferred.

  15. The Crystal Structure of TAL Effector PthXo1 Bound to Its DNA Target

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Mak, Amanda Nga-Sze; Bradley, Philip; Cernadas, Raul A.

    2012-02-10

    DNA recognition by TAL effectors is mediated by tandem repeats, each 33 to 35 residues in length, that specify nucleotides via unique repeat-variable diresidues (RVDs). The crystal structure of PthXo1 bound to its DNA target was determined by high-throughput computational structure prediction and validated by heavy-atom derivatization. Each repeat forms a left-handed, two-helix bundle that presents an RVD-containing loop to the DNA. The repeats self-associate to form a right-handed superhelix wrapped around the DNA major groove. The first RVD residue forms a stabilizing contact with the protein backbone, while the second makes a base-specific contact to the DNA sense strand.more » Two degenerate amino-terminal repeats also interact with the DNA. Containing several RVDs and noncanonical associations, the structure illustrates the basis of TAL effector-DNA recognition.« less

  16. Superimposed Code Theorectic Analysis of DNA Codes and DNA Computing

    DTIC Science & Technology

    2010-03-01

    because only certain collections (partitioned by font type) of sequences are allowed to be in each position (e.g., Arial = position 0, Comic ...rigidity of short oligos and the shape of the polar charge. Oligo movement was modeled by a Brownian motion 3 dimensional random walk. The one...temperature, kB is Boltz he viscosity of the medium. The random walk motion is modeled by assuming the oligo is on a three dimensional lattice and may

  17. Digital transcriptome profiling using selective hexamer priming for cDNA synthesis.

    PubMed

    Armour, Christopher D; Castle, John C; Chen, Ronghua; Babak, Tomas; Loerch, Patrick; Jackson, Stuart; Shah, Jyoti K; Dey, John; Rohl, Carol A; Johnson, Jason M; Raymond, Christopher K

    2009-09-01

    We developed a procedure for the preparation of whole transcriptome cDNA libraries depleted of ribosomal RNA from only 1 microg of total RNA. The method relies on a collection of short, computationally selected oligonucleotides, called 'not-so-random' (NSR) primers, to obtain full-length, strand-specific representation of nonribosomal RNA transcripts. In this study we validated the technique by profiling human whole brain and universal human reference RNA using ultra-high-throughput sequencing.

  18. Taenia solium cysticercosis in Bali, Indonesia: serology and mtDNA analysis.

    PubMed

    Sudewi, A A R; Wandra, T; Artha, A; Nkouawa, A; Ito, A

    2008-01-01

    An active Taenia solium cysticercosis case in Bali, Indonesia, was followed-up by serology and computed tomography. Serology using semi-purified glycoprotein and recombinant antigens showed a drastic drop in titers after calcification of the cysts. Three paraffin-embedded cysts, prepared for histopathological examination, from three other patients were used for mtDNA analysis. The sequences of cox1 gene from T. solium cysticerci from Bali differed from those in Papua and other Asian countries.

  19. Efficient alignment-free DNA barcode analytics.

    PubMed

    Kuksa, Pavel; Pavlovic, Vladimir

    2009-11-10

    In this work we consider barcode DNA analysis problems and address them using alternative, alignment-free methods and representations which model sequences as collections of short sequence fragments (features). The methods use fixed-length representations (spectrum) for barcode sequences to measure similarities or dissimilarities between sequences coming from the same or different species. The spectrum-based representation not only allows for accurate and computationally efficient species classification, but also opens possibility for accurate clustering analysis of putative species barcodes and identification of critical within-barcode loci distinguishing barcodes of different sample groups. New alignment-free methods provide highly accurate and fast DNA barcode-based identification and classification of species with substantial improvements in accuracy and speed over state-of-the-art barcode analysis methods. We evaluate our methods on problems of species classification and identification using barcodes, important and relevant analytical tasks in many practical applications (adverse species movement monitoring, sampling surveys for unknown or pathogenic species identification, biodiversity assessment, etc.) On several benchmark barcode datasets, including ACG, Astraptes, Hesperiidae, Fish larvae, and Birds of North America, proposed alignment-free methods considerably improve prediction accuracy compared to prior results. We also observe significant running time improvements over the state-of-the-art methods. Our results show that newly developed alignment-free methods for DNA barcoding can efficiently and with high accuracy identify specimens by examining only few barcode features, resulting in increased scalability and interpretability of current computational approaches to barcoding.

  20. Nonlinear optical and G-Quadruplex DNA stabilization properties of novel mixed ligand copper(II) complexes and coordination polymers: Synthesis, structural characterization and computational studies

    NASA Astrophysics Data System (ADS)

    Rajasekhar, Bathula; Bodavarapu, Navya; Sridevi, M.; Thamizhselvi, G.; RizhaNazar, K.; Padmanaban, R.; Swu, Toka

    2018-03-01

    The present study reports the synthesis and evaluation of nonlinear optical property and G-Quadruplex DNA Stabilization of five novel copper(II) mixed ligand complexes. They were synthesized from copper(II) salt, 2,5- and 2,3- pyridinedicarboxylic acid, diethylenetriamine and amide based ligand (AL). The crystal structure of these complexes were determined through X-ray diffraction and supported by ESI-MAS, NMR, UV-Vis and FT-IR spectroscopic methods. Their nonlinear optical property was studied using Gaussian09 computer program. For structural optimization and nonlinear optical property, density functional theory (DFT) based B3LYP method was used with LANL2DZ basis set for metal ion and 6-31G∗ for C,H,N,O and Cl atoms. The present work reveals that pre-polarized Complex-2 showed higher β value (29.59 × 10-30e.s.u) as compared to that of neutral complex-1 (β = 0.276 × 10-30e.s.u.) which may be due to greater advantage of polarizability. Complex-2 is expected to be a potential material for optoelectronic and photonic technologies. Docking studies using AutodockVina revealed that complex-2 has higher binding energy for both G-Quadruplex DNA (-8.7 kcal/mol) and duplex DNA (-10.1 kcal/mol). It was also observed that structure plays an important role in binding efficiency.

  1. DNA MemoChip: Long-Term and High Capacity Information Storage and Select Retrieval.

    PubMed

    Stefano, George B; Wang, Fuzhou; Kream, Richard M

    2018-02-26

    Over the course of history, human beings have never stopped seeking effective methods for information storage. From rocks to paper, and through the past several decades of using computer disks, USB sticks, and on to the thin silicon "chips" and "cloud" storage of today, it would seem that we have reached an era of efficiency for managing innumerable and ever-expanding data. Astonishingly, when tracing this technological path, one realizes that our ancient methods of informational storage far outlast paper (10,000 vs. 1,000 years, respectively), let alone the computer-based memory devices that only last, on average, 5 to 25 years. During this time of fast-paced information generation, it becomes increasingly difficult for current storage methods to retain such massive amounts of data, and to maintain appropriate speeds with which to retrieve it, especially when in demand by a large number of users. Others have proposed that DNA-based information storage provides a way forward for information retention as a result of its temporal stability. It is now evident that DNA represents a potentially economical and sustainable mechanism for storing information, as demonstrated by its decoding from a 700,000 year-old horse genome. The fact that the human genome is present in a cell, containing also the varied mitochondrial genome, indicates DNA's great potential for large data storage in a 'smaller' space.

  2. Genome-wide Expression Profiling, In Vivo DNA Binding Analysis, and Probabilistic Motif Prediction Reveal Novel Abf1 Target Genes during Fermentation, Respiration, and Sporulation in Yeast

    PubMed Central

    Schlecht, Ulrich; Erb, Ionas; Demougin, Philippe; Robine, Nicolas; Borde, Valérie; van Nimwegen, Erik; Nicolas, Alain

    2008-01-01

    The autonomously replicating sequence binding factor 1 (Abf1) was initially identified as an essential DNA replication factor and later shown to be a component of the regulatory network controlling mitotic and meiotic cell cycle progression in budding yeast. The protein is thought to exert its functions via specific interaction with its target site as part of distinct protein complexes, but its roles during mitotic growth and meiotic development are only partially understood. Here, we report a comprehensive approach aiming at the identification of direct Abf1-target genes expressed during fermentation, respiration, and sporulation. Computational prediction of the protein's target sites was integrated with a genome-wide DNA binding assay in growing and sporulating cells. The resulting data were combined with the output of expression profiling studies using wild-type versus temperature-sensitive alleles. This work identified 434 protein-coding loci as being transcriptionally dependent on Abf1. More than 60% of their putative promoter regions contained a computationally predicted Abf1 binding site and/or were bound by Abf1 in vivo, identifying them as direct targets. The present study revealed numerous loci previously unknown to be under Abf1 control, and it yielded evidence for the protein's variable DNA binding pattern during mitotic growth and meiotic development. PMID:18305101

  3. Biopolymer Chain Elasticity: a novel concept and a least deformation energy principle predicts backbone and overall folding of DNA TTT hairpins in agreement with NMR distances

    PubMed Central

    Pakleza, Christophe; Cognet, Jean A. H.

    2003-01-01

    A new molecular modelling methodology is presented and shown to apply to all published solution structures of DNA hairpins with TTT in the loop. It is based on the theory of elasticity of thin rods and on the assumption that single-stranded B-DNA behaves as a continuous, unshearable, unstretchable and flexible thin rod. It requires four construction steps: (i) computation of the tri-dimensional trajectory of the elastic line, (ii) global deformation of single-stranded helical DNA onto the elastic line, (iii) optimisation of the nucleoside rotations about the elastic line, (iv) energy minimisation to restore backbone bond lengths and bond angles. This theoretical approach called ‘Biopolymer Chain Elasticity’ (BCE) is capable of reproducing the tri-dimensional course of the sugar–phosphate chain and, using NMR-derived distances, of reproducing models close to published solution structures. This is shown by computing three different types of distance criteria. The natural description provided by the elastic line and by the new parameter, Ω, which corresponds to the rotation angles of nucleosides about the elastic line, offers a considerable simplification of molecular modelling of hairpin loops. They can be varied independently from each other, since the global shape of the hairpin loop is preserved in all cases. PMID:12560506

  4. Context based computational analysis and characterization of ARS consensus sequences (ACS) of Saccharomyces cerevisiae genome.

    PubMed

    Singh, Vinod Kumar; Krishnamachari, Annangarachari

    2016-09-01

    Genome-wide experimental studies in Saccharomyces cerevisiae reveal that autonomous replicating sequence (ARS) requires an essential consensus sequence (ACS) for replication activity. Computational studies identified thousands of ACS like patterns in the genome. However, only a few hundreds of these sites act as replicating sites and the rest are considered as dormant or evolving sites. In a bid to understand the sequence makeup of replication sites, a content and context-based analysis was performed on a set of replicating ACS sequences that binds to origin-recognition complex (ORC) denoted as ORC-ACS and non-replicating ACS sequences (nrACS), that are not bound by ORC. In this study, DNA properties such as base composition, correlation, sequence dependent thermodynamic and DNA structural profiles, and their positions have been considered for characterizing ORC-ACS and nrACS. Analysis reveals that ORC-ACS depict marked differences in nucleotide composition and context features in its vicinity compared to nrACS. Interestingly, an A-rich motif was also discovered in ORC-ACS sequences within its nucleosome-free region. Profound changes in the conformational features, such as DNA helical twist, inclination angle and stacking energy between ORC-ACS and nrACS were observed. Distribution of ACS motifs in the non-coding segments points to the locations of ORC-ACS which are found far away from the adjacent gene start position compared to nrACS thereby enabling an accessible environment for ORC-proteins. Our attempt is novel in considering the contextual view of ACS and its flanking region along with nucleosome positioning in the S. cerevisiae genome and may be useful for any computational prediction scheme.

  5. Computational strategies to address chromatin structure problems

    NASA Astrophysics Data System (ADS)

    Perišić, Ognjen; Schlick, Tamar

    2016-06-01

    While the genetic information is contained in double helical DNA, gene expression is a complex multilevel process that involves various functional units, from nucleosomes to fully formed chromatin fibers accompanied by a host of various chromatin binding enzymes. The chromatin fiber is a polymer composed of histone protein complexes upon which DNA wraps, like yarn upon many spools. The nature of chromatin structure has been an open question since the beginning of modern molecular biology. Many experiments have shown that the chromatin fiber is a highly dynamic entity with pronounced structural diversity that includes properties of idealized zig-zag and solenoid models, as well as other motifs. This diversity can produce a high packing ratio and thus inhibit access to a majority of the wound DNA. Despite much research, chromatin’s dynamic structure has not yet been fully described. Long stretches of chromatin fibers exhibit puzzling dynamic behavior that requires interpretation in the light of gene expression patterns in various tissue and organisms. The properties of chromatin fiber can be investigated with experimental techniques, like in vitro biochemistry, in vivo imagining, and high-throughput chromosome capture technology. Those techniques provide useful insights into the fiber’s structure and dynamics, but they are limited in resolution and scope, especially regarding compact fibers and chromosomes in the cellular milieu. Complementary but specialized modeling techniques are needed to handle large floppy polymers such as the chromatin fiber. In this review, we discuss current approaches in the chromatin structure field with an emphasis on modeling, such as molecular dynamics and coarse-grained computational approaches. Combinations of these computational techniques complement experiments and address many relevant biological problems, as we will illustrate with special focus on epigenetic modulation of chromatin structure.

  6. Introduction to bioinformatics.

    PubMed

    Can, Tolga

    2014-01-01

    Bioinformatics is an interdisciplinary field mainly involving molecular biology and genetics, computer science, mathematics, and statistics. Data intensive, large-scale biological problems are addressed from a computational point of view. The most common problems are modeling biological processes at the molecular level and making inferences from collected data. A bioinformatics solution usually involves the following steps: Collect statistics from biological data. Build a computational model. Solve a computational modeling problem. Test and evaluate a computational algorithm. This chapter gives a brief introduction to bioinformatics by first providing an introduction to biological terminology and then discussing some classical bioinformatics problems organized by the types of data sources. Sequence analysis is the analysis of DNA and protein sequences for clues regarding function and includes subproblems such as identification of homologs, multiple sequence alignment, searching sequence patterns, and evolutionary analyses. Protein structures are three-dimensional data and the associated problems are structure prediction (secondary and tertiary), analysis of protein structures for clues regarding function, and structural alignment. Gene expression data is usually represented as matrices and analysis of microarray data mostly involves statistics analysis, classification, and clustering approaches. Biological networks such as gene regulatory networks, metabolic pathways, and protein-protein interaction networks are usually modeled as graphs and graph theoretic approaches are used to solve associated problems such as construction and analysis of large-scale networks.

  7. Sequence-based prediction of protein-binding sites in DNA: comparative study of two SVM models.

    PubMed

    Park, Byungkyu; Im, Jinyong; Tuvshinjargal, Narankhuu; Lee, Wook; Han, Kyungsook

    2014-11-01

    As many structures of protein-DNA complexes have been known in the past years, several computational methods have been developed to predict DNA-binding sites in proteins. However, its inverse problem (i.e., predicting protein-binding sites in DNA) has received much less attention. One of the reasons is that the differences between the interaction propensities of nucleotides are much smaller than those between amino acids. Another reason is that DNA exhibits less diverse sequence patterns than protein. Therefore, predicting protein-binding DNA nucleotides is much harder than predicting DNA-binding amino acids. We computed the interaction propensity (IP) of nucleotide triplets with amino acids using an extensive dataset of protein-DNA complexes, and developed two support vector machine (SVM) models that predict protein-binding nucleotides from sequence data alone. One SVM model predicts protein-binding nucleotides using DNA sequence data alone, and the other SVM model predicts protein-binding nucleotides using both DNA and protein sequences. In a 10-fold cross-validation with 1519 DNA sequences, the SVM model that uses DNA sequence data only predicted protein-binding nucleotides with an accuracy of 67.0%, an F-measure of 67.1%, and a Matthews correlation coefficient (MCC) of 0.340. With an independent dataset of 181 DNAs that were not used in training, it achieved an accuracy of 66.2%, an F-measure 66.3% and a MCC of 0.324. Another SVM model that uses both DNA and protein sequences achieved an accuracy of 69.6%, an F-measure of 69.6%, and a MCC of 0.383 in a 10-fold cross-validation with 1519 DNA sequences and 859 protein sequences. With an independent dataset of 181 DNAs and 143 proteins, it showed an accuracy of 67.3%, an F-measure of 66.5% and a MCC of 0.329. Both in cross-validation and independent testing, the second SVM model that used both DNA and protein sequence data showed better performance than the first model that used DNA sequence data. To the best of our knowledge, this is the first attempt to predict protein-binding nucleotides in a given DNA sequence from the sequence data alone. Copyright © 2014 Elsevier Ireland Ltd. All rights reserved.

  8. Protein alignment algorithms with an efficient backtracking routine on multiple GPUs.

    PubMed

    Blazewicz, Jacek; Frohmberg, Wojciech; Kierzynka, Michal; Pesch, Erwin; Wojciechowski, Pawel

    2011-05-20

    Pairwise sequence alignment methods are widely used in biological research. The increasing number of sequences is perceived as one of the upcoming challenges for sequence alignment methods in the nearest future. To overcome this challenge several GPU (Graphics Processing Unit) computing approaches have been proposed lately. These solutions show a great potential of a GPU platform but in most cases address the problem of sequence database scanning and computing only the alignment score whereas the alignment itself is omitted. Thus, the need arose to implement the global and semiglobal Needleman-Wunsch, and Smith-Waterman algorithms with a backtracking procedure which is needed to construct the alignment. In this paper we present the solution that performs the alignment of every given sequence pair, which is a required step for progressive multiple sequence alignment methods, as well as for DNA recognition at the DNA assembly stage. Performed tests show that the implementation, with performance up to 6.3 GCUPS on a single GPU for affine gap penalties, is very efficient in comparison to other CPU and GPU-based solutions. Moreover, multiple GPUs support with load balancing makes the application very scalable. The article shows that the backtracking procedure of the sequence alignment algorithms may be designed to fit in with the GPU architecture. Therefore, our algorithm, apart from scores, is able to compute pairwise alignments. This opens a wide range of new possibilities, allowing other methods from the area of molecular biology to take advantage of the new computational architecture. Performed tests show that the efficiency of the implementation is excellent. Moreover, the speed of our GPU-based algorithms can be almost linearly increased when using more than one graphics card.

  9. Parameter estimation for an immortal model of colonic stem cell division using approximate Bayesian computation.

    PubMed

    Walters, Kevin

    2012-08-07

    In this paper we use approximate Bayesian computation to estimate the parameters in an immortal model of colonic stem cell division. We base the inferences on the observed DNA methylation patterns of cells sampled from the human colon. Utilising DNA methylation patterns as a form of molecular clock is an emerging area of research and has been used in several studies investigating colonic stem cell turnover. There is much debate concerning the two competing models of stem cell turnover: the symmetric (immortal) and asymmetric models. Early simulation studies concluded that the observed methylation data were not consistent with the immortal model. A later modified version of the immortal model that included preferential strand segregation was subsequently shown to be consistent with the same methylation data. Most of this earlier work assumes site independent methylation models that do not take account of the known processivity of methyltransferases whilst other work does not take into account the methylation errors that occur in differentiated cells. This paper addresses both of these issues for the immortal model and demonstrates that approximate Bayesian computation provides accurate estimates of the parameters in this neighbour-dependent model of methylation error rates. The results indicate that if colonic stem cells divide asymmetrically then colon stem cell niches are maintained by more than 8 stem cells. Results also indicate the possibility of preferential strand segregation and provide clear evidence against a site-independent model for methylation errors. In addition, algebraic expressions for some of the summary statistics used in the approximate Bayesian computation (that allow for the additional variation arising from cell division in differentiated cells) are derived and their utility discussed. Copyright © 2012 Elsevier Ltd. All rights reserved.

  10. Nanopore Logic Operation with DNA to RNA Transcription in a Droplet System.

    PubMed

    Ohara, Masayuki; Takinoue, Masahiro; Kawano, Ryuji

    2017-07-21

    This paper describes an AND logic operation with amplification and transcription from DNA to RNA, using T7 RNA polymerase. All four operations, (0 0) to (1 1), with an enzyme reaction can be performed simultaneously, using four-droplet devices that are directly connected to a patch-clamp amplifier. The output RNA molecule is detected using a biological nanopore with single-molecule translocation. Channel current recordings can be obtained using the enzyme solution. The integration of DNA logic gates into electrochemical devices is necessary to obtain output information in a human-recognizable form. Our method will be useful for rapid and confined DNA computing applications, including the development of programmable diagnostic devices.

  11. Would Dissociative Recombination of DNA+ be a Possible Pathway of DNA Damage?

    NASA Astrophysics Data System (ADS)

    Kwon, H. C.; Chen, Z. P.; Strom, R. A.; Andrianarijaona, V. M.

    2015-05-01

    It is known that dissociative recombination (DR) is one of the very efficient processes of destruction of molecular cations into neutral particles. During the past few years, the focus of DR has been expanded from small inorganic molecules to macromolecular cation. We are probing the possibility of the DR of DNA+ after ionization of DNA, for example due to ionizing radiation. Therefore we are investigating the existence of autoionization states within nucleotide bases (Guanine, Adenine, Cytosine, and Thymine). Our results from computational analysis using the modern electronic structure program ORCA will be presented. Authors wish to give special thanks to Pacific Union College Student Senate for their financial support.

  12. Self-Assembling Molecular Logic Gates Based on DNA Crossover Tiles.

    PubMed

    Campbell, Eleanor A; Peterson, Evan; Kolpashchikov, Dmitry M

    2017-07-05

    DNA-based computational hardware has attracted ever-growing attention due to its potential to be useful in the analysis of complex mixtures of biological markers. Here we report the design of self-assembling logic gates that recognize DNA inputs and assemble into crossover tiles when the output signal is high; the crossover structures disassemble to form separate DNA stands when the output is low. The output signal can be conveniently detected by fluorescence using a molecular beacon probe as a reporter. AND, NOT, and OR logic gates were designed. We demonstrate that the gates can connect to each other to produce other logic functions. © 2017 Wiley-VCH Verlag GmbH & Co. KGaA, Weinheim.

  13. A novel chaotic based image encryption using a hybrid model of deoxyribonucleic acid and cellular automata

    NASA Astrophysics Data System (ADS)

    Enayatifar, Rasul; Sadaei, Hossein Javedani; Abdullah, Abdul Hanan; Lee, Malrey; Isnin, Ismail Fauzi

    2015-08-01

    Currently, there are many studies have conducted on developing security of the digital image in order to protect such data while they are sending on the internet. This work aims to propose a new approach based on a hybrid model of the Tinkerbell chaotic map, deoxyribonucleic acid (DNA) and cellular automata (CA). DNA rules, DNA sequence XOR operator and CA rules are used simultaneously to encrypt the plain-image pixels. To determine rule number in DNA sequence and also CA, a 2-dimension Tinkerbell chaotic map is employed. Experimental results and computer simulations, both confirm that the proposed scheme not only demonstrates outstanding encryption, but also resists various typical attacks.

  14. DNA packaging and ejection forces in bacteriophage

    PubMed Central

    Kindt, James; Tzlil, Shelly; Ben-Shaul, Avinoam; Gelbart, William M.

    2001-01-01

    We calculate the forces required to package (or, equivalently, acting to eject) DNA into (from) a bacteriophage capsid, as a function of the loaded (ejected) length, under conditions for which the DNA is either self-repelling or self-attracting. Through computer simulation and analytical theory, we find the loading force to increase more than 10-fold (to tens of piconewtons) during the final third of the loading process; correspondingly, the internal pressure drops 10-fold to a few atmospheres (matching the osmotic pressure in the cell) upon ejection of just a small fraction of the phage genome. We also determine an evolution of the arrangement of packaged DNA from toroidal to spool-like structures. PMID:11707588

  15. Mechanism for priming DNA synthesis by yeast DNA Polymerase α

    PubMed Central

    Perera, Rajika L; Torella, Rubben; Klinge, Sebastian; Kilkenny, Mairi L; Maman, Joseph D; Pellegrini, Luca

    2013-01-01

    The DNA Polymerase α (Pol α)/primase complex initiates DNA synthesis in eukaryotic replication. In the complex, Pol α and primase cooperate in the production of RNA-DNA oligonucleotides that prime synthesis of new DNA. Here we report crystal structures of the catalytic core of yeast Pol α in unliganded form, bound to an RNA primer/DNA template and extending an RNA primer with deoxynucleotides. We combine the structural analysis with biochemical and computational data to demonstrate that Pol α specifically recognizes the A-form RNA/DNA helix and that the ensuing synthesis of B-form DNA terminates primer synthesis. The spontaneous release of the completed RNA-DNA primer by the Pol α/primase complex simplifies current models of primer transfer to leading- and lagging strand polymerases. The proposed mechanism of nucleotide polymerization by Pol α might contribute to genomic stability by limiting the amount of inaccurate DNA to be corrected at the start of each Okazaki fragment. DOI: http://dx.doi.org/10.7554/eLife.00482.001 PMID:23599895

  16. An evolution based biosensor receptor DNA sequence generation algorithm.

    PubMed

    Kim, Eungyeong; Lee, Malrey; Gatton, Thomas M; Lee, Jaewan; Zang, Yupeng

    2010-01-01

    A biosensor is composed of a bioreceptor, an associated recognition molecule, and a signal transducer that can selectively detect target substances for analysis. DNA based biosensors utilize receptor molecules that allow hybridization with the target analyte. However, most DNA biosensor research uses oligonucleotides as the target analytes and does not address the potential problems of real samples. The identification of recognition molecules suitable for real target analyte samples is an important step towards further development of DNA biosensors. This study examines the characteristics of DNA used as bioreceptors and proposes a hybrid evolution-based DNA sequence generating algorithm, based on DNA computing, to identify suitable DNA bioreceptor recognition molecules for stable hybridization with real target substances. The Traveling Salesman Problem (TSP) approach is applied in the proposed algorithm to evaluate the safety and fitness of the generated DNA sequences. This approach improves efficiency and stability for enhanced and variable-length DNA sequence generation and allows extension to generation of variable-length DNA sequences with diverse receptor recognition requirements.

  17. Binary counting with chemical reactions.

    PubMed

    Kharam, Aleksandra; Jiang, Hua; Riedel, Marc; Parhi, Keshab

    2011-01-01

    This paper describes a scheme for implementing a binary counter with chemical reactions. The value of the counter is encoded by logical values of "0" and "1" that correspond to the absence and presence of specific molecular types, respectively. It is incremented when molecules of a trigger type are injected. Synchronization is achieved with reactions that produce a sustained three-phase oscillation. This oscillation plays a role analogous to a clock signal in digital electronics. Quantities are transferred between molecular types in different phases of the oscillation. Unlike all previous schemes for chemical computation, this scheme is dependent only on coarse rate categories for the reactions ("fast" and "slow"). Given such categories, the computation is exact and independent of the specific reaction rates. Although conceptual for the time being, the methodology has potential applications in domains of synthetic biology such as biochemical sensing and drug delivery. We are exploring DNA-based computation via strand displacement as a possible experimental chassis.

  18. Regulatory RNA design through evolutionary computation and strand displacement.

    PubMed

    Rostain, William; Landrain, Thomas E; Rodrigo, Guillermo; Jaramillo, Alfonso

    2015-01-01

    The discovery and study of a vast number of regulatory RNAs in all kingdoms of life over the past decades has allowed the design of new synthetic RNAs that can regulate gene expression in vivo. Riboregulators, in particular, have been used to activate or repress gene expression. However, to accelerate and scale up the design process, synthetic biologists require computer-assisted design tools, without which riboregulator engineering will remain a case-by-case design process requiring expert attention. Recently, the design of RNA circuits by evolutionary computation and adapting strand displacement techniques from nanotechnology has proven to be suited to the automated generation of DNA sequences implementing regulatory RNA systems in bacteria. Herein, we present our method to carry out such evolutionary design and how to use it to create various types of riboregulators, allowing the systematic de novo design of genetic control systems in synthetic biology.

  19. Pushing the frontiers of first-principles based computer simulations of chemical and biological systems.

    PubMed

    Brunk, Elizabeth; Ashari, Negar; Athri, Prashanth; Campomanes, Pablo; de Carvalho, F Franco; Curchod, Basile F E; Diamantis, Polydefkis; Doemer, Manuel; Garrec, Julian; Laktionov, Andrey; Micciarelli, Marco; Neri, Marilisa; Palermo, Giulia; Penfold, Thomas J; Vanni, Stefano; Tavernelli, Ivano; Rothlisberger, Ursula

    2011-01-01

    The Laboratory of Computational Chemistry and Biochemistry is active in the development and application of first-principles based simulations of complex chemical and biochemical phenomena. Here, we review some of our recent efforts in extending these methods to larger systems, longer time scales and increased accuracies. Their versatility is illustrated with a diverse range of applications, ranging from the determination of the gas phase structure of the cyclic decapeptide gramicidin S, to the study of G protein coupled receptors, the interaction of transition metal based anti-cancer agents with protein targets, the mechanism of action of DNA repair enzymes, the role of metal ions in neurodegenerative diseases and the computational design of dye-sensitized solar cells. Many of these projects are done in collaboration with experimental groups from the Institute of Chemical Sciences and Engineering (ISIC) at the EPFL.

  20. ChemPreview: an augmented reality-based molecular interface.

    PubMed

    Zheng, Min; Waller, Mark P

    2017-05-01

    Human computer interfaces make computational science more comprehensible and impactful. Complex 3D structures such as proteins or DNA are magnified by digital representations and displayed on two-dimensional monitors. Augmented reality has recently opened another door to access the virtual three-dimensional world. Herein, we present an augmented reality application called ChemPreview with the potential to manipulate bio-molecular structures at an atomistic level. ChemPreview is available at https://github.com/wallerlab/chem-preview/releases, and is built on top of the Meta 1 platform https://www.metavision.com/. ChemPreview can be used to interact with a protein in an intuitive way using natural hand gestures, thereby making it appealing to computational chemists or structural biologists. The ability to manipulate atoms in real world could eventually provide new and more efficient ways of extracting structural knowledge, or designing new molecules in silico. Copyright © 2017 Elsevier Inc. All rights reserved.

  1. PRESAGE: PRivacy-preserving gEnetic testing via SoftwAre Guard Extension.

    PubMed

    Chen, Feng; Wang, Chenghong; Dai, Wenrui; Jiang, Xiaoqian; Mohammed, Noman; Al Aziz, Md Momin; Sadat, Md Nazmus; Sahinalp, Cenk; Lauter, Kristin; Wang, Shuang

    2017-07-26

    Advances in DNA sequencing technologies have prompted a wide range of genomic applications to improve healthcare and facilitate biomedical research. However, privacy and security concerns have emerged as a challenge for utilizing cloud computing to handle sensitive genomic data. We present one of the first implementations of Software Guard Extension (SGX) based securely outsourced genetic testing framework, which leverages multiple cryptographic protocols and minimal perfect hash scheme to enable efficient and secure data storage and computation outsourcing. We compared the performance of the proposed PRESAGE framework with the state-of-the-art homomorphic encryption scheme, as well as the plaintext implementation. The experimental results demonstrated significant performance over the homomorphic encryption methods and a small computational overhead in comparison to plaintext implementation. The proposed PRESAGE provides an alternative solution for secure and efficient genomic data outsourcing in an untrusted cloud by using a hybrid framework that combines secure hardware and multiple crypto protocols.

  2. Launching genomics into the cloud: deployment of Mercury, a next generation sequence analysis pipeline.

    PubMed

    Reid, Jeffrey G; Carroll, Andrew; Veeraraghavan, Narayanan; Dahdouli, Mahmoud; Sundquist, Andreas; English, Adam; Bainbridge, Matthew; White, Simon; Salerno, William; Buhay, Christian; Yu, Fuli; Muzny, Donna; Daly, Richard; Duyk, Geoff; Gibbs, Richard A; Boerwinkle, Eric

    2014-01-29

    Massively parallel DNA sequencing generates staggering amounts of data. Decreasing cost, increasing throughput, and improved annotation have expanded the diversity of genomics applications in research and clinical practice. This expanding scale creates analytical challenges: accommodating peak compute demand, coordinating secure access for multiple analysts, and sharing validated tools and results. To address these challenges, we have developed the Mercury analysis pipeline and deployed it in local hardware and the Amazon Web Services cloud via the DNAnexus platform. Mercury is an automated, flexible, and extensible analysis workflow that provides accurate and reproducible genomic results at scales ranging from individuals to large cohorts. By taking advantage of cloud computing and with Mercury implemented on the DNAnexus platform, we have demonstrated a powerful combination of a robust and fully validated software pipeline and a scalable computational resource that, to date, we have applied to more than 10,000 whole genome and whole exome samples.

  3. [Development of laboratory sequence analysis software based on WWW and UNIX].

    PubMed

    Huang, Y; Gu, J R

    2001-01-01

    Sequence analysis tools based on WWW and UNIX were developed in our laboratory to meet the needs of molecular genetics research in our laboratory. General principles of computer analysis of DNA and protein sequences were also briefly discussed in this paper.

  4. A BIOINFORMATIC STRATEGY TO RAPIDLY CHARACTERIZE CDNA LIBRARIES

    EPA Science Inventory

    A Bioinformatic Strategy to Rapidly Characterize cDNA Libraries

    G. Charles Ostermeier1, David J. Dix2 and Stephen A. Krawetz1.
    1Departments of Obstetrics and Gynecology, Center for Molecular Medicine and Genetics, & Institute for Scientific Computing, Wayne State Univer...

  5. Fast single-pass alignment and variant calling using sequencing data

    USDA-ARS?s Scientific Manuscript database

    Sequencing research requires efficient computation. Few programs use already known information about DNA variants when aligning sequence data to the reference map. New program findmap.f90 reads the previous variant list before aligning sequence, calling variant alleles, and summing the allele counts...

  6. The free energy of locking a ring: Changing a deoxyribonucleoside to a locked nucleic acid.

    PubMed

    Xu, You; Villa, Alessandra; Nilsson, Lennart

    2017-06-05

    Locked nucleic acid (LNA), a modified nucleoside which contains a bridging group across the ribose ring, improves the stability of DNA/RNA duplexes significantly, and therefore is of interest in biotechnology and gene therapy applications. In this study, we investigate the free energy change between LNA and DNA nucleosides. The transformation requires the breaking of the bridging group across the ribose ring, a problematic transformation in free energy calculations. To address this, we have developed a 3-step (easy to implement) and a 1-step protocol (more efficient, but more complicated to setup), for single and dual topologies in classical molecular dynamics simulations, using the Bennett Acceptance Ratio method to calculate the free energy. We validate the approach on the solvation free energy difference for the nucleosides thymidine, cytosine, and 5-methyl-cytosine. © 2017 The Authors. Journal of Computational Chemistry Published by Wiley Periodicals, Inc. © 2017 The Authors. Journal of Computational Chemistry Published by Wiley Periodicals, Inc.

  7. Quantum annealing versus classical machine learning applied to a simplified computational biology problem

    NASA Astrophysics Data System (ADS)

    Li, Richard Y.; Di Felice, Rosa; Rohs, Remo; Lidar, Daniel A.

    2018-03-01

    Transcription factors regulate gene expression, but how these proteins recognize and specifically bind to their DNA targets is still debated. Machine learning models are effective means to reveal interaction mechanisms. Here we studied the ability of a quantum machine learning approach to classify and rank binding affinities. Using simplified data sets of a small number of DNA sequences derived from actual binding affinity experiments, we trained a commercially available quantum annealer to classify and rank transcription factor binding. The results were compared to state-of-the-art classical approaches for the same simplified data sets, including simulated annealing, simulated quantum annealing, multiple linear regression, LASSO, and extreme gradient boosting. Despite technological limitations, we find a slight advantage in classification performance and nearly equal ranking performance using the quantum annealer for these fairly small training data sets. Thus, we propose that quantum annealing might be an effective method to implement machine learning for certain computational biology problems.

  8. Hey Buddy can you spare a DNA? New surveillance technologies and the growth of mandatory volunteerism in collecting personal information.

    PubMed

    Marx, Gary T

    2007-01-01

    The new social surveillance can be defined as scrutiny through the use of technical means to extract or create personal or group data, whether from individuals or contexts. Examples include: video cameras; computer matching, profiling and data mining; work, computer and electronic location monitoring; biometrics; DNA analysis; drug tests; brain scans for lie detection; various forms of imaging to reveal what is behind walls and enclosures. There are two problems with the new surveillance technologies. One is that they don't work and the other is that they work too well. If the first, they fail to prevent disasters, bring miscarriages of justice, and waste resources. If the second, they can further inequality and invidious social categorization; they chill liberty. These twin threats are part of the enduring paradox of democratic government that must be strong enough to maintain reasonable order, but not so strong as to become undemocratic.

  9. Application of the Linux cluster for exhaustive window haplotype analysis using the FBAT and Unphased programs.

    PubMed

    Mishima, Hiroyuki; Lidral, Andrew C; Ni, Jun

    2008-05-28

    Genetic association studies have been used to map disease-causing genes. A newly introduced statistical method, called exhaustive haplotype association study, analyzes genetic information consisting of different numbers and combinations of DNA sequence variations along a chromosome. Such studies involve a large number of statistical calculations and subsequently high computing power. It is possible to develop parallel algorithms and codes to perform the calculations on a high performance computing (HPC) system. However, most existing commonly-used statistic packages for genetic studies are non-parallel versions. Alternatively, one may use the cutting-edge technology of grid computing and its packages to conduct non-parallel genetic statistical packages on a centralized HPC system or distributed computing systems. In this paper, we report the utilization of a queuing scheduler built on the Grid Engine and run on a Rocks Linux cluster for our genetic statistical studies. Analysis of both consecutive and combinational window haplotypes was conducted by the FBAT (Laird et al., 2000) and Unphased (Dudbridge, 2003) programs. The dataset consisted of 26 loci from 277 extended families (1484 persons). Using the Rocks Linux cluster with 22 compute-nodes, FBAT jobs performed about 14.4-15.9 times faster, while Unphased jobs performed 1.1-18.6 times faster compared to the accumulated computation duration. Execution of exhaustive haplotype analysis using non-parallel software packages on a Linux-based system is an effective and efficient approach in terms of cost and performance.

  10. Application of the Linux cluster for exhaustive window haplotype analysis using the FBAT and Unphased programs

    PubMed Central

    Mishima, Hiroyuki; Lidral, Andrew C; Ni, Jun

    2008-01-01

    Background Genetic association studies have been used to map disease-causing genes. A newly introduced statistical method, called exhaustive haplotype association study, analyzes genetic information consisting of different numbers and combinations of DNA sequence variations along a chromosome. Such studies involve a large number of statistical calculations and subsequently high computing power. It is possible to develop parallel algorithms and codes to perform the calculations on a high performance computing (HPC) system. However, most existing commonly-used statistic packages for genetic studies are non-parallel versions. Alternatively, one may use the cutting-edge technology of grid computing and its packages to conduct non-parallel genetic statistical packages on a centralized HPC system or distributed computing systems. In this paper, we report the utilization of a queuing scheduler built on the Grid Engine and run on a Rocks Linux cluster for our genetic statistical studies. Results Analysis of both consecutive and combinational window haplotypes was conducted by the FBAT (Laird et al., 2000) and Unphased (Dudbridge, 2003) programs. The dataset consisted of 26 loci from 277 extended families (1484 persons). Using the Rocks Linux cluster with 22 compute-nodes, FBAT jobs performed about 14.4–15.9 times faster, while Unphased jobs performed 1.1–18.6 times faster compared to the accumulated computation duration. Conclusion Execution of exhaustive haplotype analysis using non-parallel software packages on a Linux-based system is an effective and efficient approach in terms of cost and performance. PMID:18541045

  11. Progress and challenges in bioinformatics approaches for enhancer identification

    PubMed Central

    Kleftogiannis, Dimitrios; Kalnis, Panos

    2016-01-01

    Enhancers are cis-acting DNA elements that play critical roles in distal regulation of gene expression. Identifying enhancers is an important step for understanding distinct gene expression programs that may reflect normal and pathogenic cellular conditions. Experimental identification of enhancers is constrained by the set of conditions used in the experiment. This requires multiple experiments to identify enhancers, as they can be active under specific cellular conditions but not in different cell types/tissues or cellular states. This has opened prospects for computational prediction methods that can be used for high-throughput identification of putative enhancers to complement experimental approaches. Potential functions and properties of predicted enhancers have been catalogued and summarized in several enhancer-oriented databases. Because the current methods for the computational prediction of enhancers produce significantly different enhancer predictions, it will be beneficial for the research community to have an overview of the strategies and solutions developed in this field. In this review, we focus on the identification and analysis of enhancers by bioinformatics approaches. First, we describe a general framework for computational identification of enhancers, present relevant data types and discuss possible computational solutions. Next, we cover over 30 existing computational enhancer identification methods that were developed since 2000. Our review highlights advantages, limitations and potentials, while suggesting pragmatic guidelines for development of more efficient computational enhancer prediction methods. Finally, we discuss challenges and open problems of this topic, which require further consideration. PMID:26634919

  12. From the genetic to the computer program: the historicity of 'data' and 'computation' in the investigations on the nematode worm C. elegans (1963-1998).

    PubMed

    García-Sancho, Miguel

    2012-03-01

    This paper argues that the history of the computer, of the practice of computation and of the notions of 'data' and 'programme' are essential for a critical account of the emergence and implications of data-driven research. In order to show this, I focus on the transition that the investigations on the worm C. elegans experienced in the Laboratory of Molecular Biology of Cambridge (UK). Throughout the 1980s, this research programme evolved from a study of the genetic basis of the worm's development and behaviour to a DNA mapping and sequencing initiative. By examining the changing computing technologies which were used at the Laboratory, I demonstrate that by the time of this transition researchers shifted from modelling the worm's genetic programme on a mainframe apparatus to writing minicomputer programs aimed at providing map and sequence data which was then circulated to other groups working on the genetics of C. elegans. The shift in the worm research should thus not be simply explained in the application of computers which transformed the project from hypothesis-driven to a data-intensive endeavour. The key factor was rather a historically specific technology-in-house and easy programmable minicomputers-which redefined the way of achieving the project's long-standing goal, leading the genetic programme to co-evolve with the practices of data production and distribution. Copyright © 2011 Elsevier Ltd. All rights reserved.

  13. A hybrid stochastic model of folate-mediated one-carbon metabolism: Effect of the common C677T MTHFR variant on de novo thymidylate biosynthesis.

    PubMed

    Misselbeck, Karla; Marchetti, Luca; Field, Martha S; Scotti, Marco; Priami, Corrado; Stover, Patrick J

    2017-04-11

    Folate-mediated one-carbon metabolism (FOCM) is an interconnected network of metabolic pathways, including those required for the de novo synthesis of dTMP and purine nucleotides and for remethylation of homocysteine to methionine. Mouse models of folate-responsive neural tube defects (NTDs) indicate that impaired de novo thymidylate (dTMP) synthesis through changes in SHMT expression is causative in folate-responsive NTDs. We have created a hybrid computational model comprised of ordinary differential equations and stochastic simulation. We investigated whether the de novo dTMP synthesis pathway was sensitive to perturbations in FOCM that are known to be associated with human NTDs. This computational model shows that de novo dTMP synthesis is highly sensitive to the common MTHFR C677T polymorphism and that the effect of the polymorphism on FOCM is greater in folate deficiency. Computational simulations indicate that the MTHFR C677T polymorphism and folate deficiency interact to increase the stochastic behavior of the FOCM network, with the greatest instability observed for reactions catalyzed by serine hydroxymethyltransferase (SHMT). Furthermore, we show that de novo dTMP synthesis does not occur in the cytosol at rates sufficient for DNA replication, supporting empirical data indicating that impaired nuclear de novo dTMP synthesis results in uracil misincorporation into DNA.

  14. Programming chemistry in DNA-addressable bioreactors.

    PubMed

    Fellermann, Harold; Cardelli, Luca

    2014-10-06

    We present a formal calculus, termed the chemtainer calculus, able to capture the complexity of compartmentalized reaction systems such as populations of possibly nested vesicular compartments. Compartments contain molecular cargo as well as surface markers in the form of DNA single strands. These markers serve as compartment addresses and allow for their targeted transport and fusion, thereby enabling reactions of previously separated chemicals. The overall system organization allows for the set-up of programmable chemistry in microfluidic or other automated environments. We introduce a simple sequential programming language whose instructions are motivated by state-of-the-art microfluidic technology. Our approach integrates electronic control, chemical computing and material production in a unified formal framework that is able to mimic the integrated computational and constructive capabilities of the subcellular matrix. We provide a non-deterministic semantics of our programming language that enables us to analytically derive the computational and constructive power of our machinery. This semantics is used to derive the sets of all constructable chemicals and supermolecular structures that emerge from different underlying instruction sets. Because our proofs are constructive, they can be used to automatically infer control programs for the construction of target structures from a limited set of resource molecules. Finally, we present an example of our framework from the area of oligosaccharide synthesis. © 2014 The Author(s) Published by the Royal Society. All rights reserved.

  15. Computational analysis of EBNA1 ``druggability'' suggests novel insights for Epstein-Barr virus inhibitor design

    NASA Astrophysics Data System (ADS)

    Gianti, Eleonora; Messick, Troy E.; Lieberman, Paul M.; Zauhar, Randy J.

    2016-04-01

    The Epstein-Barr Nuclear Antigen 1 (EBNA1) is a critical protein encoded by the Epstein-Barr Virus (EBV). During latent infection, EBNA1 is essential for DNA replication and transcription initiation of viral and cellular genes and is necessary to immortalize primary B-lymphocytes. Nonetheless, the concept of EBNA1 as drug target is novel. Two EBNA1 crystal structures are publicly available and the first small-molecule EBNA1 inhibitors were recently discovered. However, no systematic studies have been reported on the structural details of EBNA1 "druggable" binding sites. We conducted computational identification and structural characterization of EBNA1 binding pockets, likely to accommodate ligand molecules (i.e. "druggable" binding sites). Then, we validated our predictions by docking against a set of compounds previously tested in vitro for EBNA1 inhibition (PubChem AID-2381). Finally, we supported assessments of pocket druggability by performing induced fit docking and molecular dynamics simulations paired with binding affinity predictions by Molecular Mechanics Generalized Born Surface Area calculations for a number of hits belonging to druggable binding sites. Our results establish EBNA1 as a target for drug discovery, and provide the computational evidence that active AID-2381 hits disrupt EBNA1:DNA binding upon interacting at individual sites. Lastly, structural properties of top scoring hits are proposed to support the rational design of the next generation of EBNA1 inhibitors.

  16. Stochastic Effects in Computational Biology of Space Radiation Cancer Risk

    NASA Technical Reports Server (NTRS)

    Cucinotta, Francis A.; Pluth, Janis; Harper, Jane; O'Neill, Peter

    2007-01-01

    Estimating risk from space radiation poses important questions on the radiobiology of protons and heavy ions. We are considering systems biology models to study radiation induced repair foci (RIRF) at low doses, in which less than one-track on average transverses the cell, and the subsequent DNA damage processing and signal transduction events. Computational approaches for describing protein regulatory networks coupled to DNA and oxidative damage sites include systems of differential equations, stochastic equations, and Monte-Carlo simulations. We review recent developments in the mathematical description of protein regulatory networks and possible approaches to radiation effects simulation. These include robustness, which states that regulatory networks maintain their functions against external and internal perturbations due to compensating properties of redundancy and molecular feedback controls, and modularity, which leads to general theorems for considering molecules that interact through a regulatory mechanism without exchange of matter leading to a block diagonal reduction of the connecting pathways. Identifying rate-limiting steps, robustness, and modularity in pathways perturbed by radiation damage are shown to be valid techniques for reducing large molecular systems to realistic computer simulations. Other techniques studied are the use of steady-state analysis, and the introduction of composite molecules or rate-constants to represent small collections of reactants. Applications of these techniques to describe spatial and temporal distributions of RIRF and cell populations following low dose irradiation are described.

  17. Methods for modeling cytoskeletal and DNA filaments

    NASA Astrophysics Data System (ADS)

    Andrews, Steven S.

    2014-02-01

    This review summarizes the models that researchers use to represent the conformations and dynamics of cytoskeletal and DNA filaments. It focuses on models that address individual filaments in continuous space. Conformation models include the freely jointed, Gaussian, angle-biased chain (ABC), and wormlike chain (WLC) models, of which the first three bend at discrete joints and the last bends continuously. Predictions from the WLC model generally agree well with experiment. Dynamics models include the Rouse, Zimm, stiff rod, dynamic WLC, and reptation models, of which the first four apply to isolated filaments and the last to entangled filaments. Experiments show that the dynamic WLC and reptation models are most accurate. They also show that biological filaments typically experience strong hydrodynamic coupling and/or constrained motion. Computer simulation methods that address filament dynamics typically compute filament segment velocities from local forces using the Langevin equation and then integrate these velocities with explicit or implicit methods; the former are more versatile and the latter are more efficient. Much remains to be discovered in biological filament modeling. In particular, filament dynamics in living cells are not well understood, and current computational methods are too slow and not sufficiently versatile. Although primarily a review, this paper also presents new statistical calculations for the ABC and WLC models. Additionally, it corrects several discrepancies in the literature about bending and torsional persistence length definitions, and their relations to flexural and torsional rigidities.

  18. Understanding Transcription Factor Regulation by Integrating Gene Expression and DNase I Hypersensitive Sites.

    PubMed

    Wang, Guohua; Wang, Fang; Huang, Qian; Li, Yu; Liu, Yunlong; Wang, Yadong

    2015-01-01

    Transcription factors are proteins that bind to DNA sequences to regulate gene transcription. The transcription factor binding sites are short DNA sequences (5-20 bp long) specifically bound by one or more transcription factors. The identification of transcription factor binding sites and prediction of their function continue to be challenging problems in computational biology. In this study, by integrating the DNase I hypersensitive sites with known position weight matrices in the TRANSFAC database, the transcription factor binding sites in gene regulatory region are identified. Based on the global gene expression patterns in cervical cancer HeLaS3 cell and HelaS3-ifnα4h cell (interferon treatment on HeLaS3 cell for 4 hours), we present a model-based computational approach to predict a set of transcription factors that potentially cause such differential gene expression. Significantly, 6 out 10 predicted functional factors, including IRF, IRF-2, IRF-9, IRF-1 and IRF-3, ICSBP, belong to interferon regulatory factor family and upregulate the gene expression levels responding to the interferon treatment. Another factor, ISGF-3, is also a transcriptional activator induced by interferon alpha. Using the different transcription factor binding sites selected criteria, the prediction result of our model is consistent. Our model demonstrated the potential to computationally identify the functional transcription factors in gene regulation.

  19. [Analysis of Conformational Features of Watson-Crick Duplex Fragments by Molecular Mechanics and Quantum Mechanics Methods].

    PubMed

    Poltev, V I; Anisimov, V M; Sanchez, C; Deriabina, A; Gonzalez, E; Garcia, D; Rivas, F; Polteva, N A

    2016-01-01

    It is generally accepted that the important characteristic features of the Watson-Crick duplex originate from the molecular structure of its subunits. However, it still remains to elucidate what properties of each subunit are responsible for the significant characteristic features of the DNA structure. The computations of desoxydinucleoside monophosphates complexes with Na-ions using density functional theory revealed a pivotal role of DNA conformational properties of single-chain minimal fragments in the development of unique features of the Watson-Crick duplex. We found that directionality of the sugar-phosphate backbone and the preferable ranges of its torsion angles, combined with the difference between purines and pyrimidines. in ring bases, define the dependence of three-dimensional structure of the Watson-Crick duplex on nucleotide base sequence. In this work, we extended these density functional theory computations to the minimal' fragments of DNA duplex, complementary desoxydinucleoside monophosphates complexes with Na-ions. Using several computational methods and various functionals, we performed a search for energy minima of BI-conformation for complementary desoxydinucleoside monophosphates complexes with different nucleoside sequences. Two sequences are optimized using ab initio method at the MP2/6-31++G** level of theory. The analysis of torsion angles, sugar ring puckering and mutual base positions of optimized structures demonstrates that the conformational characteristic features of complementary desoxydinucleoside monophosphates complexes with Na-ions remain within BI ranges and become closer to the corresponding characteristic features of the Watson-Crick duplex crystals. Qualitatively, the main characteristic features of each studied complementary desoxydinucleoside monophosphates complex remain invariant when different computational methods are used, although the quantitative values of some conformational parameters could vary lying within the limits typical for the corresponding family. We observe that popular functionals in density functional theory calculations lead to the overestimated distances between base pairs, while MP2 computations and the newer complex functionals produce the structures that have too close atom-atom contacts. A detailed study of some complementary desoxydinucleoside monophosphate complexes with Na-ions highlights the existence of several energy minima corresponding to BI-conformations, in other words, the complexity of the relief pattern of the potential energy surface of complementary desoxydinucleoside monophosphate complexes. This accounts for variability of conformational parameters of duplex fragments with the same base sequence. Popular molecular mechanics force fields AMBER and CHARMM reproduce most of the conformational characteristics of desoxydinucleoside monophosphates and their complementary complexes with Na-ions but fail to reproduce some details of the dependence of the Watson-Crick duplex conformation on the nucleotide sequence.

  20. Image analysis in cytology: DNA-histogramming versus cervical smear prescreening.

    PubMed

    Bengtsson, E W; Nordin, B

    1993-01-01

    The visual inspection of cellular specimens and histological sections through a light microscope plays an important role in clinical medicine and biomedical research. The human visual system is very good at the recognition of various patterns but less efficient at quantitative assessment of these patterns. Some samples are prepared in great numbers, most notably the screening for cervical cancer, the so-called PAP-smears, which results in hundreds of millions of samples each year, creating a tedious mass inspection task. Numerous attempts have been made over the last 40 years to create systems that solve these two tasks, the quantitative supplement to the human visual system and the automation of mass screening. The most difficult task, the total automation, has received the greatest attention with many large scale projects over the decades. In spite of all these efforts, still no generally accepted automated prescreening device exists on the market. The main reason for this failure is the great pattern recognition capabilities needed to distinguish between cancer cells and all other kinds of objects found in the specimens: cellular clusters, debris, degenerate cells, etc. Improved algorithms, the ever-increasing processing power of computers and progress in biochemical specimen preparation techniques make it likely that eventually useful automated prescreening systems will become available. Meanwhile, much less effort has been put into the development of interactive cell image analysis systems. Still, some such systems have been developed and put into use at thousands of laboratories worldwide. In these the human pattern recognition capability is used to select the fields and objects that are to be analysed while the computational power of the computer is used for the quantitative analysis of cellular DNA content or other relevant markers. Numerous studies have shown that the quantitative information about the distribution of cellular DNA content is of prognostic significance in many types of cancer. Several laboratories are therefore putting these techniques into routine clinical use. The more advanced systems can also study many other markers and cellular features, some known to be of clinical interest, others useful in research. The advances in computer technology are making these systems more generally available through decreasing cost, increasing computational power and improved user interfaces. We have been involved in research and development of both automated and interactive cell analysis systems during the last 20 years. Here some experiences and conclusions from this work will be presented as well as some predictions about what can be expected in the near future.

  1. Correlation of DNA content and nucleomorphometric features with World Health Organization grading of meningiomas.

    PubMed

    Grunewald, J P; Röhl, F W; Kirches, E; Dietzmann, K

    1998-02-01

    Many studies dealing with extracranial cancer showed a strong correlation of DNA ploidy to a poor clinical outcome, recurrence, or malignancy. In brain tumors, analysis of DNA content did not always provided significant diagnostic information. In this study, DNA density and karyometric parameters of 50 meningiomas (26 Grade I, 10 Grade II, 14 Grade III) were quantitatively evaluated by digital cell image analyses of Feulgen-stained nuclei. In particular, the densitometric parameter SEXT, which describes nuclear DNA content, as well as the morphometric values LENG (a computer-assisted measurement of nuclear circumference), AREA (a computer-assisted measurement of nuclear area), FCON (a parameter that describes nuclear roundness), and CONC (a describing nuclear contour), evaluated with the software IMAGE C, were correlated to World Health Organization (WHO) grading using univariate and multivariate methods. AREA and LENG values showed significant differences between tumors of Grades I and III. FCON values were unable to distinguish WHO Grade III from Grade I/II but were useful in clearly separating Grade II from Grade I tumors. CONC values detected differences between WHO Grades II and I/III tumors but not between the latter. SEXT values clearly distinguished Grade III from Grade I/II tumors. The 1c, 2c, 2.5c, and 5c exceeding rates showed no predictive values. Only the 6c exceeding rate showed a significant difference between Grades I and III. These results outline the characteristic features of the atypical (Grade II) meningiomas, which make them a recognizable tumor entity distinct from benign and anaplastic meningiomas. The combination of DNA densitometric and morphometric findings seems to be a powerful addition to the histopathologic classification of meningiomas, as suggested by the WHO.

  2. A computational model for histone mark propagation reproduces the distribution of heterochromatin in different human cell types.

    PubMed

    Schwämmle, Veit; Jensen, Ole Nørregaard

    2013-01-01

    Chromatin is a highly compact and dynamic nuclear structure that consists of DNA and associated proteins. The main organizational unit is the nucleosome, which consists of a histone octamer with DNA wrapped around it. Histone proteins are implicated in the regulation of eukaryote genes and they carry numerous reversible post-translational modifications that control DNA-protein interactions and the recruitment of chromatin binding proteins. Heterochromatin, the transcriptionally inactive part of the genome, is densely packed and contains histone H3 that is methylated at Lys 9 (H3K9me). The propagation of H3K9me in nucleosomes along the DNA in chromatin is antagonizing by methylation of H3 Lysine 4 (H3K4me) and acetylations of several lysines, which is related to euchromatin and active genes. We show that the related histone modifications form antagonized domains on a coarse scale. These histone marks are assumed to be initiated within distinct nucleation sites in the DNA and to propagate bi-directionally. We propose a simple computer model that simulates the distribution of heterochromatin in human chromosomes. The simulations are in agreement with previously reported experimental observations from two different human cell lines. We reproduced different types of barriers between heterochromatin and euchromatin providing a unified model for their function. The effect of changes in the nucleation site distribution and of propagation rates were studied. The former occurs mainly with the aim of (de-)activation of single genes or gene groups and the latter has the power of controlling the transcriptional programs of entire chromosomes. Generally, the regulatory program of gene transcription is controlled by the distribution of nucleation sites along the DNA string.

  3. Supercoil Formation During DNA Melting

    NASA Astrophysics Data System (ADS)

    Sayar, Mehmet; Avsaroglu, Baris; Kabakcioglu, Alkan

    2009-03-01

    Supercoil formation plays a key role in determining the structure-function relationship in DNA. Biological and technological processes, such as protein synthesis, polymerase chain reaction, and microarrays relys on separation of the two strands in DNA, which is coupled to the unwinding of the supercoiled structure. This problem has been studied theoretically via Peyrard-Bishop and Poland-Scheraga type models, which include a simple representation of the DNA structural properties. In recent years, computational models, which provide a more realtistic representaion of DNA molecule, have been used to study the melting behavior of short DNA chains. Here, we will present a new coarse-grained model of DNA which is capable of simulating sufficiently long DNA chains for studying the supercoil formation during melting, without sacrificing the local structural properties. Our coarse-grained model successfully reproduces the local geometry of the DNA molecule, such as the 3'-5' directionality, major-minor groove structure, and the helical pitch. We will present our initial results on the dynamics of supercoiling during DNA melting.

  4. Sub-Terrahertz Spectroscopy of E.COLI Dna: Experiment, Statistical Model, and MD Simulations

    NASA Astrophysics Data System (ADS)

    Sizov, I.; Dorofeeva, T.; Khromova, T.; Gelmont, B.; Globus, T.

    2012-06-01

    We will present result of combined experimental and computational study of sub-THz absorption spectra from Escherichia coli (E.coli) DNA. Measurements were conducted using a Bruker FTIR spectrometer with a liquid helium cooled bolometer and a recently developed frequency domain sensor operating at room temperature, with spectral resolution of 0.25 cm-1 and 0.03 cm-1, correspondingly. We have earlier demonstrated that molecular dynamics (MD) simulation can be effectively applied for characterizing relatively small biological molecules, such as transfer RNA or small protein thioredoxin from E. coli , and help to understand and predict their absorption spectra. Large size of DNA macromolecules ( 5 million base pairs for E. coli DNA) prevents, however, direct application of MD simulation at the current level of computational capabilities. Therefore, by applying a second order Markov chain approach and Monte-Carlo technique, we have developed a new statistical model to construct DNA sequences from biological cells. These short representative sequences (20-60 base pairs) are built upon the most frequently repeated fragments (2-10 base pairs) in the original DNA. Using this new approach, we constructed DNA sequences for several non-pathogenic strains of E.coli, including a well-known strain BL21, uro-pathogenic strain, CFT073, and deadly EDL933 strain (O157:H7), and used MD simulations to calculate vibrational absorption spectra of these strains. Significant differences are clearly present in spectra of strains in averaged spectra and in all components for particular orientations. The mechanism of interaction of THz radiation with a biological molecule is studied by analyzing dynamics of atoms and correlation of local vibrations in the modeled molecule. Simulated THz vibrational spectra of DNA are compared with experimental results. With the spectral resolution of 0.1 cm-1 or better, which is now available in experiments, the very easy discrimination between different strains of the same bacteria becomes possible.

  5. On the topology of chromatin fibres

    PubMed Central

    Barbi, Maria; Mozziconacci, Julien; Victor, Jean-Marc; Wong, Hua; Lavelle, Christophe

    2012-01-01

    The ability of cells to pack, use and duplicate DNA remains one of the most fascinating questions in biology. To understand DNA organization and dynamics, it is important to consider the physical and topological constraints acting on it. In the eukaryotic cell nucleus, DNA is organized by proteins acting as spools on which DNA can be wrapped. These proteins can subsequently interact and form a structure called the chromatin fibre. Using a simple geometric model, we propose a general method for computing topological properties (twist, writhe and linking number) of the DNA embedded in those fibres. The relevance of the method is reviewed through the analysis of magnetic tweezers single molecule experiments that revealed unexpected properties of the chromatin fibre. Possible biological implications of these results are discussed. PMID:24098838

  6. Computational Model for DNA Organization Mediated by Protein Interaction in Prokaryotes

    NASA Astrophysics Data System (ADS)

    Garimella, Karthik; Kharel, Savan

    2016-03-01

    In Escherichia Coli, there are several mechanisms that drive chromosomal organization. We know through experiments that the E. Coli chromosome is condensed into highly structured regions known as macrodomains (MDs). One of the regions known as the Terminus undergoes DNA-bridging condensation that form loops between distant DNA sites and it is known to be mediated by a Terminus specific protein, which binds to specific markers within the Terminus region. In the absence of Terminus specific protein, however, the Terminus region is known to not condense nearly as much, which will likely impede several biological processes including DNA replication. In order to understand the molecular basis of protein mediation in vivo several models of Terminus specific segregation have been constructed in silico which model DNA as polymer chains.

  7. Superimposed Code Theoretic Analysis of Deoxyribonucleic Acid (DNA) Codes and DNA Computing

    DTIC Science & Technology

    2010-01-01

    partitioned by font type) of sequences are allowed to be in each position (e.g., Arial = position 0, Comic = position 1, etc. ) and within each collection...movement was modeled by a Brownian motion 3 dimensional random walk. The one dimensional diffusion coefficient D for the ellipsoid shape with 3...temperature, kB is Boltzmann’s constant, and η is the viscosity of the medium. The random walk motion is modeled by assuming the oligo is on a three

  8. Interfacing Neural Network Components and Nucleic Acids

    PubMed Central

    Lissek, Thomas

    2017-01-01

    Translating neural activity into nucleic acid modifications in a controlled manner harbors unique advantages for basic neurobiology and bioengineering. It would allow for a new generation of biological computers that store output in ultra-compact and long-lived DNA and enable the investigation of animal nervous systems at unprecedented scales. Furthermore, by exploiting the ability of DNA to precisely influence neuronal activity and structure, it could be possible to more effectively create cellular therapy approaches for psychiatric diseases that are currently difficult to treat. PMID:29255707

  9. Entropic Profiler – detection of conservation in genomes using information theory

    PubMed Central

    Fernandes, Francisco; Freitas, Ana T; Almeida, Jonas S; Vinga, Susana

    2009-01-01

    Background In the last decades, with the successive availability of whole genome sequences, many research efforts have been made to mathematically model DNA. Entropic Profiles (EP) were proposed recently as a new measure of continuous entropy of genome sequences. EP represent local information plots related to DNA randomness and are based on information theory and statistical concepts. They express the weighed relative abundance of motifs for each position in genomes. Their study is very relevant because under or over-representation segments are often associated with significant biological meaning. Findings The Entropic Profiler application here presented is a new tool designed to detect and extract under and over-represented DNA segments in genomes by using EP. It allows its computation in a very efficient way by recurring to improved algorithms and data structures, which include modified suffix trees. Available through a web interface and as downloadable source code, it allows to study positions and to search for motifs inside the whole sequence or within a specified range. DNA sequences can be entered from different sources, including FASTA files, pre-loaded examples or resuming a previously saved work. Besides the EP value plots, p-values and z-scores for each motif are also computed, along with the Chaos Game Representation of the sequence. Conclusion EP are directly related with the statistical significance of motifs and can be considered as a new method to extract and classify significant regions in genomes and estimate local scales in DNA. The present implementation establishes an efficient and useful tool for whole genome analysis. PMID:19416538

  10. 20 Discoveries that Shaped Our Lives: Century of the Sciences.

    ERIC Educational Resources Information Center

    Judson, Horace Freeland

    1984-01-01

    Describes (in separate articles) 20 developments in science, technology, and medicine that were made during the twentieth century and had significant impact on society. They include discoveries related to intelligence tests, plastics, aviation, antibiotics, genetics, evolution, birth control, computers, transistors, DNA, lasers, statistics,…

  11. Genome Wide Characterization of Simple Sequence Repeats in Cucumber

    USDA-ARS?s Scientific Manuscript database

    The whole genome sequence of the cucumber cultivar Gy14 was recently sequenced at 15× coverage with the Roche 454 Titanium technology. The microsatellite DNA sequences (simple sequence repeats, SSRs) in the assembled scaffolds were computationally explored and characterized. A total of 112,073 SSRs ...

  12. The Teaching of Protein Synthesis--A Microcomputer Based Method.

    ERIC Educational Resources Information Center

    Goodridge, Frank

    1983-01-01

    Describes two computer programs (BASIC for 32K Commodore PET) for teaching protein synthesis. The first is an interactive test of base-pairing knowledge, and the second generates random DNA nucleotide sequences, with instructions for substitution, insertion, and deletion printed out for each student. (JN)

  13. Really big data: Processing and analysis of large datasets

    USDA-ARS?s Scientific Manuscript database

    Modern animal breeding datasets are large and getting larger, due in part to the recent availability of DNA data for many animals. Computational methods for efficiently storing and analyzing those data are under development. The amount of storage space required for such datasets is increasing rapidl...

  14. A Novel Computational Method to Reduce Leaky Reaction in DNA Strand Displacement.

    PubMed

    Li, Xin; Wang, Xun; Song, Tao; Lu, Wei; Chen, Zhihua; Shi, Xiaolong

    2015-01-01

    DNA strand displacement technique is widely used in DNA programming, DNA biosensors, and gene analysis. In DNA strand displacement, leaky reactions can cause DNA signals decay and detecting DNA signals fails. The mostly used method to avoid leakage is cleaning up after upstream leaky reactions, and it remains a challenge to develop reliable DNA strand displacement technique with low leakage. In this work, we address the challenge by experimentally evaluating the basic factors, including reaction time, ratio of reactants, and ion concentration to the leakage in DNA strand displacement. Specifically, fluorescent probes and a hairpin structure reporting DNA strand are designed to detect the output of DNA strand displacement, and thus can evaluate the leakage of DNA strand displacement reactions with different reaction time, ratios of reactants, and ion concentrations. From the obtained data, mathematical models for evaluating leakage are achieved by curve derivation. As a result, it is obtained that long time incubation, high concentration of fuel strand, and inappropriate amount of ion concentration can weaken leaky reactions. This contributes to a method to set proper reaction conditions to reduce leakage in DNA strand displacement.

  15. DNA nanotechnology: a future perspective

    PubMed Central

    2013-01-01

    In addition to its genetic function, DNA is one of the most distinct and smart self-assembling nanomaterials. DNA nanotechnology exploits the predictable self-assembly of DNA oligonucleotides to design and assemble innovative and highly discrete nanostructures. Highly ordered DNA motifs are capable of providing an ultra-fine framework for the next generation of nanofabrications. The majority of these applications are based upon the complementarity of DNA base pairing: adenine with thymine, and guanine with cytosine. DNA provides an intelligent route for the creation of nanoarchitectures with programmable and predictable patterns. DNA strands twist along one helix for a number of bases before switching to the other helix by passing through a crossover junction. The association of two crossovers keeps the helices parallel and holds them tightly together, allowing the assembly of bigger structures. Because of the DNA molecule's unique and novel characteristics, it can easily be applied in a vast variety of multidisciplinary research areas like biomedicine, computer science, nano/optoelectronics, and bionanotechnology. PMID:23497147

  16. Sequential addition of short DNA oligos in DNA-polymerase-based synthesis reactions

    DOEpatents

    Gardner, Shea N [San Leandro, CA; Mariella, Jr., Raymond P.; Christian, Allen T [Tracy, CA; Young, Jennifer A [Berkeley, CA; Clague, David S [Livermore, CA

    2011-01-18

    A method of fabricating a DNA molecule of user-defined sequence. The method comprises the steps of preselecting a multiplicity of DNA sequence segments that will comprise the DNA molecule of user-defined sequence, separating the DNA sequence segments temporally, and combining the multiplicity of DNA sequence segments with at least one polymerase enzyme wherein the multiplicity of DNA sequence segments join to produce the DNA molecule of user-defined sequence. Sequence segments may be of length n, where n is an even or odd integer. In one embodiment the length of desired hybridizing overlap is specified by the user and the sequences and the protocol for combining them are guided by computational (bioinformatics) predictions. In one embodiment sequence segments are combined from multiple reading frames to span the same region of a sequence, so that multiple desired hybridizations may occur with different overlap lengths. In one embodiment starting sequence fragments are of different lengths, n, n+1, n+2, etc.

  17. Effects of sequence on DNA wrapping around histones

    NASA Astrophysics Data System (ADS)

    Ortiz, Vanessa

    2011-03-01

    A central question in biophysics is whether the sequence of a DNA strand affects its mechanical properties. In epigenetics, these are thought to influence nucleosome positioning and gene expression. Theoretical and experimental attempts to answer this question have been hindered by an inability to directly resolve DNA structure and dynamics at the base-pair level. In our previous studies we used a detailed model of DNA to measure the effects of sequence on the stability of naked DNA under bending. Sequence was shown to influence DNA's ability to form kinks, which arise when certain motifs slide past others to form non-native contacts. Here, we have now included histone-DNA interactions to see if the results obtained for naked DNA are transferable to the problem of nucleosome positioning. Different DNA sequences interacting with the histone protein complex are studied, and their equilibrium and mechanical properties are compared among themselves and with the naked case. NLM training grant to the Computation and Informatics in Biology and Medicine Training Program (NLM T15LM007359).

  18. Modeling photoionization of aqueous DNA and its components.

    PubMed

    Pluhařová, Eva; Slavíček, Petr; Jungwirth, Pavel

    2015-05-19

    Radiation damage to DNA is usually considered in terms of UVA and UVB radiation. These ultraviolet rays, which are part of the solar spectrum, can indeed cause chemical lesions in DNA, triggered by photoexcitation particularly in the UVB range. Damage can, however, be also caused by higher energy radiation, which can ionize directly the DNA or its immediate surroundings, leading to indirect damage. Thanks to absorption in the atmosphere, the intensity of such ionizing radiation is negligible in the solar spectrum at the surface of Earth. Nevertheless, such an ionizing scenario can become dangerously plausible for astronauts or flight personnel, as well as for persons present at nuclear power plant accidents. On the beneficial side, ionizing radiation is employed as means for destroying the DNA of cancer cells during radiation therapy. Quantitative information about ionization of DNA and its components is important not only for DNA radiation damage, but also for understanding redox properties of DNA in redox sensing or labeling, as well as charge migration along the double helix in nanoelectronics applications. Until recently, the vast majority of experimental and computational data on DNA ionization was pertinent to its components in the gas phase, which is far from its native aqueous environment. The situation has, however, changed for the better due to the advent of photoelectron spectroscopy in liquid microjets and its most recent application to photoionization of aqueous nucleosides, nucleotides, and larger DNA fragments. Here, we present a consistent and efficient computational methodology, which allows to accurately evaluate ionization energies and model photoelectron spectra of aqueous DNA and its individual components. After careful benchmarking, the method based on density functional theory and its time-dependent variant with properly chosen hybrid functionals and polarizable continuum solvent model provides ionization energies with accuracy of 0.2-0.3 eV, allowing for faithful modeling and interpretation of DNA photoionization. The key finding is that the aqueous medium is remarkably efficient in screening the interactions within DNA such that, unlike in the gas phase, ionization of a base, nucleoside, or nucleotide depends only very weakly on the particular DNA context. An exception is the electronic interaction between neighboring bases which can lead to sequence-specific effects, such as a partial delocalization of the cationic hole upon ionization enabled by presence of adjacent bases of the same type.

  19. Remote control of nanoscale devices

    NASA Astrophysics Data System (ADS)

    Högberg, Björn

    2018-01-01

    Processes that occur at the nanometer scale have a tremendous impact on our daily lives. Sophisticated evolved nanomachines operate in each of our cells; we also, as a society, increasingly rely on synthetic nanodevices for communication and computation. Scientists are still only beginning to master this scale, but, recently, DNA nanotechnology (1)—in particular, DNA origami (2)—has emerged as a powerful tool to build structures precise enough to help us do so. On page 296 of this issue, Kopperger et al. (3) show that they are now also able to control the motion of a DNA origami device from the outside by applying electric fields.

  20. Implementation of Arithmetic and Nonarithmetic Functions on a Label-free and DNA-based Platform

    NASA Astrophysics Data System (ADS)

    Wang, Kun; He, Mengqi; Wang, Jin; He, Ronghuan; Wang, Jianhua

    2016-10-01

    A series of complex logic gates were constructed based on graphene oxide and DNA-templated silver nanoclusters to perform both arithmetic and nonarithmetic functions. For the purpose of satisfying the requirements of progressive computational complexity and cost-effectiveness, a label-free and universal platform was developed by integration of various functions, including half adder, half subtractor, multiplexer and demultiplexer. The label-free system avoided laborious modification of biomolecules. The designed DNA-based logic gates can be implemented with readout of near-infrared fluorescence, and exhibit great potential applications in the field of bioimaging as well as disease diagnosis.

  1. Identification of unique repeated patterns, location of mutation in DNA finger printing using artificial intelligence technique.

    PubMed

    Mukunthan, B; Nagaveni, N

    2014-01-01

    In genetic engineering, conventional techniques and algorithms employed by forensic scientists to assist in identification of individuals on the basis of their respective DNA profiles involves more complex computational steps and mathematical formulae, also the identification of location of mutation in a genomic sequence in laboratories is still an exigent task. This novel approach provides ability to solve the problems that do not have an algorithmic solution and the available solutions are also too complex to be found. The perfect blend made of bioinformatics and neural networks technique results in efficient DNA pattern analysis algorithm with utmost prediction accuracy.

  2. DNA-Enabled Integrated Molecular Systems for Computation and Sensing

    DTIC Science & Technology

    2014-05-21

    nanostructures to create nanophotonic networks that undergo nonradiative , near-field energy transfer. This process is known as resonance energy transfer (RET...promoting A to A*. The A* species can then decay nonradiatively or emit a photon of energy hν2.When chromophores are too far away, they cannot efficiently

  3. Function-Based Algorithms for Biological Sequences

    ERIC Educational Resources Information Center

    Mohanty, Pragyan Sheela P.

    2015-01-01

    Two problems at two different abstraction levels of computational biology are studied. At the molecular level, efficient pattern matching algorithms in DNA sequences are presented. For gene order data, an efficient data structure is presented capable of storing all gene re-orderings in a systematic manner. A common characteristic of presented…

  4. EVALUATION OF DNA CHIPS (MICROARRAYS) FOR DETERMINING VIRULENCE FACTOR ACTIVITY RELATIONSHIPS (VFARS)

    EPA Science Inventory

    Computational toxicology is a rapid approach to screening for toxic effects and looking for common outcomes that can result in predictive models. The long term project will result in the development of a database of mRNA responses to known water-borne pathogens. An understanding...

  5. Sequence variation between 462 human individuals fine-tunes functional sites of RNA processing

    NASA Astrophysics Data System (ADS)

    Ferreira, Pedro G.; Oti, Martin; Barann, Matthias; Wieland, Thomas; Ezquina, Suzana; Friedländer, Marc R.; Rivas, Manuel A.; Esteve-Codina, Anna; Estivill, Xavier; Guigó, Roderic; Dermitzakis, Emmanouil; Antonarakis, Stylianos; Meitinger, Thomas; Strom, Tim M.; Palotie, Aarno; François Deleuze, Jean; Sudbrak, Ralf; Lerach, Hans; Gut, Ivo; Syvänen, Ann-Christine; Gyllensten, Ulf; Schreiber, Stefan; Rosenstiel, Philip; Brunner, Han; Veltman, Joris; Hoen, Peter A. C. T.; Jan van Ommen, Gert; Carracedo, Angel; Brazma, Alvis; Flicek, Paul; Cambon-Thomsen, Anne; Mangion, Jonathan; Bentley, David; Hamosh, Ada; Rosenstiel, Philip; Strom, Tim M.; Lappalainen, Tuuli; Guigó, Roderic; Sammeth, Michael

    2016-09-01

    Recent advances in the cost-efficiency of sequencing technologies enabled the combined DNA- and RNA-sequencing of human individuals at the population-scale, making genome-wide investigations of the inter-individual genetic impact on gene expression viable. Employing mRNA-sequencing data from the Geuvadis Project and genome sequencing data from the 1000 Genomes Project we show that the computational analysis of DNA sequences around splice sites and poly-A signals is able to explain several observations in the phenotype data. In contrast to widespread assessments of statistically significant associations between DNA polymorphisms and quantitative traits, we developed a computational tool to pinpoint the molecular mechanisms by which genetic markers drive variation in RNA-processing, cataloguing and classifying alleles that change the affinity of core RNA elements to their recognizing factors. The in silico models we employ further suggest RNA editing can moonlight as a splicing-modulator, albeit less frequently than genomic sequence diversity. Beyond existing annotations, we demonstrate that the ultra-high resolution of RNA-Seq combined from 462 individuals also provides evidence for thousands of bona fide novel elements of RNA processing—alternative splice sites, introns, and cleavage sites—which are often rare and lowly expressed but in other characteristics similar to their annotated counterparts.

  6. GPU-BSM: A GPU-Based Tool to Map Bisulfite-Treated Reads

    PubMed Central

    Manconi, Andrea; Orro, Alessandro; Manca, Emanuele; Armano, Giuliano; Milanesi, Luciano

    2014-01-01

    Cytosine DNA methylation is an epigenetic mark implicated in several biological processes. Bisulfite treatment of DNA is acknowledged as the gold standard technique to study methylation. This technique introduces changes in the genomic DNA by converting cytosines to uracils while 5-methylcytosines remain nonreactive. During PCR amplification 5-methylcytosines are amplified as cytosine, whereas uracils and thymines as thymine. To detect the methylation levels, reads treated with the bisulfite must be aligned against a reference genome. Mapping these reads to a reference genome represents a significant computational challenge mainly due to the increased search space and the loss of information introduced by the treatment. To deal with this computational challenge we devised GPU-BSM, a tool based on modern Graphics Processing Units. Graphics Processing Units are hardware accelerators that are increasingly being used successfully to accelerate general-purpose scientific applications. GPU-BSM is a tool able to map bisulfite-treated reads from whole genome bisulfite sequencing and reduced representation bisulfite sequencing, and to estimate methylation levels, with the goal of detecting methylation. Due to the massive parallelization obtained by exploiting graphics cards, GPU-BSM aligns bisulfite-treated reads faster than other cutting-edge solutions, while outperforming most of them in terms of unique mapped reads. PMID:24842718

  7. A neonate with reduced cytomegalovirus DNA copy number and marked improvement of hearing in the treatment of congenital cytomegalovirus infection.

    PubMed

    Hayakawa, Jun; Kawakami, Yasuhiko; Takeda, Sachiyo; Ozawa, Hiroshi; Fukazawa, Ryuji; Takase, Masato; Fukunaga, Yoshitaka

    2012-01-01

    Congenital cytomegalovirus (CMV) infection can cause severe permanent disabilities. A mother who is seronegative before conception but acquires infection during pregnancy is a risk factor for congenital infection. We describe a neonate in whom congenital CMV infection was diagnosed at birth and confirmed with DNA quantitation by means of the polymerase chain reaction, was accompanied by cerebral ventriculomegaly and severe hearing loss, and was treated with ganciclovir/valganciclovir for 6 weeks. Initially, cerebral ventriculomegaly and calcification were also found with computed tomography, and severe hearing loss was detected with auditory brainstem response testing. After treatment, CMV DNA decreased in copy number and became undetectable. No marked side effects occurred after treatment. Surprisingly, 1 year after treatment, neurological and motor development was equivalent to that in a healthy infant. Audiometry indicated that auditory ability would improve with rehabilitation, speech and language therapy, and cochlear implantation. Single-photon emission computed tomography showed marked improvement 6 months after treatment. This case provides compelling evidence that a reliable diagnosis of congenital CMV infections coupled with a prompt and appropriate treatment program can prevent permanent disability. It is, therefore, important to establish a more effective strategy for the management of congenital CMV infection.

  8. Three-dimensional DNA image cytometry by optical projection tomographic microscopy for early cancer diagnosis.

    PubMed

    Agarwal, Nitin; Biancardi, Alberto M; Patten, Florence W; Reeves, Anthony P; Seibel, Eric J

    2014-04-01

    Aneuploidy is typically assessed by flow cytometry (FCM) and image cytometry (ICM). We used optical projection tomographic microscopy (OPTM) for assessing cellular DNA content using absorption and fluorescence stains. OPTM combines some of the attributes of both FCM and ICM and generates isometric high-resolution three-dimensional (3-D) images of single cells. Although the depth of field of the microscope objective was in the submicron range, it was extended by scanning the objective's focal plane. The extended depth of field image is similar to a projection in a conventional x-ray computed tomography. These projections were later reconstructed using computed tomography methods to form a 3-D image. We also present an automated method for 3-D nuclear segmentation. Nuclei of chicken, trout, and triploid trout erythrocyte were used to calibrate OPTM. Ratios of integrated optical densities extracted from 50 images of each standard were compared to ratios of DNA indices from FCM. A comparison of mean square errors with thionin, hematoxylin, Feulgen, and SYTOX green was done. Feulgen technique was preferred as it showed highest stoichiometry, least variance, and preserved nuclear morphology in 3-D. The addition of this quantitative biomarker could further strengthen existing classifiers and improve early diagnosis of cancer using 3-D microscopy.

  9. Functional analysis of rare variants in mismatch repair proteins augments results from computation-based predictive methods

    PubMed Central

    Arora, Sanjeevani; Huwe, Peter J.; Sikder, Rahmat; Shah, Manali; Browne, Amanda J.; Lesh, Randy; Nicolas, Emmanuelle; Deshpande, Sanat; Hall, Michael J.; Dunbrack, Roland L.; Golemis, Erica A.

    2017-01-01

    ABSTRACT The cancer-predisposing Lynch Syndrome (LS) arises from germline mutations in DNA mismatch repair (MMR) genes, predominantly MLH1, MSH2, MSH6, and PMS2. A major challenge for clinical diagnosis of LS is the frequent identification of variants of uncertain significance (VUS) in these genes, as it is often difficult to determine variant pathogenicity, particularly for missense variants. Generic programs such as SIFT and PolyPhen-2, and MMR gene-specific programs such as PON-MMR and MAPP-MMR, are often used to predict deleterious or neutral effects of VUS in MMR genes. We evaluated the performance of multiple predictive programs in the context of functional biologic data for 15 VUS in MLH1, MSH2, and PMS2. Using cell line models, we characterized VUS predicted to range from neutral to pathogenic on mRNA and protein expression, basal cellular viability, viability following treatment with a panel of DNA-damaging agents, and functionality in DNA damage response (DDR) signaling, benchmarking to wild-type MMR proteins. Our results suggest that the MMR gene-specific classifiers do not always align with the experimental phenotypes related to DDR. Our study highlights the importance of complementary experimental and computational assessment to develop future predictors for the assessment of VUS. PMID:28494185

  10. Optogenetics and computer vision for Caenorhabditis elegans neuroscience and other biophysical applications

    NASA Astrophysics Data System (ADS)

    Leifer, Andrew Michael

    2011-07-01

    This work presents optogenetics and real-time computer vision techniques to non-invasively manipulate and monitor neural activity with high spatiotemporal resolution in awake behaving Caenorhabditis elegans. These methods were employed to dissect the nematode's mechanosensory and motor circuits and to elucidate the neural control of wave propagation during forward locomotion. Additionally, similar computer vision methods were used to automatically detect and decode fluorescing DNA origami nanobarcodes, a new class of fluorescent reporter constructs. An optogenetic instrument capable of real-time light delivery with high spatiotemporal resolution to specified targets in freely moving C. elegans, the first such instrument of its kind, was developed. The instrument was used to probe the nematode's mechanosensory circuit, demonstrating that stimulation of a single mechanosensory neuron suffices to induce reversals. The instrument was also used to probe the motor circuit, demonstrating that inhibition of regions of cholinergic motor neurons blocks undulatory wave propagation and that muscle contractions can persist even without inputs from the motor neurons. The motor circuit was further probed using optogenetics and microfluidic techniques. Undulatory wave propagation during forward locomotion was observed to depend on stretch-sensitive signaling mediated by cholinergic motor neurons. Specifically, posterior body segments are compelled, through stretch-sensitive feedback, to bend in the same direction as anterior segments. This is the first explicit demonstration of such feedback and serves as a foundation for understanding motor circuits in other organisms. A real-time tracking system was developed to record intracellular calcium transients in single neurons while simultaneously monitoring macroscopic behavior of freely moving C. elegans. This was used to study the worm's stereotyped reversal behavior, the omega turn. Calcium transients corresponding to temporal features of the omega turn were observed in interneurons AVA and AVB. Optics and computer vision techniques similar to those developed for the C. elegans experiments were also used to detect DNA origami nanorod barcodes. An optimal Bayesian multiple hypothesis test was deployed to unambiguously classify each barcode as a member of one of 216 distinct barcode species. Overall, this set of experiments demonstrates the powerful role that optogenetics and computer vision can play in behavioral neuroscience and quantitative biophysics.

  11. A novel computational approach "BP-STOCH" to study ligand binding to finite lattice.

    PubMed

    Beshnova, Daria A; Bereznyak, Ekaterina G; Shestopalova, Anna V; Evstigneev, Maxim P

    2011-03-01

    We report a novel computational algorithm "BP-STOCH" to be used for studying single-type ligand binding with biopolymers of finite lengths, such as DNA oligonucleotides or oligopeptides. It is based on an idea to represent any type of ligand-biopolymer complex in a form of binary number, where "0" and "1" bits stand for vacant and engaged monomers of the biopolymer, respectively. Cycling over all binary numbers from the lowest 0 up to the highest 2(N) - 1 means a sequential generating of all possible configurations of vacant/engaged monomers, which, after proper filtering, results in a full set of possible types of complexes in solution between the ligand and the N-site lattice. The principal advantage of BP-STOCH algorithm is the possibility to incorporate into this cycle any conditions on computation of the concentrations and observed experimental parameters of the complexes in solution, and programmatic access to each monomer of the biopolymer within each binding site of every binding configuration. The latter is equivalent to unlimited extension of the basic reaction scheme and allows to use BP-STOCH algorithm as an alternative to conventional computational approaches.

  12. A Survey of Computational Intelligence Techniques in Protein Function Prediction

    PubMed Central

    Tiwari, Arvind Kumar; Srivastava, Rajeev

    2014-01-01

    During the past, there was a massive growth of knowledge of unknown proteins with the advancement of high throughput microarray technologies. Protein function prediction is the most challenging problem in bioinformatics. In the past, the homology based approaches were used to predict the protein function, but they failed when a new protein was different from the previous one. Therefore, to alleviate the problems associated with homology based traditional approaches, numerous computational intelligence techniques have been proposed in the recent past. This paper presents a state-of-the-art comprehensive review of various computational intelligence techniques for protein function predictions using sequence, structure, protein-protein interaction network, and gene expression data used in wide areas of applications such as prediction of DNA and RNA binding sites, subcellular localization, enzyme functions, signal peptides, catalytic residues, nuclear/G-protein coupled receptors, membrane proteins, and pathway analysis from gene expression datasets. This paper also summarizes the result obtained by many researchers to solve these problems by using computational intelligence techniques with appropriate datasets to improve the prediction performance. The summary shows that ensemble classifiers and integration of multiple heterogeneous data are useful for protein function prediction. PMID:25574395

  13. Efficient alignment-free DNA barcode analytics

    PubMed Central

    Kuksa, Pavel; Pavlovic, Vladimir

    2009-01-01

    Background In this work we consider barcode DNA analysis problems and address them using alternative, alignment-free methods and representations which model sequences as collections of short sequence fragments (features). The methods use fixed-length representations (spectrum) for barcode sequences to measure similarities or dissimilarities between sequences coming from the same or different species. The spectrum-based representation not only allows for accurate and computationally efficient species classification, but also opens possibility for accurate clustering analysis of putative species barcodes and identification of critical within-barcode loci distinguishing barcodes of different sample groups. Results New alignment-free methods provide highly accurate and fast DNA barcode-based identification and classification of species with substantial improvements in accuracy and speed over state-of-the-art barcode analysis methods. We evaluate our methods on problems of species classification and identification using barcodes, important and relevant analytical tasks in many practical applications (adverse species movement monitoring, sampling surveys for unknown or pathogenic species identification, biodiversity assessment, etc.) On several benchmark barcode datasets, including ACG, Astraptes, Hesperiidae, Fish larvae, and Birds of North America, proposed alignment-free methods considerably improve prediction accuracy compared to prior results. We also observe significant running time improvements over the state-of-the-art methods. Conclusion Our results show that newly developed alignment-free methods for DNA barcoding can efficiently and with high accuracy identify specimens by examining only few barcode features, resulting in increased scalability and interpretability of current computational approaches to barcoding. PMID:19900305

  14. SSR_pipeline: a bioinformatic infrastructure for identifying microsatellites from paired-end Illumina high-throughput DNA sequencing data

    USGS Publications Warehouse

    Miller, Mark P.; Knaus, Brian J.; Mullins, Thomas D.; Haig, Susan M.

    2013-01-01

    SSR_pipeline is a flexible set of programs designed to efficiently identify simple sequence repeats (e.g., microsatellites) from paired-end high-throughput Illumina DNA sequencing data. The program suite contains 3 analysis modules along with a fourth control module that can automate analyses of large volumes of data. The modules are used to 1) identify the subset of paired-end sequences that pass Illumina quality standards, 2) align paired-end reads into a single composite DNA sequence, and 3) identify sequences that possess microsatellites (both simple and compound) conforming to user-specified parameters. The microsatellite search algorithm is extremely efficient, and we have used it to identify repeats with motifs from 2 to 25bp in length. Each of the 3 analysis modules can also be used independently to provide greater flexibility or to work with FASTQ or FASTA files generated from other sequencing platforms (Roche 454, Ion Torrent, etc.). We demonstrate use of the program with data from the brine fly Ephydra packardi (Diptera: Ephydridae) and provide empirical timing benchmarks to illustrate program performance on a common desktop computer environment. We further show that the Illumina platform is capable of identifying large numbers of microsatellites, even when using unenriched sample libraries and a very small percentage of the sequencing capacity from a single DNA sequencing run. All modules from SSR_pipeline are implemented in the Python programming language and can therefore be used from nearly any computer operating system (Linux, Macintosh, and Windows).

  15. BayesPI-BAR: a new biophysical model for characterization of regulatory sequence variations

    PubMed Central

    Wang, Junbai; Batmanov, Kirill

    2015-01-01

    Sequence variations in regulatory DNA regions are known to cause functionally important consequences for gene expression. DNA sequence variations may have an essential role in determining phenotypes and may be linked to disease; however, their identification through analysis of massive genome-wide sequencing data is a great challenge. In this work, a new computational pipeline, a Bayesian method for protein–DNA interaction with binding affinity ranking (BayesPI-BAR), is proposed for quantifying the effect of sequence variations on protein binding. BayesPI-BAR uses biophysical modeling of protein–DNA interactions to predict single nucleotide polymorphisms (SNPs) that cause significant changes in the binding affinity of a regulatory region for transcription factors (TFs). The method includes two new parameters (TF chemical potentials or protein concentrations and direct TF binding targets) that are neglected by previous methods. The new method is verified on 67 known human regulatory SNPs, of which 47 (70%) have predicted true TFs ranked in the top 10. Importantly, the performance of BayesPI-BAR, which uses principal component analysis to integrate multiple predictions from various TF chemical potentials, is found to be better than that of existing programs, such as sTRAP and is-rSNP, when evaluated on the same SNPs. BayesPI-BAR is a publicly available tool and is able to carry out parallelized computation, which helps to investigate a large number of TFs or SNPs and to detect disease-associated regulatory sequence variations in the sea of genome-wide noncoding regions. PMID:26202972

  16. An end-to-end workflow for engineering of biological networks from high-level specifications.

    PubMed

    Beal, Jacob; Weiss, Ron; Densmore, Douglas; Adler, Aaron; Appleton, Evan; Babb, Jonathan; Bhatia, Swapnil; Davidsohn, Noah; Haddock, Traci; Loyall, Joseph; Schantz, Richard; Vasilev, Viktor; Yaman, Fusun

    2012-08-17

    We present a workflow for the design and production of biological networks from high-level program specifications. The workflow is based on a sequence of intermediate models that incrementally translate high-level specifications into DNA samples that implement them. We identify algorithms for translating between adjacent models and implement them as a set of software tools, organized into a four-stage toolchain: Specification, Compilation, Part Assignment, and Assembly. The specification stage begins with a Boolean logic computation specified in the Proto programming language. The compilation stage uses a library of network motifs and cellular platforms, also specified in Proto, to transform the program into an optimized Abstract Genetic Regulatory Network (AGRN) that implements the programmed behavior. The part assignment stage assigns DNA parts to the AGRN, drawing the parts from a database for the target cellular platform, to create a DNA sequence implementing the AGRN. Finally, the assembly stage computes an optimized assembly plan to create the DNA sequence from available part samples, yielding a protocol for producing a sample of engineered plasmids with robotics assistance. Our workflow is the first to automate the production of biological networks from a high-level program specification. Furthermore, the workflow's modular design allows the same program to be realized on different cellular platforms simply by swapping workflow configurations. We validated our workflow by specifying a small-molecule sensor-reporter program and verifying the resulting plasmids in both HEK 293 mammalian cells and in E. coli bacterial cells.

  17. Decision Tree Algorithm-Generated Single-Nucleotide Polymorphism Barcodes of rbcL Genes for 38 Brassicaceae Species Tagging.

    PubMed

    Yang, Cheng-Hong; Wu, Kuo-Chuan; Chuang, Li-Yeh; Chang, Hsueh-Wei

    2018-01-01

    DNA barcode sequences are accumulating in large data sets. A barcode is generally a sequence larger than 1000 base pairs and generates a computational burden. Although the DNA barcode was originally envisioned as straightforward species tags, the identification usage of barcode sequences is rarely emphasized currently. Single-nucleotide polymorphism (SNP) association studies provide us an idea that the SNPs may be the ideal target of feature selection to discriminate between different species. We hypothesize that SNP-based barcodes may be more effective than the full length of DNA barcode sequences for species discrimination. To address this issue, we tested a r ibulose diphosphate carboxylase ( rbcL ) S NP b arcoding (RSB) strategy using a decision tree algorithm. After alignment and trimming, 31 SNPs were discovered in the rbcL sequences from 38 Brassicaceae plant species. In the decision tree construction, these SNPs were computed to set up the decision rule to assign the sequences into 2 groups level by level. After algorithm processing, 37 nodes and 31 loci were required for discriminating 38 species. Finally, the sequence tags consisting of 31 rbcL SNP barcodes were identified for discriminating 38 Brassicaceae species based on the decision tree-selected SNP pattern using RSB method. Taken together, this study provides the rational that the SNP aspect of DNA barcode for rbcL gene is a useful and effective sequence for tagging 38 Brassicaceae species.

  18. Use of laptop computers connected to internet through Wi-Fi decreases human sperm motility and increases sperm DNA fragmentation.

    PubMed

    Avendaño, Conrado; Mata, Ariela; Sanchez Sarmiento, César A; Doncel, Gustavo F

    2012-01-01

    To evaluate the effects of laptop computers connected to local area networks wirelessly (Wi-Fi) on human spermatozoa. Prospective in vitro study. Center for reproductive medicine. Semen samples from 29 healthy donors. Motile sperm were selected by swim up. Each sperm suspension was divided into two aliquots. One sperm aliquot (experimental) from each patient was exposed to an internet-connected laptop by Wi-Fi for 4 hours, whereas the second aliquot (unexposed) was used as control, incubated under identical conditions without being exposed to the laptop. Evaluation of sperm motility, viability, and DNA fragmentation. Donor sperm samples, mostly normozoospermic, exposed ex vivo during 4 hours to a wireless internet-connected laptop showed a significant decrease in progressive sperm motility and an increase in sperm DNA fragmentation. Levels of dead sperm showed no significant differences between the two groups. To our knowledge, this is the first study to evaluate the direct impact of laptop use on human spermatozoa. Ex vivo exposure of human spermatozoa to a wireless internet-connected laptop decreased motility and induced DNA fragmentation by a nonthermal effect. We speculate that keeping a laptop connected wirelessly to the internet on the lap near the testes may result in decreased male fertility. Further in vitro and in vivo studies are needed to prove this contention. Copyright © 2012 American Society for Reproductive Medicine. Published by Elsevier Inc. All rights reserved.

  19. Photoswitching of DNA Hybridization Using a Molecular Motor.

    PubMed

    Lubbe, Anouk S; Liu, Qing; Smith, Sanne J; de Vries, Jan Willem; Kistemaker, Jos C M; de Vries, Alex H; Faustino, Ignacio; Meng, Zhuojun; Szymanski, Wiktor; Herrmann, Andreas; Feringa, Ben L

    2018-04-18

    Reversible control over the functionality of biological systems via external triggers may be used in future medicine to reduce the need for invasive procedures. Additionally, externally regulated biomacromolecules are now considered as particularly attractive tools in nanoscience and the design of smart materials, due to their highly programmable nature and complex functionality. Incorporation of photoswitches into biomolecules, such as peptides, antibiotics, and nucleic acids, has generated exciting results in the past few years. Molecular motors offer the potential for new and more precise methods of photoregulation, due to their multistate switching cycle, unidirectionality of rotation, and helicity inversion during the rotational steps. Aided by computational studies, we designed and synthesized a photoswitchable DNA hairpin, in which a molecular motor serves as the bridgehead unit. After it was determined that motor function was not affected by the rigid arms of the linker, solid-phase synthesis was employed to incorporate the motor into an 8-base-pair self-complementary DNA strand. With the photoswitchable bridgehead in place, hairpin formation was unimpaired, while the motor part of this advanced biohybrid system retains excellent photochemical properties. Rotation of the motor generates large changes in structure, and as a consequence the duplex stability of the oligonucleotide could be regulated by UV light irradiation. Additionally, Molecular Dynamics computations were employed to rationalize the observed behavior of the motor-DNA hybrid. The results presented herein establish molecular motors as powerful multistate switches for application in biological environments.

  20. SSR_pipeline: a bioinformatic infrastructure for identifying microsatellites from paired-end Illumina high-throughput DNA sequencing data.

    PubMed

    Miller, Mark P; Knaus, Brian J; Mullins, Thomas D; Haig, Susan M

    2013-01-01

    SSR_pipeline is a flexible set of programs designed to efficiently identify simple sequence repeats (e.g., microsatellites) from paired-end high-throughput Illumina DNA sequencing data. The program suite contains 3 analysis modules along with a fourth control module that can automate analyses of large volumes of data. The modules are used to 1) identify the subset of paired-end sequences that pass Illumina quality standards, 2) align paired-end reads into a single composite DNA sequence, and 3) identify sequences that possess microsatellites (both simple and compound) conforming to user-specified parameters. The microsatellite search algorithm is extremely efficient, and we have used it to identify repeats with motifs from 2 to 25 bp in length. Each of the 3 analysis modules can also be used independently to provide greater flexibility or to work with FASTQ or FASTA files generated from other sequencing platforms (Roche 454, Ion Torrent, etc.). We demonstrate use of the program with data from the brine fly Ephydra packardi (Diptera: Ephydridae) and provide empirical timing benchmarks to illustrate program performance on a common desktop computer environment. We further show that the Illumina platform is capable of identifying large numbers of microsatellites, even when using unenriched sample libraries and a very small percentage of the sequencing capacity from a single DNA sequencing run. All modules from SSR_pipeline are implemented in the Python programming language and can therefore be used from nearly any computer operating system (Linux, Macintosh, and Windows).

  1. Nanoscale Bio-engineering Solutions for Space Exploration: The Nanopore Sequencer

    NASA Technical Reports Server (NTRS)

    Stolc, Viktor; Cozmuta, Ioana

    2004-01-01

    Characterization of biological systems at the molecular level and extraction of essential information for nano-engineering design to guide the nano-fabrication of solid-state sensors and molecular identification devices is a computational challenge. The alpha hemolysin protein ion channel is used as a model system for structural analysis of nucleic acids like DNA. Applied voltage draws a DNA strand and surrounding ionic solution through the biological nanopore. The subunits in the DNA strand block ion flow by differing amounts. Atomistic scale simulations are employed using NASA supercomputers to study DNA translocation, with the aim to enhance single DNA subunit identification. Compared to protein channels, solid-state nanopores offer a better temporal control of the translocation of DNA and the possibility to easily tune its chemistry to increase the signal resolution. Potential applications for NASA missions, besides real-time genome sequencing include astronaut health, life detection and decoding of various genomes.

  2. RDNAnalyzer: A tool for DNA secondary structure prediction and sequence analysis.

    PubMed

    Afzal, Muhammad; Shahid, Ahmad Ali; Shehzadi, Abida; Nadeem, Shahid; Husnain, Tayyab

    2012-01-01

    RDNAnalyzer is an innovative computer based tool designed for DNA secondary structure prediction and sequence analysis. It can randomly generate the DNA sequence or user can upload the sequences of their own interest in RAW format. It uses and extends the Nussinov dynamic programming algorithm and has various application for the sequence analysis. It predicts the DNA secondary structure and base pairings. It also provides the tools for routinely performed sequence analysis by the biological scientists such as DNA replication, reverse compliment generation, transcription, translation, sequence specific information as total number of nucleotide bases, ATGC base contents along with their respective percentages and sequence cleaner. RDNAnalyzer is a unique tool developed in Microsoft Visual Studio 2008 using Microsoft Visual C# and Windows Presentation Foundation and provides user friendly environment for sequence analysis. It is freely available. http://www.cemb.edu.pk/sw.html RDNAnalyzer - Random DNA Analyser, GUI - Graphical user interface, XAML - Extensible Application Markup Language.

  3. Functional specificity of a Hox protein mediated by the recognition of minor groove structure.

    PubMed

    Joshi, Rohit; Passner, Jonathan M; Rohs, Remo; Jain, Rinku; Sosinsky, Alona; Crickmore, Michael A; Jacob, Vinitha; Aggarwal, Aneel K; Honig, Barry; Mann, Richard S

    2007-11-02

    The recognition of specific DNA-binding sites by transcription factors is a critical yet poorly understood step in the control of gene expression. Members of the Hox family of transcription factors bind DNA by making nearly identical major groove contacts via the recognition helices of their homeodomains. In vivo specificity, however, often depends on extended and unstructured regions that link Hox homeodomains to a DNA-bound cofactor, Extradenticle (Exd). Using a combination of structure determination, computational analysis, and in vitro and in vivo assays, we show that Hox proteins recognize specific Hox-Exd binding sites via residues located in these extended regions that insert into the minor groove but only when presented with the correct DNA sequence. Our results suggest that these residues, which are conserved in a paralog-specific manner, confer specificity by recognizing a sequence-dependent DNA structure instead of directly reading a specific DNA sequence.

  4. A coarse-grained DNA model for the prediction of current signals in DNA translocation experiments

    NASA Astrophysics Data System (ADS)

    Weik, Florian; Kesselheim, Stefan; Holm, Christian

    2016-11-01

    We present an implicit solvent coarse-grained double-stranded DNA (dsDNA) model confined to an infinite cylindrical pore that reproduces the experimentally observed current modulations of a KaCl solution at various concentrations. Our model extends previous coarse-grained and mean-field approaches by incorporating a position dependent friction term on the ions, which Kesselheim et al. [Phys. Rev. Lett. 112, 018101 (2014)] identified as an essential ingredient to correctly reproduce the experimental data of Smeets et al. [Nano Lett. 6, 89 (2006)]. Our approach reduces the computational effort by orders of magnitude compared with all-atom simulations and serves as a promising starting point for modeling the entire translocation process of dsDNA. We achieve a consistent description of the system's electrokinetics by using explicitly parameterized ions, a friction term between the DNA beads and the ions, and a lattice-Boltzmann model for the solvent.

  5. A study of deoxyribonucleotide metabolism and its relation to DNA synthesis. Supercomputer simulation and model-system analysis.

    PubMed

    Heinmets, F; Leary, R H

    1991-06-01

    A model system (1) was established to analyze purine and pyrimidine metabolism. This system has been expanded to include macrosimulation of DNA synthesis and the study of its regulation by terminal deoxynucleoside triphosphates (dNTPs) via a complex set of interactions. Computer experiments reveal that our model exhibits adequate and reasonable sensitivity in terms of dNTP pool levels and rates of DNA synthesis when inputs to the system are varied. These simulation experiments reveal that in order to achieve maximum DNA synthesis (in terms of purine metabolism), a proper balance is required in guanine and adenine input into this metabolic system. Excessive inputs will become inhibitory to DNA synthesis. In addition, studies are carried out on rates of DNA synthesis when various parameters are changed quantitatively. The current system is formulated by 110 differential equations.

  6. Nanoscale Bioengineering Solutions for Space Exploration the Nanopore Sequencer

    NASA Technical Reports Server (NTRS)

    Ioana, Cozmuta; Viktor, Stoic

    2005-01-01

    Characterization of biological systems at the molecular level and extraction of essential information for nano-engineering design to guide the nano-fabrication of solid-state sensors and molecular identification devices is a computational challenge. The alpha hemolysin protein ion channel is used as a model system for structural analysis of nucleic acids like DNA. Applied voltage draws a DNA strand and surrounding ionic solution through the biological nanopore. The subunits in the DNA strand block ion flow by differing amounts. Atomistic scale simulations are employed using NASA supercomputers to study DNA translocation. with the aim to enhance single DNA subunit identification. Compared to protein channels, solid-state nanopores offer a better temporal control of the translocation of DNA and the possibility to easily tune its chemistry to increase the signal resolution. Potential applications for NASA missions, besides real-time genome sequencing include astronaut health, life detection and decoding of various genomes. http://phenomrph.arc.nasa.gov/index.php

  7. Programming Self-Assembly of DNA Origami Honeycomb Two-Dimensional Lattices and Plasmonic Metamaterials.

    PubMed

    Wang, Pengfei; Gaitanaros, Stavros; Lee, Seungwoo; Bathe, Mark; Shih, William M; Ke, Yonggang

    2016-06-22

    Scaffolded DNA origami has proven to be a versatile method for generating functional nanostructures with prescribed sub-100 nm shapes. Programming DNA-origami tiles to form large-scale 2D lattices that span hundreds of nanometers to the micrometer scale could provide an enabling platform for diverse applications ranging from metamaterials to surface-based biophysical assays. Toward this end, here we design a family of hexagonal DNA-origami tiles using computer-aided design and demonstrate successful self-assembly of micrometer-scale 2D honeycomb lattices and tubes by controlling their geometric and mechanical properties including their interconnecting strands. Our results offer insight into programmed self-assembly of low-defect supra-molecular DNA-origami 2D lattices and tubes. In addition, we demonstrate that these DNA-origami hexagon tiles and honeycomb lattices are versatile platforms for assembling optical metamaterials via programmable spatial arrangement of gold nanoparticles (AuNPs) into cluster and superlattice geometries.

  8. Antibody-controlled actuation of DNA-based molecular circuits.

    PubMed

    Engelen, Wouter; Meijer, Lenny H H; Somers, Bram; de Greef, Tom F A; Merkx, Maarten

    2017-02-17

    DNA-based molecular circuits allow autonomous signal processing, but their actuation has relied mostly on RNA/DNA-based inputs, limiting their application in synthetic biology, biomedicine and molecular diagnostics. Here we introduce a generic method to translate the presence of an antibody into a unique DNA strand, enabling the use of antibodies as specific inputs for DNA-based molecular computing. Our approach, antibody-templated strand exchange (ATSE), uses the characteristic bivalent architecture of antibodies to promote DNA-strand exchange reactions both thermodynamically and kinetically. Detailed characterization of the ATSE reaction allowed the establishment of a comprehensive model that describes the kinetics and thermodynamics of ATSE as a function of toehold length, antibody-epitope affinity and concentration. ATSE enables the introduction of complex signal processing in antibody-based diagnostics, as demonstrated here by constructing molecular circuits for multiplex antibody detection, integration of multiple antibody inputs using logic gates and actuation of enzymes and DNAzymes for signal amplification.

  9. Antibody-controlled actuation of DNA-based molecular circuits

    NASA Astrophysics Data System (ADS)

    Engelen, Wouter; Meijer, Lenny H. H.; Somers, Bram; de Greef, Tom F. A.; Merkx, Maarten

    2017-02-01

    DNA-based molecular circuits allow autonomous signal processing, but their actuation has relied mostly on RNA/DNA-based inputs, limiting their application in synthetic biology, biomedicine and molecular diagnostics. Here we introduce a generic method to translate the presence of an antibody into a unique DNA strand, enabling the use of antibodies as specific inputs for DNA-based molecular computing. Our approach, antibody-templated strand exchange (ATSE), uses the characteristic bivalent architecture of antibodies to promote DNA-strand exchange reactions both thermodynamically and kinetically. Detailed characterization of the ATSE reaction allowed the establishment of a comprehensive model that describes the kinetics and thermodynamics of ATSE as a function of toehold length, antibody-epitope affinity and concentration. ATSE enables the introduction of complex signal processing in antibody-based diagnostics, as demonstrated here by constructing molecular circuits for multiplex antibody detection, integration of multiple antibody inputs using logic gates and actuation of enzymes and DNAzymes for signal amplification.

  10. Computer-assisted design for scaling up systems based on DNA reaction networks.

    PubMed

    Aubert, Nathanaël; Mosca, Clément; Fujii, Teruo; Hagiya, Masami; Rondelez, Yannick

    2014-04-06

    In the past few years, there have been many exciting advances in the field of molecular programming, reaching a point where implementation of non-trivial systems, such as neural networks or switchable bistable networks, is a reality. Such systems require nonlinearity, be it through signal amplification, digitalization or the generation of autonomous dynamics such as oscillations. The biochemistry of DNA systems provides such mechanisms, but assembling them in a constructive manner is still a difficult and sometimes counterintuitive process. Moreover, realistic prediction of the actual evolution of concentrations over time requires a number of side reactions, such as leaks, cross-talks or competitive interactions, to be taken into account. In this case, the design of a system targeting a given function takes much trial and error before the correct architecture can be found. To speed up this process, we have created DNA Artificial Circuits Computer-Assisted Design (DACCAD), a computer-assisted design software that supports the construction of systems for the DNA toolbox. DACCAD is ultimately aimed to design actual in vitro implementations, which is made possible by building on the experimental knowledge available on the DNA toolbox. We illustrate its effectiveness by designing various systems, from Montagne et al.'s Oligator or Padirac et al.'s bistable system to new and complex networks, including a two-bit counter or a frequency divider as well as an example of very large system encoding the game Mastermind. In the process, we highlight a variety of behaviours, such as enzymatic saturation and load effect, which would be hard to handle or even predict with a simpler model. We also show that those mechanisms, while generally seen as detrimental, can be used in a positive way, as functional part of a design. Additionally, the number of parameters included in these simulations can be large, especially in the case of complex systems. For this reason, we included the possibility to use CMA-ES, a state-of-the-art optimization algorithm that will automatically evolve parameters chosen by the user to try to match a specified behaviour. Finally, because all possible functionality cannot be captured by a single software, DACCAD includes the possibility to export a system in the synthetic biology markup language, a widely used language for describing biological reaction systems. DACCAD can be downloaded online at http://www.yannick-rondelez.com/downloads/.

  11. XLS (c9orf142) is a new component of mammalian DNA double-stranded break repair.

    PubMed

    Craxton, A; Somers, J; Munnur, D; Jukes-Jones, R; Cain, K; Malewicz, M

    2015-06-01

    Repair of double-stranded DNA breaks (DSBs) in mammalian cells primarily occurs by the non-homologous end-joining (NHEJ) pathway, which requires seven core proteins (Ku70/Ku86, DNA-PKcs (DNA-dependent protein kinase catalytic subunit), Artemis, XRCC4-like factor (XLF), XRCC4 and DNA ligase IV). Here we show using combined affinity purification and mass spectrometry that DNA-PKcs co-purifies with all known core NHEJ factors. Furthermore, we have identified a novel evolutionary conserved protein associated with DNA-PKcs-c9orf142. Computer-based modelling of c9orf142 predicted a structure very similar to XRCC4, hence we have named c9orf142-XLS (XRCC4-like small protein). Depletion of c9orf142/XLS in cells impaired DSB repair consistent with a defect in NHEJ. Furthermore, c9orf142/XLS interacted with other core NHEJ factors. These results demonstrate the existence of a new component of the NHEJ DNA repair pathway in mammalian cells.

  12. Computer-aided design of DNA origami structures.

    PubMed

    Selnihhin, Denis; Andersen, Ebbe Sloth

    2015-01-01

    The DNA origami method enables the creation of complex nanoscale objects that can be used to organize molecular components and to function as reconfigurable mechanical devices. Of relevance to synthetic biology, DNA origami structures can be delivered to cells where they can perform complicated sense-and-act tasks, and can be used as scaffolds to organize enzymes for enhanced synthesis. The design of DNA origami structures is a complicated matter and is most efficiently done using dedicated software packages. This chapter describes a procedure for designing DNA origami structures using a combination of state-of-the-art software tools. First, we introduce the basic method for calculating crossover positions between DNA helices and the standard crossover patterns for flat, square, and honeycomb DNA origami lattices. Second, we provide a step-by-step tutorial for the design of a simple DNA origami biosensor device, from schematic idea to blueprint creation and to 3D modeling and animation, and explain how careful modeling can facilitate later experimentation in the laboratory.

  13. Context influences on TALE–DNA binding revealed by quantitative profiling

    PubMed Central

    Rogers, Julia M.; Barrera, Luis A.; Reyon, Deepak; Sander, Jeffry D.; Kellis, Manolis; Joung, J Keith; Bulyk, Martha L.

    2015-01-01

    Transcription activator-like effector (TALE) proteins recognize DNA using a seemingly simple DNA-binding code, which makes them attractive for use in genome engineering technologies that require precise targeting. Although this code is used successfully to design TALEs to target specific sequences, off-target binding has been observed and is difficult to predict. Here we explore TALE–DNA interactions comprehensively by quantitatively assaying the DNA-binding specificities of 21 representative TALEs to ∼5,000–20,000 unique DNA sequences per protein using custom-designed protein-binding microarrays (PBMs). We find that protein context features exert significant influences on binding. Thus, the canonical recognition code does not fully capture the complexity of TALE–DNA binding. We used the PBM data to develop a computational model, Specificity Inference For TAL-Effector Design (SIFTED), to predict the DNA-binding specificity of any TALE. We provide SIFTED as a publicly available web tool that predicts potential genomic off-target sites for improved TALE design. PMID:26067805

  14. Context influences on TALE-DNA binding revealed by quantitative profiling.

    PubMed

    Rogers, Julia M; Barrera, Luis A; Reyon, Deepak; Sander, Jeffry D; Kellis, Manolis; Joung, J Keith; Bulyk, Martha L

    2015-06-11

    Transcription activator-like effector (TALE) proteins recognize DNA using a seemingly simple DNA-binding code, which makes them attractive for use in genome engineering technologies that require precise targeting. Although this code is used successfully to design TALEs to target specific sequences, off-target binding has been observed and is difficult to predict. Here we explore TALE-DNA interactions comprehensively by quantitatively assaying the DNA-binding specificities of 21 representative TALEs to ∼5,000-20,000 unique DNA sequences per protein using custom-designed protein-binding microarrays (PBMs). We find that protein context features exert significant influences on binding. Thus, the canonical recognition code does not fully capture the complexity of TALE-DNA binding. We used the PBM data to develop a computational model, Specificity Inference For TAL-Effector Design (SIFTED), to predict the DNA-binding specificity of any TALE. We provide SIFTED as a publicly available web tool that predicts potential genomic off-target sites for improved TALE design.

  15. ProbeDesigner: for the design of probesets for branched DNA (bDNA) signal amplification assays.

    PubMed

    Bushnell, S; Budde, J; Catino, T; Cole, J; Derti, A; Kelso, R; Collins, M L; Molino, G; Sheridan, P; Monahan, J; Urdea, M

    1999-05-01

    The sensitivity and specificity of branched DNA (bDNA) assays are derived in part through the judicious design of the capture and label extender probes. To minimize non-specific hybridization (NSH) events, which elevate assay background, candidate probes must be computer screened for complementarity with generic sequences present in the assay. We present a software application which allows for rapid and flexible design of bDNA probesets for novel targets. It includes an algorithm for estimating the magnitude of NSH contribution to background, a mechanism for removing probes with elevated contributions, a methodology for the simultaneous design of probesets for multiple targets, and a graphical user interface which guides the user through the design steps. The program is available as a commercial package through the Pharmaceutical Drug Discovery program at Chiron Diagnostics.

  16. Genetic Constructor: An Online DNA Design Platform.

    PubMed

    Bates, Maxwell; Lachoff, Joe; Meech, Duncan; Zulkower, Valentin; Moisy, Anaïs; Luo, Yisha; Tekotte, Hille; Franziska Scheitz, Cornelia Johanna; Khilari, Rupal; Mazzoldi, Florencio; Chandran, Deepak; Groban, Eli

    2017-12-15

    Genetic Constructor is a cloud Computer Aided Design (CAD) application developed to support synthetic biologists from design intent through DNA fabrication and experiment iteration. The platform allows users to design, manage, and navigate complex DNA constructs and libraries, using a new visual language that focuses on functional parts abstracted from sequence. Features like combinatorial libraries and automated primer design allow the user to separate design from construction by focusing on functional intent, and design constraints aid iterative refinement of designs. A plugin architecture enables contributions from scientists and coders to leverage existing powerful software and connect to DNA foundries. The software is easily accessible and platform agnostic, free for academics, and available in an open-source community edition. Genetic Constructor seeks to democratize DNA design, manufacture, and access to tools and services from the synthetic biology community.

  17. Synthetic Molecular Machines for Active Self-Assembly: Prototype Algorithms, Designs, and Experimental Study

    NASA Astrophysics Data System (ADS)

    Dabby, Nadine L.

    Computer science and electrical engineering have been the great success story of the twentieth century. The neat modularity and mapping of a language onto circuits has led to robots on Mars, desktop computers and smartphones. But these devices are not yet able to do some of the things that life takes for granted: repair a scratch, reproduce, regenerate, or grow exponentially fast--all while remaining functional. This thesis explores and develops algorithms, molecular implementations, and theoretical proofs in the context of "active self-assembly" of molecular systems. The long-term vision of active self-assembly is the theoretical and physical implementation of materials that are composed of reconfigurable units with the programmability and adaptability of biology's numerous molecular machines. En route to this goal, we must first find a way to overcome the memory limitations of molecular systems, and to discover the limits of complexity that can be achieved with individual molecules. One of the main thrusts in molecular programming is to use computer science as a tool for figuring out what can be achieved. While molecular systems that are Turing-complete have been demonstrated [Winfree, 1996], these systems still cannot achieve some of the feats biology has achieved. One might think that because a system is Turing-complete, capable of computing "anything," that it can do any arbitrary task. But while it can simulate any digital computational problem, there are many behaviors that are not "computations" in a classical sense, and cannot be directly implemented. Examples include exponential growth and molecular motion relative to a surface. Passive self-assembly systems cannot implement these behaviors because (a) molecular motion relative to a surface requires a source of fuel that is external to the system, and (b) passive systems are too slow to assemble exponentially-fast-growing structures. We call these behaviors "energetically incomplete" programmable behaviors. This class of behaviors includes any behavior where a passive physical system simply does not have enough physical energy to perform the specified tasks in the requisite amount of time. As we will demonstrate and prove, a sufficiently expressive implementation of an "active" molecular self-assembly approach can achieve these behaviors. Using an external source of fuel solves part of the problem, so the system is not "energetically incomplete." But the programmable system also needs to have sufficient expressive power to achieve the specified behaviors. Perhaps surprisingly, some of these systems do not even require Turing completeness to be sufficiently expressive. Building on a large variety of work by other scientists in the fields of DNA nanotechnology, chemistry and reconfigurable robotics, this thesis introduces several research contributions in the context of active self-assembly. We show that simple primitives such as insertion and deletion are able to generate complex and interesting results such as the growth of a linear polymer in logarithmic time and the ability of a linear polymer to treadmill. To this end we developed a formal model for active-self assembly that is directly implementable with DNA molecules. We show that this model is computationally equivalent to a machine capable of producing strings that are stronger than regular languages and, at most, as strong as context-free grammars. This is a great advance in the theory of active self-assembly as prior models were either entirely theoretical or only implementable in the context of macro-scale robotics. We developed a chain reaction method for the autonomous exponential growth of a linear DNA polymer. Our method is based on the insertion of molecules into the assembly, which generates two new insertion sites for every initial one employed. The building of a line in logarithmic time is a first step toward building a shape in logarithmic time. We demonstrate the first construction of a synthetic linear polymer that grows exponentially fast via insertion. We show that monomer molecules are converted into the polymer in logarithmic time via spectrofluorimetry and gel electrophoresis experiments. We also demonstrate the division of these polymers via the addition of a single DNA complex that competes with the insertion mechanism. This shows the growth of a population of polymers in logarithmic time. We characterize the DNA insertion mechanism that we utilize in Chapter 4. We experimentally demonstrate that we can control the kinetics of this reaction over at least seven orders of magnitude, by programming the sequences of DNA that initiate the reaction. In addition, we review co-authored work on programming molecular robots using prescriptive landscapes of DNA origami; this was the first microscopic demonstration of programming a molecular robot to walk on a 2-dimensional surface. We developed a snapshot method for imaging these random walking molecular robots and a CAPTCHA-like analysis method for difficult-to-interpret imaging data.

  18. Molecular complexes of some anthraquinone anti-cancer drugs: experimental and computational study

    NASA Astrophysics Data System (ADS)

    El-Gogary, Tarek M.

    2003-03-01

    It is known that anti-cancer drugs target DNA in the cell. The mechanism of interaction of anti-cancer drugs with DNA is not fully understood. It is thought that the forces of interaction have some contribution from charge-transfer (CT) binding. The ability of some anthraquinones (AQs) anti-cancer drugs to form CT complexes with well-known electron donor molecules was investigated by NMR. The NMR spectroscopy has indicated the formation of CT complexes between 1,4-bis{[2-(dimethylamino) ethyl]amino}-5,8-dihydroxyanthracene-9,10-dione, (AQ4), and its des-hydroxylated equivalent 1,4-bis{[2-(dimethylamino) ethyl]amino}anthracene-9,10-dione, (AQ4H), as electron acceptors and pyrene (PY) and hexamethylbenzene (HMB) as electron donors. Association constants of the formed CT complexes were determined from the NMR data. AQ4 showed weaker electron accepting power than AQ4H, which could be easily explained on the basis of the electron donating nature of the two-hydroxyl groups. AQ4 and AQ4H have higher stability constant with PY than with HMB. This reflects the weaker interaction of the AQs with the latter, which is a direct effect of the six bulky methyl groups. Electronic absorption spectroscopy of the studied system was performed in chloroform and showed the absence of new absorption bands. The extent of interaction between AQs and donors has been computed using molecular mechanics and quantum mechanics. The computed values were compared with the experimental results of association constants.

  19. Study of Reversible Logic Synthesis with Application in SOC: A Review

    NASA Astrophysics Data System (ADS)

    Sharma, Chinmay; Pahuja, Hitesh; Dadhwal, Mandeep; Singh, Balwinder

    2017-08-01

    The prime concern in today’s SOC designs is the power dissipation which increases with technology scaling. The reversible logic possesses very high potential in reducing power dissipation in these designs. It finds its application in latest research fields such as DNA computing, quantum computing, ultra-low power CMOS design and nanotechnology. The reversible circuits can be easily designed using the conventional CMOS technology at a cost of a garbage output which maintains the reversibility. The purpose of this paper is to provide an overview of the developments that have occurred till date in this concept and how the new reversible logic gates are used to design the logic functions.

  20. Highly Parallel Computing Architectures by using Arrays of Quantum-dot Cellular Automata (QCA): Opportunities, Challenges, and Recent Results

    NASA Technical Reports Server (NTRS)

    Fijany, Amir; Toomarian, Benny N.

    2000-01-01

    There has been significant improvement in the performance of VLSI devices, in terms of size, power consumption, and speed, in recent years and this trend may also continue for some near future. However, it is a well known fact that there are major obstacles, i.e., physical limitation of feature size reduction and ever increasing cost of foundry, that would prevent the long term continuation of this trend. This has motivated the exploration of some fundamentally new technologies that are not dependent on the conventional feature size approach. Such technologies are expected to enable scaling to continue to the ultimate level, i.e., molecular and atomistic size. Quantum computing, quantum dot-based computing, DNA based computing, biologically inspired computing, etc., are examples of such new technologies. In particular, quantum-dots based computing by using Quantum-dot Cellular Automata (QCA) has recently been intensely investigated as a promising new technology capable of offering significant improvement over conventional VLSI in terms of reduction of feature size (and hence increase in integration level), reduction of power consumption, and increase of switching speed. Quantum dot-based computing and memory in general and QCA specifically, are intriguing to NASA due to their high packing density (10(exp 11) - 10(exp 12) per square cm ) and low power consumption (no transfer of current) and potentially higher radiation tolerant. Under Revolutionary Computing Technology (RTC) Program at the NASA/JPL Center for Integrated Space Microelectronics (CISM), we have been investigating the potential applications of QCA for the space program. To this end, exploiting the intrinsic features of QCA, we have designed novel QCA-based circuits for co-planner (i.e., single layer) and compact implementation of a class of data permutation matrices, a class of interconnection networks, and a bit-serial processor. Building upon these circuits, we have developed novel algorithms and QCA-based architectures for highly parallel and systolic computation of signal/image processing applications, such as FFT and Wavelet and Wlash-Hadamard Transforms.

Top