j5 DNA assembly design automation.
Hillson, Nathan J
2014-01-01
Modern standardized methodologies, described in detail in the previous chapters of this book, have enabled the software-automated design of optimized DNA construction protocols. This chapter describes how to design (combinatorial) scar-less DNA assembly protocols using the web-based software j5. j5 assists biomedical and biotechnological researchers construct DNA by automating the design of optimized protocols for flanking homology sequence as well as type IIS endonuclease-mediated DNA assembly methodologies. Unlike any other software tool available today, j5 designs scar-less combinatorial DNA assembly protocols, performs a cost-benefit analysis to identify which portions of an assembly process would be less expensive to outsource to a DNA synthesis service provider, and designs hierarchical DNA assembly strategies to mitigate anticipated poor assembly junction sequence performance. Software integrated with j5 add significant value to the j5 design process through graphical user-interface enhancement and downstream liquid-handling robotic laboratory automation.
Molecular mechanisms of floral organ specification by MADS domain proteins.
Yan, Wenhao; Chen, Dijun; Kaufmann, Kerstin
2016-02-01
Flower development is a model system to understand organ specification in plants. The identities of different types of floral organs are specified by homeotic MADS transcription factors that interact in a combinatorial fashion. Systematic identification of DNA-binding sites and target genes of these key regulators show that they have shared and unique sets of target genes. DNA binding by MADS proteins is not based on 'simple' recognition of a specific DNA sequence, but depends on DNA structure and combinatorial interactions. Homeotic MADS proteins regulate gene expression via alternative mechanisms, one of which may be to modulate chromatin structure and accessibility in their target gene promoters. Copyright © 2015 Elsevier Ltd. All rights reserved.
2018-01-01
Many implementations of pooled screens in mammalian cells rely on linking an element of interest to a barcode, with the latter subsequently quantitated by next generation sequencing. However, substantial uncoupling between these paired elements during lentiviral production has been reported, especially as the distance between elements increases. We detail that PCR amplification is another major source of uncoupling, and becomes more pronounced with increased amounts of DNA template molecules and PCR cycles. To lessen uncoupling in systems that use paired elements for detection, we recommend minimizing the distance between elements, using low and equal template DNA inputs for plasmid and genomic DNA during PCR, and minimizing the number of PCR cycles. We also present a vector design for conducting combinatorial CRISPR screens that enables accurate barcode-based detection with a single short sequencing read and minimal uncoupling. PMID:29799876
DNA-Encoded Solid-Phase Synthesis: Encoding Language Design and Complex Oligomer Library Synthesis.
MacConnell, Andrew B; McEnaney, Patrick J; Cavett, Valerie J; Paegel, Brian M
2015-09-14
The promise of exploiting combinatorial synthesis for small molecule discovery remains unfulfilled due primarily to the "structure elucidation problem": the back-end mass spectrometric analysis that significantly restricts one-bead-one-compound (OBOC) library complexity. The very molecular features that confer binding potency and specificity, such as stereochemistry, regiochemistry, and scaffold rigidity, are conspicuously absent from most libraries because isomerism introduces mass redundancy and diverse scaffolds yield uninterpretable MS fragmentation. Here we present DNA-encoded solid-phase synthesis (DESPS), comprising parallel compound synthesis in organic solvent and aqueous enzymatic ligation of unprotected encoding dsDNA oligonucleotides. Computational encoding language design yielded 148 thermodynamically optimized sequences with Hamming string distance ≥ 3 and total read length <100 bases for facile sequencing. Ligation is efficient (70% yield), specific, and directional over 6 encoding positions. A series of isomers served as a testbed for DESPS's utility in split-and-pool diversification. Single-bead quantitative PCR detected 9 × 10(4) molecules/bead and sequencing allowed for elucidation of each compound's synthetic history. We applied DESPS to the combinatorial synthesis of a 75,645-member OBOC library containing scaffold, stereochemical and regiochemical diversity using mixed-scale resin (160-μm quality control beads and 10-μm screening beads). Tandem DNA sequencing/MALDI-TOF MS analysis of 19 quality control beads showed excellent agreement (<1 ppt) between DNA sequence-predicted mass and the observed mass. DESPS synergistically unites the advantages of solid-phase synthesis and DNA encoding, enabling single-bead structural elucidation of complex compounds and synthesis using reactions normally considered incompatible with unprotected DNA. The widespread availability of inexpensive oligonucleotide synthesis, enzymes, DNA sequencing, and PCR make implementation of DESPS straightforward, and may prompt the chemistry community to revisit the synthesis of more complex and diverse libraries.
Su, Zhangli
2016-01-01
Combinatorial patterns of histone modifications are key indicators of different chromatin states. Most of the current approaches rely on the usage of antibodies to analyze combinatorial histone modifications. Here we detail an antibody-free method named MARCC (Matrix-Assisted Reader Chromatin Capture) to enrich combinatorial histone modifications. The combinatorial patterns are enriched on native nucleosomes extracted from cultured mammalian cells and prepared by micrococcal nuclease digestion. Such enrichment is achieved by recombinant chromatin-interacting protein modules, or so-called reader domains, which can bind in a combinatorial modification-dependent manner. The enriched chromatin can be quantified by western blotting or mass spectrometry for the co-existence of histone modifications, while the associated DNA content can be analyzed by qPCR or next-generation sequencing. Altogether, MARCC provides a reproducible, efficient and customizable solution to enrich and analyze combinatorial histone modifications. PMID:26131849
An Integrated Microfluidic Processor for DNA-Encoded Combinatorial Library Functional Screening
2017-01-01
DNA-encoded synthesis is rekindling interest in combinatorial compound libraries for drug discovery and in technology for automated and quantitative library screening. Here, we disclose a microfluidic circuit that enables functional screens of DNA-encoded compound beads. The device carries out library bead distribution into picoliter-scale assay reagent droplets, photochemical cleavage of compound from the bead, assay incubation, laser-induced fluorescence-based assay detection, and fluorescence-activated droplet sorting to isolate hits. DNA-encoded compound beads (10-μm diameter) displaying a photocleavable positive control inhibitor pepstatin A were mixed (1920 beads, 729 encoding sequences) with negative control beads (58 000 beads, 1728 encoding sequences) and screened for cathepsin D inhibition using a biochemical enzyme activity assay. The circuit sorted 1518 hit droplets for collection following 18 min incubation over a 240 min analysis. Visual inspection of a subset of droplets (1188 droplets) yielded a 24% false discovery rate (1166 pepstatin A beads; 366 negative control beads). Using template barcoding strategies, it was possible to count hit collection beads (1863) using next-generation sequencing data. Bead-specific barcodes enabled replicate counting, and the false discovery rate was reduced to 2.6% by only considering hit-encoding sequences that were observed on >2 beads. This work represents a complete distributable small molecule discovery platform, from microfluidic miniaturized automation to ultrahigh-throughput hit deconvolution by sequencing. PMID:28199790
An Integrated Microfluidic Processor for DNA-Encoded Combinatorial Library Functional Screening.
MacConnell, Andrew B; Price, Alexander K; Paegel, Brian M
2017-03-13
DNA-encoded synthesis is rekindling interest in combinatorial compound libraries for drug discovery and in technology for automated and quantitative library screening. Here, we disclose a microfluidic circuit that enables functional screens of DNA-encoded compound beads. The device carries out library bead distribution into picoliter-scale assay reagent droplets, photochemical cleavage of compound from the bead, assay incubation, laser-induced fluorescence-based assay detection, and fluorescence-activated droplet sorting to isolate hits. DNA-encoded compound beads (10-μm diameter) displaying a photocleavable positive control inhibitor pepstatin A were mixed (1920 beads, 729 encoding sequences) with negative control beads (58 000 beads, 1728 encoding sequences) and screened for cathepsin D inhibition using a biochemical enzyme activity assay. The circuit sorted 1518 hit droplets for collection following 18 min incubation over a 240 min analysis. Visual inspection of a subset of droplets (1188 droplets) yielded a 24% false discovery rate (1166 pepstatin A beads; 366 negative control beads). Using template barcoding strategies, it was possible to count hit collection beads (1863) using next-generation sequencing data. Bead-specific barcodes enabled replicate counting, and the false discovery rate was reduced to 2.6% by only considering hit-encoding sequences that were observed on >2 beads. This work represents a complete distributable small molecule discovery platform, from microfluidic miniaturized automation to ultrahigh-throughput hit deconvolution by sequencing.
Utro, Filippo; Di Benedetto, Valeria; Corona, Davide F V; Giancarlo, Raffaele
2016-03-15
Thanks to research spanning nearly 30 years, two major models have emerged that account for nucleosome organization in chromatin: statistical and sequence specific. The first is based on elegant, easy to compute, closed-form mathematical formulas that make no assumptions of the physical and chemical properties of the underlying DNA sequence. Moreover, they need no training on the data for their computation. The latter is based on some sequence regularities but, as opposed to the statistical model, it lacks the same type of closed-form formulas that, in this case, should be based on the DNA sequence only. We contribute to close this important methodological gap between the two models by providing three very simple formulas for the sequence specific one. They are all based on well-known formulas in Computer Science and Bioinformatics, and they give different quantifications of how complex a sequence is. In view of how remarkably well they perform, it is very surprising that measures of sequence complexity have not even been considered as candidates to close the mentioned gap. We provide experimental evidence that the intrinsic level of combinatorial organization and information-theoretic content of subsequences within a genome are strongly correlated to the level of DNA encoded nucleosome organization discovered by Kaplan et al Our results establish an important connection between the intrinsic complexity of subsequences in a genome and the intrinsic, i.e. DNA encoded, nucleosome organization of eukaryotic genomes. It is a first step towards a mathematical characterization of this latter 'encoding'. Supplementary data are available at Bioinformatics online. futro@us.ibm.com. © The Author 2015. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.
BioPartsBuilder: a synthetic biology tool for combinatorial assembly of biological parts.
Yang, Kun; Stracquadanio, Giovanni; Luo, Jingchuan; Boeke, Jef D; Bader, Joel S
2016-03-15
Combinatorial assembly of DNA elements is an efficient method for building large-scale synthetic pathways from standardized, reusable components. These methods are particularly useful because they enable assembly of multiple DNA fragments in one reaction, at the cost of requiring that each fragment satisfies design constraints. We developed BioPartsBuilder as a biologist-friendly web tool to design biological parts that are compatible with DNA combinatorial assembly methods, such as Golden Gate and related methods. It retrieves biological sequences, enforces compliance with assembly design standards and provides a fabrication plan for each fragment. BioPartsBuilder is accessible at http://public.biopartsbuilder.org and an Amazon Web Services image is available from the AWS Market Place (AMI ID: ami-508acf38). Source code is released under the MIT license, and available for download at https://github.com/baderzone/biopartsbuilder joel.bader@jhu.edu Supplementary data are available at Bioinformatics online. © The Author 2015. Published by Oxford University Press.
Jézéquel, Laetitia; Loeper, Jacqueline; Pompon, Denis
2008-11-01
Combinatorial libraries coding for mosaic enzymes with predefined crossover points constitute useful tools to address and model structure-function relationships and for functional optimization of enzymes based on multivariate statistics. The presented method, called sequence-independent generation of a chimera-ordered library (SIGNAL), allows easy shuffling of any predefined amino acid segment between two or more proteins. This method is particularly well adapted to the exchange of protein structural modules. The procedure could also be well suited to generate ordered combinatorial libraries independent of sequence similarities in a robotized manner. Sequence segments to be recombined are first extracted by PCR from a single-stranded template coding for an enzyme of interest using a biotin-avidin-based method. This technique allows the reduction of parental template contamination in the final library. Specific PCR primers allow amplification of two complementary mosaic DNA fragments, overlapping in the region to be exchanged. Fragments are finally reassembled using a fusion PCR. The process is illustrated via the construction of a set of mosaic CYP2B enzymes using this highly modular approach.
Combinatorial Pooling Enables Selective Sequencing of the Barley Gene Space
Lonardi, Stefano; Duma, Denisa; Alpert, Matthew; Cordero, Francesca; Beccuti, Marco; Bhat, Prasanna R.; Wu, Yonghui; Ciardo, Gianfranco; Alsaihati, Burair; Ma, Yaqin; Wanamaker, Steve; Resnik, Josh; Bozdag, Serdar; Luo, Ming-Cheng; Close, Timothy J.
2013-01-01
For the vast majority of species – including many economically or ecologically important organisms, progress in biological research is hampered due to the lack of a reference genome sequence. Despite recent advances in sequencing technologies, several factors still limit the availability of such a critical resource. At the same time, many research groups and international consortia have already produced BAC libraries and physical maps and now are in a position to proceed with the development of whole-genome sequences organized around a physical map anchored to a genetic map. We propose a BAC-by-BAC sequencing protocol that combines combinatorial pooling design and second-generation sequencing technology to efficiently approach denovo selective genome sequencing. We show that combinatorial pooling is a cost-effective and practical alternative to exhaustive DNA barcoding when preparing sequencing libraries for hundreds or thousands of DNA samples, such as in this case gene-bearing minimum-tiling-path BAC clones. The novelty of the protocol hinges on the computational ability to efficiently compare hundred millions of short reads and assign them to the correct BAC clones (deconvolution) so that the assembly can be carried out clone-by-clone. Experimental results on simulated data for the rice genome show that the deconvolution is very accurate, and the resulting BAC assemblies have high quality. Results on real data for a gene-rich subset of the barley genome confirm that the deconvolution is accurate and the BAC assemblies have good quality. While our method cannot provide the level of completeness that one would achieve with a comprehensive whole-genome sequencing project, we show that it is quite successful in reconstructing the gene sequences within BACs. In the case of plants such as barley, this level of sequence knowledge is sufficient to support critical end-point objectives such as map-based cloning and marker-assisted breeding. PMID:23592960
Combinatorial pooling enables selective sequencing of the barley gene space.
Lonardi, Stefano; Duma, Denisa; Alpert, Matthew; Cordero, Francesca; Beccuti, Marco; Bhat, Prasanna R; Wu, Yonghui; Ciardo, Gianfranco; Alsaihati, Burair; Ma, Yaqin; Wanamaker, Steve; Resnik, Josh; Bozdag, Serdar; Luo, Ming-Cheng; Close, Timothy J
2013-04-01
For the vast majority of species - including many economically or ecologically important organisms, progress in biological research is hampered due to the lack of a reference genome sequence. Despite recent advances in sequencing technologies, several factors still limit the availability of such a critical resource. At the same time, many research groups and international consortia have already produced BAC libraries and physical maps and now are in a position to proceed with the development of whole-genome sequences organized around a physical map anchored to a genetic map. We propose a BAC-by-BAC sequencing protocol that combines combinatorial pooling design and second-generation sequencing technology to efficiently approach denovo selective genome sequencing. We show that combinatorial pooling is a cost-effective and practical alternative to exhaustive DNA barcoding when preparing sequencing libraries for hundreds or thousands of DNA samples, such as in this case gene-bearing minimum-tiling-path BAC clones. The novelty of the protocol hinges on the computational ability to efficiently compare hundred millions of short reads and assign them to the correct BAC clones (deconvolution) so that the assembly can be carried out clone-by-clone. Experimental results on simulated data for the rice genome show that the deconvolution is very accurate, and the resulting BAC assemblies have high quality. Results on real data for a gene-rich subset of the barley genome confirm that the deconvolution is accurate and the BAC assemblies have good quality. While our method cannot provide the level of completeness that one would achieve with a comprehensive whole-genome sequencing project, we show that it is quite successful in reconstructing the gene sequences within BACs. In the case of plants such as barley, this level of sequence knowledge is sufficient to support critical end-point objectives such as map-based cloning and marker-assisted breeding.
Improved Modeling of Side-Chain–Base Interactions and Plasticity in Protein–DNA Interface Design
Thyme, Summer B.; Baker, David; Bradley, Philip
2012-01-01
Combinatorial sequence optimization for protein design requires libraries of discrete side-chain conformations. The discreteness of these libraries is problematic, particularly for long, polar side chains, since favorable interactions can be missed. Previously, an approach to loop remodeling where protein backbone movement is directed by side-chain rotamers predicted to form interactions previously observed in native complexes (termed “motifs”) was described. Here, we show how such motif libraries can be incorporated into combinatorial sequence optimization protocols and improve native complex recapitulation. Guided by the motif rotamer searches, we made improvements to the underlying energy function, increasing recapitulation of native interactions. To further test the methods, we carried out a comprehensive experimental scan of amino acid preferences in the I-AniI protein–DNA interface and found that many positions tolerated multiple amino acids. This sequence plasticity is not observed in the computational results because of the fixed-backbone approximation of the model. We improved modeling of this diversity by introducing DNA flexibility and reducing the convergence of the simulated annealing algorithm that drives the design process. In addition to serving as a benchmark, this extensive experimental data set provides insight into the types of interactions essential to maintain the function of this potential gene therapy reagent. PMID:22426128
Improved modeling of side-chain--base interactions and plasticity in protein--DNA interface design.
Thyme, Summer B; Baker, David; Bradley, Philip
2012-06-08
Combinatorial sequence optimization for protein design requires libraries of discrete side-chain conformations. The discreteness of these libraries is problematic, particularly for long, polar side chains, since favorable interactions can be missed. Previously, an approach to loop remodeling where protein backbone movement is directed by side-chain rotamers predicted to form interactions previously observed in native complexes (termed "motifs") was described. Here, we show how such motif libraries can be incorporated into combinatorial sequence optimization protocols and improve native complex recapitulation. Guided by the motif rotamer searches, we made improvements to the underlying energy function, increasing recapitulation of native interactions. To further test the methods, we carried out a comprehensive experimental scan of amino acid preferences in the I-AniI protein-DNA interface and found that many positions tolerated multiple amino acids. This sequence plasticity is not observed in the computational results because of the fixed-backbone approximation of the model. We improved modeling of this diversity by introducing DNA flexibility and reducing the convergence of the simulated annealing algorithm that drives the design process. In addition to serving as a benchmark, this extensive experimental data set provides insight into the types of interactions essential to maintain the function of this potential gene therapy reagent. Published by Elsevier Ltd.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Torella, JP; Lienert, F; Boehm, CR
2014-08-07
Recombination-based DNA construction methods, such as Gibson assembly, have made it possible to easily and simultaneously assemble multiple DNA parts, and they hold promise for the development and optimization of metabolic pathways and functional genetic circuits. Over time, however, these pathways and circuits have become more complex, and the increasing need for standardization and insulation of genetic parts has resulted in sequence redundancies-for example, repeated terminator and insulator sequences-that complicate recombination-based assembly. We and others have recently developed DNA assembly methods, which we refer to collectively as unique nucleotide sequence (UNS)-guided assembly, in which individual DNA parts are flanked withmore » UNSs to facilitate the ordered, recombination-based assembly of repetitive sequences. Here we present a detailed protocol for UNS-guided assembly that enables researchers to convert multiple DNA parts into sequenced, correctly assembled constructs, or into high-quality combinatorial libraries in only 2-3 d. If the DNA parts must be generated from scratch, an additional 2-5 d are necessary. This protocol requires no specialized equipment and can easily be implemented by a student with experience in basic cloning techniques.« less
Torella, Joseph P.; Lienert, Florian; Boehm, Christian R.; Chen, Jan-Hung; Way, Jeffrey C.; Silver, Pamela A.
2016-01-01
Recombination-based DNA construction methods, such as Gibson assembly, have made it possible to easily and simultaneously assemble multiple DNA parts and hold promise for the development and optimization of metabolic pathways and functional genetic circuits. Over time, however, these pathways and circuits have become more complex, and the increasing need for standardization and insulation of genetic parts has resulted in sequence redundancies — for example repeated terminator and insulator sequences — that complicate recombination-based assembly. We and others have recently developed DNA assembly methods that we refer to collectively as unique nucleotide sequence (UNS)-guided assembly, in which individual DNA parts are flanked with UNSs to facilitate the ordered, recombination-based assembly of repetitive sequences. Here we present a detailed protocol for UNS-guided assembly that enables researchers to convert multiple DNA parts into sequenced, correctly-assembled constructs, or into high-quality combinatorial libraries in only 2–3 days. If the DNA parts must be generated from scratch, an additional 2–5 days are necessary. This protocol requires no specialized equipment and can easily be implemented by a student with experience in basic cloning techniques. PMID:25101822
Abécassis, V; Pompon, D; Truan, G
2000-10-15
The design of a family shuffling strategy (CLERY: Combinatorial Libraries Enhanced by Recombination in Yeast) associating PCR-based and in vivo recombination and expression in yeast is described. This strategy was tested using human cytochrome P450 CYP1A1 and CYP1A2 as templates, which share 74% nucleotide sequence identity. Construction of highly shuffled libraries of mosaic structures and reduction of parental gene contamination were two major goals. Library characterization involved multiprobe hybridization on DNA macro-arrays. The statistical analysis of randomly selected clones revealed a high proportion of chimeric genes (86%) and a homogeneous representation of the parental contribution among the sequences (55.8 +/- 2.5% for parental sequence 1A2). A microtiter plate screening system was designed to achieve colorimetric detection of polycyclic hydrocarbon hydroxylation by transformed yeast cells. Full sequences of five randomly picked and five functionally selected clones were analyzed. Results confirmed the shuffling efficiency and allowed calculation of the average length of sequence exchange and mutation rates. The efficient and statistically representative generation of mosaic structures by this type of family shuffling in a yeast expression system constitutes a novel and promising tool for structure-function studies and tuning enzymatic activities of multicomponent eucaryote complexes involving non-soluble enzymes.
Dokarry, Melissa; Laurendon, Caroline; O'Maille, Paul E
2012-01-01
Structure-based combinatorial protein engineering (SCOPE) is a homology-independent recombination method to create multiple crossover gene libraries by assembling defined combinations of structural elements ranging from single mutations to domains of protein structure. SCOPE was originally inspired by DNA shuffling, which mimics recombination during meiosis, where mutations from parental genes are "shuffled" to create novel combinations in the resulting progeny. DNA shuffling utilizes sequence identity between parental genes to mediate template-switching events (the annealing and extension of one parental gene fragment on another) in PCR reassembly reactions to generate crossovers and hence recombination between parental genes. In light of the conservation of protein structure and degeneracy of sequence, SCOPE was developed to enable the "shuffling" of distantly related genes with no requirement for sequence identity. The central principle involves the use of oligonucleotides to encode for crossover regions to choreograph template-switching events during PCR assembly of gene fragments to create chimeric genes. This approach was initially developed to create libraries of hybrid DNA polymerases from distantly related parents, and later developed to create a combinatorial mutant library of sesquiterpene synthases to explore the catalytic landscapes underlying the functional divergence of related enzymes. This chapter presents a simplified protocol of SCOPE that can be integrated with different mutagenesis techniques and is suitable for automation by liquid-handling robots. Two examples are presented to illustrate the application of SCOPE to create gene libraries using plant sesquiterpene synthases as the model system. In the first example, we outline how to create an active-site library as a series of complex mixtures of diverse mutants. In the second example, we outline how to create a focused library as an array of individual clones to distil minimal combinations of functionally important mutations. Through these examples, the principles of the technique are illustrated and the suitability of automating various aspects of the procedure for given applications are discussed. Copyright © 2012 Elsevier Inc. All rights reserved.
Nonparametric Combinatorial Sequence Models
NASA Astrophysics Data System (ADS)
Wauthier, Fabian L.; Jordan, Michael I.; Jojic, Nebojsa
This work considers biological sequences that exhibit combinatorial structures in their composition: groups of positions of the aligned sequences are "linked" and covary as one unit across sequences. If multiple such groups exist, complex interactions can emerge between them. Sequences of this kind arise frequently in biology but methodologies for analyzing them are still being developed. This paper presents a nonparametric prior on sequences which allows combinatorial structures to emerge and which induces a posterior distribution over factorized sequence representations. We carry out experiments on three sequence datasets which indicate that combinatorial structures are indeed present and that combinatorial sequence models can more succinctly describe them than simpler mixture models. We conclude with an application to MHC binding prediction which highlights the utility of the posterior distribution induced by the prior. By integrating out the posterior our method compares favorably to leading binding predictors.
The Evolution of DNA-Templated Synthesis as a Tool for Materials Discovery.
O'Reilly, Rachel K; Turberfield, Andrew J; Wilks, Thomas R
2017-10-17
Precise control over reactivity and molecular structure is a fundamental goal of the chemical sciences. Billions of years of evolution by natural selection have resulted in chemical systems capable of information storage, self-replication, catalysis, capture and production of light, and even cognition. In all these cases, control over molecular structure is required to achieve a particular function: without structural control, function may be impaired, unpredictable, or impossible. The search for molecules with a desired function is often achieved by synthesizing a combinatorial library, which contains many or all possible combinations of a set of chemical building blocks (BBs), and then screening this library to identify "successful" structures. The largest libraries made by conventional synthesis are currently of the order of 10 8 distinct molecules. To put this in context, there are 10 13 ways of arranging the 21 proteinogenic amino acids in chains up to 10 units long. Given that we know that a number of these compounds have potent biological activity, it would be highly desirable to be able to search them all to identify leads for new drug molecules. Large libraries of oligonucleotides can be synthesized combinatorially and translated into peptides using systems based on biological replication such as mRNA display, with selected molecules identified by DNA sequencing; but these methods are limited to BBs that are compatible with cellular machinery. In order to search the vast tracts of chemical space beyond nucleic acids and natural peptides, an alternative approach is required. DNA-templated synthesis (DTS) could enable us to meet this challenge. DTS controls chemical product formation by using the specificity of DNA hybridization to bring selected reactants into close proximity, and is capable of the programmed synthesis of many distinct products in the same reaction vessel. By making use of dynamic, programmable DNA processes, it is possible to engineer a system that can translate instructions coded as a sequence of DNA bases into a chemical structure-a process analogous to the action of the ribosome in living organisms but with the potential to create a much more chemically diverse set of products. It is also possible to ensure that each product molecule is tagged with its identifying DNA sequence. Compound libraries synthesized in this way can be exposed to selection against suitable targets, enriching successful molecules. The encoding DNA can then be amplified using the polymerase chain reaction and decoded by DNA sequencing. More importantly, the DNA instruction sequences can be mutated and reused during multiple rounds of amplification, translation, and selection. In other words, DTS could be used as the foundation for a system of synthetic molecular evolution, which could allow us to efficiently search a vast chemical space. This has huge potential to revolutionize materials discovery-imagine being able to evolve molecules for light harvesting, or catalysts for CO 2 fixation. The field of DTS has developed to the point where a wide variety of reactions can be performed on a DNA template. Complex architectures and autonomous "DNA robots" have been implemented for the controlled assembly of BBs, and these mechanisms have in turn enabled the one-pot synthesis of large combinatorial libraries. Indeed, DTS libraries are being exploited by pharmaceutical companies and have already found their way into drug lead discovery programs. This Account explores the processes involved in DTS and highlights the challenges that remain in creating a general system for molecular discovery by evolution.
The Evolution of DNA-Templated Synthesis as a Tool for Materials Discovery
2017-01-01
Conspectus Precise control over reactivity and molecular structure is a fundamental goal of the chemical sciences. Billions of years of evolution by natural selection have resulted in chemical systems capable of information storage, self-replication, catalysis, capture and production of light, and even cognition. In all these cases, control over molecular structure is required to achieve a particular function: without structural control, function may be impaired, unpredictable, or impossible. The search for molecules with a desired function is often achieved by synthesizing a combinatorial library, which contains many or all possible combinations of a set of chemical building blocks (BBs), and then screening this library to identify “successful” structures. The largest libraries made by conventional synthesis are currently of the order of 108 distinct molecules. To put this in context, there are 1013 ways of arranging the 21 proteinogenic amino acids in chains up to 10 units long. Given that we know that a number of these compounds have potent biological activity, it would be highly desirable to be able to search them all to identify leads for new drug molecules. Large libraries of oligonucleotides can be synthesized combinatorially and translated into peptides using systems based on biological replication such as mRNA display, with selected molecules identified by DNA sequencing; but these methods are limited to BBs that are compatible with cellular machinery. In order to search the vast tracts of chemical space beyond nucleic acids and natural peptides, an alternative approach is required. DNA-templated synthesis (DTS) could enable us to meet this challenge. DTS controls chemical product formation by using the specificity of DNA hybridization to bring selected reactants into close proximity, and is capable of the programmed synthesis of many distinct products in the same reaction vessel. By making use of dynamic, programmable DNA processes, it is possible to engineer a system that can translate instructions coded as a sequence of DNA bases into a chemical structure—a process analogous to the action of the ribosome in living organisms but with the potential to create a much more chemically diverse set of products. It is also possible to ensure that each product molecule is tagged with its identifying DNA sequence. Compound libraries synthesized in this way can be exposed to selection against suitable targets, enriching successful molecules. The encoding DNA can then be amplified using the polymerase chain reaction and decoded by DNA sequencing. More importantly, the DNA instruction sequences can be mutated and reused during multiple rounds of amplification, translation, and selection. In other words, DTS could be used as the foundation for a system of synthetic molecular evolution, which could allow us to efficiently search a vast chemical space. This has huge potential to revolutionize materials discovery—imagine being able to evolve molecules for light harvesting, or catalysts for CO2 fixation. The field of DTS has developed to the point where a wide variety of reactions can be performed on a DNA template. Complex architectures and autonomous “DNA robots” have been implemented for the controlled assembly of BBs, and these mechanisms have in turn enabled the one-pot synthesis of large combinatorial libraries. Indeed, DTS libraries are being exploited by pharmaceutical companies and have already found their way into drug lead discovery programs. This Account explores the processes involved in DTS and highlights the challenges that remain in creating a general system for molecular discovery by evolution. PMID:28915003
Axe: rapid, competitive sequence read demultiplexing using a trie.
Murray, Kevin D; Borevitz, Justin O
2018-06-01
We describe a rapid algorithm for demultiplexing DNA sequence reads with in-read indices. Axe selects the optimal index present in a sequence read, even in the presence of sequencing errors. The algorithm is able to handle combinatorial indexing, indices of differing length, and several mismatches per index sequence. Axe is implemented in C, and is used as a command-line program on Unix-like systems. Axe is available online at https://github.com/kdmurray91/axe, and is available in Debian/Ubuntu distributions of GNU/Linux as the package axe-demultiplexer. Kevin Murray axe@kdmurray.id.au. Supplementary data are available at Bioinformatics online.
Complexity: an internet resource for analysis of DNA sequence complexity
Orlov, Y. L.; Potapov, V. N.
2004-01-01
The search for DNA regions with low complexity is one of the pivotal tasks of modern structural analysis of complete genomes. The low complexity may be preconditioned by strong inequality in nucleotide content (biased composition), by tandem or dispersed repeats or by palindrome-hairpin structures, as well as by a combination of all these factors. Several numerical measures of textual complexity, including combinatorial and linguistic ones, together with complexity estimation using a modified Lempel–Ziv algorithm, have been implemented in a software tool called ‘Complexity’ (http://wwwmgs.bionet.nsc.ru/mgs/programs/low_complexity/). The software enables a user to search for low-complexity regions in long sequences, e.g. complete bacterial genomes or eukaryotic chromosomes. In addition, it estimates the complexity of groups of aligned sequences. PMID:15215465
Optimized Reaction Conditions for Amide Bond Formation in DNA-Encoded Combinatorial Libraries.
Li, Yizhou; Gabriele, Elena; Samain, Florent; Favalli, Nicholas; Sladojevich, Filippo; Scheuermann, Jörg; Neri, Dario
2016-08-08
DNA-encoded combinatorial libraries are increasingly being used as tools for the discovery of small organic binding molecules to proteins of biological or pharmaceutical interest. In the majority of cases, synthetic procedures for the formation of DNA-encoded combinatorial libraries incorporate at least one step of amide bond formation between amino-modified DNA and a carboxylic acid. We investigated reaction conditions and established a methodology by using 1-ethyl-3-(3-(dimethylamino)propyl)carbodiimide, 1-hydroxy-7-azabenzotriazole and N,N'-diisopropylethylamine (EDC/HOAt/DIPEA) in combination, which provided conversions greater than 75% for 423/543 (78%) of the carboxylic acids tested. These reaction conditions were efficient with a variety of primary and secondary amines, as well as with various types of amino-modified oligonucleotides. The reaction conditions, which also worked efficiently over a broad range of DNA concentrations and reaction scales, should facilitate the synthesis of novel DNA-encoded combinatorial libraries.
Parvari, R; Avivi, A; Lentner, F; Ziv, E; Tel-Or, S; Burstein, Y; Schechter, I
1988-03-01
cDNA clones encoding the variable and constant regions of chicken immunoglobulin (Ig) gamma-chains were obtained from spleen cDNA libraries. Southern blots of kidney DNA show that the variable region sequences of eight cDNA clones reveal the same set of bands corresponding to approximately 30 cross-hybridizing VH genes of one subgroup. Since the VH clones were randomly selected, it is likely that the bulk of chicken H-chains are encoded by a single VH subgroup. Nucleotide sequence determinations of two cDNA clones reveal VH, D, JH and the constant region. The VH segments are closely related to each other (83% homology) as expected for VH or the same subgroup. The JHs are 15 residues long and differ by one amino acid. The Ds differ markedly in sequence (20% homology) and size (10 and 20 residues). These findings strongly indicate multiple (at least two) D genes which by a combinatorial joining mechanism diversify the H-chains, a mechanism which is not operative in the chicken L-chain locus. The most notable among the chicken Igs is the so-called 7S IgG because its H-chain differs in many important aspects from any mammalian IgG. The sequence of the C gamma cDNA reported here resolves this issue. The chicken C gamma is 426 residues long with four CH domains (unlike mammalian C gamma which has three CH domains) and it shows 25% homology to the chicken C mu. The chicken C gamma is most related to the mammalian C epsilon in length, the presence of four CH domains and the distribution of cysteines in the CH1 and CH2 domains. We propose that the unique chicken C gamma is the ancestor of the mammalian C epsilon and C gamma subclasses, and discuss the evolution of the H-chain locus from that of chicken with presumably three genes (mu, gamma, alpha) to the mammalian loci with 8-10 H-chain genes.
Müller, Patrick; Rößler, Jens; Schwarz-Finsterle, Jutta; Schmitt, Eberhard; Hausmann, Michael
2016-07-01
Recently, advantages concerning targeting specificity of PCR constructed oligonucleotide FISH probes in contrast to established FISH probes, e.g. BAC clones, have been demonstrated. These techniques, however, are still using labelling protocols with DNA denaturing steps applying harsh heat treatment with or without further denaturing chemical agents. COMBO-FISH (COMBinatorial Oligonucleotide FISH) allows the design of specific oligonucleotide probe combinations in silico. Thus, being independent from primer libraries or PCR laboratory conditions, the probe sequences extracted by computer sequence data base search can also be synthesized as single stranded PNA-probes (Peptide Nucleic Acid probes) or TINA-DNA (Twisted Intercalating Nucleic Acids). Gene targets can be specifically labelled with at least about 20 probes obtaining visibly background free specimens. By using appropriately designed triplex forming oligonucleotides, the denaturing procedures can completely be omitted. These results reveal a significant step towards oligonucleotide-FISH maintaining the 3d-nanostructure and even the viability of the cell target. The method is demonstrated with the detection of Her2/neu and GRB7 genes, which are indicators in breast cancer diagnosis and therapy. Copyright © 2016. Published by Elsevier Inc.
Effective DNA Inhibitors of Cathepsin G by In Vitro Selection
Gatto, Barbara; Vianini, Elena; Lucatello, Lorena; Sissi, Claudia; Moltrasio, Danilo; Pescador, Rodolfo; Porta, Roberto; Palumbo, Manlio
2008-01-01
Cathepsin G (CatG) is a chymotrypsin-like protease released upon degranulation of neutrophils. In several inflammatory and ischaemic diseases the impaired balance between CatG and its physiological inhibitors leads to tissue destruction and platelet aggregation. Inhibitors of CatG are suitable for the treatment of inflammatory diseases and procoagulant conditions. DNA released upon the death of neutrophils at injury sites binds CatG. Moreover, short DNA fragments are more inhibitory than genomic DNA. Defibrotide, a single stranded polydeoxyribonucleotide with antithrombotic effect is also a potent CatG inhibitor. Given the above experimental evidences we employed a selection protocol to assess whether DNA inhibition of CatG may be ascribed to specific sequences present in defibrotide DNA. A Selex protocol was applied to identify the single-stranded DNA sequences exhibiting the highest affinity for CatG, the diversity of a combinatorial pool of oligodeoxyribonucleotides being a good representation of the complexity found in defibrotide. Biophysical and biochemical studies confirmed that the selected sequences bind tightly to the target enzyme and also efficiently inhibit its catalytic activity. Sequence analysis carried out to unveil a motif responsible for CatG recognition showed a recurrence of alternating TG repeats in the selected CatG binders, adopting an extended conformation that grants maximal interaction with the highly charged protein surface. This unprecedented finding is validated by our results showing high affinity and inhibition of CatG by specific DNA sequences of variable length designed to maximally reduce pairing/folding interactions. PMID:19325843
Self-Assembled Combinatorial Nanoarrays for Multiplex Biosensing
2010-02-05
origami tiles containing two lines of apt-A (green dots) and two lines of apt-B thrombin (blue dots). The neighboring lines of apt-A and apt-B are...included helping to verify the positions of the lines in the AFM images, b, e, AFM height images of the DNA origami tiles (lOnM) with 60 nM thrombin... origami method we designed a rectangular- shaped DNA tile (Fig. 10a) that had a dimension of 60 x 90 nm. Stem-loops with apt-A and apt-B sequences were
A Versatile Microfluidic Device for Automating Synthetic Biology.
Shih, Steve C C; Goyal, Garima; Kim, Peter W; Koutsoubelis, Nicolas; Keasling, Jay D; Adams, Paul D; Hillson, Nathan J; Singh, Anup K
2015-10-16
New microbes are being engineered that contain the genetic circuitry, metabolic pathways, and other cellular functions required for a wide range of applications such as producing biofuels, biobased chemicals, and pharmaceuticals. Although currently available tools are useful in improving the synthetic biology process, further improvements in physical automation would help to lower the barrier of entry into this field. We present an innovative microfluidic platform for assembling DNA fragments with 10× lower volumes (compared to that of current microfluidic platforms) and with integrated region-specific temperature control and on-chip transformation. Integration of these steps minimizes the loss of reagents and products compared to that with conventional methods, which require multiple pipetting steps. For assembling DNA fragments, we implemented three commonly used DNA assembly protocols on our microfluidic device: Golden Gate assembly, Gibson assembly, and yeast assembly (i.e., TAR cloning, DNA Assembler). We demonstrate the utility of these methods by assembling two combinatorial libraries of 16 plasmids each. Each DNA plasmid is transformed into Escherichia coli or Saccharomyces cerevisiae using on-chip electroporation and further sequenced to verify the assembly. We anticipate that this platform will enable new research that can integrate this automated microfluidic platform to generate large combinatorial libraries of plasmids and will help to expedite the overall synthetic biology process.
DNA-based identification of Brassica vegetable species for the juice industry.
Etoh, Kazumi; Niijima, Noritaka; Yokoshita, Masahiko; Fukuoka, Shin-Ichi
2003-10-01
Since kale (Brassica oleracea var. acephala), a cruciferous vegetable with a high level of vitamins and functional compounds beneficial to health and wellness, has become widely used in the juice industry, a precise method for quality control of vegetable species is necessary. We describe here a DNA-based identification method to distinguish kale from cabbage (Brassica oleracea var. capitata), a closely related species, which can be inadvertently mixed with kale during the manufacturing process. Using genomic DNA from these vegetables and combinatory sets of nucleotide primers, we screened for random amplified polymorphic DNA (RAPD) fragments and found three cabbage-specific fragments. These RAPD fragments, with lengths of 1.4, 0.5, and 1.5 kb, were purified, subcloned, and sequenced. Based on sequence-tagged sites (STS), we designed sets of primers to detect cabbage-specific identification (CAI) DNA markers. Utilizing the CAI markers, we successfully distinguished more than 10 different local cabbage accessions from 20 kale accessions, and identified kale juices experimentally spiked with different amounts of cabbage.
Widespread evidence of cooperative DNA binding by transcription factors in Drosophila development
Kazemian, Majid; Pham, Hannah; Wolfe, Scot A.; Brodsky, Michael H.; Sinha, Saurabh
2013-01-01
Regulation of eukaryotic gene transcription is often combinatorial in nature, with multiple transcription factors (TFs) regulating common target genes, often through direct or indirect mutual interactions. Many individual examples of cooperative binding by directly interacting TFs have been identified, but it remains unclear how pervasive this mechanism is during animal development. Cooperative TF binding should be manifest in genomic sequences as biased arrangements of TF-binding sites. Here, we explore the extent and diversity of such arrangements related to gene regulation during Drosophila embryogenesis. We used the DNA-binding specificities of 322 TFs along with chromatin accessibility information to identify enriched spacing and orientation patterns of TF-binding site pairs. We developed a new statistical approach for this task, specifically designed to accurately assess inter-site spacing biases while accounting for the phenomenon of homotypic site clustering commonly observed in developmental regulatory regions. We observed a large number of short-range distance preferences between TF-binding site pairs, including examples where the preference depends on the relative orientation of the binding sites. To test whether these binding site patterns reflect physical interactions between the corresponding TFs, we analyzed 27 TF pairs whose binding sites exhibited short distance preferences. In vitro protein–protein binding experiments revealed that >65% of these TF pairs can directly interact with each other. For five pairs, we further demonstrate that they bind cooperatively to DNA if both sites are present with the preferred spacing. This study demonstrates how DNA-binding motifs can be used to produce a comprehensive map of sequence signatures for different mechanisms of combinatorial TF action. PMID:23847101
Widespread evidence of cooperative DNA binding by transcription factors in Drosophila development.
Kazemian, Majid; Pham, Hannah; Wolfe, Scot A; Brodsky, Michael H; Sinha, Saurabh
2013-09-01
Regulation of eukaryotic gene transcription is often combinatorial in nature, with multiple transcription factors (TFs) regulating common target genes, often through direct or indirect mutual interactions. Many individual examples of cooperative binding by directly interacting TFs have been identified, but it remains unclear how pervasive this mechanism is during animal development. Cooperative TF binding should be manifest in genomic sequences as biased arrangements of TF-binding sites. Here, we explore the extent and diversity of such arrangements related to gene regulation during Drosophila embryogenesis. We used the DNA-binding specificities of 322 TFs along with chromatin accessibility information to identify enriched spacing and orientation patterns of TF-binding site pairs. We developed a new statistical approach for this task, specifically designed to accurately assess inter-site spacing biases while accounting for the phenomenon of homotypic site clustering commonly observed in developmental regulatory regions. We observed a large number of short-range distance preferences between TF-binding site pairs, including examples where the preference depends on the relative orientation of the binding sites. To test whether these binding site patterns reflect physical interactions between the corresponding TFs, we analyzed 27 TF pairs whose binding sites exhibited short distance preferences. In vitro protein-protein binding experiments revealed that >65% of these TF pairs can directly interact with each other. For five pairs, we further demonstrate that they bind cooperatively to DNA if both sites are present with the preferred spacing. This study demonstrates how DNA-binding motifs can be used to produce a comprehensive map of sequence signatures for different mechanisms of combinatorial TF action.
Combinatorial events of insertion sequences and ICE in Gram-negative bacteria.
Toleman, Mark A; Walsh, Timothy R
2011-09-01
The emergence of antibiotic and antimicrobial resistance in Gram-negative bacteria is incremental and linked to genetic elements that function in a so-called 'one-ended transposition' manner, including ISEcp1, ISCR elements and Tn3-like transposons. The power of these elements lies in their inability to consistently recognize one of their own terminal sequences, while recognizing more genetically distant surrogate sequences. This has the effect of mobilizing the DNA sequence found adjacent to their initial location. In general, resistance in Gram-negatives is closely linked to a few one-off events. These include the capture of the class 1 integron by a Tn5090-like transposon; the formation of the 3' conserved segment (3'-CS); and the fusion of the ISCR1 element to the 3'-CS. The structures formed by these rare events have been massively amplified and disseminated in Gram-negative bacteria, but hitherto, are rarely found in Gram-positives. Such events dominate current resistance gene acquisition and are instrumental in the construction of large resistance gene islands on chromosomes and plasmids. Similar combinatorial events appear to have occurred between conjugative plasmids and phages constructing hybrid elements called integrative and conjugative elements or conjugative transposons. These elements are beginning to be closely linked to some of the more powerful resistance mechanisms such as the extended spectrum β-lactamases, metallo- and AmpC type β-lactamases. Antibiotic resistance in Gram-negative bacteria is dominated by unusual combinatorial mistakes of Insertion sequences and gene fusions which have been selected and amplified by antibiotic pressure enabling the formation of extended resistance islands. © 2011 Federation of European Microbiological Societies. Published by Blackwell Publishing Ltd. All rights reserved.
High-throughput sequencing of three Lemnoideae (duckweeds) chloroplast genomes from total DNA.
Wang, Wenqin; Messing, Joachim
2011-01-01
Chloroplast genomes provide a wealth of information for evolutionary and population genetic studies. Chloroplasts play a particularly important role in the adaption for aquatic plants because they float on water and their major surface is exposed continuously to sunlight. The subfamily of Lemnoideae represents such a collection of aquatic species that because of photosynthesis represents one of the fastest growing plant species on earth. We sequenced the chloroplast genomes from three different genera of Lemnoideae, Spirodela polyrhiza, Wolffiella lingulata and Wolffia australiana by high-throughput DNA sequencing of genomic DNA using the SOLiD platform. Unfractionated total DNA contains high copies of plastid DNA so that sequences from the nucleus and mitochondria can easily be filtered computationally. Remaining sequence reads were assembled into contiguous sequences (contigs) using SOLiD software tools. Contigs were mapped to a reference genome of Lemna minor and gaps, selected by PCR, were sequenced on the ABI3730xl platform. This combinatorial approach yielded whole genomic contiguous sequences in a cost-effective manner. Over 1,000-time coverage of chloroplast from total DNA were reached by the SOLiD platform in a single spot on a quadrant slide without purification. Comparative analysis indicated that the chloroplast genome was conserved in gene number and organization with respect to the reference genome of L. minor. However, higher nucleotide substitution, abundant deletions and insertions occurred in non-coding regions of these genomes, indicating a greater genomic dynamics than expected from the comparison of other related species in the Pooideae. Noticeably, there was no transition bias over transversion in Lemnoideae. The data should have immediate applications in evolutionary biology and plant taxonomy with increased resolution and statistical power.
High-Throughput Sequencing of Three Lemnoideae (Duckweeds) Chloroplast Genomes from Total DNA
Wang, Wenqin; Messing, Joachim
2011-01-01
Background Chloroplast genomes provide a wealth of information for evolutionary and population genetic studies. Chloroplasts play a particularly important role in the adaption for aquatic plants because they float on water and their major surface is exposed continuously to sunlight. The subfamily of Lemnoideae represents such a collection of aquatic species that because of photosynthesis represents one of the fastest growing plant species on earth. Methods We sequenced the chloroplast genomes from three different genera of Lemnoideae, Spirodela polyrhiza, Wolffiella lingulata and Wolffia australiana by high-throughput DNA sequencing of genomic DNA using the SOLiD platform. Unfractionated total DNA contains high copies of plastid DNA so that sequences from the nucleus and mitochondria can easily be filtered computationally. Remaining sequence reads were assembled into contiguous sequences (contigs) using SOLiD software tools. Contigs were mapped to a reference genome of Lemna minor and gaps, selected by PCR, were sequenced on the ABI3730xl platform. Conclusions This combinatorial approach yielded whole genomic contiguous sequences in a cost-effective manner. Over 1,000-time coverage of chloroplast from total DNA were reached by the SOLiD platform in a single spot on a quadrant slide without purification. Comparative analysis indicated that the chloroplast genome was conserved in gene number and organization with respect to the reference genome of L. minor. However, higher nucleotide substitution, abundant deletions and insertions occurred in non-coding regions of these genomes, indicating a greater genomic dynamics than expected from the comparison of other related species in the Pooideae. Noticeably, there was no transition bias over transversion in Lemnoideae. The data should have immediate applications in evolutionary biology and plant taxonomy with increased resolution and statistical power. PMID:21931804
Tron, Adriana E.; Bertoncini, Carlos W.; Palena, Claudia M.; Chan, Raquel L.; Gonzalez, Daniel H.
2001-01-01
Four groups of plant homeodomain proteins contain a dimerization motif closely linked to the homeodomain. We here show that two sunflower homeodomain proteins, Hahb-4 and HAHR1, which belong to the Hd-Zip I and GL2/Hd-Zip IV groups, respectively, show different binding preferences at a defined position of a pseudopalindromic DNA-binding site used as a target. HAHR1 shows a preference for the sequence 5′-CATT(A/T)AATG-3′, rather than 5′-CAAT(A/T)ATTG-3′, recognized by Hahb-4. To analyze the molecular basis of this behavior, we have constructed a set of mutants with exchanged residues (Phe→Ile and Ile→Phe) at position 47 of the homeodomain, together with chimeric proteins between HAHR1 and Hahb-4. The results obtained indicate that Phe47, but not Ile47, allows binding to 5′-CATT(A/T)AATG-3′. However, the preference for this sequence is determined, in addition, by amino acids located C-terminal to residue 53 of the HAHR1 homeodomain. A double mutant of Hahb-4 (Ile47→Phe/Ala54→Thr) shows the same binding behavior as HAHR1, suggesting that combinatorial interactions of amino acid residues at positions 47 and 54 of the homeodomain are involved in establishing the affinity and selectivity of plant dimeric homeodomain proteins with different DNA target sequences. PMID:11726696
Genetic Constructor: An Online DNA Design Platform.
Bates, Maxwell; Lachoff, Joe; Meech, Duncan; Zulkower, Valentin; Moisy, Anaïs; Luo, Yisha; Tekotte, Hille; Franziska Scheitz, Cornelia Johanna; Khilari, Rupal; Mazzoldi, Florencio; Chandran, Deepak; Groban, Eli
2017-12-15
Genetic Constructor is a cloud Computer Aided Design (CAD) application developed to support synthetic biologists from design intent through DNA fabrication and experiment iteration. The platform allows users to design, manage, and navigate complex DNA constructs and libraries, using a new visual language that focuses on functional parts abstracted from sequence. Features like combinatorial libraries and automated primer design allow the user to separate design from construction by focusing on functional intent, and design constraints aid iterative refinement of designs. A plugin architecture enables contributions from scientists and coders to leverage existing powerful software and connect to DNA foundries. The software is easily accessible and platform agnostic, free for academics, and available in an open-source community edition. Genetic Constructor seeks to democratize DNA design, manufacture, and access to tools and services from the synthetic biology community.
DNA Assembly Techniques for Next Generation Combinatorial Biosynthesis of Natural Products
Cobb, Ryan E.; Ning, Jonathan C.; Zhao, Huimin
2013-01-01
Natural product scaffolds remain important leads for pharmaceutical development. However, transforming a natural product into a drug entity often requires derivatization to enhance the compound’s therapeutic properties. A powerful method by which to perform this derivatization is combinatorial biosynthesis, the manipulation of the genes in the corresponding pathway to divert synthesis towards novel derivatives. While these manipulations have traditionally been carried out via restriction digestion/ligation-based cloning, the shortcomings of such techniques limit their throughput and thus the scope of corresponding combinatorial biosynthesis experiments. In the burgeoning field of synthetic biology, the demand for facile DNA assembly techniques has promoted the development of a host of novel DNA assembly strategies. Here we describe the advantages of these recently-developed tools for rapid, efficient synthesis of large DNA constructs. We also discuss their potential to facilitate the simultaneous assembly of complete libraries of natural product biosynthetic pathways, ushering in the next generation of combinatorial biosynthesis. PMID:24127070
Caetano-Anollés, G; Gresshoff, P M
1996-06-01
DNA amplification fingerprinting (DAF) with mini-hairpins harboring arbitrary "core" sequences at their 3' termini were used to fingerprint a variety of templates, including PCR products and whole genomes, to establish genetic relationships between plant tax at the interspecific and intraspecific level, and to identify closely related fungal isolates and plant accessions. No correlation was observed between the sequence of the arbitrary core, the stability of the mini-hairpin structure and DAF efficiency. Mini-hairpin primers with short arbitrary cores and primers complementary to simple sequence repeats present in microsatellites were also used to generate arbitrary signatures from amplification profiles (ASAP). The ASAP strategy is a dual-step amplification procedure that uses at least one primer in each fingerprinting stage. ASAP was able to reproducibly amplify DAF products (representing about 10-15 kb of sequence) following careful optimization of amplification parameters such as primer and template concentration. Avoidance of primer sequences partially complementary to DAF product termini was necessary in order to produce distinct fingerprints. This allowed the combinatorial use of oligomers in nucleic acid screening, with numerous ASAP fingerprinting reactions based on a limited number of primer sequences. Mini-hairpin primers and ASAP analysis significantly increased detection of polymorphic DNA, separating closely related bermudagrass (Cynodon) cultivars and detecting putatively linked markers in bulked segregant analysis of the soybean (Glycine max) supernodulation (nitrate-tolerant symbiosis) locus.
Construction of a scFv Library with Synthetic, Non-combinatorial CDR Diversity.
Bai, Xuelian; Shim, Hyunbo
2017-01-01
Many large synthetic antibody libraries have been designed, constructed, and successfully generated high-quality antibodies suitable for various demanding applications. While synthetic antibody libraries have many advantages such as optimized framework sequences and a broader sequence landscape than natural antibodies, their sequence diversities typically are generated by random combinatorial synthetic processes which cause the incorporation of many undesired CDR sequences. Here, we describe the construction of a synthetic scFv library using oligonucleotide mixtures that contain predefined, non-combinatorially synthesized CDR sequences. Each CDR is first inserted to a master scFv framework sequence and the resulting single-CDR libraries are subjected to a round of proofread panning. The proofread CDR sequences are assembled to produce the final scFv library with six diversified CDRs.
DNA-Encoded Dynamic Combinatorial Chemical Libraries.
Reddavide, Francesco V; Lin, Weilin; Lehnert, Sarah; Zhang, Yixin
2015-06-26
Dynamic combinatorial chemistry (DCC) explores the thermodynamic equilibrium of reversible reactions. Its application in the discovery of protein binders is largely limited by difficulties in the analysis of complex reaction mixtures. DNA-encoded chemical library (DECL) technology allows the selection of binders from a mixture of up to billions of different compounds; however, experimental results often show low a signal-to-noise ratio and poor correlation between enrichment factor and binding affinity. Herein we describe the design and application of DNA-encoded dynamic combinatorial chemical libraries (EDCCLs). Our experiments have shown that the EDCCL approach can be used not only to convert monovalent binders into high-affinity bivalent binders, but also to cause remarkably enhanced enrichment of potent bivalent binders by driving their in situ synthesis. We also demonstrate the application of EDCCLs in DNA-templated chemical reactions. © 2015 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
Algorithms for optimizing cross-overs in DNA shuffling.
He, Lu; Friedman, Alan M; Bailey-Kellogg, Chris
2012-03-21
DNA shuffling generates combinatorial libraries of chimeric genes by stochastically recombining parent genes. The resulting libraries are subjected to large-scale genetic selection or screening to identify those chimeras with favorable properties (e.g., enhanced stability or enzymatic activity). While DNA shuffling has been applied quite successfully, it is limited by its homology-dependent, stochastic nature. Consequently, it is used only with parents of sufficient overall sequence identity, and provides no control over the resulting chimeric library. This paper presents efficient methods to extend the scope of DNA shuffling to handle significantly more diverse parents and to generate more predictable, optimized libraries. Our CODNS (cross-over optimization for DNA shuffling) approach employs polynomial-time dynamic programming algorithms to select codons for the parental amino acids, allowing for zero or a fixed number of conservative substitutions. We first present efficient algorithms to optimize the local sequence identity or the nearest-neighbor approximation of the change in free energy upon annealing, objectives that were previously optimized by computationally-expensive integer programming methods. We then present efficient algorithms for more powerful objectives that seek to localize and enhance the frequency of recombination by producing "runs" of common nucleotides either overall or according to the sequence diversity of the resulting chimeras. We demonstrate the effectiveness of CODNS in choosing codons and allocating substitutions to promote recombination between parents targeted in earlier studies: two GAR transformylases (41% amino acid sequence identity), two very distantly related DNA polymerases, Pol X and β (15%), and beta-lactamases of varying identity (26-47%). Our methods provide the protein engineer with a new approach to DNA shuffling that supports substantially more diverse parents, is more deterministic, and generates more predictable and more diverse chimeric libraries.
Space pruning monotonic search for the non-unique probe selection problem.
Pappalardo, Elisa; Ozkok, Beyza Ahlatcioglu; Pardalos, Panos M
2014-01-01
Identification of targets, generally viruses or bacteria, in a biological sample is a relevant problem in medicine. Biologists can use hybridisation experiments to determine whether a specific DNA fragment, that represents the virus, is presented in a DNA solution. A probe is a segment of DNA or RNA, labelled with a radioactive isotope, dye or enzyme, used to find a specific target sequence on a DNA molecule by hybridisation. Selecting unique probes through hybridisation experiments is a difficult task, especially when targets have a high degree of similarity, for instance in a case of closely related viruses. After preliminary experiments, performed by a canonical Monte Carlo method with Heuristic Reduction (MCHR), a new combinatorial optimisation approach, the Space Pruning Monotonic Search (SPMS) method, is introduced. The experiments show that SPMS provides high quality solutions and outperforms the current state-of-the-art algorithms.
Xu, Huayong; Yu, Hui; Tu, Kang; Shi, Qianqian; Wei, Chaochun; Li, Yuan-Yuan; Li, Yi-Xue
2013-01-01
We are witnessing rapid progress in the development of methodologies for building the combinatorial gene regulatory networks involving both TFs (Transcription Factors) and miRNAs (microRNAs). There are a few tools available to do these jobs but most of them are not easy to use and not accessible online. A web server is especially needed in order to allow users to upload experimental expression datasets and build combinatorial regulatory networks corresponding to their particular contexts. In this work, we compiled putative TF-gene, miRNA-gene and TF-miRNA regulatory relationships from forward-engineering pipelines and curated them as built-in data libraries. We streamlined the R codes of our two separate forward-and-reverse engineering algorithms for combinatorial gene regulatory network construction and formalized them as two major functional modules. As a result, we released the cGRNB (combinatorial Gene Regulatory Networks Builder): a web server for constructing combinatorial gene regulatory networks through integrated engineering of seed-matching sequence information and gene expression datasets. The cGRNB enables two major network-building modules, one for MPGE (miRNA-perturbed gene expression) datasets and the other for parallel miRNA/mRNA expression datasets. A miRNA-centered two-layer combinatorial regulatory cascade is the output of the first module and a comprehensive genome-wide network involving all three types of combinatorial regulations (TF-gene, TF-miRNA, and miRNA-gene) are the output of the second module. In this article we propose cGRNB, a web server for building combinatorial gene regulatory networks through integrated engineering of seed-matching sequence information and gene expression datasets. Since parallel miRNA/mRNA expression datasets are rapidly accumulated by the advance of next-generation sequencing techniques, cGRNB will be very useful tool for researchers to build combinatorial gene regulatory networks based on expression datasets. The cGRNB web-server is free and available online at http://www.scbit.org/cgrnb.
BASIC: A Simple and Accurate Modular DNA Assembly Method.
Storch, Marko; Casini, Arturo; Mackrow, Ben; Ellis, Tom; Baldwin, Geoff S
2017-01-01
Biopart Assembly Standard for Idempotent Cloning (BASIC) is a simple, accurate, and robust DNA assembly method. The method is based on linker-mediated DNA assembly and provides highly accurate DNA assembly with 99 % correct assemblies for four parts and 90 % correct assemblies for seven parts [1]. The BASIC standard defines a single entry vector for all parts flanked by the same prefix and suffix sequences and its idempotent nature means that the assembled construct is returned in the same format. Once a part has been adapted into the BASIC format it can be placed at any position within a BASIC assembly without the need for reformatting. This allows laboratories to grow comprehensive and universal part libraries and to share them efficiently. The modularity within the BASIC framework is further extended by the possibility of encoding ribosomal binding sites (RBS) and peptide linker sequences directly on the linkers used for assembly. This makes BASIC a highly versatile library construction method for combinatorial part assembly including the construction of promoter, RBS, gene variant, and protein-tag libraries. In comparison with other DNA assembly standards and methods, BASIC offers a simple robust protocol; it relies on a single entry vector, provides for easy hierarchical assembly, and is highly accurate for up to seven parts per assembly round [2].
A label-free, fluorescence based assay for microarray
NASA Astrophysics Data System (ADS)
Niu, Sanjun
DNA chip technology has drawn tremendous attention since it emerged in the mid 90's as a method that expedites gene sequencing by over 100-fold. DNA chip, also called DNA microarray, is a combinatorial technology in which different single-stranded DNA (ssDNA) molecules of known sequences are immobilized at specific spots. The immobilized ssDNA strands are called probes. In application, the chip is exposed to a solution containing ssDNA of unknown sequence, called targets, which are labeled with fluorescent dyes. Due to specific molecular recognition among the base pairs in the DNA, the binding or hybridization occurs only when the probe and target sequences are complementary. The nucleotide sequence of the target is determined by imaging the fluorescence from the spots. The uncertainty of background in signal detection and statistical error in data analysis, primarily due to the error in the DNA amplification process and statistical distribution of the tags in the target DNA, have become the fundamental barriers in bringing the technology into application for clinical diagnostics. Furthermore, the dye and tagging process are expensive, making the cost of DNA chips inhibitive for clinical testing. These limitations and challenges make it difficult to implement DNA chip methods as a diagnostic tool in a pathology laboratory. The objective of this dissertation research is to provide an alternative approach that will address the above challenges. In this research, a label-free assay is designed and studied. Polystyrene (PS), a commonly used polymeric material, serves as the fluorescence agent. Probe ssDNA is covalently immobilized on polystyrene thin film that is supported by a reflecting substrate. When this chip is exposed to excitation light, fluorescence light intensity from PS is detected as the signal. Since the optical constants and conformations of ssDNA and dsDNA (double stranded DNA) are different, the measured fluorescence from PS changes for the same intensity of excitation light. The fluorescence contrast is used to quantify the amount of probe-target hybridization. A mathematical model that considers multiple reflections and scattering is developed to explain the mechanism of the fluorescence contrast which depends on the thickness of the PS film. Scattering is the dominant factor that contributes to the contrast. The potential of this assay to detect single nucleotide polymorphism is also tested.
Recombinant organisms for production of industrial products
Adrio, Jose-Luis
2010-01-01
A revolution in industrial microbiology was sparked by the discoveries of ther double-stranded structure of DNA and the development of recombinant DNA technology. Traditional industrial microbiology was merged with molecular biology to yield improved recombinant processes for the industrial production of primary and secondary metabolites, protein biopharmaceuticals and industrial enzymes. Novel genetic techniques such as metabolic engineering, combinatorial biosynthesis and molecular breeding techniques and their modifications are contributing greatly to the development of improved industrial processes. In addition, functional genomics, proteomics and metabolomics are being exploited for the discovery of novel valuable small molecules for medicine as well as enzymes for catalysis. The sequencing of industrial microbal genomes is being carried out which bodes well for future process improvement and discovery of new industrial products. PMID:21326937
Discovery of DNA repair inhibitors by combinatorial library profiling
Moeller, Benjamin J.; Sidman, Richard L.; Pasqualini, Renata; Arap, Wadih
2011-01-01
Small molecule inhibitors of DNA repair are emerging as potent and selective anti-cancer therapies, but the sheer magnitude of the protein networks involved in DNA repair processes poses obstacles to discovery of effective candidate drugs. To address this challenge, we used a subtractive combinatorial selection approach to identify a panel of peptide ligands that bind DNA repair complexes. Supporting the concept that these ligands have therapeutic potential, we show that one selected peptide specifically binds and non-competitively inactivates DNA-PKcs, a protein kinase critical in double-strand DNA break repair. In doing so, this ligand sensitizes BRCA-deficient tumor cells to genotoxic therapy. Our findings establish a platform for large-scale parallel screening for ligand-directed DNA repair inhibitors, with immediate applicability to cancer therapy. PMID:21343400
Selection and identification of a DNA aptamer targeted to Vibrio parahemolyticus.
Duan, Nuo; Wu, Shijia; Chen, Xiujuan; Huang, Yukun; Wang, Zhouping
2012-04-25
A whole-bacterium systemic evolution of ligands by exponential enrichment (SELEX) method was applied to a combinatorial library of FAM-labeled single-stranded DNA molecules to identify DNA aptamers demonstrating specific binding to Vibrio parahemolyticus . FAM-labeled aptamer sequences with high binding affinity to V. parahemolyticus were identified by flow cytometric analysis. Aptamer A3P, which showed a particularly high binding affinity in preliminary studies, was chosen for further characterization. This aptamer displayed a dissociation constant (K(d)) of 16.88 ± 1.92 nM. Binding assays to assess the specificity of aptamer A3P showed a high binding affinity (76%) for V. parahemolyticus and a low apparent binding affinity (4%) for other bacteria. Whole-bacterium SELEX is a promising technique for the design of aptamer-based molecular probes for microbial pathogens that does not require the labor-intensive steps of isolating and purifying complex markers or targets.
Disentangling the many layers of eukaryotic transcriptional regulation.
Lelli, Katherine M; Slattery, Matthew; Mann, Richard S
2012-01-01
Regulation of gene expression in eukaryotes is an extremely complex process. In this review, we break down several critical steps, emphasizing new data and techniques that have expanded current gene regulatory models. We begin at the level of DNA sequence where cis-regulatory modules (CRMs) provide important regulatory information in the form of transcription factor (TF) binding sites. In this respect, CRMs function as instructional platforms for the assembly of gene regulatory complexes. We discuss multiple mechanisms controlling complex assembly, including cooperative DNA binding, combinatorial codes, and CRM architecture. The second section of this review places CRM assembly in the context of nucleosomes and condensed chromatin. We discuss how DNA accessibility and histone modifications contribute to TF function. Lastly, new advances in chromosomal mapping techniques have provided increased understanding of intra- and interchromosomal interactions. We discuss how these topological maps influence gene regulatory models.
Marsic, Damien; Govindasamy, Lakshmanan; Currlin, Seth; Markusic, David M; Tseng, Yu-Shan; Herzog, Roland W; Agbandje-McKenna, Mavis; Zolotukhin, Sergei
2014-01-01
Methodologies to improve existing adeno-associated virus (AAV) vectors for gene therapy include either rational approaches or directed evolution to derive capsid variants characterized by superior transduction efficiencies in targeted tissues. Here, we integrated both approaches in one unified design strategy of “virtual family shuffling” to derive a combinatorial capsid library whereby only variable regions on the surface of the capsid are modified. Individual sublibraries were first assembled in order to preselect compatible amino acid residues within restricted surface-exposed regions to minimize the generation of dead-end variants. Subsequently, the successful families were interbred to derive a combined library of ~8 × 105 complexity. Next-generation sequencing of the packaged viral DNA revealed capsid surface areas susceptible to directed evolution, thus providing guidance for future designs. We demonstrated the utility of the library by deriving an AAV2-based vector characterized by a 20-fold higher transduction efficiency in murine liver, now equivalent to that of AAV8. PMID:25048217
Modeling discrete combinatorial systems as alphabetic bipartite networks: theory and applications.
Choudhury, Monojit; Ganguly, Niloy; Maiti, Abyayananda; Mukherjee, Animesh; Brusch, Lutz; Deutsch, Andreas; Peruani, Fernando
2010-03-01
Genes and human languages are discrete combinatorial systems (DCSs), in which the basic building blocks are finite sets of elementary units: nucleotides or codons in a DNA sequence, and letters or words in a language. Different combinations of these finite units give rise to potentially infinite numbers of genes or sentences. This type of DCSs can be represented as an alphabetic bipartite network (ABN) where there are two kinds of nodes, one type represents the elementary units while the other type represents their combinations. Here, we extend and generalize recent analytical findings for ABNs derived in [Peruani, Europhys. Lett. 79, 28001 (2007)] and empirically investigate two real world systems in terms of ABNs, the codon gene and the phoneme-language network. The one-mode projections onto the elementary basic units are also studied theoretically as well as in real world ABNs. We propose the use of ABNs as a means for inferring the mechanisms underlying the growth of real world DCSs.
Recombinant organisms for production of industrial products.
Adrio, Jose-Luis; Demain, Arnold L
2010-01-01
A revolution in industrial microbiology was sparked by the discoveries of ther double-stranded structure of DNA and the development of recombinant DNA technology. Traditional industrial microbiology was merged with molecular biology to yield improved recombinant processes for the industrial production of primary and secondary metabolites, protein biopharmaceuticals and industrial enzymes. Novel genetic techniques such as metabolic engineering, combinatorial biosynthesis and molecular breeding techniques and their modifications are contributing greatly to the development of improved industrial processes. In addition, functional genomics, proteomics and metabolomics are being exploited for the discovery of novel valuable small molecules for medicine as well as enzymes for catalysis. The sequencing of industrial microbal genomes is being carried out which bodes well for future process improvement and discovery of new industrial products. © 2010 Landes Bioscience
Zhang, Yuhuan; Liu, Wei; Zhang, Wentao; Yu, Shaoxuan; Yue, Xiaoyue; Zhu, Wenxin; Zhang, Daohong; Wang, Yanru; Wang, Jianlong
2015-10-15
Herein, the structure of two DNA strands which are complementary except fourteen T-T and C-C mismatches was programmed for the design of the combinatorial logic operation by utilizing the different protective capacities of single chain DNA, part-hybridized DNA and completed-hybridized DNA on unmodified gold nanoparticles. In the presence of either Hg(2+) or Ag(+), the T-Hg(2+)-T or C-Ag(+)-C coordination chemistry could lead to the formation of part-hybridized DNA which keeps gold nanoparticles from clumping after the addition of 40 μL 0.2M NaClO4 solution, but the protection would be screened by 120 μL 0.2M NaClO4 solution. While the coexistence of Hg(2+), Ag(+) caused the formation of completed-hybridized DNA and the protection for gold nanoparticles lost in either 40 μL or 120 μL NaClO4 solutions. Benefiting from sharing of the same inputs of Hg(2+) and Ag(+), OR and AND logic gates were easily integrated into a simple colorimetric combinatorial logic operation in one system, which make it possible to execute logic gates in parallel to mimic arithmetic calculations on a binary digit. Furthermore, two other logic gates including INHIBIT1 and INHIBIT2 were realized to integrated with OR logic gate both for simultaneous qualitative discrimination and quantitative determination of Hg(2+) and Ag(+). Results indicate that the developed logic system based on the different protective capacities of DNA structure on gold nanoparticles provides a new pathway for the design of the combinatorial logic operation in one system and presents a useful strategy for development of advanced sensors, which may have potential applications in multiplex chemical analysis and molecular-scale computer design. Copyright © 2015 Elsevier B.V. All rights reserved.
Pydna: a simulation and documentation tool for DNA assembly strategies using python.
Pereira, Filipa; Azevedo, Flávio; Carvalho, Ângela; Ribeiro, Gabriela F; Budde, Mark W; Johansson, Björn
2015-05-02
Recent advances in synthetic biology have provided tools to efficiently construct complex DNA molecules which are an important part of many molecular biology and biotechnology projects. The planning of such constructs has traditionally been done manually using a DNA sequence editor which becomes error-prone as scale and complexity of the construction increase. A human-readable formal description of cloning and assembly strategies, which also allows for automatic computer simulation and verification, would therefore be a valuable tool. We have developed pydna, an extensible, free and open source Python library for simulating basic molecular biology DNA unit operations such as restriction digestion, ligation, PCR, primer design, Gibson assembly and homologous recombination. A cloning strategy expressed as a pydna script provides a description that is complete, unambiguous and stable. Execution of the script automatically yields the sequence of the final molecule(s) and that of any intermediate constructs. Pydna has been designed to be understandable for biologists with limited programming skills by providing interfaces that are semantically similar to the description of molecular biology unit operations found in literature. Pydna simplifies both the planning and sharing of cloning strategies and is especially useful for complex or combinatorial DNA molecule construction. An important difference compared to existing tools with similar goals is the use of Python instead of a specifically constructed language, providing a simulation environment that is more flexible and extensible by the user.
Litovchick, Alexander; Clark, Matthew A; Keefe, Anthony D
2014-01-01
The affinity-mediated selection of large libraries of DNA-encoded small molecules is increasingly being used to initiate drug discovery programs. We present universal methods for the encoding of such libraries using the chemical ligation of oligonucleotides. These methods may be used to record the chemical history of individual library members during combinatorial synthesis processes. We demonstrate three different chemical ligation methods as examples of information recording processes (writing) for such libraries and two different cDNA-generation methods as examples of information retrieval processes (reading) from such libraries. The example writing methods include uncatalyzed and Cu(I)-catalyzed alkyne-azide cycloadditions and a novel photochemical thymidine-psoralen cycloaddition. The first reading method “relay primer-dependent bypass” utilizes a relay primer that hybridizes across a chemical ligation junction embedded in a fixed-sequence and is extended at its 3′-terminus prior to ligation to adjacent oligonucleotides. The second reading method “repeat-dependent bypass” utilizes chemical ligation junctions that are flanked by repeated sequences. The upstream repeat is copied prior to a rearrangement event during which the 3′-terminus of the cDNA hybridizes to the downstream repeat and polymerization continues. In principle these reading methods may be used with any ligation chemistry and offer universal strategies for the encoding (writing) and interpretation (reading) of DNA-encoded chemical libraries. PMID:25483841
Zhu, Yingjie; Xu, Jiang; Sun, Chao; Zhou, Shiguo; Xu, Haibin; Nelson, David R.; Qian, Jun; Song, Jingyuan; Luo, Hongmei; Xiang, Li; Li, Ying; Xu, Zhichao; Ji, Aijia; Wang, Lizhi; Lu, Shanfa; Hayward, Alice; Sun, Wei; Li, Xiwen; Schwartz, David C.; Wang, Yitao; Chen, Shilin
2015-01-01
Fungi have evolved powerful genomic and chemical defense systems to protect themselves against genetic destabilization and other organisms. However, the precise molecular basis involved in fungal defense remain largely unknown in Basidiomycetes. Here the complete genome sequence, as well as DNA methylation patterns and small RNA transcriptomes, was analyzed to provide a holistic overview of secondary metabolism and defense processes in the model medicinal fungus, Ganoderma sinense. We reported the 48.96 Mb genome sequence of G. sinense, consisting of 12 chromosomes and encoding 15,688 genes. More than thirty gene clusters involved in the biosynthesis of secondary metabolites, as well as a large array of genes responsible for their transport and regulation were highlighted. In addition, components of genome defense mechanisms, namely repeat-induced point mutation (RIP), DNA methylation and small RNA-mediated gene silencing, were revealed in G. sinense. Systematic bioinformatic investigation of the genome and methylome suggested that RIP and DNA methylation combinatorially maintain G. sinense genome stability by inactivating invasive genetic material and transposable elements. The elucidation of the G. sinense genome and epigenome provides an unparalleled opportunity to advance our understanding of secondary metabolism and fungal defense mechanisms. PMID:26046933
Nishimoto, Ryu; Tani, Jun
2004-09-01
This study shows how sensory-action sequences of imitating finite state machines (FSMs) can be learned by utilizing the deterministic dynamics of recurrent neural networks (RNNs). Our experiments indicated that each possible combinatorial sequence can be recalled by specifying its respective initial state value and also that fractal structures appear in this initial state mapping after the learning converges. We also observed that the sequences of mimicking FSMs are encoded utilizing the transient regions rather than the invariant sets of the evolved dynamical systems of the RNNs.
Wang, Yen-Ling
2014-01-01
Checkpoint kinase 2 (Chk2) has a great effect on DNA-damage and plays an important role in response to DNA double-strand breaks and related lesions. In this study, we will concentrate on Chk2 and the purpose is to find the potential inhibitors by the pharmacophore hypotheses (PhModels), combinatorial fusion, and virtual screening techniques. Applying combinatorial fusion into PhModels and virtual screening techniques is a novel design strategy for drug design. We used combinatorial fusion to analyze the prediction results and then obtained the best correlation coefficient of the testing set (r test) with the value 0.816 by combining the BesttrainBesttest and FasttrainFasttest prediction results. The potential inhibitors were selected from NCI database by screening according to BesttrainBesttest + FasttrainFasttest prediction results and molecular docking with CDOCKER docking program. Finally, the selected compounds have high interaction energy between a ligand and a receptor. Through these approaches, 23 potential inhibitors for Chk2 are retrieved for further study. PMID:24864236
Chetta, M.; Drmanac, A.; Santacroce, R.; Grandone, E.; Surrey, S.; Fortina, P.; Margaglione, M.
2008-01-01
BACKGROUND: Standard methods of mutation detection are time consuming in Hemophilia A (HA) rendering their application unavailable in some analysis such as prenatal diagnosis. OBJECTIVES: To evaluate the feasibility of combinatorial sequencing-by-hybridization (cSBH) as an alternative and reliable tool for mutation detection in FVIII gene. PATIENTS/METHODS: We have applied a new method of cSBH that uses two different colors for detection of multiple point mutations in the FVIII gene. The 26 exons encompassing the HA gene were analyzed in 7 newly diagnosed Italian patients and in 19 previously characterized individuals with FVIII deficiency. RESULTS: Data show that, when solution-phase TAMRA and QUASAR labeled 5-mer oligonucleotide sets mixed with unlabeled target PCR templates are co-hybridized in the presence of DNA ligase to universal 6-mer oligonucleotide probe-based arrays, a number of mutations can be successfully detected. The technique was reliable also in identifying a mutant FVIII allele in an obligate heterozygote. A novel missense mutation (Leu1843Thr) in exon 16 and three novel neutral polymorphisms are presented with an updated protocol for 2-color cSBH. CONCLUSIONS: cSBH is a reliable tool for mutation detection in FVIII gene and may represent a complementary method for the genetic screening of HA patients. PMID:20300295
Designed Electroresponsive Biomaterials: Sequence-Controlled Behavior
2010-06-29
protein of the M13 . Traditional phage and yeast display methodologies indicate that peptide sequences with high affinities for electrode materials...drug delivery. The original vision for this work was to employ combinatorial tools such as phage and yeast display under electrical selection pressure...and drug delivery. The original vision for this work was to employ combinatorial tools such as phage and yeast display under electrical selection
Smaczniak, Cezary; Muiño, Jose M; Chen, Dijun; Angenent, Gerco C; Kaufmann, Kerstin
2017-08-01
Floral organ identities in plants are specified by the combinatorial action of homeotic master regulatory transcription factors. However, how these factors achieve their regulatory specificities is still largely unclear. Genome-wide in vivo DNA binding data show that homeotic MADS domain proteins recognize partly distinct genomic regions, suggesting that DNA binding specificity contributes to functional differences of homeotic protein complexes. We used in vitro systematic evolution of ligands by exponential enrichment followed by high-throughput DNA sequencing (SELEX-seq) on several floral MADS domain protein homo- and heterodimers to measure their DNA binding specificities. We show that specification of reproductive organs is associated with distinct binding preferences of a complex formed by SEPALLATA3 and AGAMOUS. Binding specificity is further modulated by different binding site spacing preferences. Combination of SELEX-seq and genome-wide DNA binding data allows differentiation between targets in specification of reproductive versus perianth organs in the flower. We validate the importance of DNA binding specificity for organ-specific gene regulation by modulating promoter activity through targeted mutagenesis. Our study shows that intrafamily protein interactions affect DNA binding specificity of floral MADS domain proteins. Differential DNA binding of MADS domain protein complexes plays a role in the specificity of target gene regulation. © 2017 American Society of Plant Biologists. All rights reserved.
Zhou, Qian-Mei; Chen, Qi-Long; Du, Jia; Wang, Xiu-Feng; Lu, Yi-Yu; Zhang, Hui; Su, Shi-Bing
2014-01-01
In order to explore the synergistic mechanisms of combinatorial treatment using curcumin and mitomycin C (MMC) for breast cancer, MCF-7 breast cancer xenografts were conducted to observe the synergistic effect of combinatorial treatment using curcumin and MMC at various dosages. The synergistic mechanisms of combinatorial treatment using curcumin and MMC on the inhibition of tumor growth were explored by differential gene expression profile, gene ontology (GO), ingenuity pathway analysis (IPA) and Signal–Net network analysis. The expression levels of selected genes identified by cDNA microarray expression profiling were validated by quantitative RT-PCR (qRT-PCR) and Western blot analysis. Effect of combinatorial treatment on the inhibition of cell growth was observed by MTT assay. Apoptosis was detected by flow cytometric analysis and Hoechst 33258 staining. The combinatorial treatment of 100 mg/kg curcumin and 1.5 mg/kg MMC revealed synergistic inhibition on tumor growth. Among 1501 differentially expressed genes, the expression of 25 genes exhibited an obvious change and a significant difference in 27 signal pathways was observed (p < 0.05). In addition, Mapk1 (ERK) and Mapk14 (MAPK p38) had more cross-interactions with other genes and revealed an increase in expression by 8.14- and 11.84-fold, respectively during the combinatorial treatment by curcumin and MMC when compared with the control. Moreover, curcumin can synergistically improve tumoricidal effect of MMC in another human breast cancer MDA-MB-231 cells. Apoptosis was significantly induced by the combinatorial treatment (p < 0.05) and significantly inhibited by ERK inhibitor (PD98059) in MCF-7 cells (p < 0.05). The synergistic effect of combinatorial treatment by curcumin and MMC on the induction of apoptosis in breast cancer cells may be via the ERK pathway. PMID:25226537
Analytical validation of a psychiatric pharmacogenomic test.
Jablonski, Michael R; King, Nina; Wang, Yongbao; Winner, Joel G; Watterson, Lucas R; Gunselman, Sandra; Dechairo, Bryan M
2018-05-01
The aim of this study was to validate the analytical performance of a combinatorial pharmacogenomics test designed to aid in the appropriate medication selection for neuropsychiatric conditions. Genomic DNA was isolated from buccal swabs. Twelve genes (65 variants/alleles) associated with psychotropic medication metabolism, side effects, and mechanisms of actions were evaluated by bead array, MALDI-TOF mass spectrometry, and/or capillary electrophoresis methods (GeneSight Psychotropic, Assurex Health, Inc.). The combinatorial pharmacogenomics test has a dynamic range of 2.5-20 ng/μl of input genomic DNA, with comparable performance for all assays included in the test. Both the precision and accuracy of the test were >99.9%, with individual gene components between 99.4 and 100%. This study demonstrates that the combinatorial pharmacogenomics test is robust and reproducible, making it suitable for clinical use.
Poisson Statistics of Combinatorial Library Sampling Predict False Discovery Rates of Screening
2017-01-01
Microfluidic droplet-based screening of DNA-encoded one-bead-one-compound combinatorial libraries is a miniaturized, potentially widely distributable approach to small molecule discovery. In these screens, a microfluidic circuit distributes library beads into droplets of activity assay reagent, photochemically cleaves the compound from the bead, then incubates and sorts the droplets based on assay result for subsequent DNA sequencing-based hit compound structure elucidation. Pilot experimental studies revealed that Poisson statistics describe nearly all aspects of such screens, prompting the development of simulations to understand system behavior. Monte Carlo screening simulation data showed that increasing mean library sampling (ε), mean droplet occupancy, or library hit rate all increase the false discovery rate (FDR). Compounds identified as hits on k > 1 beads (the replicate k class) were much more likely to be authentic hits than singletons (k = 1), in agreement with previous findings. Here, we explain this observation by deriving an equation for authenticity, which reduces to the product of a library sampling bias term (exponential in k) and a sampling saturation term (exponential in ε) setting a threshold that the k-dependent bias must overcome. The equation thus quantitatively describes why each hit structure’s FDR is based on its k class, and further predicts the feasibility of intentionally populating droplets with multiple library beads, assaying the micromixtures for function, and identifying the active members by statistical deconvolution. PMID:28682059
Synthesis and cell-free cloning of DNA libraries using programmable microfluidics
Yehezkel, Tuval Ben; Rival, Arnaud; Raz, Ofir; Cohen, Rafael; Marx, Zipora; Camara, Miguel; Dubern, Jean-Frédéric; Koch, Birgit; Heeb, Stephan; Krasnogor, Natalio; Delattre, Cyril; Shapiro, Ehud
2016-01-01
Microfluidics may revolutionize our ability to write synthetic DNA by addressing several fundamental limitations associated with generating novel genetic constructs. Here we report the first de novo synthesis and cell-free cloning of custom DNA libraries in sub-microliter reaction droplets using programmable digital microfluidics. Specifically, we developed Programmable Order Polymerization (POP), Microfluidic Combinatorial Assembly of DNA (M-CAD) and Microfluidic In-vitro Cloning (MIC) and applied them to de novo synthesis, combinatorial assembly and cell-free cloning of genes, respectively. Proof-of-concept for these methods was demonstrated by programming an autonomous microfluidic system to construct and clone libraries of yeast ribosome binding sites and bacterial Azurine, which were then retrieved in individual droplets and validated. The ability to rapidly and robustly generate designer DNA molecules in an autonomous manner should have wide application in biological research and development. PMID:26481354
Lin, Chun-Yuan; Wang, Yen-Ling
2014-01-01
Checkpoint kinase 2 (Chk2) has a great effect on DNA-damage and plays an important role in response to DNA double-strand breaks and related lesions. In this study, we will concentrate on Chk2 and the purpose is to find the potential inhibitors by the pharmacophore hypotheses (PhModels), combinatorial fusion, and virtual screening techniques. Applying combinatorial fusion into PhModels and virtual screening techniques is a novel design strategy for drug design. We used combinatorial fusion to analyze the prediction results and then obtained the best correlation coefficient of the testing set (r test) with the value 0.816 by combining the Best(train)Best(test) and Fast(train)Fast(test) prediction results. The potential inhibitors were selected from NCI database by screening according to Best(train)Best(test) + Fast(train)Fast(test) prediction results and molecular docking with CDOCKER docking program. Finally, the selected compounds have high interaction energy between a ligand and a receptor. Through these approaches, 23 potential inhibitors for Chk2 are retrieved for further study.
SpDamID: Marking DNA Bound by Protein Complexes Identifies Notch-Dimer Responsive Enhancers
Hass, Matthew R.; Liow, Hien-haw; Chen, Xiaoting; Sharma, Ankur; Inoue, Yukiko U.; Inoue, Takayoshi; Reeb, Ashley; Martens, Andrew; Fulbright, Mary; Raju, Saravanan; Stevens, Michael; Boyle, Scott; Park, Joo-Seop; Weirauch, Matthew T.; Brent, Michael; Kopan, Raphael
2015-01-01
SUMMARY We developed Split DamID (SpDamID), a protein complementation version of DamID, to mark genomic DNA bound in vivo by interacting or juxtapositioned transcription factors. Inactive halves of DAM (DNA Adenine Methyltransferase) were fused to protein pairs to be queried Interaction or proximity enabled DAM reconstitution and methylation of adenine in GATC. Inducible SpDamID was used to analyze Notch-mediated transcriptional activation. We demonstrate that Notch complexes label RBP sites broadly across the genome, and show that a subset of these complexes that recruit MAML and p300 undergo changes in chromatin accessibility in response to Notch signaling. SpDamID differentiates between monomeric and dimeric binding thereby allowing for identification of half-site motifs used by Notch dimers. Motif enrichment of Notch enhancers coupled with SpDamID reveals co-targeting of regulatory sequences by Notch and Runx1. SpDamID represents a sensitive and powerful tool that enables dynamic analysis of combinatorial protein-DNA transactions at a genome-wide level. PMID:26257285
DOE Office of Scientific and Technical Information (OSTI.GOV)
Hillson, Nathan
j5 automates and optimizes the design of the molecular biological process of cloning/constructing DNA. j5 enables users to benefit from (combinatorial) multi-part scar-less SLIC, Gibson, CPEC, Golden Gate assembly, or variants thereof, for which automation software does not currently exist, without the intense labor currently associated with the process. j5 inputs a list of the DNA sequences to be assembled, along with a Genbank, FASTA, jbei-seq, or SBOL v1.1 format sequence file for each DNA source. Given the list of DNA sequences to be assembled, j5 first determines the cost-minimizing assembly strategy for each part (direct synthesis, PCR/SOE, or oligo-embedding),more » designs DNA oligos with Primer3, adds flanking homology sequences (SLIC, Gibson, and CPEC; optimized with Primer3 for CPEC) or optimized overhang sequences (Golden Gate) to the oligos and direct synthesis pieces, and utilizes BLAST to check against oligo mis-priming and assembly piece incompatibility events. After identifying DNA oligos that are already contained within a local collection for reuse, the program estimates the total cost of direct synthesis and new oligos to be ordered. In the instance that j5 identifies putative assembly piece incompatibilities (multiple pieces with high flanking sequence homology), the program suggests hierarchical subassemblies where possible. The program outputs a comma-separated value (CSV) file, viewable via Excel or other spreadsheet software, that contains assembly design information (such as the PCR/SOE reactions to perform, their anticipated sizes and sequences, etc.) as well as a properly annotated genbank file containing the sequence resulting from the assembly, and appends the local oligo library with the oligos to be ordered j5 condenses multiple independent assembly projects into 96-well format for high-throughput liquid-handling robotics platforms, and generates configuration files for the PR-PR biology-friendly robot programming language. j5 thus provides a new way to design DNA assembly procedures much more productively and efficiently, not only in terms of time, but also in terms of cost. To a large extent, however, j5 does not allow people to do something that could not be done before by hand given enough time and effort. An exception to this is that, since the very act of using j5 to design the DNA assembly process standardizes the experimental details and workflow, j5 enables a single person to concurrently perform the independent DNA construction tasks of an entire group of researchers. Currently, this is not readily possible, since separate researchers employ disparate design strategies and workflows, and furthermore, their designs and workflows are very infrequently fully captured in an electronic format which is conducive to automation.« less
Application of combinatorial biocatalysis for a unique ring expansion of dihydroxymethylzearalenone
USDA-ARS?s Scientific Manuscript database
Combinatorial biocatalysis was applied to generate a diverse set of dihydroxymethylzearalenone derivatives with modified ring structure. In one chemoenzymatic reaction sequence, dihydroxymethylzearalenone was first subjected to a unique enzyme-catalyzed oxidative ring opening reaction that creates ...
Shapiro, James A
2016-06-08
The 21st century genomics-based analysis of evolutionary variation reveals a number of novel features impossible to predict when Dobzhansky and other evolutionary biologists formulated the neo-Darwinian Modern Synthesis in the middle of the last century. These include three distinct realms of cell evolution; symbiogenetic fusions forming eukaryotic cells with multiple genome compartments; horizontal organelle, virus and DNA transfers; functional organization of proteins as systems of interacting domains subject to rapid evolution by exon shuffling and exonization; distributed genome networks integrated by mobile repetitive regulatory signals; and regulation of multicellular development by non-coding lncRNAs containing repetitive sequence components. Rather than single gene traits, all phenotypes involve coordinated activity by multiple interacting cell molecules. Genomes contain abundant and functional repetitive components in addition to the unique coding sequences envisaged in the early days of molecular biology. Combinatorial coding, plus the biochemical abilities cells possess to rearrange DNA molecules, constitute a powerful toolbox for adaptive genome rewriting. That is, cells possess "Read-Write Genomes" they alter by numerous biochemical processes capable of rapidly restructuring cellular DNA molecules. Rather than viewing genome evolution as a series of accidental modifications, we can now study it as a complex biological process of active self-modification.
Shapiro, James A.
2016-01-01
The 21st century genomics-based analysis of evolutionary variation reveals a number of novel features impossible to predict when Dobzhansky and other evolutionary biologists formulated the neo-Darwinian Modern Synthesis in the middle of the last century. These include three distinct realms of cell evolution; symbiogenetic fusions forming eukaryotic cells with multiple genome compartments; horizontal organelle, virus and DNA transfers; functional organization of proteins as systems of interacting domains subject to rapid evolution by exon shuffling and exonization; distributed genome networks integrated by mobile repetitive regulatory signals; and regulation of multicellular development by non-coding lncRNAs containing repetitive sequence components. Rather than single gene traits, all phenotypes involve coordinated activity by multiple interacting cell molecules. Genomes contain abundant and functional repetitive components in addition to the unique coding sequences envisaged in the early days of molecular biology. Combinatorial coding, plus the biochemical abilities cells possess to rearrange DNA molecules, constitute a powerful toolbox for adaptive genome rewriting. That is, cells possess “Read–Write Genomes” they alter by numerous biochemical processes capable of rapidly restructuring cellular DNA molecules. Rather than viewing genome evolution as a series of accidental modifications, we can now study it as a complex biological process of active self-modification. PMID:27338490
Jiménez-Moreno, Ester; Montalvillo-Jiménez, Laura; Santana, Andrés G; Gómez, Ana M; Jiménez-Osés, Gonzalo; Corzana, Francisco; Bastida, Agatha; Jiménez-Barbero, Jesús; Cañada, Francisco Javier; Gómez-Pinto, Irene; González, Carlos; Asensio, Juan Luis
2016-05-25
Development of strong and selective binders from promiscuous lead compounds represents one of the most expensive and time-consuming tasks in drug discovery. We herein present a novel fragment-based combinatorial strategy for the optimization of multivalent polyamine scaffolds as DNA/RNA ligands. Our protocol provides a quick access to a large variety of regioisomer libraries that can be tested for selective recognition by combining microdialysis assays with simple isotope labeling and NMR experiments. To illustrate our approach, 20 small libraries comprising 100 novel kanamycin-B derivatives have been prepared and evaluated for selective binding to the ribosomal decoding A-Site sequence. Contrary to the common view of NMR as a low-throughput technique, we demonstrate that our NMR methodology represents a valuable alternative for the detection and quantification of complex mixtures, even integrated by highly similar or structurally related derivatives, a common situation in the context of a lead optimization process. Furthermore, this study provides valuable clues about the structural requirements for selective A-site recognition.
Subbiah, Ishwaria M.; Tsimberidou, Apostolia; Subbiah, Vivek; Janku, Filip; Roy-Chowdhuri, Sinchita; Hong, David S.
2017-01-01
Background Advanced carcinoma of unknown primary (CUP) has limited effective therapeutic options given the phenotypic and genotypic diversity. To identify future novel therapeutic strategies we conducted an exploratory analysis of next-generation sequencing (NGS) of relapsed, refractory CUP. Methods We identified patients in our phase I clinic where archival tissue was available for a targeted NGS CLIA-certified assay. Results Of 17 patients tested, 15 (88%) demonstrated genomic alterations (median 2 aberrations; range 0–8, total 59 alterations). Nine (53%) patients had altered cell signaling including the PI3K/AKT/MTOR (n=5, 29%) and MAPK pathways (n=3,18%); 7 (41%) patients demonstrated ≥1 alterations in tumor suppressor genes (TP53 in 5 patients), 8 (47%) had impaired epigenetic regulation and DNA methylation, 8 (47%) had aberrant cell cycle regulation, commonly in the cyclin dependent kinases. Ten (59%) patients had alterations in transcriptional regulators. Concurrent mutations affecting cell cycle regulation were noted to occur with aberrant epigenetic regulation (n=6, 35%) and MAPK/PI3K pathway (n=5, 29%). Conclusion Every patient had a unique molecular profile with no two patients demonstrating an identical panel of mutations. We identify two emerging novel combinatorial strategies targeting impaired cell cycle arrest, first with epigenetic modifiers and, second, with MAPK/PI3K pathway inhibition. PMID:28781987
Synthetic biology advances and applications in the biotechnology industry: a perspective.
Katz, Leonard; Chen, Yvonne Y; Gonzalez, Ramon; Peterson, Todd C; Zhao, Huimin; Baltz, Richard H
2018-06-18
Synthetic biology is a logical extension of what has been called recombinant DNA (rDNA) technology or genetic engineering since the 1970s. As rDNA technology has been the driver for the development of a thriving biotechnology industry today, starting with the commercialization of biosynthetic human insulin in the early 1980s, synthetic biology has the potential to take the industry to new heights in the coming years. Synthetic biology advances have been driven by dramatic cost reductions in DNA sequencing and DNA synthesis; by the development of sophisticated tools for genome editing, such as CRISPR/Cas9; and by advances in informatics, computational tools, and infrastructure to facilitate and scale analysis and design. Synthetic biology approaches have already been applied to the metabolic engineering of microorganisms for the production of industrially important chemicals and for the engineering of human cells to treat medical disorders. It also shows great promise to accelerate the discovery and development of novel secondary metabolites from microorganisms through traditional, engineered, and combinatorial biosynthesis. We anticipate that synthetic biology will continue to have broadening impacts on the biotechnology industry to address ongoing issues of human health, world food supply, renewable energy, and industrial chemicals and enzymes.
The Complete Nucleotide Sequence of the Human Immunoglobulin Heavy Chain Variable Region Locus
Matsuda, Fumihiko; Ishii, Kazuo; Bourvagnet, Patrice; Kuma, Kei-ichi; Hayashida, Hidenori; Miyata, Takashi; Honjo, Tasuku
1998-01-01
The complete nucleotide sequence of the 957-kb DNA of the human immunoglobulin heavy chain variable (VH) region locus was determined and 43 novel VH segments were identified. The region contains 123 VH segments classifiable into seven different families, of which 79 are pseudogenes. Of the 44 VH segments with an open reading frame, 39 are expressed as heavy chain proteins and 1 as mRNA, while the remaining 4 are not found in immunoglobulin cDNAs. Combinatorial diversity of VH region was calculated to be ∼6,000. Conservation of the promoter and recombination signal sequences was observed to be higher in functional VH segments than in pseudogenes. Phylogenetic analysis of 114 VH segments clearly showed clustering of the VH segments of each family. However, an independent branch in the tree contained a single VH, V4-44.1P, sharing similar levels of homology to human VH families and to those of other vertebrates. Comparison between different copies of homologous units that appear repeatedly across the locus clearly demonstrates that dynamic DNA reorganization of the locus took place at least eight times between 133 and 10 million years ago. One nonimmunoglobulin gene of unknown function was identified in the intergenic region. PMID:9841928
Design of a DNA chip for detection of unknown genetically modified organisms (GMOs).
Nesvold, Håvard; Kristoffersen, Anja Bråthen; Holst-Jensen, Arne; Berdal, Knut G
2005-05-01
Unknown genetically modified organisms (GMOs) have not undergone a risk evaluation, and hence might pose a danger to health and environment. There are, today, no methods for detecting unknown GMOs. In this paper we propose a novel method intended as a first step in an approach for detecting unknown genetically modified (GM) material in a single plant. A model is designed where biological and combinatorial reduction rules are applied to a set of DNA chip probes containing all possible sequences of uniform length n, creating probes capable of detecting unknown GMOs. The model is theoretically tested for Arabidopsis thaliana Columbia, and the probabilities for detecting inserts and receiving false positives are assessed for various parameters for this organism. From a theoretical standpoint, the model looks very promising but should be tested further in the laboratory. The model and algorithms will be available upon request to the corresponding author.
Dynamic peptide libraries for the discovery of supramolecular nanomaterials
NASA Astrophysics Data System (ADS)
Pappas, Charalampos G.; Shafi, Ramim; Sasselli, Ivan R.; Siccardi, Henry; Wang, Tong; Narang, Vishal; Abzalimov, Rinat; Wijerathne, Nadeesha; Ulijn, Rein V.
2016-11-01
Sequence-specific polymers, such as oligonucleotides and peptides, can be used as building blocks for functional supramolecular nanomaterials. The design and selection of suitable self-assembling sequences is, however, challenging because of the vast combinatorial space available. Here we report a methodology that allows the peptide sequence space to be searched for self-assembling structures. In this approach, unprotected homo- and heterodipeptides (including aromatic, aliphatic, polar and charged amino acids) are subjected to continuous enzymatic condensation, hydrolysis and sequence exchange to create a dynamic combinatorial peptide library. The free-energy change associated with the assembly process itself gives rise to selective amplification of self-assembling candidates. By changing the environmental conditions during the selection process, different sequences and consequent nanoscale morphologies are selected.
Dynamic peptide libraries for the discovery of supramolecular nanomaterials.
Pappas, Charalampos G; Shafi, Ramim; Sasselli, Ivan R; Siccardi, Henry; Wang, Tong; Narang, Vishal; Abzalimov, Rinat; Wijerathne, Nadeesha; Ulijn, Rein V
2016-11-01
Sequence-specific polymers, such as oligonucleotides and peptides, can be used as building blocks for functional supramolecular nanomaterials. The design and selection of suitable self-assembling sequences is, however, challenging because of the vast combinatorial space available. Here we report a methodology that allows the peptide sequence space to be searched for self-assembling structures. In this approach, unprotected homo- and heterodipeptides (including aromatic, aliphatic, polar and charged amino acids) are subjected to continuous enzymatic condensation, hydrolysis and sequence exchange to create a dynamic combinatorial peptide library. The free-energy change associated with the assembly process itself gives rise to selective amplification of self-assembling candidates. By changing the environmental conditions during the selection process, different sequences and consequent nanoscale morphologies are selected.
Accurate multiple sequence-structure alignment of RNA sequences using combinatorial optimization.
Bauer, Markus; Klau, Gunnar W; Reinert, Knut
2007-07-27
The discovery of functional non-coding RNA sequences has led to an increasing interest in algorithms related to RNA analysis. Traditional sequence alignment algorithms, however, fail at computing reliable alignments of low-homology RNA sequences. The spatial conformation of RNA sequences largely determines their function, and therefore RNA alignment algorithms have to take structural information into account. We present a graph-based representation for sequence-structure alignments, which we model as an integer linear program (ILP). We sketch how we compute an optimal or near-optimal solution to the ILP using methods from combinatorial optimization, and present results on a recently published benchmark set for RNA alignments. The implementation of our algorithm yields better alignments in terms of two published scores than the other programs that we tested: This is especially the case with an increasing number of input sequences. Our program LARA is freely available for academic purposes from http://www.planet-lisa.net.
Mokhir, A A; Connors, W H; Richert, C
2001-09-01
A total of 16 oligodeoxyribonucleotides of general sequence 5'-TCTTCTZTCTTTCT-3', where Z denotes an N-acyl-N-(2-hydroxyethyl)glycine residue, were prepared via solid phase synthesis. The ability of these oligonucleotides to form triplexes with the duplex 5'-AGAAGATAGAAAGA-HEG-TCTTTCTATCTTCT-3', where HEG is a hexaethylene glycol linker, was tested. In these triplexes, an 'interrupting' T:A base pair faces the Z residue in the third strand. Among the acyl moieties of Z tested, an anthraquinone carboxylic acid residue linked via a glycinyl group gave the most stable triplex, whose UV melting point was 8.4 degrees C higher than that of the triplex with 5'-TCTTCTGTCTTTCT-3' as the third strand. The results from exploratory nuclease selection experiments suggest that a combinatorial search for strands capable of recognizing mixed sequences by triple helix formation is feasible.
Castanotto, Daniela; Sakurai, Kumi; Lingeman, Robert; Li, Haitang; Shively, Louise; Aagaard, Lars; Soifer, Harris; Gatignol, Anne; Riggs, Arthur; Rossi, John J.
2007-01-01
Despite the great potential of RNAi, ectopic expression of shRNA or siRNAs holds the inherent risk of competition for critical RNAi components, thus altering the regulatory functions of some cellular microRNAs. In addition, specific siRNA sequences can potentially hinder incorporation of other siRNAs when used in a combinatorial approach. We show that both synthetic siRNAs and expressed shRNAs compete against each other and with the endogenous microRNAs for transport and for incorporation into the RNA induced silencing complex (RISC). The same siRNA sequences do not display competition when expressed from a microRNA backbone. We also show that TAR RNA binding protein (TRBP) is one of the sensors for selection and incorporation of the guide sequence of interfering RNAs. These findings reveal that combinatorial siRNA approaches can be problematic and have important implications for the methodology of expression and use of therapeutic interfering RNAs. PMID:17660190
Combining multiple ChIP-seq peak detection systems using combinatorial fusion.
Schweikert, Christina; Brown, Stuart; Tang, Zuojian; Smith, Phillip R; Hsu, D Frank
2012-01-01
Due to the recent rapid development in ChIP-seq technologies, which uses high-throughput next-generation DNA sequencing to identify the targets of Chromatin Immunoprecipitation, there is an increasing amount of sequencing data being generated that provides us with greater opportunity to analyze genome-wide protein-DNA interactions. In particular, we are interested in evaluating and enhancing computational and statistical techniques for locating protein binding sites. Many peak detection systems have been developed; in this study, we utilize the following six: CisGenome, MACS, PeakSeq, QuEST, SISSRs, and TRLocator. We define two methods to merge and rescore the regions of two peak detection systems and analyze the performance based on average precision and coverage of transcription start sites. The results indicate that ChIP-seq peak detection can be improved by fusion using score or rank combination. Our method of combination and fusion analysis would provide a means for generic assessment of available technologies and systems and assist researchers in choosing an appropriate system (or fusion method) for analyzing ChIP-seq data. This analysis offers an alternate approach for increasing true positive rates, while decreasing false positive rates and hence improving the ChIP-seq peak identification process.
Massively multiplex single-cell Hi-C
Ramani, Vijay; Deng, Xinxian; Qiu, Ruolan; Gunderson, Kevin L; Steemers, Frank J; Disteche, Christine M; Noble, William S; Duan, Zhijun; Shendure, Jay
2016-01-01
We present single-cell combinatorial indexed Hi-C (sciHi-C), which applies the concept of combinatorial cellular indexing to chromosome conformation capture. In this proof-of-concept, we generate and sequence six sciHi-C libraries comprising a total of 10,696 single cells. We use sciHi-C data to separate cells by karytoypic and cell-cycle state differences and identify cell-to-cell heterogeneity in mammalian chromosomal conformation. Our results demonstrate that combinatorial indexing is a generalizable strategy for single-cell genomics. PMID:28135255
DeviceEditor visual biological CAD canvas
2012-01-01
Background Biological Computer Aided Design (bioCAD) assists the de novo design and selection of existing genetic components to achieve a desired biological activity, as part of an integrated design-build-test cycle. To meet the emerging needs of Synthetic Biology, bioCAD tools must address the increasing prevalence of combinatorial library design, design rule specification, and scar-less multi-part DNA assembly. Results We report the development and deployment of web-based bioCAD software, DeviceEditor, which provides a graphical design environment that mimics the intuitive visual whiteboard design process practiced in biological laboratories. The key innovations of DeviceEditor include visual combinatorial library design, direct integration with scar-less multi-part DNA assembly design automation, and a graphical user interface for the creation and modification of design specification rules. We demonstrate how biological designs are rendered on the DeviceEditor canvas, and we present effective visualizations of genetic component ordering and combinatorial variations within complex designs. Conclusions DeviceEditor liberates researchers from DNA base-pair manipulation, and enables users to create successful prototypes using standardized, functional, and visual abstractions. Open and documented software interfaces support further integration of DeviceEditor with other bioCAD tools and software platforms. DeviceEditor saves researcher time and institutional resources through correct-by-construction design, the automation of tedious tasks, design reuse, and the minimization of DNA assembly costs. PMID:22373390
Comprehensive human transcription factor binding site map for combinatory binding motifs discovery.
Müller-Molina, Arnoldo J; Schöler, Hans R; Araúzo-Bravo, Marcos J
2012-01-01
To know the map between transcription factors (TFs) and their binding sites is essential to reverse engineer the regulation process. Only about 10%-20% of the transcription factor binding motifs (TFBMs) have been reported. This lack of data hinders understanding gene regulation. To address this drawback, we propose a computational method that exploits never used TF properties to discover the missing TFBMs and their sites in all human gene promoters. The method starts by predicting a dictionary of regulatory "DNA words." From this dictionary, it distills 4098 novel predictions. To disclose the crosstalk between motifs, an additional algorithm extracts TF combinatorial binding patterns creating a collection of TF regulatory syntactic rules. Using these rules, we narrowed down a list of 504 novel motifs that appear frequently in syntax patterns. We tested the predictions against 509 known motifs confirming that our system can reliably predict ab initio motifs with an accuracy of 81%-far higher than previous approaches. We found that on average, 90% of the discovered combinatorial binding patterns target at least 10 genes, suggesting that to control in an independent manner smaller gene sets, supplementary regulatory mechanisms are required. Additionally, we discovered that the new TFBMs and their combinatorial patterns convey biological meaning, targeting TFs and genes related to developmental functions. Thus, among all the possible available targets in the genome, the TFs tend to regulate other TFs and genes involved in developmental functions. We provide a comprehensive resource for regulation analysis that includes a dictionary of "DNA words," newly predicted motifs and their corresponding combinatorial patterns. Combinatorial patterns are a useful filter to discover TFBMs that play a major role in orchestrating other factors and thus, are likely to lock/unlock cellular functional clusters.
Comprehensive Human Transcription Factor Binding Site Map for Combinatory Binding Motifs Discovery
Müller-Molina, Arnoldo J.; Schöler, Hans R.; Araúzo-Bravo, Marcos J.
2012-01-01
To know the map between transcription factors (TFs) and their binding sites is essential to reverse engineer the regulation process. Only about 10%–20% of the transcription factor binding motifs (TFBMs) have been reported. This lack of data hinders understanding gene regulation. To address this drawback, we propose a computational method that exploits never used TF properties to discover the missing TFBMs and their sites in all human gene promoters. The method starts by predicting a dictionary of regulatory “DNA words.” From this dictionary, it distills 4098 novel predictions. To disclose the crosstalk between motifs, an additional algorithm extracts TF combinatorial binding patterns creating a collection of TF regulatory syntactic rules. Using these rules, we narrowed down a list of 504 novel motifs that appear frequently in syntax patterns. We tested the predictions against 509 known motifs confirming that our system can reliably predict ab initio motifs with an accuracy of 81%—far higher than previous approaches. We found that on average, 90% of the discovered combinatorial binding patterns target at least 10 genes, suggesting that to control in an independent manner smaller gene sets, supplementary regulatory mechanisms are required. Additionally, we discovered that the new TFBMs and their combinatorial patterns convey biological meaning, targeting TFs and genes related to developmental functions. Thus, among all the possible available targets in the genome, the TFs tend to regulate other TFs and genes involved in developmental functions. We provide a comprehensive resource for regulation analysis that includes a dictionary of “DNA words,” newly predicted motifs and their corresponding combinatorial patterns. Combinatorial patterns are a useful filter to discover TFBMs that play a major role in orchestrating other factors and thus, are likely to lock/unlock cellular functional clusters. PMID:23209563
Dossani, Zain Y.; Reider Apel, Amanda; Szmidt-Middleton, Heather; ...
2017-10-30
Despite the need for inducible promoters in strain development efforts, the majority of engineering in Saccharomyces cerevisiae continues to rely on a few constitutively active or inducible promoters. Building on advances that use the modular nature of both transcription factors and promoter regions, we have built a library of hybrid promoters that are regulated by a synthetic transcription factor. The hybrid promoters consist of native S. cerevisiae promoters, in which the operator regions have been replaced with sequences that are recognized by the bacterial LexA DNA binding protein. Correspondingly, the synthetic transcription factor (TF) consists of the DNA binding domainmore » of the LexA protein, fused with the human estrogen binding domain and the viral activator domain, VP16. The resulting system with a bacterial DNA binding domain avoids the transcription of native S. cerevisiae genes, and the hybrid promoters can be induced using estradiol, a compound with no detectable impact on S. cerevisiae physiology. Using combinations of one, two or three operator sequence repeats and a set of native S. cerevisiae promoters, we obtained a series of hybrid promoters that can be induced to different levels, using the same synthetic TF and a given estradiol. Finally, this set of promoters, in combination with our synthetic TF, has the potential to regulate numerous genes or pathways simultaneously, to multiple desired levels, in a single strain.« less
Dossani, Zain Y.; Reider Apel, Amanda; Szmidt‐Middleton, Heather; Hillson, Nathan J.; Deutsch, Samuel; Keasling, Jay D.
2017-01-01
Abstract Despite the need for inducible promoters in strain development efforts, the majority of engineering in Saccharomyces cerevisiae continues to rely on a few constitutively active or inducible promoters. Building on advances that use the modular nature of both transcription factors and promoter regions, we have built a library of hybrid promoters that are regulated by a synthetic transcription factor. The hybrid promoters consist of native S. cerevisiae promoters, in which the operator regions have been replaced with sequences that are recognized by the bacterial LexA DNA binding protein. Correspondingly, the synthetic transcription factor (TF) consists of the DNA binding domain of the LexA protein, fused with the human estrogen binding domain and the viral activator domain, VP16. The resulting system with a bacterial DNA binding domain avoids the transcription of native S. cerevisiae genes, and the hybrid promoters can be induced using estradiol, a compound with no detectable impact on S. cerevisiae physiology. Using combinations of one, two or three operator sequence repeats and a set of native S. cerevisiae promoters, we obtained a series of hybrid promoters that can be induced to different levels, using the same synthetic TF and a given estradiol. This set of promoters, in combination with our synthetic TF, has the potential to regulate numerous genes or pathways simultaneously, to multiple desired levels, in a single strain. PMID:29084380
DOE Office of Scientific and Technical Information (OSTI.GOV)
Dossani, Zain Y.; Reider Apel, Amanda; Szmidt-Middleton, Heather
Despite the need for inducible promoters in strain development efforts, the majority of engineering in Saccharomyces cerevisiae continues to rely on a few constitutively active or inducible promoters. Building on advances that use the modular nature of both transcription factors and promoter regions, we have built a library of hybrid promoters that are regulated by a synthetic transcription factor. The hybrid promoters consist of native S. cerevisiae promoters, in which the operator regions have been replaced with sequences that are recognized by the bacterial LexA DNA binding protein. Correspondingly, the synthetic transcription factor (TF) consists of the DNA binding domainmore » of the LexA protein, fused with the human estrogen binding domain and the viral activator domain, VP16. The resulting system with a bacterial DNA binding domain avoids the transcription of native S. cerevisiae genes, and the hybrid promoters can be induced using estradiol, a compound with no detectable impact on S. cerevisiae physiology. Using combinations of one, two or three operator sequence repeats and a set of native S. cerevisiae promoters, we obtained a series of hybrid promoters that can be induced to different levels, using the same synthetic TF and a given estradiol. Finally, this set of promoters, in combination with our synthetic TF, has the potential to regulate numerous genes or pathways simultaneously, to multiple desired levels, in a single strain.« less
DOE Office of Scientific and Technical Information (OSTI.GOV)
Müller, Patrick; Rößler, Jens; Schwarz-Finsterle, Jutta
Recently, advantages concerning targeting specificity of PCR constructed oligonucleotide FISH probes in contrast to established FISH probes, e.g. BAC clones, have been demonstrated. These techniques, however, are still using labelling protocols with DNA denaturing steps applying harsh heat treatment with or without further denaturing chemical agents. COMBO-FISH (COMBinatorial Oligonucleotide FISH) allows the design of specific oligonucleotide probe combinations in silico. Thus, being independent from primer libraries or PCR laboratory conditions, the probe sequences extracted by computer sequence data base search can also be synthesized as single stranded PNA-probes (Peptide Nucleic Acid probes). Gene targets can be specifically labelled with atmore » least about 20 PNA-probes obtaining visibly background free specimens. By using appropriately designed triplex forming oligonucleotides, the denaturing procedures can completely be omitted. These results reveal a significant step towards oligonucleotide-FISH maintaining the 3D-nanostructure and even the viability of the cell target. The method is demonstrated with the detection of Her2/neu and GRB7 genes, which are indicators in breast cancer diagnosis and therapy. - Highlights: • Denaturation free protocols preserve 3D architecture of chromosomes and nuclei. • Labelling sets are determined in silico for duplex and triplex binding. • Probes are produced chemically with freely chosen backbones and base variants. • Peptide nucleic acid backbones reduce hindering charge interactions. • Intercalating side chains stabilize binding of short oligonucleotides.« less
Search and Discovery Strategies for Biotechnology: the Paradigm Shift
Bull, Alan T.; Ward, Alan C.; Goodfellow, Michael
2000-01-01
Profound changes are occurring in the strategies that biotechnology-based industries are deploying in the search for exploitable biology and to discover new products and develop new or improved processes. The advances that have been made in the past decade in areas such as combinatorial chemistry, combinatorial biosynthesis, metabolic pathway engineering, gene shuffling, and directed evolution of proteins have caused some companies to consider withdrawing from natural product screening. In this review we examine the paradigm shift from traditional biology to bioinformatics that is revolutionizing exploitable biology. We conclude that the reinvigorated means of detecting novel organisms, novel chemical structures, and novel biocatalytic activities will ensure that natural products will continue to be a primary resource for biotechnology. The paradigm shift has been driven by a convergence of complementary technologies, exemplified by DNA sequencing and amplification, genome sequencing and annotation, proteome analysis, and phenotypic inventorying, resulting in the establishment of huge databases that can be mined in order to generate useful knowledge such as the identity and characterization of organisms and the identity of biotechnology targets. Concurrently there have been major advances in understanding the extent of microbial diversity, how uncultured organisms might be grown, and how expression of the metabolic potential of microorganisms can be maximized. The integration of information from complementary databases presents a significant challenge. Such integration should facilitate answers to complex questions involving sequence, biochemical, physiological, taxonomic, and ecological information of the sort posed in exploitable biology. The paradigm shift which we discuss is not absolute in the sense that it will replace established microbiology; rather, it reinforces our view that innovative microbiology is essential for releasing the potential of microbial diversity for biotechnology penetration throughout industry. Various of these issues are considered with reference to deep-sea microbiology and biotechnology. PMID:10974127
cDREM: inferring dynamic combinatorial gene regulation.
Wise, Aaron; Bar-Joseph, Ziv
2015-04-01
Genes are often combinatorially regulated by multiple transcription factors (TFs). Such combinatorial regulation plays an important role in development and facilitates the ability of cells to respond to different stresses. While a number of approaches have utilized sequence and ChIP-based datasets to study combinational regulation, these have often ignored the combinational logic and the dynamics associated with such regulation. Here we present cDREM, a new method for reconstructing dynamic models of combinatorial regulation. cDREM integrates time series gene expression data with (static) protein interaction data. The method is based on a hidden Markov model and utilizes the sparse group Lasso to identify small subsets of combinatorially active TFs, their time of activation, and the logical function they implement. We tested cDREM on yeast and human data sets. Using yeast we show that the predicted combinatorial sets agree with other high throughput genomic datasets and improve upon prior methods developed to infer combinatorial regulation. Applying cDREM to study human response to flu, we were able to identify several combinatorial TF sets, some of which were known to regulate immune response while others represent novel combinations of important TFs.
Sequencing thousands of single-cell genomes with combinatorial indexing.
Vitak, Sarah A; Torkenczy, Kristof A; Rosenkrantz, Jimi L; Fields, Andrew J; Christiansen, Lena; Wong, Melissa H; Carbone, Lucia; Steemers, Frank J; Adey, Andrew
2017-03-01
Single-cell genome sequencing has proven valuable for the detection of somatic variation, particularly in the context of tumor evolution. Current technologies suffer from high library construction costs, which restrict the number of cells that can be assessed and thus impose limitations on the ability to measure heterogeneity within a tissue. Here, we present single-cell combinatorial indexed sequencing (SCI-seq) as a means of simultaneously generating thousands of low-pass single-cell libraries for detection of somatic copy-number variants. We constructed libraries for 16,698 single cells from a combination of cultured cell lines, primate frontal cortex tissue and two human adenocarcinomas, and obtained a detailed assessment of subclonal variation within a pancreatic tumor.
ERIC Educational Resources Information Center
Fuller, Amelia A.
2016-01-01
A five-week, research-based experiment suitable for second-semester introductory organic laboratory students is described. Each student designs, prepares, and analyzes a combinatorial array of six aromatic oligoamides. Molecules are prepared on solid phase via a six-step synthetic sequence, and purities and identities are determined by analysis of…
Gurevich-Messina, Juan M; Giudicessi, Silvana L; Martínez-Ceron, María C; Acosta, Gerardo; Erra-Balsells, Rosa; Cascone, Osvaldo; Albericio, Fernando; Camperi, Silvia A
2015-01-01
Short cyclic peptides have a great interest in therapeutic, diagnostic and affinity chromatography applications. The screening of 'one-bead-one-peptide' combinatorial libraries combined with mass spectrometry (MS) is an excellent tool to find peptides with affinity for any target protein. The fragmentation patterns of cyclic peptides are quite more complex than those of their linear counterparts, and the elucidation of the resulting tandem mass spectra is rather more difficult. Here, we propose a simple protocol for combinatorial cyclic libraries synthesis and ring opening before MS analysis. In this strategy, 4-hydroxymethylbenzoic acid, which forms a benzyl ester with the first amino acid, was used as the linker. A glycolamidic ester group was incorporated after the combinatorial positions by adding glycolic acid. The library synthesis protocol consisted in the following: (i) incorporation of Fmoc-Asp[2-phenylisopropyl (OPp)]-OH to Ala-Gly-oxymethylbenzamide-ChemMatrix, (ii) synthesis of the combinatorial library, (iii) assembly of a glycolic acid, (iv) couple of an Ala residue in the N-terminal, (v) removal of OPp, (vi) peptide cyclisation through side chain Asp and N-Ala amino terminus and (vii) removal of side chain protecting groups. In order to simultaneously open the ring and release each peptide, benzyl and glycolamidic esters were cleaved with ammonia. Peptide sequences could be deduced from the tandem mass spectra of each single bead evaluated. The strategy herein proposed is suitable for the preparation of one-bead-one-cyclic depsipeptide libraries that can be easily open for its sequencing by matrix-assisted laser desorption/ionisation MS. It employs techniques and reagents frequently used in a broad range of laboratories without special expertise in organic synthesis. Copyright © 2014 European Peptide Society and John Wiley & Sons, Ltd.
DOE Office of Scientific and Technical Information (OSTI.GOV)
CHEN, JOANNA; SIMIRENKO, LISA; TAPASWI, MANJIRI
The DIVA software interfaces a process in which researchers design their DNA with a web-based graphical user interface, submit their designs to a central queue, and a few weeks later receive their sequence-verified clonal constructs. Each researcher independently designs the DNA to be constructed with a web-based BioCAD tool, and presses a button to submit their designs to a central queue. Researchers have web-based access to their DNA design queues, and can track the progress of their submitted designs as they progress from "evaluation", to "waiting for reagents", to "in progress", to "complete". Researchers access their completed constructs through themore » central DNA repository. Along the way, all DNA construction success/failure rates are captured in a central database. Once a design has been submitted to the queue, a small number of dedicated staff evaluate the design for feasibility and provide feedback to the responsible researcher if the design is either unreasonable (e.g., encompasses a combinatorial library of a billion constructs) or small design changes could significantly facilitate the downstream implementation process. The dedicated staff then use DNA assembly design automation software to optimize the DNA construction process for the design, leveraging existing parts from the DNA repository where possible and ordering synthetic DNA where necessary. SynTrack software manages the physical locations and availability of the various requisite reagents and process inputs (e.g., DNA templates). Once all requisite process inputs are available, the design progresses from "waiting for reagents" to "in progress" in the design queue. Human-readable and machine-parseable DNA construction protocols output by the DNA assembly design automation software are then executed by the dedicated staff exploiting lab automation devices wherever possible. Since the all employed DNA construction methods are sequence-agnostic, standardized (utilize the same enzymatic master mixes and reaction conditions), completely independent DNA construction tasks can be aggregated into the same multi-well plates and pursued in parallel. The resulting sets of cloned constructs can then be screened by high-throughput next-gen sequencing platforms for sequence correctness. A combination of long read-length (e.g., PacBio) and paired-end read platforms (e.g., Illumina) would be exploited depending the particular task at hand (e.g., PacBio might be sufficient to screen a set of pooled constructs with significant gene divergence). Post sequence verification, designs for which at least one correct clone was identified will progress to a "complete" status, while designs for which no correct clones wereidentified will progress to a "failure" status. Depending on the failure mode (e.g., no transformants), and how many prior attempts/variations of assembly protocol have been already made for a given design, subsequent attempts may be made or the design can progress to a "permanent failure" state. All success and failure rate information will be captured during the process, including at which stage a given clonal construction procedure failed (e.g., no PCR product) and what the exact failure was (e.g. assembly piece 2 missing). This success/failure rate data can be leveraged to refine the DNA assembly design process.« less
DOE Office of Scientific and Technical Information (OSTI.GOV)
Rodi, D. J.; Soares, A. S.; Makowski, L.
Novel statistical methods have been developed and used to quantitate and annotate the sequence diversity within combinatorial peptide libraries on the basis of small numbers (1-200) of sequences selected at random from commercially available M13 p3-based phage display libraries. These libraries behave statistically as though they correspond to populations containing roughly 4.0{+-}1.6% of the random dodecapeptides and 7.9{+-}2.6% of the random constrained heptapeptides that are theoretically possible within the phage populations. Analysis of amino acid residue occurrence patterns shows no demonstrable influence on sequence censorship by Escherichia coli tRNA isoacceptor profiles or either overall codon or Class II codon usagemore » patterns, suggesting no metabolic constraints on recombinant p3 synthesis. There is an overall depression in the occurrence of cysteine, arginine and glycine residues and an overabundance of proline, threonine and histidine residues. The majority of position-dependent amino acid sequence bias is clustered at three positions within the inserted peptides of the dodecapeptide library, +1, +3 and +12 downstream from the signal peptidase cleavage site. Conformational tendency measures of the peptides indicate a significant preference for inserts favoring a {beta}-turn conformation. The observed protein sequence limitations can primarily be attributed to genetic codon degeneracy and signal peptidase cleavage preferences. These data suggest that for applications in which maximal sequence diversity is essential, such as epitope mapping or novel receptor identification, combinatorial peptide libraries should be constructed using codon-corrected trinucleotide cassettes within vector-host systems designed to minimize morphogenesis-related censorship.« less
Zou, J; Saven, J G
2000-02-11
A self-consistent theory is presented that can be used to estimate the number and composition of sequences satisfying a predetermined set of constraints. The theory is formulated so as to examine the features of sequences having a particular value of Delta=E(f)-
Dissection of combinatorial control by the Met4 transcriptional complex.
Lee, Traci A; Jorgensen, Paul; Bognar, Andrew L; Peyraud, Caroline; Thomas, Dominique; Tyers, Mike
2010-02-01
Met4 is the transcriptional activator of the sulfur metabolic network in Saccharomyces cerevisiae. Lacking DNA-binding ability, Met4 must interact with proteins called Met4 cofactors to target promoters for transcription. Two types of DNA-binding cofactors (Cbf1 and Met31/Met32) recruit Met4 to promoters and one cofactor (Met28) stabilizes the DNA-bound Met4 complexes. To dissect this combinatorial system, we systematically deleted each category of cofactor(s) and analyzed Met4-activated transcription on a genome-wide scale. We defined a core regulon for Met4, consisting of 45 target genes. Deletion of both Met31 and Met32 eliminated activation of the core regulon, whereas loss of Met28 or Cbf1 interfered with only a subset of targets that map to distinct sectors of the sulfur metabolic network. These transcriptional dependencies roughly correlated with the presence of Cbf1 promoter motifs. Quantitative analysis of in vivo promoter binding properties indicated varying levels of cooperativity and interdependency exists between members of this combinatorial system. Cbf1 was the only cofactor to remain fully bound to target promoters under all conditions, whereas other factors exhibited different degrees of regulated binding in a promoter-specific fashion. Taken together, Met4 cofactors use a variety of mechanisms to allow differential transcription of target genes in response to various cues.
Blakskjaer, Peter; Heitner, Tara; Hansen, Nils Jakob Vest
2015-06-01
DNA-encoded small-molecule library (DEL) technology allows vast drug-like small molecule libraries to be efficiently synthesized in a combinatorial fashion and screened in a single tube method for binding, with an assay readout empowered by advances in next generation sequencing technology. This approach has increasingly been applied as a viable technology for the identification of small-molecule modulators to protein targets and as precursors to drugs in the past decade. Several strategies for producing and for screening DELs have been devised by both academic and industrial institutions. This review highlights some of the most significant and recent strategies along with important results. A special focus on the production of high fidelity DEL technologies with the ability to eliminate screening noise and false positives is included: using a DNA junction called the Yoctoreactor, building blocks (BBs) are spatially confined at the center of the junction facilitating both the chemical reaction between BBs and encoding of the synthetic route. A screening method, known as binder trap enrichment, permits DELs to be screened robustly in a homogeneous manner delivering clean data sets and potent hits for even the most challenging targets. Copyright © 2015 Elsevier Ltd. All rights reserved.
Hegde, Mahesh; Mantelingu, Kempegowda; Pandey, Monica; Pavankumar, Chottanahalli S; Rangappa, Kanchugarakoppal S; Raghavan, Sathees C
2016-10-01
Cancer is a multifactorial disease, which makes it difficult to cure. Since more than one defective cellular component is often involved during oncogenesis, combination therapy is gaining prominence in the field of cancer therapeutics. The purpose of this study was to investigate the combinatorial effects of a novel PARP inhibitor, P10, and HDAC inhibitor, SAHA, in leukemic cells. Combinatorial effects of P10 and SAHA were tested using propidium iodide staining in different leukemic cells. Further, flowcytometry-based assays such as calcein-AM/ethidium homodimer staining, annexin-FITC/PI staining, and JC-1 staining were carried out to elucidate the mechanism of cell death. In addition, cell-cycle analysis, immunocytochemistry studies, and western blotting analysis were conducted to check the combinatorial effect in Nalm6 cells. Propidium iodide staining showed that P10 in combination with SAHA induced cell death in Nalm6 cells, in which PARP expression and activity is high with a combination index of <0.2. Annexin-FITC/PI staining, JC-1 staining, and other biochemical assays revealed that P10 in combination with SAHA induced apoptosis by causing a change in mitochondrial membrane potential in >65 % cells. Importantly, combinatorial treatment induced S phase arrest in 40-45 % cells due to DNA damage and plausible replicative stress. Finally, we demonstrated that treatment with P10 led to DNA strand breaks, which were further potentiated by SAHA (p < 0.01), leading to activation of apoptosis and increased cell death in PARP-positive leukemic cells. Our study reveals that coadministration of PARP inhibitor with SAHA could be used as a combination therapy against leukemic cells that possess high levels of intrinsic PARP activity.
Fluorescence In situ Hybridization: Cell-Based Genetic Diagnostic and Research Applications.
Cui, Chenghua; Shu, Wei; Li, Peining
2016-01-01
Fluorescence in situ hybridization (FISH) is a macromolecule recognition technology based on the complementary nature of DNA or DNA/RNA double strands. Selected DNA strands incorporated with fluorophore-coupled nucleotides can be used as probes to hybridize onto the complementary sequences in tested cells and tissues and then visualized through a fluorescence microscope or an imaging system. This technology was initially developed as a physical mapping tool to delineate genes within chromosomes. Its high analytical resolution to a single gene level and high sensitivity and specificity enabled an immediate application for genetic diagnosis of constitutional common aneuploidies, microdeletion/microduplication syndromes, and subtelomeric rearrangements. FISH tests using panels of gene-specific probes for somatic recurrent losses, gains, and translocations have been routinely applied for hematologic and solid tumors and are one of the fastest-growing areas in cancer diagnosis. FISH has also been used to detect infectious microbias and parasites like malaria in human blood cells. Recent advances in FISH technology involve various methods for improving probe labeling efficiency and the use of super resolution imaging systems for direct visualization of intra-nuclear chromosomal organization and profiling of RNA transcription in single cells. Cas9-mediated FISH (CASFISH) allowed in situ labeling of repetitive sequences and single-copy sequences without the disruption of nuclear genomic organization in fixed or living cells. Using oligopaint-FISH and super-resolution imaging enabled in situ visualization of chromosome haplotypes from differentially specified single-nucleotide polymorphism loci. Single molecule RNA FISH (smRNA-FISH) using combinatorial labeling or sequential barcoding by multiple round of hybridization were applied to measure mRNA expression of multiple genes within single cells. Research applications of these single molecule single cells DNA and RNA FISH techniques have visualized intra-nuclear genomic structure and sub-cellular transcriptional dynamics of many genes and revealed their functions in various biological processes.
Selection dynamic of Escherichia coli host in M13 combinatorial peptide phage display libraries.
Zanconato, Stefano; Minervini, Giovanni; Poli, Irene; De Lucrezia, Davide
2011-01-01
Phage display relies on an iterative cycle of selection and amplification of random combinatorial libraries to enrich the initial population of those peptides that satisfy a priori chosen criteria. The effectiveness of any phage display protocol depends directly on library amino acid sequence diversity and the strength of the selection procedure. In this study we monitored the dynamics of the selective pressure exerted by the host organism on a random peptide library in the absence of any additional selection pressure. The results indicate that sequence censorship exerted by Escherichia coli dramatically reduces library diversity and can significantly impair phage display effectiveness.
Lee, Sangho; Privalsky, Martin L.
2009-01-01
Nuclear receptors are ligand-regulated transcription factors that regulate key aspects of metazoan development, differentiation, and homeostasis. Nuclear receptors recognize target genes by binding to specific DNA recognition sequences, denoted hormone response elements (HREs). Many nuclear receptors can recognize HREs as either homodimers or heterodimers. Retinoid X receptors (RXRs), in particular, serve as important heterodimer partners for many other nuclear receptors, including thyroid hormone receptors (TRs), and RXR/TR heterodimers have been proposed to be the primary mediators of target gene regulation by T3 hormone. Here, we report that the retinoic acid receptors (RARs), a distinct class of nuclear receptors, are also efficient heterodimer partners for TRs. These RAR/TR heterodimers form with similar affinities as RXR/TR heterodimers on an assortment of consensus and natural HREs, and preferentially assemble with the RAR partner 5′ of the TR moiety. The corepressor and coactivator recruitment properties of these RAR/TR heterodimers and their transcriptional activities in vivo are distinct from those observed with the corresponding RXR heterodimers. Our studies indicate that RXRs are not unique in their ability to partner with TRs, and that RARs can also serve as robust heterodimer partners and combinatorial regulators of T3-modulated gene expression. PMID:15650024
Lohmann, Ingrid
2012-01-01
In multi-cellular organisms, spatiotemporal activity of cis-regulatory DNA elements depends on their occupancy by different transcription factors (TFs). In recent years, genome-wide ChIP-on-Chip, ChIP-Seq and DamID assays have been extensively used to unravel the combinatorial interaction of TFs with cis-regulatory modules (CRMs) in the genome. Even though genome-wide binding profiles are increasingly becoming available for different TFs, single TF binding profiles are in most cases not sufficient for dissecting complex regulatory networks. Thus, potent computational tools detecting statistically significant and biologically relevant TF-motif co-occurrences in genome-wide datasets are essential for analyzing context-dependent transcriptional regulation. We have developed COPS (Co-Occurrence Pattern Search), a new bioinformatics tool based on a combination of association rules and Markov chain models, which detects co-occurring TF binding sites (BSs) on genomic regions of interest. COPS scans DNA sequences for frequent motif patterns using a Frequent-Pattern tree based data mining approach, which allows efficient performance of the software with respect to both data structure and implementation speed, in particular when mining large datasets. Since transcriptional gene regulation very often relies on the formation of regulatory protein complexes mediated by closely adjoining TF binding sites on CRMs, COPS additionally detects preferred short distance between co-occurring TF motifs. The performance of our software with respect to biological significance was evaluated using three published datasets containing genomic regions that are independently bound by several TFs involved in a defined biological process. In sum, COPS is a fast, efficient and user-friendly tool mining statistically and biologically significant TFBS co-occurrences and therefore allows the identification of TFs that combinatorially regulate gene expression. PMID:23272209
Signatures of DNA target selectivity by ETS transcription factors
Kim, Hye Mi
2017-01-01
ABSTRACT The ETS family of transcription factors is a functionally heterogeneous group of gene regulators that share a structurally conserved, eponymous DNA-binding domain. DNA target specificity derives from combinatorial interactions with other proteins as well as intrinsic heterogeneity among ETS domains. Emerging evidence suggests molecular hydration as a fundamental feature that defines the intrinsic heterogeneity in DNA target selection and susceptibility to epigenetic DNA modification. This perspective invokes novel hypotheses in the regulation of ETS proteins in physiologic osmotic stress, their pioneering potential in heterochromatin, and the effects of passive and pharmacologic DNA demethylation on ETS regulation. PMID:28301293
Signatures of DNA target selectivity by ETS transcription factors.
Poon, Gregory M K; Kim, Hye Mi
2017-05-27
The ETS family of transcription factors is a functionally heterogeneous group of gene regulators that share a structurally conserved, eponymous DNA-binding domain. DNA target specificity derives from combinatorial interactions with other proteins as well as intrinsic heterogeneity among ETS domains. Emerging evidence suggests molecular hydration as a fundamental feature that defines the intrinsic heterogeneity in DNA target selection and susceptibility to epigenetic DNA modification. This perspective invokes novel hypotheses in the regulation of ETS proteins in physiologic osmotic stress, their pioneering potential in heterochromatin, and the effects of passive and pharmacologic DNA demethylation on ETS regulation.
Guilfoyle, Richard A.; Guo, Zhen
2001-01-01
A restriction site indexing method for selectively amplifying any fragment generated by a Class II restriction enzyme includes adaptors specific to fragment ends containing adaptor indexing sequences complementary to fragment indexing sequences near the termini of fragments generated by Class II enzyme cleavage. A method for combinatorial indexing facilitates amplification of restriction fragments whose sequence is not known.
Guilfoyle, Richard A.; Guo, Zhen
1999-01-01
A restriction site indexing method for selectively amplifying any fragment generated by a Class II restriction enzyme includes adaptors specific to fragment ends containing adaptor indexing sequences complementary to fragment indexing sequences near the termini of fragments generated by Class II enzyme cleavage. A method for combinatorial indexing facilitates amplification of restriction fragments whose sequence is not known.
Generation of Synthetic Copolymer Libraries by Combinatorial Assembly on Nucleic Acid Templates.
Kong, Dehui; Yeung, Wayland; Hili, Ryan
2016-07-11
Recent advances in nucleic acid-templated copolymerization have expanded the scope of sequence-controlled synthetic copolymers beyond the molecular architectures witnessed in nature. This has enabled the power of molecular evolution to be applied to synthetic copolymer libraries to evolve molecular function ranging from molecular recognition to catalysis. This Review seeks to summarize different approaches available to generate sequence-defined monodispersed synthetic copolymer libraries using nucleic acid-templated polymerization. Key concepts and principles governing nucleic acid-templated polymerization, as well as the fidelity of various copolymerization technologies, will be described. The Review will focus on methods that enable the combinatorial generation of copolymer libraries and their molecular evolution for desired function.
Cave, John W; Xia, Li; Caudy, Michael
2011-01-01
In Drosophila melanogaster, achaete (ac) and m8 are model basic helix-loop-helix activator (bHLH A) and repressor genes, respectively, that have the opposite cell expression pattern in proneural clusters during Notch signaling. Previous studies have shown that activation of m8 transcription in specific cells within proneural clusters by Notch signaling is programmed by a "combinatorial" and "architectural" DNA transcription code containing binding sites for the Su(H) and proneural bHLH A proteins. Here we show the novel result that the ac promoter contains a similar combinatorial code of Su(H) and bHLH A binding sites but contains a different Su(H) site architectural code that does not mediate activation during Notch signaling, thus programming a cell expression pattern opposite that of m8 in proneural clusters.
An artificial intelligence approach fit for tRNA gene studies in the era of big sequence data.
Iwasaki, Yuki; Abe, Takashi; Wada, Kennosuke; Wada, Yoshiko; Ikemura, Toshimichi
2017-09-12
Unsupervised data mining capable of extracting a wide range of knowledge from big data without prior knowledge or particular models is a timely application in the era of big sequence data accumulation in genome research. By handling oligonucleotide compositions as high-dimensional data, we have previously modified the conventional self-organizing map (SOM) for genome informatics and established BLSOM, which can analyze more than ten million sequences simultaneously. Here, we develop BLSOM specialized for tRNA genes (tDNAs) that can cluster (self-organize) more than one million microbial tDNAs according to their cognate amino acid solely depending on tetra- and pentanucleotide compositions. This unsupervised clustering can reveal combinatorial oligonucleotide motifs that are responsible for the amino acid-dependent clustering, as well as other functionally and structurally important consensus motifs, which have been evolutionarily conserved. BLSOM is also useful for identifying tDNAs as phylogenetic markers for special phylotypes. When we constructed BLSOM with 'species-unknown' tDNAs from metagenomic sequences plus 'species-known' microbial tDNAs, a large portion of metagenomic tDNAs self-organized with species-known tDNAs, yielding information on microbial communities in environmental samples. BLSOM can also enhance accuracy in the tDNA database obtained from big sequence data. This unsupervised data mining should become important for studying numerous functionally unclear RNAs obtained from a wide range of organisms.
Molecular biomimetics: nanotechnology through biology.
Sarikaya, Mehmet; Tamerler, Candan; Jen, Alex K-Y; Schulten, Klaus; Baneyx, François
2003-09-01
Proteins, through their unique and specific interactions with other macromolecules and inorganics, control structures and functions of all biological hard and soft tissues in organisms. Molecular biomimetics is an emerging field in which hybrid technologies are developed by using the tools of molecular biology and nanotechnology. Taking lessons from biology, polypeptides can now be genetically engineered to specifically bind to selected inorganic compounds for applications in nano- and biotechnology. This review discusses combinatorial biological protocols, that is, bacterial cell surface and phage-display technologies, in the selection of short sequences that have affinity to (noble) metals, semiconducting oxides and other technological compounds. These genetically engineered proteins for inorganics (GEPIs) can be used in the assembly of functional nanostructures. Based on the three fundamental principles of molecular recognition, self-assembly and DNA manipulation, we highlight successful uses of GEPI in nanotechnology.
DNA as Sensors and Imaging Agents for Metal Ions
Xiang, Yu
2014-01-01
Increasing interests in detecting metal ions in many chemical and biomedical fields have created demands for developing sensors and imaging agents for metal ions with high sensitivity and selectivity. This review covers recent progress in DNA-based sensors and imaging agents for metal ions. Through both combinatorial selection and rational design, a number of metal ion-dependent DNAzymes and metal ion-binding DNA structures that can selectively recognize specific metal ions have been obtained. By attaching these DNA molecules with signal reporters such as fluorophores, chromophores, electrochemical tags, and Raman tags, a number of DNA-based sensors for both diamagnetic and paramagnetic metal ions have been developed for fluorescent, colorimetric, electrochemical, and surface Raman detections. These sensors are highly sensitive (with detection limit down to 11 ppt) and selective (with selectivity up to millions-fold) toward specific metal ions. In addition, through further development to simplify the operation, such as the use of “dipstick tests”, portable fluorometers, computer-readable discs, and widely available glucose meters, these sensors have been applied for on-site and real-time environmental monitoring and point-of-care medical diagnostics. The use of these sensors for in situ cellular imaging has also been reported. The generality of the combinatorial selection to obtain DNAzymes for almost any metal ion in any oxidation state, and the ease of modification of the DNA with different signal reporters make DNA an emerging and promising class of molecules for metal ion sensing and imaging in many fields of applications. PMID:24359450
One step DNA assembly for combinatorial metabolic engineering.
Coussement, Pieter; Maertens, Jo; Beauprez, Joeri; Van Bellegem, Wouter; De Mey, Marjan
2014-05-01
The rapid and efficient assembly of multi-step metabolic pathways for generating microbial strains with desirable phenotypes is a critical procedure for metabolic engineering, and remains a significant challenge in synthetic biology. Although several DNA assembly methods have been developed and applied for metabolic pathway engineering, many of them are limited by their suitability for combinatorial pathway assembly. The introduction of transcriptional (promoters), translational (ribosome binding site (RBS)) and enzyme (mutant genes) variability to modulate pathway expression levels is essential for generating balanced metabolic pathways and maximizing the productivity of a strain. We report a novel, highly reliable and rapid single strand assembly (SSA) method for pathway engineering. The method was successfully optimized and applied to create constructs containing promoter, RBS and/or mutant enzyme libraries. To demonstrate its efficiency and reliability, the method was applied to fine-tune multi-gene pathways. Two promoter libraries were simultaneously introduced in front of two target genes, enabling orthogonal expression as demonstrated by principal component analysis. This shows that SSA will increase our ability to tune multi-gene pathways at all control levels for the biotechnological production of complex metabolites, achievable through the combinatorial modulation of transcription, translation and enzyme activity. Copyright © 2014 International Metabolic Engineering Society. Published by Elsevier Inc. All rights reserved.
Properties of Cordonnier, Perrin and Van der Laan Numbers
ERIC Educational Resources Information Center
Shannon, A. G.; Anderson, P. G.; Horadam, A. F.
2006-01-01
This paper aims to explore some properties of certain third-order linear sequences which have some properties analogous to the better known second-order sequences of Fibonacci and Lucas. Historical background issues are outlined. These, together with the number and combinatorial theoretical results, provide plenty of pedagogical opportunities for…
NASA Astrophysics Data System (ADS)
Xue, Wei; Wang, Qi; Wang, Tianyu
2018-04-01
This paper presents an improved parallel combinatory spread spectrum (PC/SS) communication system with the method of double information matching (DIM). Compared with conventional PC/SS system, the new model inherits the advantage of high transmission speed, large information capacity and high security. Besides, the problem traditional system will face is the high bit error rate (BER) and since its data-sequence mapping algorithm. Hence the new model presented shows lower BER and higher efficiency by its optimization of mapping algorithm.
Mapping protein-protein interactions with phage-displayed combinatorial peptide libraries.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Kay, B. K.; Castagnoli, L.; Biosciences Division
This unit describes the process and analysis of affinity selecting bacteriophage M13 from libraries displaying combinatorial peptides fused to either a minor or major capsid protein. Direct affinity selection uses target protein bound to a microtiter plate followed by purification of selected phage by ELISA. Alternatively, there is a bead-based affinity selection method. These methods allow one to readily isolate peptide ligands that bind to a protein target of interest and use the consensus sequence to search proteomic databases for putative interacting proteins.
Kono, H; Saven, J G
2001-02-23
Combinatorial experiments provide new ways to probe the determinants of protein folding and to identify novel folding amino acid sequences. These types of experiments, however, are complicated both by enormous conformational complexity and by large numbers of possible sequences. Therefore, a quantitative computational theory would be helpful in designing and interpreting these types of experiment. Here, we present and apply a statistically based, computational approach for identifying the properties of sequences compatible with a given main-chain structure. Protein side-chain conformations are included in an atom-based fashion. Calculations are performed for a variety of similar backbone structures to identify sequence properties that are robust with respect to minor changes in main-chain structure. Rather than specific sequences, the method yields the likelihood of each of the amino acids at preselected positions in a given protein structure. The theory may be used to quantify the characteristics of sequence space for a chosen structure without explicitly tabulating sequences. To account for hydrophobic effects, we introduce an environmental energy that it is consistent with other simple hydrophobicity scales and show that it is effective for side-chain modeling. We apply the method to calculate the identity probabilities of selected positions of the immunoglobulin light chain-binding domain of protein L, for which many variant folding sequences are available. The calculations compare favorably with the experimentally observed identity probabilities.
Electro-Microfluidic Packaging
NASA Astrophysics Data System (ADS)
Benavides, G. L.; Galambos, P. C.
2002-06-01
There are many examples of electro-microfluidic products that require cost effective packaging solutions. Industry has responded to a demand for products such as drop ejectors, chemical sensors, and biological sensors. Drop ejectors have consumer applications such as ink jet printing and scientific applications such as patterning self-assembled monolayers or ejecting picoliters of expensive analytes/reagents for chemical analysis. Drop ejectors can be used to perform chemical analysis, combinatorial chemistry, drug manufacture, drug discovery, drug delivery, and DNA sequencing. Chemical and biological micro-sensors can sniff the ambient environment for traces of dangerous materials such as explosives, toxins, or pathogens. Other biological sensors can be used to improve world health by providing timely diagnostics and applying corrective measures to the human body. Electro-microfluidic packaging can easily represent over fifty percent of the product cost and, as with Integrated Circuits (IC), the industry should evolve to standard packaging solutions. Standard packaging schemes will minimize cost and bring products to market sooner.
High Throughput Biological Analysis Using Multi-bit Magnetic Digital Planar Tags
NASA Astrophysics Data System (ADS)
Hong, B.; Jeong, J.-R.; Llandro, J.; Hayward, T. J.; Ionescu, A.; Trypiniotis, T.; Mitrelias, T.; Kopper, K. P.; Steinmuller, S. J.; Bland, J. A. C.
2008-06-01
We report a new magnetic labelling technology for high-throughput biomolecular identification and DNA sequencing. Planar multi-bit magnetic tags have been designed and fabricated, which comprise a magnetic barcode formed by an ensemble of micron-sized thin film Ni80Fe20 bars encapsulated in SU8. We show that by using a globally applied magnetic field and magneto-optical Kerr microscopy the magnetic elements in the multi-bit magnetic tags can be addressed individually and encoded/decoded remotely. The critical steps needed to show the feasibility of this technology are demonstrated, including fabrication, flow transport, remote writing and reading, and successful functionalization of the tags as verified by fluorescence detection. This approach is ideal for encoding information on tags in microfluidic flow or suspension, for such applications as labelling of chemical precursors during drug synthesis and combinatorial library-based high-throughput multiplexed bioassays.
Heisig, Julia; Weber, David; Englberger, Eva; Winkler, Anja; Kneitz, Susanne; Sung, Wing-Kin; Wolf, Elmar; Eilers, Martin; Wei, Chia-Lin; Gessler, Manfred
2012-01-01
HEY bHLH transcription factors have been shown to regulate multiple key steps in cardiovascular development. They can be induced by activated NOTCH receptors, but other upstream stimuli mediated by TGFß and BMP receptors may elicit a similar response. While the basic and helix-loop-helix domains exhibit strong similarity, large parts of the proteins are still unique and may serve divergent functions. The striking overlap of cardiac defects in HEY2 and combined HEY1/HEYL knockout mice suggested that all three HEY genes fulfill overlapping function in target cells. We therefore sought to identify target genes for HEY proteins by microarray expression and ChIPseq analyses in HEK293 cells, cardiomyocytes, and murine hearts. HEY proteins were found to modulate expression of their target gene to a rather limited extent, but with striking functional interchangeability between HEY factors. Chromatin immunoprecipitation revealed a much greater number of potential binding sites that again largely overlap between HEY factors. Binding sites are clustered in the proximal promoter region especially of transcriptional regulators or developmental control genes. Multiple lines of evidence suggest that HEY proteins primarily act as direct transcriptional repressors, while gene activation seems to be due to secondary or indirect effects. Mutagenesis of putative DNA binding residues supports the notion of direct DNA binding. While class B E-box sequences (CACGYG) clearly represent preferred target sequences, there must be additional and more loosely defined modes of DNA binding since many of the target promoters that are efficiently bound by HEY proteins do not contain an E-box motif. These data clearly establish the three HEY bHLH factors as highly redundant transcriptional repressors in vitro and in vivo, which explains the combinatorial action observed in different tissues with overlapping expression.
Englberger, Eva; Winkler, Anja; Kneitz, Susanne; Sung, Wing-Kin; Wolf, Elmar; Eilers, Martin; Wei, Chia-Lin; Gessler, Manfred
2012-01-01
HEY bHLH transcription factors have been shown to regulate multiple key steps in cardiovascular development. They can be induced by activated NOTCH receptors, but other upstream stimuli mediated by TGFß and BMP receptors may elicit a similar response. While the basic and helix-loop-helix domains exhibit strong similarity, large parts of the proteins are still unique and may serve divergent functions. The striking overlap of cardiac defects in HEY2 and combined HEY1/HEYL knockout mice suggested that all three HEY genes fulfill overlapping function in target cells. We therefore sought to identify target genes for HEY proteins by microarray expression and ChIPseq analyses in HEK293 cells, cardiomyocytes, and murine hearts. HEY proteins were found to modulate expression of their target gene to a rather limited extent, but with striking functional interchangeability between HEY factors. Chromatin immunoprecipitation revealed a much greater number of potential binding sites that again largely overlap between HEY factors. Binding sites are clustered in the proximal promoter region especially of transcriptional regulators or developmental control genes. Multiple lines of evidence suggest that HEY proteins primarily act as direct transcriptional repressors, while gene activation seems to be due to secondary or indirect effects. Mutagenesis of putative DNA binding residues supports the notion of direct DNA binding. While class B E-box sequences (CACGYG) clearly represent preferred target sequences, there must be additional and more loosely defined modes of DNA binding since many of the target promoters that are efficiently bound by HEY proteins do not contain an E-box motif. These data clearly establish the three HEY bHLH factors as highly redundant transcriptional repressors in vitro and in vivo, which explains the combinatorial action observed in different tissues with overlapping expression. PMID:22615585
Two is better than one; toward a rational design of combinatorial therapy.
Chen, Sheng-Hong; Lahav, Galit
2016-12-01
Drug combination is an appealing strategy for combating the heterogeneity of tumors and evolution of drug resistance. However, the rationale underlying combinatorial therapy is often not well established due to lack of understandings of the specific pathways responding to the drugs, and their temporal dynamics following each treatment. Here we present several emerging trends in harnessing properties of biological systems for the optimal design of drug combinations, including the type of drugs, specific concentration, sequence of addition and the temporal schedule of treatments. We highlight recent studies showing different approaches for efficient design of drug combinations including single-cell signaling dynamics, adaption and pathway crosstalk. Finally, we discuss novel and feasible approaches that can facilitate the optimal design of combinatorial therapy. Copyright © 2016 Elsevier Ltd. All rights reserved.
Merrick, C A; Wardrope, C; Paget, J E; Colloms, S D; Rosser, S J
2016-01-01
Metabolic pathway engineering in microbial hosts for heterologous biosynthesis of commodity compounds and fine chemicals offers a cheaper, greener, and more reliable method of production than does chemical synthesis. However, engineering metabolic pathways within a microbe is a complicated process: levels of gene expression, protein stability, enzyme activity, and metabolic flux must be balanced for high productivity without compromising host cell viability. A major rate-limiting step in engineering microbes for optimum biosynthesis of a target compound is DNA assembly, as current methods can be cumbersome and costly. Serine integrase recombinational assembly (SIRA) is a rapid DNA assembly method that utilizes serine integrases, and is particularly applicable to rapid optimization of engineered metabolic pathways. Using six pairs of orthogonal attP and attB sites with different central dinucleotide sequences that follow SIRA design principles, we have demonstrated that ΦC31 integrase can be used to (1) insert a single piece of DNA into a substrate plasmid; (2) assemble three, four, and five DNA parts encoding the enzymes for functional metabolic pathways in a one-pot reaction; (3) generate combinatorial libraries of metabolic pathway constructs with varied ribosome binding site strengths or gene orders in a one-pot reaction; and (4) replace and add DNA parts within a construct through targeted postassembly modification. We explain the mechanism of SIRA and the principles behind designing a SIRA reaction. We also provide protocols for making SIRA reaction components and practical methods for applying SIRA to rapid optimization of metabolic pathways. © 2016 Elsevier Inc. All rights reserved.
NASA Astrophysics Data System (ADS)
Doerr, Timothy; Alves, Gelio; Yu, Yi-Kuo
2006-03-01
Typical combinatorial optimizations are NP-hard; however, for a particular class of cost functions the corresponding combinatorial optimizations can be solved in polynomial time. This suggests a way to efficiently find approximate solutions - - find a transformation that makes the cost function as similar as possible to that of the solvable class. After keeping many high-ranking solutions using the approximate cost function, one may then re-assess these solutions with the full cost function to find the best approximate solution. Under this approach, it is important to be able to assess the quality of the solutions obtained, e.g., by finding the true ranking of kth best approximate solution when all possible solutions are considered exhaustively. To tackle this statistical issue, we provide a systematic method starting with a scaling function generated from the fininte number of high- ranking solutions followed by a convergent iterative mapping. This method, useful in a variant of the directed paths in random media problem proposed here, can also provide a statistical significance assessment for one of the most important proteomic tasks - - peptide sequencing using tandem mass spectrometry data.
Kim, Pora; Jia, Peilin; Zhao, Zhongming
2018-01-01
Abstract Assessing the impact of kinase in gene fusion is essential for both identifying driver fusion genes (FGs) and developing molecular targeted therapies. Kinase domain retention is a crucial factor in kinase fusion genes (KFGs), but such a systematic investigation has not been done yet. To this end, we analyzed kinase domain retention (KDR) status in chimeric protein sequences of 914 KFGs covering 312 kinases across 13 major cancer types. Based on 171 kinase domain-retained KFGs including 101 kinases, we studied their recurrence, kinase groups, fusion partners, exon-based expression depth, short DNA motifs around the break points and networks. Our results, such as more KDR than 5′-kinase fusion genes, combinatorial effects between 3′-KDR kinases and their 5′-partners and a signal transduction-specific DNA sequence motif in the break point intronic sequences, supported positive selection on 3′-kinase fusion genes in cancer. We introduced a degree-of-frequency (DoF) score to measure the possible number of KFGs of a kinase. Interestingly, kinases with high DoF scores tended to undergo strong gene expression alteration at the break points. Furthermore, our KDR gene fusion network analysis revealed six of the seven kinases with the highest DoF scores (ALK, BRAF, MET, NTRK1, NTRK3 and RET) were all observed in thyroid carcinoma. Finally, we summarized common features of ‘effective’ (highly recurrent) kinases in gene fusions such as expression alteration at break point, redundant usage in multiple cancer types and 3′-location tendency. Collectively, our findings are useful for prioritizing driver kinases and FGs and provided insights into KFGs’ clinical implications. PMID:28013235
Yin, R H; Bai, W L; Wang, J M; Wu, C D; Dou, Q L; Yin, R L; He, J B; Luo, G B
2009-09-01
Yak meat is of good quality with fine texture, high protein and low fat content, and rich in amino acids compared with that of cattle, and it lacks anabolic steroids or other drugs. In general terms, however, the meat yield of yak is relatively low compared with that of the cattle. In order to prevent possible adulteration of yak meat with cattle meat, based on the sequence of mitochondrial 12S rRNA gene, a multiplex PCR-based approach was proposed for rapid identification of the meat from yak and cattle using three primers designed in this work. Through the combinatorial usage of three primers with a single reaction set, two fragments of 290 and 159bp were amplified from the cattle meat DNA, whereas only a fragment of 290bp was obtained from the yak meat DNA. Using the assay described, satisfactory amplification was accomplished in the analysis of raw and heat-treated binary meat mixtures of yak/cattle with a detection limit of 0.1% for cattle meat. The technique is fast and straightforward. It might be a useful tool in the quality control of yak meat and meat products.
Diversity of immunoglobulin lambda light chain gene usage over developmental stages in the horse.
Tallmadge, Rebecca L; Tseng, Chia T; Felippe, M Julia B
2014-10-01
To further studies of neonatal immune responses to pathogens and vaccination, we investigated the dynamics of B lymphocyte development and immunoglobulin (Ig) gene diversity. Previously we demonstrated that equine fetal Ig VDJ sequences exhibit combinatorial and junctional diversity levels comparable to those of adult Ig VDJ sequences. Herein, RACE clones from fetal, neonatal, foal, and adult lymphoid tissue were assessed for Ig lambda light chain combinatorial, junctional, and sequence diversity. Remarkably, more lambda variable genes (IGLV) were used during fetal life than later stages and IGLV gene usage differed significantly with time, in contrast to the Ig heavy chain. Junctional diversity measured by CDR3L length was constant over time. Comparison of Ig lambda transcripts to germline revealed significant increases in nucleotide diversity over time, even during fetal life. These results suggest that the Ig lambda light chain provides an additional dimension of diversity to the equine Ig repertoire. Copyright © 2014 Elsevier Ltd. All rights reserved.
Cheng, Yi-Qiang; Tang, Gong-Li; Shen, Ben
2002-01-01
Leinamycin (LNM), produced by Streptomyces atroolivaceus, is a thiazole-containing hybrid peptide-polyketide natural product structurally characterized with an unprecedented 1,3-dioxo-1,2-dithiolane moiety that is spiro-fused to a 18-member macrolactam ring. LNM exhibits a broad spectrum of antimicrobial and antitumor activities, most significantly against tumors that are resistant to clinically important anticancer drugs, resulting from its DNA cleavage activity in the presence of a reducing agent. Using a PCR approach to clone a thiazole-forming nonribosomal peptide synthetase (NRPS) as a probe, we localized a 172-kb DNA region from S. atroolivaceus S-140 that harbors the lnm biosynthetic gene cluster. Sequence analysis of 11-kb DNA revealed three genes, lnmG, lnmH, and lnmI, and the deduced product of lnmI is characterized by domains characteristic to both NRPS and polyketide synthase (PKS). The involvement of the cloned gene cluster in LNM biosynthesis was confirmed by disrupting the lnmI gene to generate non-LNM-producing mutants and by characterizing LnmI as a hybrid NRPS-PKS megasynthetase, the NRPS module of which specifies for l-Cys and catalyzes thiazole formation. These results have now set the stage for full investigations of LNM biosynthesis and for generation of novel LNM analogs by combinatorial biosynthesis. PMID:12446651
High-Throughput Identification of Combinatorial Ligands for DNA Delivery in Cell Culture
NASA Astrophysics Data System (ADS)
Svahn, Mathias G.; Rabe, Kersten S.; Barger, Geoffrey; EL-Andaloussi, Samir; Simonson, Oscar E.; Didier, Boturyn; Olivier, Renaudet; Dumy, Pascal; Brandén, Lars J.; Niemeyer, Christof M.; Smith, C. I. Edvard
2008-10-01
Finding the optimal combinations of ligands for tissue-specific delivery is tedious even if only a few well-established compounds are tested. The cargo affects the receptor-ligand interaction, especially when it is charged like DNA. The ligand should therefore be evaluated together with its cargo. Several viruses have been shown to interact with more than one receptor, for efficient internalization. We here present a DNA oligonucleotide-based method for inexpensive and rapid screening of biotin labeled ligands for combinatorial effects on cellular binding and uptake. The oligonucleotide complex was designed as a 44 bp double-stranded DNA oligonucleotide with one central streptavidin molecule and a second streptavidin at the terminus. The use of a highly advanced robotic platform ensured stringent processing and execution of the experiments. The oligonucleotides were fluorescently labeled and used for detection and analysis of cell-bound, internalized and intra-cellular compartmentalized constructs by an automated line-scanning confocal microscope, IN Cell Analyzer 3000. All possible combinations of 22 ligands were explored in sets of 2 and tested on 6 different human cell lines in triplicates. In total, 10 000 transfections were performed on the automation platform. Cell-specific combinations of ligands were identified and their relative position on the scaffold oligonucleotide was found to be of importance. The ligands were found to be cargo dependent, carbohydrates were more potent for DNA delivery whereas cell penetrating peptides were more potent for delivery of less charged particles.
Double Dutch: A Tool for Designing Combinatorial Libraries of Biological Systems.
Roehner, Nicholas; Young, Eric M; Voigt, Christopher A; Gordon, D Benjamin; Densmore, Douglas
2016-06-17
Recently, semirational approaches that rely on combinatorial assembly of characterized DNA components have been used to engineer biosynthetic pathways. In practice, however, it is not practical to assemble and test millions of pathway variants in order to elucidate how different DNA components affect the behavior of a pathway. To address this challenge, we apply a rigorous mathematical approach known as design of experiments (DOE) that can be used to construct empirical models of system behavior without testing all variants. To support this approach, we have developed a tool named Double Dutch, which uses a formal grammar and heuristic algorithms to automate the process of DOE library design. Compared to designing by hand, Double Dutch enables users to more efficiently and scalably design libraries of pathway variants that can be used in a DOE framework and uniquely provides a means to flexibly balance design considerations of statistical analysis, construction cost, and risk of homologous recombination, thereby demonstrating the utility of automating decision making when faced with complex design trade-offs.
Popova, Blagovesta; Schubert, Steffen; Bulla, Ingo; Buchwald, Daniela; Kramer, Wilfried
2015-01-01
A major challenge in gene library generation is to guarantee a large functional size and diversity that significantly increases the chances of selecting different functional protein variants. The use of trinucleotides mixtures for controlled randomization results in superior library diversity and offers the ability to specify the type and distribution of the amino acids at each position. Here we describe the generation of a high diversity gene library using tHisF of the hyperthermophile Thermotoga maritima as a scaffold. Combining various rational criteria with contingency, we targeted 26 selected codons of the thisF gene sequence for randomization at a controlled level. We have developed a novel method of creating full-length gene libraries by combinatorial assembly of smaller sub-libraries. Full-length libraries of high diversity can easily be assembled on demand from smaller and much less diverse sub-libraries, which circumvent the notoriously troublesome long-term archivation and repeated proliferation of high diversity ensembles of phages or plasmids. We developed a generally applicable software tool for sequence analysis of mutated gene sequences that provides efficient assistance for analysis of library diversity. Finally, practical utility of the library was demonstrated in principle by assessment of the conformational stability of library members and isolating protein variants with HisF activity from it. Our approach integrates a number of features of nucleic acids synthetic chemistry, biochemistry and molecular genetics to a coherent, flexible and robust method of combinatorial gene synthesis. PMID:26355961
Popova, Blagovesta; Schubert, Steffen; Bulla, Ingo; Buchwald, Daniela; Kramer, Wilfried
2015-01-01
A major challenge in gene library generation is to guarantee a large functional size and diversity that significantly increases the chances of selecting different functional protein variants. The use of trinucleotides mixtures for controlled randomization results in superior library diversity and offers the ability to specify the type and distribution of the amino acids at each position. Here we describe the generation of a high diversity gene library using tHisF of the hyperthermophile Thermotoga maritima as a scaffold. Combining various rational criteria with contingency, we targeted 26 selected codons of the thisF gene sequence for randomization at a controlled level. We have developed a novel method of creating full-length gene libraries by combinatorial assembly of smaller sub-libraries. Full-length libraries of high diversity can easily be assembled on demand from smaller and much less diverse sub-libraries, which circumvent the notoriously troublesome long-term archivation and repeated proliferation of high diversity ensembles of phages or plasmids. We developed a generally applicable software tool for sequence analysis of mutated gene sequences that provides efficient assistance for analysis of library diversity. Finally, practical utility of the library was demonstrated in principle by assessment of the conformational stability of library members and isolating protein variants with HisF activity from it. Our approach integrates a number of features of nucleic acids synthetic chemistry, biochemistry and molecular genetics to a coherent, flexible and robust method of combinatorial gene synthesis.
Single cell systems biology by super-resolution imaging and combinatorial labeling
Lubeck, Eric; Cai, Long
2012-01-01
Fluorescence microscopy is a powerful quantitative tool for exploring regulatory networks in single cells. However, the number of molecular species that can be measured simultaneously is limited by the spectral separability of fluorophores. Here we demonstrate a simple but general strategy to drastically increase the capacity for multiplex detection of molecules in single cells by using optical super-resolution microscopy (SRM) and combinatorial labeling. As a proof of principle, we labeled mRNAs with unique combinations of fluorophores using Fluorescence in situ Hybridization (FISH), and resolved the sequences and combinations of fluorophores with SRM. We measured the mRNA levels of 32 genes simultaneously in single S. cerevisiae cells. These experiments demonstrate that combinatorial labeling and super-resolution imaging of single cells provides a natural approach to bring systems biology into single cells. PMID:22660740
Particle-Based Microarrays of Oligonucleotides and Oligopeptides.
Nesterov-Mueller, Alexander; Maerkle, Frieder; Hahn, Lothar; Foertsch, Tobias; Schillo, Sebastian; Bykovskaya, Valentina; Sedlmayr, Martyna; Weber, Laura K; Ridder, Barbara; Soehindrijo, Miriam; Muenster, Bastian; Striffler, Jakob; Bischoff, F Ralf; Breitling, Frank; Loeffler, Felix F
2014-10-28
In this review, we describe different methods of microarray fabrication based on the use of micro-particles/-beads and point out future tendencies in the development of particle-based arrays. First, we consider oligonucleotide bead arrays, where each bead is a carrier of one specific sequence of oligonucleotides. This bead-based array approach, appearing in the late 1990s, enabled high-throughput oligonucleotide analysis and had a large impact on genome research. Furthermore, we consider particle-based peptide array fabrication using combinatorial chemistry. In this approach, particles can directly participate in both the synthesis and the transfer of synthesized combinatorial molecules to a substrate. Subsequently, we describe in more detail the synthesis of peptide arrays with amino acid polymer particles, which imbed the amino acids inside their polymer matrix. By heating these particles, the polymer matrix is transformed into a highly viscous gel, and thereby, imbedded monomers are allowed to participate in the coupling reaction. Finally, we focus on combinatorial laser fusing of particles for the synthesis of high-density peptide arrays. This method combines the advantages of particles and combinatorial lithographic approaches.
Particle-Based Microarrays of Oligonucleotides and Oligopeptides
Nesterov-Mueller, Alexander; Maerkle, Frieder; Hahn, Lothar; Foertsch, Tobias; Schillo, Sebastian; Bykovskaya, Valentina; Sedlmayr, Martyna; Weber, Laura K.; Ridder, Barbara; Soehindrijo, Miriam; Muenster, Bastian; Striffler, Jakob; Bischoff, F. Ralf; Breitling, Frank; Loeffler, Felix F.
2014-01-01
In this review, we describe different methods of microarray fabrication based on the use of micro-particles/-beads and point out future tendencies in the development of particle-based arrays. First, we consider oligonucleotide bead arrays, where each bead is a carrier of one specific sequence of oligonucleotides. This bead-based array approach, appearing in the late 1990s, enabled high-throughput oligonucleotide analysis and had a large impact on genome research. Furthermore, we consider particle-based peptide array fabrication using combinatorial chemistry. In this approach, particles can directly participate in both the synthesis and the transfer of synthesized combinatorial molecules to a substrate. Subsequently, we describe in more detail the synthesis of peptide arrays with amino acid polymer particles, which imbed the amino acids inside their polymer matrix. By heating these particles, the polymer matrix is transformed into a highly viscous gel, and thereby, imbedded monomers are allowed to participate in the coupling reaction. Finally, we focus on combinatorial laser fusing of particles for the synthesis of high-density peptide arrays. This method combines the advantages of particles and combinatorial lithographic approaches. PMID:27600347
Efficient search, mapping, and optimization of multi-protein genetic systems in diverse bacteria
Farasat, Iman; Kushwaha, Manish; Collens, Jason; Easterbrook, Michael; Guido, Matthew; Salis, Howard M
2014-01-01
Developing predictive models of multi-protein genetic systems to understand and optimize their behavior remains a combinatorial challenge, particularly when measurement throughput is limited. We developed a computational approach to build predictive models and identify optimal sequences and expression levels, while circumventing combinatorial explosion. Maximally informative genetic system variants were first designed by the RBS Library Calculator, an algorithm to design sequences for efficiently searching a multi-protein expression space across a > 10,000-fold range with tailored search parameters and well-predicted translation rates. We validated the algorithm's predictions by characterizing 646 genetic system variants, encoded in plasmids and genomes, expressed in six gram-positive and gram-negative bacterial hosts. We then combined the search algorithm with system-level kinetic modeling, requiring the construction and characterization of 73 variants to build a sequence-expression-activity map (SEAMAP) for a biosynthesis pathway. Using model predictions, we designed and characterized 47 additional pathway variants to navigate its activity space, find optimal expression regions with desired activity response curves, and relieve rate-limiting steps in metabolism. Creating sequence-expression-activity maps accelerates the optimization of many protein systems and allows previous measurements to quantitatively inform future designs. PMID:24952589
A ripple-spreading genetic algorithm for the aircraft sequencing problem.
Hu, Xiao-Bing; Di Paolo, Ezequiel A
2011-01-01
When genetic algorithms (GAs) are applied to combinatorial problems, permutation representations are usually adopted. As a result, such GAs are often confronted with feasibility and memory-efficiency problems. With the aircraft sequencing problem (ASP) as a study case, this paper reports on a novel binary-representation-based GA scheme for combinatorial problems. Unlike existing GAs for the ASP, which typically use permutation representations based on aircraft landing order, the new GA introduces a novel ripple-spreading model which transforms the original landing-order-based ASP solutions into value-based ones. In the new scheme, arriving aircraft are projected as points into an artificial space. A deterministic method inspired by the natural phenomenon of ripple-spreading on liquid surfaces is developed, which uses a few parameters as input to connect points on this space to form a landing sequence. A traditional GA, free of feasibility and memory-efficiency problems, can then be used to evolve the ripple-spreading related parameters in order to find an optimal sequence. Since the ripple-spreading model is the centerpiece of the new algorithm, it is called the ripple-spreading GA (RSGA). The advantages of the proposed RSGA are illustrated by extensive comparative studies for the case of the ASP.
Application of combinatorial biocatalysis for a unique ring expansion of dihydroxymethylzearalenone.
Rich, Joseph O; Budde, Cheryl L; McConeghey, Luke D; Cotterill, Ian C; Mozhaev, Vadim V; Singh, Sheo B; Goetz, Michael A; Zhao, Annie; Michels, Peter C; Khmelnitsky, Yuri L
2009-06-01
Combinatorial biocatalysis was applied to generate a diverse set of dihydroxymethylzearalenone analogs with modified ring structure. In one representative chemoenzymatic reaction sequence, dihydroxymethylzearalenone was first subjected to a unique enzyme-catalyzed oxidative ring opening reaction that creates two new carboxylic groups on the molecule. These groups served as reaction sites for further derivatization involving biocatalytic ring closure reactions with structurally diverse bifunctional reagents, including different diols and diamines. As a result, a library of cyclic bislactones and bislactams was created, with modified ring structures covering chemical space and structure activity relationships unattainable by conventional synthetic means.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Zhong, Wenwan
2003-01-01
Capillary electrophoresis (CE) offers many advantages over conventional analytical methods, such as speed, simplicity, high resolution, low cost, and small sample consumption, especially for the separation of enantiomers. However, chiral method developments still can be time consuming and tedious. They designed a comprehensive enantioseparation protocol employing neutral and sulfated cyclodextrins as chiral selectors for common basic, neutral, and acidic compounds with a 96-capillary array system. By using only four judiciously chosen separation buffers, successful enantioseparations were achieved for 49 out of 54 test compounds spanning a large variety of pKs and structures. Therefore, unknown compounds can be screened in thismore » manner to identify optimal enantioselective conditions in just one rn. In addition to superior separation efficiency for small molecules, CE is also the most powerful technique for DNA separations. Using the same multiplexed capillary system with UV absorption detection, the sequence of a short DNA template can be acquired without any dye-labels. Two internal standards were utilized to adjust the migration time variations among capillaries, so that the four electropherograms for the A, T, C, G Sanger reactions can be aligned and base calling can be completed with a high level of confidence. the CE separation of DNA can be applied to study differential gene expression as well. Combined with pattern recognition techniques, small variations among electropherograms obtained by the separation of cDNA fragments produced from the total RNA samples of different human tissues can be revealed. These variations reflect the differences in total RNA expression among tissues. Thus, this Ce-based approach can serve as an alternative to the DNA array techniques in gene expression analysis.« less
Thangavel, Jayakumar; Malik, Asrar B.; Elias, Harold K.; Rajasingh, Sheeja; Simpson, Andrew D.; Sundivakkam, Premanand K.; Vogel, Stephen M.; Xuan, Yu-Ting; Dawn, Buddhadeb; Rajasingh, Johnson
2015-01-01
Impairment of tissue fluid homeostasis and migration of inflammatory cells across the vascular endothelial barrier are crucial factors in the pathogenesis of acute lung injury (ALI). The goal for treatment of ALI is to target pathways that lead to profound dysregulation of the lung endothelial barrier. Although studies have shown that chemical epigenetic modifiers can limit lung inflammation in experimental ALI models, studies to date have not examined efficacy of a combination of DNA methyl transferase inhibitor 5-Aza 2-deoxycytidine and histone deacetylase inhibitor trichostatin A (herein referred to as Aza+TSA) after endotoxemia-induced mouse lung injury. We tested the hypothesis that treatment with Aza+TSA after lipopolysaccharide induction of ALI through epigenetic modification of lung endothelial cells prevents inflammatory lung injury. Combinatorial treatment with Aza+TSA mitigated the increased endothelial permeability response after lipopolysaccharide challenge. In addition, we observed reduced lung inflammation and lung injury. Aza+TSA also significantly reduced mortality in the ALI model. The protection was ascribed to inhibition of the eNOS-Cav1-MLC2 signaling pathway and enhanced acetylation of histone markers on the vascular endothelial-cadherin promoter. In summary, these data show for the first time the efficacy of combinatorial Aza+TSA therapy in preventing ALI in lipopolysaccharide-induced endotoxemia and raise the possibility of an essential role of DNA methyl transferase and histone deacetylase in the mechanism of ALI. PMID:24929240
Klug, Aaron
2010-02-01
A long-standing goal of molecular biologists has been to construct DNA-binding proteins for the control of gene expression. The classical Cys2His2 (C2H2) zinc finger design is ideally suited for such purposes. Discriminating between closely related DNA sequences both in vitro and in vivo, this naturally occurring design was adopted for engineering zinc finger proteins (ZFPs) to target genes specifically. Zinc fingers were discovered in 1985, arising from the interpretation of our biochemical studies on the interaction of the Xenopus protein transcription factor IIIA (TFIIIA) with 5S RNA. Subsequent structural studies revealed its three-dimensional structure and its interaction with DNA. Each finger constitutes a self-contained domain stabilized by a zinc (Zn) ion ligated to a pair of cysteines and a pair of histidines and also by an inner structural hydrophobic core. This discovery showed not only a new protein fold but also a novel principle of DNA recognition. Whereas other DNA-binding proteins generally make use of the 2-fold symmetry of the double helix, functioning as homo- or heterodimers, zinc fingers can be linked linearly in tandem to recognize nucleic acid sequences of varying lengths. This modular design offers a large number of combinatorial possibilities for the specific recognition of DNA (or RNA). It is therefore not surprising that the zinc finger is found widespread in nature, including 3% of the genes of the human genome. The zinc finger design can be used to construct DNA-binding proteins for specific intervention in gene expression. By fusing selected zinc finger peptides to repression or activation domains, genes can be selectively switched off or on by targeting the peptide to the desired gene target. It was also suggested that by combining an appropriate zinc finger peptide with other effector or functional domains, e.g. from nucleases or integrases to form chimaeric proteins, genomes could be modified or manipulated. The first example of the power of the method was published in 1994 when a three-finger protein was constructed to block the expression of a human oncogene transformed into a mouse cell line. The same paper also described how a reporter gene was activated by targeting an inserted 9-base pair (bp) sequence, which acts as the promoter. Thus, by fusing zinc finger peptides to repression or activation domains, genes can be selectively switched off or on. It was also suggested that, by combining zinc fingers with other effector or functional domains, e.g. from nucleases or integrases, to form chimaeric proteins, genomes could be manipulated or modified. Several applications of such engineered ZFPs are described here, including some of therapeutic importance, and also their adaptation for breeding improved crop plants.
Functional analysis of the bacteriophage T4 DNA-packaging ATPase motor.
Mitchell, Michael S; Rao, Venigalla B
2006-01-06
Packaging of double-stranded DNA into bacteriophage capsids is driven by one of the most powerful force-generating motors reported to date. The phage T4 motor is constituted by gene product 16 (gp16) (18 kDa; small terminase), gp17 (70 kDa; large terminase), and gp20 (61 kDa; dodecameric portal). Extensive sequence alignments revealed that numerous phage and viral large terminases encode a common Walker-B motif in the N-terminal ATPase domain. The gp17 motif consists of a highly conserved aspartate (Asp255) preceded by four hydrophobic residues (251MIYI254), which are predicted to form a beta-strand. Combinatorial mutagenesis demonstrated that mutations that compromised hydrophobicity, or integrity of the beta-strand, resulted in a null phenotype, whereas certain changes in hydrophobicity resulted in cs/ts phenotypes. No substitutions, including a highly conservative glutamate, are tolerated at the conserved aspartate. Biochemical analyses revealed that the Asp255 mutants showed no detectable in vitro DNA packaging activity. The purified D255E, D255N, D255T, D255V, and D255E/E256D mutant proteins exhibited defective ATP binding and very low or no gp16-stimulated ATPase activity. The nuclease activity of gp17 is, however, retained, albeit at a greatly reduced level. These data define the N-terminal ATPase center in terminases and show for the first time that subtle defects in the ATP-Mg complex formation at this center lead to a profound loss of phage DNA packaging.
Structure-based design of combinatorial mutagenesis libraries
Verma, Deeptak; Grigoryan, Gevorg; Bailey-Kellogg, Chris
2015-01-01
The development of protein variants with improved properties (thermostability, binding affinity, catalytic activity, etc.) has greatly benefited from the application of high-throughput screens evaluating large, diverse combinatorial libraries. At the same time, since only a very limited portion of sequence space can be experimentally constructed and tested, an attractive possibility is to use computational protein design to focus libraries on a productive portion of the space. We present a general-purpose method, called “Structure-based Optimization of Combinatorial Mutagenesis” (SOCoM), which can optimize arbitrarily large combinatorial mutagenesis libraries directly based on structural energies of their constituents. SOCoM chooses both positions and substitutions, employing a combinatorial optimization framework based on library-averaged energy potentials in order to avoid explicitly modeling every variant in every possible library. In case study applications to green fluorescent protein, β-lactamase, and lipase A, SOCoM optimizes relatively small, focused libraries whose variants achieve energies comparable to or better than previous library design efforts, as well as larger libraries (previously not designable by structure-based methods) whose variants cover greater diversity while still maintaining substantially better energies than would be achieved by representative random library approaches. By allowing the creation of large-scale combinatorial libraries based on structural calculations, SOCoM promises to increase the scope of applicability of computational protein design and improve the hit rate of discovering beneficial variants. While designs presented here focus on variant stability (predicted by total energy), SOCoM can readily incorporate other structure-based assessments, such as the energy gap between alternative conformational or bound states. PMID:25611189
Structure-based design of combinatorial mutagenesis libraries.
Verma, Deeptak; Grigoryan, Gevorg; Bailey-Kellogg, Chris
2015-05-01
The development of protein variants with improved properties (thermostability, binding affinity, catalytic activity, etc.) has greatly benefited from the application of high-throughput screens evaluating large, diverse combinatorial libraries. At the same time, since only a very limited portion of sequence space can be experimentally constructed and tested, an attractive possibility is to use computational protein design to focus libraries on a productive portion of the space. We present a general-purpose method, called "Structure-based Optimization of Combinatorial Mutagenesis" (SOCoM), which can optimize arbitrarily large combinatorial mutagenesis libraries directly based on structural energies of their constituents. SOCoM chooses both positions and substitutions, employing a combinatorial optimization framework based on library-averaged energy potentials in order to avoid explicitly modeling every variant in every possible library. In case study applications to green fluorescent protein, β-lactamase, and lipase A, SOCoM optimizes relatively small, focused libraries whose variants achieve energies comparable to or better than previous library design efforts, as well as larger libraries (previously not designable by structure-based methods) whose variants cover greater diversity while still maintaining substantially better energies than would be achieved by representative random library approaches. By allowing the creation of large-scale combinatorial libraries based on structural calculations, SOCoM promises to increase the scope of applicability of computational protein design and improve the hit rate of discovering beneficial variants. While designs presented here focus on variant stability (predicted by total energy), SOCoM can readily incorporate other structure-based assessments, such as the energy gap between alternative conformational or bound states. © 2015 The Protein Society.
SCRaMbLE generates designed combinatorial stochastic diversity in synthetic chromosomes
Shen, Yue; Stracquadanio, Giovanni; Wang, Yun; Yang, Kun; Mitchell, Leslie A.; Xue, Yaxin; Cai, Yizhi; Chen, Tai; Dymond, Jessica S.; Kang, Kang; Gong, Jianhui; Zeng, Xiaofan; Zhang, Yongfen; Li, Yingrui; Feng, Qiang; Xu, Xun; Wang, Jun; Wang, Jian; Yang, Huanming; Boeke, Jef D.; Bader, Joel S.
2016-01-01
Synthetic chromosome rearrangement and modification by loxP-mediated evolution (SCRaMbLE) generates combinatorial genomic diversity through rearrangements at designed recombinase sites. We applied SCRaMbLE to yeast synthetic chromosome arm synIXR (43 recombinase sites) and then used a computational pipeline to infer or unscramble the sequence of recombinations that created the observed genomes. Deep sequencing of 64 synIXR SCRaMbLE strains revealed 156 deletions, 89 inversions, 94 duplications, and 55 additional complex rearrangements; several duplications are consistent with a double rolling circle mechanism. Every SCRaMbLE strain was unique, validating the capability of SCRaMbLE to explore a diverse space of genomes. Rearrangements occurred exclusively at designed loxPsym sites, with no significant evidence for ectopic rearrangements or mutations involving synthetic regions, the 99% nonsynthetic nuclear genome, or the mitochondrial genome. Deletion frequencies identified genes required for viability or fast growth. Replacement of 3′ UTR by non-UTR sequence had surprisingly little effect on fitness. SCRaMbLE generates genome diversity in designated regions, reveals fitness constraints, and should scale to simultaneous evolution of multiple synthetic chromosomes. PMID:26566658
Monotonicity of fitness landscapes and mutation rate control.
Belavkin, Roman V; Channon, Alastair; Aston, Elizabeth; Aston, John; Krašovec, Rok; Knight, Christopher G
2016-12-01
A common view in evolutionary biology is that mutation rates are minimised. However, studies in combinatorial optimisation and search have shown a clear advantage of using variable mutation rates as a control parameter to optimise the performance of evolutionary algorithms. Much biological theory in this area is based on Ronald Fisher's work, who used Euclidean geometry to study the relation between mutation size and expected fitness of the offspring in infinite phenotypic spaces. Here we reconsider this theory based on the alternative geometry of discrete and finite spaces of DNA sequences. First, we consider the geometric case of fitness being isomorphic to distance from an optimum, and show how problems of optimal mutation rate control can be solved exactly or approximately depending on additional constraints of the problem. Then we consider the general case of fitness communicating only partial information about the distance. We define weak monotonicity of fitness landscapes and prove that this property holds in all landscapes that are continuous and open at the optimum. This theoretical result motivates our hypothesis that optimal mutation rate functions in such landscapes will increase when fitness decreases in some neighbourhood of an optimum, resembling the control functions derived in the geometric case. We test this hypothesis experimentally by analysing approximately optimal mutation rate control functions in 115 complete landscapes of binding scores between DNA sequences and transcription factors. Our findings support the hypothesis and find that the increase of mutation rate is more rapid in landscapes that are less monotonic (more rugged). We discuss the relevance of these findings to living organisms.
Qin, Song; Zhang, Hongyu; Li, Fuchao; Zhu, Benwei; Zheng, Huajun
2012-03-01
A series of angucyclinone antibiotics have been isolated from marine Streptomyces sp. strain W007 and identified. Here, a draft genome sequence of Streptomyces sp. W007 is presented. The genome contains an intact biosynthetic gene cluster for angucyclinone antibiotics, which provides insight into the combinatorial biosynthesis of angucyclinone antibiotics produced by marine streptomycetes.
Lin, En-Chiang; Cole, Jesse J; Jacobs, Heiko O
2010-11-10
This article reports and applies a recently discovered programmable multimaterial deposition process to the formation and combinatorial improvement of 3D nanostructured devices. The gas-phase deposition process produces charged <5 nm particles of silver, tungsten, and platinum and uses externally biased electrodes to control the material flux and to turn deposition ON/OFF in selected domains. Domains host nanostructured dielectrics to define arrays of electrodynamic 10 × nanolenses to further control the flux to form <100 nm resolution deposits. The unique feature of the process is that material type, amount, and sequence can be altered from one domain to the next leading to different types of nanostructures including multimaterial bridges, interconnects, or nanowire arrays with 20 nm positional accuracy. These features enable combinatorial nanostructured materials and device discovery. As a first demonstration, we produce and identify in a combinatorial way 3D nanostructured electrode designs that improve light scattering, absorption, and minority carrier extraction of bulk heterojunction photovoltaic cells. Photovoltaic cells from domains with long and dense nanowire arrays improve the relative power conversion efficiency by 47% when compared to flat domains on the same substrate.
Molecular Determinants of Estrogen Receptor Alpha Stability
2008-07-01
presence of E2. This question can be addressed by a T7 phage display screen using a breast cancer cell library and DNA-bound ERα in the presence of...conformation of ERα induced by 27HC versus E2. To accomplish this, we performed combinatorial phage display using a modified M13 phage display screen
Solforosi, Laura; Mancini, Nicasio; Canducci, Filippo; Clementi, Nicola; Sautto, Giuseppe Andrea; Diotti, Roberta Antonia; Clementi, Massimo; Burioni, Roberto
2012-07-01
A novel phagemid vector, named pCM, was optimized for the cloning and display of antibody fragment (Fab) libraries on the surface of filamentous phage. This vector contains two long DNA "stuffer" fragments for easier differentiation of the correctly cut forms of the vector. Moreover, in pCM the fragment at the heavy-chain cloning site contains an acid phosphatase-encoding gene allowing an easy distinction of the Escherichia coli cells containing the unmodified form of the phagemid versus the heavy-chain fragment coding cDNA. In pCM transcription of heavy-chain Fd/gene III and light chain is driven by a single lacZ promoter. The light chain is directed to the periplasm by the ompA signal peptide, whereas the heavy-chain Fd/coat protein III is trafficked by the pelB signal peptide. The phagemid pCM was used to generate a human combinatorial phage display antibody library that allowed the selection of a monoclonal Fab fragment antibody directed against the nucleoprotein (NP) of Influenza A virus.
mtDNA, Metastasis, and the Mitochondrial Unfolded Protein Response (UPRmt).
Kenny, Timothy C; Germain, Doris
2017-01-01
While several studies have confirmed a link between mitochondrial DNA (mtDNA) mutations and cancer cell metastasis, much debate remains regarding the nature of the alternations in mtDNA leading to this effect. Meanwhile, the mitochondrial unfolded protein response (UPR mt ) has gained much attention in recent years, with most studies of this pathway focusing on its role in aging. However, the UPR mt has also been studied in the context of cancer. More recent work suggests that rather than a single mutation or alternation, specific combinatorial mtDNA landscapes able to activate the UPR mt may be those that are selected by metastatic cells, while mtDNA landscapes unable to activate the UPR mt do not. This review aims at offering an overview of the confusing literature on mtDNA mutations and metastasis and the more recent work on the UPR mt in this setting.
Oldfield, Lauren M; Grzesik, Peter; Voorhies, Alexander A; Alperovich, Nina; MacMath, Derek; Najera, Claudia D; Chandra, Diya Sabrina; Prasad, Sanjana; Noskov, Vladimir N; Montague, Michael G; Friedman, Robert M; Desai, Prashant J; Vashee, Sanjay
2017-10-17
Here, we present a transformational approach to genome engineering of herpes simplex virus type 1 (HSV-1), which has a large DNA genome, using synthetic genomics tools. We believe this method will enable more rapid and complex modifications of HSV-1 and other large DNA viruses than previous technologies, facilitating many useful applications. Yeast transformation-associated recombination was used to clone 11 fragments comprising the HSV-1 strain KOS 152 kb genome. Using overlapping sequences between the adjacent pieces, we assembled the fragments into a complete virus genome in yeast, transferred it into an Escherichia coli host, and reconstituted infectious virus following transfection into mammalian cells. The virus derived from this yeast-assembled genome, KOS YA , replicated with kinetics similar to wild-type virus. We demonstrated the utility of this modular assembly technology by making numerous modifications to a single gene, making changes to two genes at the same time and, finally, generating individual and combinatorial deletions to a set of five conserved genes that encode virion structural proteins. While the ability to perform genome-wide editing through assembly methods in large DNA virus genomes raises dual-use concerns, we believe the incremental risks are outweighed by potential benefits. These include enhanced functional studies, generation of oncolytic virus vectors, development of delivery platforms of genes for vaccines or therapy, as well as more rapid development of countermeasures against potential biothreats.
Grzesik, Peter; Voorhies, Alexander A.; Alperovich, Nina; MacMath, Derek; Najera, Claudia D.; Chandra, Diya Sabrina; Prasad, Sanjana; Noskov, Vladimir N.; Montague, Michael G.; Friedman, Robert M.; Desai, Prashant J.
2017-01-01
Here, we present a transformational approach to genome engineering of herpes simplex virus type 1 (HSV-1), which has a large DNA genome, using synthetic genomics tools. We believe this method will enable more rapid and complex modifications of HSV-1 and other large DNA viruses than previous technologies, facilitating many useful applications. Yeast transformation-associated recombination was used to clone 11 fragments comprising the HSV-1 strain KOS 152 kb genome. Using overlapping sequences between the adjacent pieces, we assembled the fragments into a complete virus genome in yeast, transferred it into an Escherichia coli host, and reconstituted infectious virus following transfection into mammalian cells. The virus derived from this yeast-assembled genome, KOSYA, replicated with kinetics similar to wild-type virus. We demonstrated the utility of this modular assembly technology by making numerous modifications to a single gene, making changes to two genes at the same time and, finally, generating individual and combinatorial deletions to a set of five conserved genes that encode virion structural proteins. While the ability to perform genome-wide editing through assembly methods in large DNA virus genomes raises dual-use concerns, we believe the incremental risks are outweighed by potential benefits. These include enhanced functional studies, generation of oncolytic virus vectors, development of delivery platforms of genes for vaccines or therapy, as well as more rapid development of countermeasures against potential biothreats. PMID:28928148
Neri, Dario; Lerner, Richard A
2018-06-20
The discovery of organic ligands that bind specifically to proteins is a central problem in chemistry, biology, and the biomedical sciences. The encoding of individual organic molecules with distinctive DNA tags, serving as amplifiable identification bar codes, allows the construction and screening of combinatorial libraries of unprecedented size, thus facilitating the discovery of ligands to many different protein targets. Fundamentally, one links powers of genetics and chemical synthesis. After the initial description of DNA-encoded chemical libraries in 1992, several experimental embodiments of the technology have been reduced to practice. This review provides a historical account of important milestones in the development of DNA-encoded chemical libraries, a survey of relevant ongoing research activities, and a glimpse into the future.
Lonardi, Stefano; Mirebrahim, Hamid; Wanamaker, Steve; Alpert, Matthew; Ciardo, Gianfranco; Duma, Denisa; Close, Timothy J
2015-09-15
As the invention of DNA sequencing in the 70s, computational biologists have had to deal with the problem of de novo genome assembly with limited (or insufficient) depth of sequencing. In this work, we investigate the opposite problem, that is, the challenge of dealing with excessive depth of sequencing. We explore the effect of ultra-deep sequencing data in two domains: (i) the problem of decoding reads to bacterial artificial chromosome (BAC) clones (in the context of the combinatorial pooling design we have recently proposed), and (ii) the problem of de novo assembly of BAC clones. Using real ultra-deep sequencing data, we show that when the depth of sequencing increases over a certain threshold, sequencing errors make these two problems harder and harder (instead of easier, as one would expect with error-free data), and as a consequence the quality of the solution degrades with more and more data. For the first problem, we propose an effective solution based on 'divide and conquer': we 'slice' a large dataset into smaller samples of optimal size, decode each slice independently, and then merge the results. Experimental results on over 15 000 barley BACs and over 4000 cowpea BACs demonstrate a significant improvement in the quality of the decoding and the final assembly. For the second problem, we show for the first time that modern de novo assemblers cannot take advantage of ultra-deep sequencing data. Python scripts to process slices and resolve decoding conflicts are available from http://goo.gl/YXgdHT; software Hashfilter can be downloaded from http://goo.gl/MIyZHs stelo@cs.ucr.edu or timothy.close@ucr.edu Supplementary data are available at Bioinformatics online. © The Author 2015. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.
Chimeric Rhinoviruses Displaying MPER Epitopes Elicit Anti-HIV Neutralizing Responses
Yi, Guohua; Lapelosa, Mauro; Bradley, Rachel; Mariano, Thomas M.; Dietz, Denise Elsasser; Hughes, Scott; Wrin, Terri; Petropoulos, Chris; Gallicchio, Emilio; Levy, Ronald M.; Arnold, Eddy; Arnold, Gail Ferstandig
2013-01-01
Background The development of an effective AIDS vaccine has been a formidable task, but remains a critical necessity. The well conserved membrane-proximal external region (MPER) of the HIV-1 gp41 glycoprotein is one of the crucial targets for AIDS vaccine development, as it has the necessary attribute of being able to elicit antibodies capable of neutralizing diverse isolates of HIV. Methodology/Principle Findings Guided by X-ray crystallography, molecular modeling, combinatorial chemistry, and powerful selection techniques, we designed and produced six combinatorial libraries of chimeric human rhinoviruses (HRV) displaying the MPER epitopes corresponding to mAbs 2F5, 4E10, and/or Z13e1, connected to an immunogenic surface loop of HRV via linkers of varying lengths and sequences. Not all libraries led to viable chimeric viruses with the desired sequences, but the combinatorial approach allowed us to examine large numbers of MPER-displaying chimeras. Among the chimeras were five that elicited antibodies capable of significantly neutralizing HIV-1 pseudoviruses from at least three subtypes, in one case leading to neutralization of 10 pseudoviruses from all six subtypes tested. Conclusions Optimization of these chimeras or closely related chimeras could conceivably lead to useful components of an effective AIDS vaccine. While the MPER of HIV may not be immunodominant in natural infection by HIV-1, its presence in a vaccine cocktail could provide critical breadth of protection. PMID:24039745
Godet, Angélique N; Guergnon, Julien; Maire, Virginie; Croset, Amélie; Garcia, Alphonse
2010-04-01
Previous studies established that PP1 is a target for Bcl-2 proteins and an important regulator of apoptosis. The two distinct functional PP1 consensus docking motifs, R/Kx((0,1))V/IxF and FxxR/KxR/K, involved in PP1 binding and cell death were previously characterized in the BH1 and BH3 domains of some Bcl-2 proteins. In this study, we demonstrate that DPT-AIF(1), a peptide containing the AIF(562-571) sequence located in a c-terminal domain of AIF, is a new PP1 interacting and cell penetrating molecule. We also showed that DPT-AIF(1) provoked apoptosis in several human cell lines. Furthermore, DPT-APAF(1) a bi-partite cell penetrating peptide containing APAF-1(122-131), a non penetrating sequence from APAF-1 protein, linked to our previously described DPT-sh1 peptide shuttle, is also a PP1-interacting death molecule. Both AIF(562-571) and APAF-1(122-131) sequences contain a common R/Kx((0,1))V/IxFxxR/KxR/K motif, shared by several proteins involved in control of cell survival pathways. This motif combines the two distinct PP1c consensus docking motifs initially identified in some Bcl-2 proteins. Interestingly DPT-AIF(2) and DPT-APAF(2) that carry a F to A mutation within this combinatorial motif, no longer exhibited any PP1c binding or apoptotic effects. Moreover the F to A mutation in DPT-AIF(2) also suppressed cell penetration. These results indicate that the combinatorial PP1c docking motif R/Kx((0,1))V/IxFxxR/KxR/K, deduced from AIF(562-571) and APAF-1(122-131) sequences, is a new PP1c-dependent Apoptotic Signature. This motif is also a new tool for drug design that could be used to characterize potential anti-tumour molecules.
DNA-encoded chemical libraries: advancing beyond conventional small-molecule libraries.
Franzini, Raphael M; Neri, Dario; Scheuermann, Jörg
2014-04-15
DNA-encoded chemical libraries (DECLs) represent a promising tool in drug discovery. DECL technology allows the synthesis and screening of chemical libraries of unprecedented size at moderate costs. In analogy to phage-display technology, where large antibody libraries are displayed on the surface of filamentous phage and are genetically encoded in the phage genome, DECLs feature the display of individual small organic chemical moieties on DNA fragments serving as amplifiable identification barcodes. The DNA-tag facilitates the synthesis and allows the simultaneous screening of very large sets of compounds (up to billions of molecules), because the hit compounds can easily be identified and quantified by PCR-amplification of the DNA-barcode followed by high-throughput DNA sequencing. Several approaches have been used to generate DECLs, differing both in the methods used for library encoding and for the combinatorial assembly of chemical moieties. For example, DECLs can be used for fragment-based drug discovery, displaying a single molecule on DNA or two chemical moieties at the extremities of complementary DNA strands. DECLs can vary substantially in the chemical structures and the library size. While ultralarge libraries containing billions of compounds have been reported containing four or more sets of building blocks, also smaller libraries have been shown to be efficient for ligand discovery. In general, it has been found that the overall library size is a poor predictor for library performance and that the number and diversity of the building blocks are rather important indicators. Smaller libraries consisting of two to three sets of building blocks better fulfill the criteria of drug-likeness and often have higher quality. In this Account, we present advances in the DECL field from proof-of-principle studies to practical applications for drug discovery, both in industry and in academia. DECL technology can yield specific binders to a variety of target proteins and is likely to become a standard tool for pharmaceutical hit discovery, lead expansion, and Chemical Biology research. The introduction of new methodologies for library encoding and for compound synthesis in the presence of DNA is an exciting research field and will crucially contribute to the performance and the propagation of the technology.
SCRaMbLE generates designed combinatorial stochastic diversity in synthetic chromosomes.
Shen, Yue; Stracquadanio, Giovanni; Wang, Yun; Yang, Kun; Mitchell, Leslie A; Xue, Yaxin; Cai, Yizhi; Chen, Tai; Dymond, Jessica S; Kang, Kang; Gong, Jianhui; Zeng, Xiaofan; Zhang, Yongfen; Li, Yingrui; Feng, Qiang; Xu, Xun; Wang, Jun; Wang, Jian; Yang, Huanming; Boeke, Jef D; Bader, Joel S
2016-01-01
Synthetic chromosome rearrangement and modification by loxP-mediated evolution (SCRaMbLE) generates combinatorial genomic diversity through rearrangements at designed recombinase sites. We applied SCRaMbLE to yeast synthetic chromosome arm synIXR (43 recombinase sites) and then used a computational pipeline to infer or unscramble the sequence of recombinations that created the observed genomes. Deep sequencing of 64 synIXR SCRaMbLE strains revealed 156 deletions, 89 inversions, 94 duplications, and 55 additional complex rearrangements; several duplications are consistent with a double rolling circle mechanism. Every SCRaMbLE strain was unique, validating the capability of SCRaMbLE to explore a diverse space of genomes. Rearrangements occurred exclusively at designed loxPsym sites, with no significant evidence for ectopic rearrangements or mutations involving synthetic regions, the 99% nonsynthetic nuclear genome, or the mitochondrial genome. Deletion frequencies identified genes required for viability or fast growth. Replacement of 3' UTR by non-UTR sequence had surprisingly little effect on fitness. SCRaMbLE generates genome diversity in designated regions, reveals fitness constraints, and should scale to simultaneous evolution of multiple synthetic chromosomes. © 2016 Shen et al.; Published by Cold Spring Harbor Laboratory Press.
Zhang, Hongkai; Torkamani, Ali; Jones, Teresa M; Ruiz, Diana I; Pons, Jaume; Lerner, Richard A
2011-08-16
Use of large combinatorial antibody libraries and next-generation sequencing of nucleic acids are two of the most powerful methods in modern molecular biology. The libraries are screened using the principles of evolutionary selection, albeit in real time, to enrich for members with a particular phenotype. This selective process necessarily results in the loss of information about less-fit molecules. On the other hand, sequencing of the library, by itself, gives information that is mostly unrelated to phenotype. If the two methods could be combined, the full potential of very large molecular libraries could be realized. Here we report the implementation of a phenotype-information-phenotype cycle that integrates information and gene recovery. After selection for phage-encoded antibodies that bind to targets expressed on the surface of Escherichia coli, the information content of the selected pool is obtained by pyrosequencing. Sequences that encode specific antibodies are identified by a bioinformatic analysis and recovered by a stringent affinity method that is uniquely suited for gene isolation from a highly degenerate collection of nucleic acids. This approach can be generalized for selection of antibodies against targets that are present as minor components of complex systems.
Kelm, R J; Cogan, J G; Elder, P K; Strauch, A R; Getz, M J
1999-05-14
Transcriptional activity of the mouse vascular smooth muscle alpha-actin gene in fibroblasts is regulated, in part, by a 30-base pair asymmetric polypurine-polypyrimidine tract containing an essential MCAT enhancer motif. The double-stranded form of this sequence serves as a binding site for a transcription enhancer factor 1-related protein while the separated single strands interact with two distinct DNA binding activities termed VACssBF1 and 2 (Cogan, J. G., Sun, S., Stoflet, E. S., Schmidt, L. J., Getz, M. J., and Strauch, A. R. (1995) J. Biol. Chem. 270, 11310-11321; Sun, S., Stoflet, E. S., Cogan, J. G., Strauch, A. R., and Getz, M. J. (1995) Mol. Cell. Biol. 15, 2429-2936). VACssBF2 has been recently cloned and shown to consist of two closely related proteins, Puralpha and Purbeta (Kelm, R. J., Elder, P. K., Strauch, A. R., and Getz, M. J. (1997) J. Biol. Chem. 272, 26727-26733). In this study, we demonstrate that Puralpha and Purbeta interact with each other via highly specific protein-protein interactions and bind to the purine-rich strand of the MCAT enhancer in the form of both homo- and heteromeric complexes. Moreover, both Pur proteins interact with MSY1, a VACssBF1-like protein cloned by virtue of its affinity for the pyrimidine-rich strand of the enhancer. Interactions between Puralpha, Purbeta, and MSY1 do not require the participation of DNA. Combinatorial interactions between these three single-stranded DNA-binding proteins may be important in regulating activity of the smooth muscle alpha-actin MCAT enhancer in fibroblasts.
Construction and engineering of large biochemical pathways via DNA assembler
Shao, Zengyi; Zhao, Huimin
2015-01-01
Summary DNA assembler enables rapid construction and engineering of biochemical pathways in a one-step fashion by exploitation of the in vivo homologous recombination mechanism in Saccharomyces cerevisiae. It has many applications in pathway engineering, metabolic engineering, combinatorial biology, and synthetic biology. Here we use two examples including the zeaxanthin biosynthetic pathway and the aureothin biosynthetic gene cluster to describe the key steps in the construction of pathways containing multiple genes using the DNA assembler approach. Methods for construct design, pathway assembly, pathway confirmation, and functional analysis are shown. The protocol for fine genetic modifications such as site-directed mutagenesis for engineering the aureothin gene cluster is also illustrated. PMID:23996442
Kyriazi, Maria-Eleni; Giust, Davide; El-Sagheer, Afaf H; Lackie, Peter M; Muskens, Otto L; Brown, Tom; Kanaras, Antonios G
2018-04-24
The design of nanoparticulate systems which can perform multiple synergistic functions in cells with high specificity and selectivity is of great importance in applications. Here we combine recent advances in DNA-gold nanoparticle self-assembly and sensing to develop gold nanoparticle dimers that are able to perform multiplexed synergistic functions within a cellular environment. These dimers can sense two mRNA targets and simultaneously or independently deliver one or two DNA-intercalating anticancer drugs (doxorubicin and mitoxantrone) in live cells. Our study focuses on the design of sophisticated nanoparticle assemblies with multiple and synergistic functions that have the potential to advance sensing and drug delivery in cells.
Generalized theory on the mechanism of site-specific DNA-protein interactions
NASA Astrophysics Data System (ADS)
Niranjani, G.; Murugan, R.
2016-05-01
We develop a generalized theoretical framework on the binding of transcription factor proteins (TFs) with specific sites on DNA that takes into account the interplay of various factors regarding overall electrostatic potential at the DNA-protein interface, occurrence of kinetic traps along the DNA sequence, presence of other roadblock protein molecules along DNA and crowded environment, conformational fluctuations in the DNA binding domains (DBDs) of TFs, and the conformational state of the DNA. Starting from a Smolochowski type theoretical framework on site-specific binding of TFs we logically build our model by adding the effects of these factors one by one. Our generalized two-step model suggests that the electrostatic attractive forces present inbetween the positively charged DBDs of TFs and the negatively charged phosphate backbone of DNA, along with the counteracting shielding effects of solvent ions, is the core factor that creates a fluidic type environment at the DNA-protein interface. This in turn facilitates various one-dimensional diffusion (1Dd) processes such as sliding, hopping and intersegmental transfers. These facilitating processes as well as flipping dynamics of conformational states of DBDs of TFs between stationary and mobile states can enhance the 1Dd coefficient on a par with three-dimensional diffusion (3Dd). The random coil conformation of DNA also plays critical roles in enhancing the site-specific association rate. The extent of enhancement over the 3Dd controlled rate seems to be directly proportional to the maximum possible 1Dd length. We show that the overall site-specific binding rate scales with the length of DNA in an asymptotic way. For relaxed DNA, the specific binding rate will be independent of the length of DNA as length increases towards infinity. For condensed DNA as in in vivo conditions, the specific binding rate depends on the length of DNA in a turnover way with a maximum. This maximum rate seems to scale with the maximum possible 1Dd length of TFs in a square root manner. Results suggest that 1Dd processes contribute much less to the enhancement of specific binding rate under in vivo conditions for condensed DNA. There exists a critical length of binding stretch of TFs beyond which the probability associated with the random occurrence of similar specific binding sites will be close to zero. TFs in natural systems from prokaryotes to eukaryotes seem to handle sequence-mediated kinetic traps via increasing the length of their recognition stretch or combinatorial binding. TFs overcome the hurdles of roadblocks via switching efficiently between sliding, hopping and intersegmental transfer modes. The site-specific binding rate as well as the maximum possible 1Dd length seem to be directly proportional to the square root of the probability (p R) of finding a nonspecific binding site to be free from dynamic roadblocks. Here p R seems to be a function of the number of nsbs available per DNA binding protein (ϕ) inside the living cell. It seems that p R > 0.8 when ϕ > 10 which is true for the Escherichia coli cell system.
Verzotto, Davide; M Teo, Audrey S; Hillmer, Axel M; Nagarajan, Niranjan
2016-01-01
Resolution of complex repeat structures and rearrangements in the assembly and analysis of large eukaryotic genomes is often aided by a combination of high-throughput sequencing and genome-mapping technologies (for example, optical restriction mapping). In particular, mapping technologies can generate sparse maps of large DNA fragments (150 kilo base pairs (kbp) to 2 Mbp) and thus provide a unique source of information for disambiguating complex rearrangements in cancer genomes. Despite their utility, combining high-throughput sequencing and mapping technologies has been challenging because of the lack of efficient and sensitive map-alignment algorithms for robustly aligning error-prone maps to sequences. We introduce a novel seed-and-extend glocal (short for global-local) alignment method, OPTIMA (and a sliding-window extension for overlap alignment, OPTIMA-Overlap), which is the first to create indexes for continuous-valued mapping data while accounting for mapping errors. We also present a novel statistical model, agnostic with respect to technology-dependent error rates, for conservatively evaluating the significance of alignments without relying on expensive permutation-based tests. We show that OPTIMA and OPTIMA-Overlap outperform other state-of-the-art approaches (1.6-2 times more sensitive) and are more efficient (170-200 %) and precise in their alignments (nearly 99 % precision). These advantages are independent of the quality of the data, suggesting that our indexing approach and statistical evaluation are robust, provide improved sensitivity and guarantee high precision.
Trial watch: Naked and vectored DNA-based anticancer vaccines
Bloy, Norma; Buqué, Aitziber; Aranda, Fernando; Castoldi, Francesca; Eggermont, Alexander; Cremer, Isabelle; Sautès-Fridman, Catherine; Fucikova, Jitka; Galon, Jérôme; Spisek, Radek; Tartour, Eric; Zitvogel, Laurence; Kroemer, Guido; Galluzzi, Lorenzo
2015-01-01
One type of anticancer vaccine relies on the administration of DNA constructs encoding one or multiple tumor-associated antigens (TAAs). The ultimate objective of these preparations, which can be naked or vectored by non-pathogenic viruses, bacteria or yeast cells, is to drive the synthesis of TAAs in the context of an immunostimulatory milieu, resulting in the (re-)elicitation of a tumor-targeting immune response. In spite of encouraging preclinical results, the clinical efficacy of DNA-based vaccines employed as standalone immunotherapeutic interventions in cancer patients appears to be limited. Thus, efforts are currently being devoted to the development of combinatorial regimens that allow DNA-based anticancer vaccines to elicit clinically relevant immune responses. Here, we discuss recent advances in the preclinical and clinical development of this therapeutic paradigm. PMID:26155408
Trial watch: Naked and vectored DNA-based anticancer vaccines.
Bloy, Norma; Buqué, Aitziber; Aranda, Fernando; Castoldi, Francesca; Eggermont, Alexander; Cremer, Isabelle; Sautès-Fridman, Catherine; Fucikova, Jitka; Galon, Jérôme; Spisek, Radek; Tartour, Eric; Zitvogel, Laurence; Kroemer, Guido; Galluzzi, Lorenzo
2015-05-01
One type of anticancer vaccine relies on the administration of DNA constructs encoding one or multiple tumor-associated antigens (TAAs). The ultimate objective of these preparations, which can be naked or vectored by non-pathogenic viruses, bacteria or yeast cells, is to drive the synthesis of TAAs in the context of an immunostimulatory milieu, resulting in the (re-)elicitation of a tumor-targeting immune response. In spite of encouraging preclinical results, the clinical efficacy of DNA-based vaccines employed as standalone immunotherapeutic interventions in cancer patients appears to be limited. Thus, efforts are currently being devoted to the development of combinatorial regimens that allow DNA-based anticancer vaccines to elicit clinically relevant immune responses. Here, we discuss recent advances in the preclinical and clinical development of this therapeutic paradigm.
Li, Yang Eric; Xiao, Mu; Shi, Binbin; Yang, Yu-Cheng T; Wang, Dong; Wang, Fei; Marcia, Marco; Lu, Zhi John
2017-09-08
Crosslinking immunoprecipitation sequencing (CLIP-seq) technologies have enabled researchers to characterize transcriptome-wide binding sites of RNA-binding protein (RBP) with high resolution. We apply a soft-clustering method, RBPgroup, to various CLIP-seq datasets to group together RBPs that specifically bind the same RNA sites. Such combinatorial clustering of RBPs helps interpret CLIP-seq data and suggests functional RNA regulatory elements. Furthermore, we validate two RBP-RBP interactions in cell lines. Our approach links proteins and RNA motifs known to possess similar biochemical and cellular properties and can, when used in conjunction with additional experimental data, identify high-confidence RBP groups and their associated RNA regulatory elements.
Scar-less multi-part DNA assembly design automation
DOE Office of Scientific and Technical Information (OSTI.GOV)
Hillson, Nathan J.
The present invention provides a method of a method of designing an implementation of a DNA assembly. In an exemplary embodiment, the method includes (1) receiving a list of DNA sequence fragments to be assembled together and an order in which to assemble the DNA sequence fragments, (2) designing DNA oligonucleotides (oligos) for each of the DNA sequence fragments, and (3) creating a plan for adding flanking homology sequences to each of the DNA oligos. In an exemplary embodiment, the method includes (1) receiving a list of DNA sequence fragments to be assembled together and an order in which tomore » assemble the DNA sequence fragments, (2) designing DNA oligonucleotides (oligos) for each of the DNA sequence fragments, and (3) creating a plan for adding optimized overhang sequences to each of the DNA oligos.« less
Chen, Xin; Wu, Qiong; Sun, Ruimin; Zhang, Louxin
2012-01-01
The discovery of single-nucleotide polymorphisms (SNPs) has important implications in a variety of genetic studies on human diseases and biological functions. One valuable approach proposed for SNP discovery is based on base-specific cleavage and mass spectrometry. However, it is still very challenging to achieve the full potential of this SNP discovery approach. In this study, we formulate two new combinatorial optimization problems. While both problems are aimed at reconstructing the sample sequence that would attain the minimum number of SNPs, they search over different candidate sequence spaces. The first problem, denoted as SNP - MSP, limits its search to sequences whose in silico predicted mass spectra have all their signals contained in the measured mass spectra. In contrast, the second problem, denoted as SNP - MSQ, limits its search to sequences whose in silico predicted mass spectra instead contain all the signals of the measured mass spectra. We present an exact dynamic programming algorithm for solving the SNP - MSP problem and also show that the SNP - MSQ problem is NP-hard by a reduction from a restricted variation of the 3-partition problem. We believe that an efficient solution to either problem above could offer a seamless integration of information in four complementary base-specific cleavage reactions, thereby improving the capability of the underlying biotechnology for sensitive and accurate SNP discovery.
NASA Astrophysics Data System (ADS)
Doerr, Timothy P.; Alves, Gelio; Yu, Yi-Kuo
2005-08-01
Typical combinatorial optimizations are NP-hard; however, for a particular class of cost functions the corresponding combinatorial optimizations can be solved in polynomial time using the transfer matrix technique or, equivalently, the dynamic programming approach. This suggests a way to efficiently find approximate solutions-find a transformation that makes the cost function as similar as possible to that of the solvable class. After keeping many high-ranking solutions using the approximate cost function, one may then re-assess these solutions with the full cost function to find the best approximate solution. Under this approach, it is important to be able to assess the quality of the solutions obtained, e.g., by finding the true ranking of the kth best approximate solution when all possible solutions are considered exhaustively. To tackle this statistical issue, we provide a systematic method starting with a scaling function generated from the finite number of high-ranking solutions followed by a convergent iterative mapping. This method, useful in a variant of the directed paths in random media problem proposed here, can also provide a statistical significance assessment for one of the most important proteomic tasks-peptide sequencing using tandem mass spectrometry data. For directed paths in random media, the scaling function depends on the particular realization of randomness; in the mass spectrometry case, the scaling function is spectrum-specific.
An Interactive Multiobjective Programming Approach to Combinatorial Data Analysis.
ERIC Educational Resources Information Center
Brusco, Michael J.; Stahl, Stephanie
2001-01-01
Describes an interactive procedure for multiobjective asymmetric unidimensional seriation problems that uses a dynamic-programming algorithm to generate partially the efficient set of sequences for small to medium-sized problems and a multioperational heuristic to estimate the efficient set for larger problems. Applies the procedure to an…
2007-12-12
REPORT DOCUMENTATION PAGE o:~r’Jo , , , , ’!’" ’ "~’’;;;, .-’ ’"",: I ~~--’ h.~ ng t I :;"O(’:,~s ) (From ~ To) . I "NO ’."" "elE I ~~A...1612 temperature (Rn with gentle shaki ng and were then scanned as described below prior to addition ofanalytes. All DNA oligonu- cleotides were added...scans. One parameter which we have recently found to be of great val ue in reduci ng baseline variations in the CARS array (Fig. 6) is purificalion
Ahsendorf, Tobias; Müller, Franz-Josef; Topkar, Ved; Gunawardena, Jeremy; Eils, Roland
2017-01-01
The DNA microstates that regulate transcription include sequence-specific transcription factors (TFs), coregulatory complexes, nucleosomes, histone modifications, DNA methylation, and parts of the three-dimensional architecture of genomes, which could create an enormous combinatorial complexity across the genome. However, many proteins and epigenetic marks are known to colocalize, suggesting that the information content encoded in these marks can be compressed. It has so far proved difficult to understand this compression in a systematic and quantitative manner. Here, we show that simple linear models can reliably predict the data generated by the ENCODE and Roadmap Epigenomics consortia. Further, we demonstrate that a small number of marks can predict all other marks with high average correlation across the genome, systematically revealing the substantial information compression that is present in different cell lines. We find that the linear models for activating marks are typically cell line-independent, while those for silencing marks are predominantly cell line-specific. Of particular note, a nuclear receptor corepressor, transducin beta-like 1 X-linked receptor 1 (TBLR1), was highly predictive of other marks in two hematopoietic cell lines. The methodology presented here shows how the potentially vast complexity of TFs, coregulators, and epigenetic marks at eukaryotic genes is highly redundant and that the information present can be compressed onto a much smaller subset of marks. These findings could be used to efficiently characterize cell lines and tissues based on a small number of diagnostic marks and suggest how the DNA microstates, which regulate the expression of individual genes, can be specified. PMID:29216191
Functional analysis of the plant disease resistance gene Pto using DNA shuffling.
Bernal, Adriana J; Pan, Qilin; Pollack, Jeff; Rose, Laura; Kozik, Alexander; Willits, Neil; Luo, Yao; Guittet, Muriel; Kochetkova, Elena; Michelmore, Richard W
2005-06-17
Pto is a serine/threonine kinase that mediates resistance in tomato to strains of Pseudomonas syringae pv. tomato expressing the (a)virulence proteins AvrPto or AvrPtoB. DNA shuffling was used as a combinatorial in vitro genetic approach to dissect the functional regions of Pto. The Pto gene was shuffled with four of its paralogs from a resistant haplotype to create a library of recombinant products that was screened for interaction with AvrPto in yeast. All interacting clones and a representative sample of noninteracting clones were sequenced, and their ability to signal downstream was tested by the elicitation of a hypersensitive response in an AvrPto-dependent or -independent manner in planta. Eight candidate regions important for binding to AvrPto or for downstream signaling were identified by statistical correlations between individual amino acid positions and phenotype. A subset of the regions had previously been identified as important for recognition, confirming the validity of the shuffling approach. Three novel regions important for Pto function were validated by site-directed mutagenesis. Several chimeras and point mutants exhibited a differential interaction with (a)virulence proteins in the AvrPto and VirPphA family, demonstrating distinct binding requirements for different ligands. Additionally, the identification of chimeras that are both constitutively active as well as capable of binding AvrPto indicates that elicitation of downstream signaling does not involve a conformational change that precludes binding of AvrPto, as previously hypothesized. The correlations between phenotypes and variation generated by DNA shuffling paralleled natural variation observed between orthologs of Pto from Lycopersicon spp.
1994-01-01
Elevation of cAMP can cause gene-specific inhibition of interleukin 2 (IL-2) expression. To investigate the mechanism of this effect, we have combined electrophoretic mobility shift assays and in vivo genomic footprinting to assess both the availability of putative IL-2 transcription factors in forskolin-treated cells and the functional capacity of these factors to engage their sites in vivo. All observed effects of forskolin depended upon protein kinase A, for they were blocked by introduction of a dominant negative mutant subunit of protein kinase A. In the EL4.E1 cell line, we report specific inhibitory effects of cAMP elevation both on NF-kappa B/Rel family factors binding at -200 bp, and on a novel, biochemically distinct "TGGGC" factor binding at -225 bp with respect to the IL-2 transcriptional start site. Neither NF-AT nor AP-1 binding activities are detectably inhibited in gel mobility shift assays. Elevation of cAMP inhibits NF-kappa B activity with delayed kinetics in association with a delayed inhibition of IL-2 RNA accumulation. Activation of cells in the presence of forskolin prevents the maintenance of stable protein- DNA interactions in vivo, not only at the NF-kappa B and TGGGC sites of the IL-2 enhancer, but also at the NF-AT, AP-1, and other sites. This result, and similar results in cyclosporin A-treated cells, imply that individual IL-2 transcription factors cannot stably bind their target sequences in vivo without coengagement of all other distinct factors at neighboring sites. It is proposed that nonhierarchical, cooperative enhancement of binding is a structural basis of combinatorial transcription factor action at the IL-2 locus. PMID:8113685
Giffard, Philip M; Andersson, Patiyan; Wilson, Judith; Buckley, Cameron; Lilliebridge, Rachael; Harris, Tegan M; Kleinecke, Mariana; O'Grady, Kerry-Ann F; Huston, Wilhelmina M; Lambert, Stephen B; Whiley, David M; Holt, Deborah C
2018-01-01
Chlamydia trachomatis infects the urogenital tract (UGT) and eyes. Anatomical tropism is correlated with variation in the major outer membrane protein encoded by ompA. Strains possessing the ocular ompA variants A, B, Ba and C are typically found within the phylogenetically coherent "classical ocular lineage". However, variants B, Ba and C have also been found within three distinct strains in Australia, all associated with ocular disease in children and outside the classical ocular lineage. CtGEM genotyping is a method for detecting and discriminating ocular strains and also the major phylogenetic lineages. The rationale was facilitation of surveillance to inform responses to C. trachomatis detection in UGT specimens from young children. CtGEM typing is based on high resolution melting analysis (HRMA) of two PCR amplified fragments with high combinatorial resolving power, as defined by computerised comparison of 65 whole genomes. One fragment is from the hypothetical gene defined by Jali-1891 in the C. trachomatis B_Jali20 genome, while the other is from ompA. Twenty combinatorial CtGEM types have been shown to exist, and these encompass unique genotypes for all known ocular strains, and also delineate the TI and T2 major phylogenetic lineages, identify LGV strains and provide additional resolution beyond this. CtGEM typing and Sanger sequencing were compared with 42 C. trachomatis positive clinical specimens, and there were no disjunctions. CtGEM typing is a highly efficient method designed and tested using large scale comparative genomics. It divides C. trachomatis into clinically and biologically meaningful groups, and may have broad application in surveillance.
Sabri, Suriana; Steen, Jennifer A; Bongers, Mareike; Nielsen, Lars K; Vickers, Claudia E
2013-06-24
Metabolic engineering projects often require integration of multiple genes in order to control the desired phenotype. However, this often requires iterative rounds of engineering because many current insertion approaches are limited by the size of the DNA that can be transferred onto the chromosome. Consequently, construction of highly engineered strains is very time-consuming. A lack of well-characterised insertion loci is also problematic. A series of knock-in/knock-out (KIKO) vectors was constructed for integration of large DNA sequences onto the E. coli chromosome at well-defined loci. The KIKO plasmids target three nonessential genes/operons as insertion sites: arsB (an arsenite transporter); lacZ (β-galactosidase); and rbsA-rbsR (a ribose metabolism operon). Two homologous 'arms' target each insertion locus; insertion is mediated by λ Red recombinase through these arms. Between the arms is a multiple cloning site for the introduction of exogenous sequences and an antibiotic resistance marker (either chloramphenicol or kanamycin) for selection of positive recombinants. The resistance marker can subsequently be removed by flippase-mediated recombination. The insertion cassette is flanked by hairpin loops to isolate it from the effects of external transcription at the integration locus. To characterize each target locus, a xylanase reporter gene (xynA) was integrated onto the chromosomes of E. coli strains W and K-12 using the KIKO vectors. Expression levels varied between loci, with the arsB locus consistently showing the highest level of expression. To demonstrate the simultaneous use of all three loci in one strain, xynA, green fluorescent protein (gfp) and a sucrose catabolic operon (cscAKB) were introduced into lacZ, arsB and rbsAR respectively, and shown to be functional. The KIKO plasmids are a useful tool for efficient integration of large DNA fragments (including multiple genes and pathways) into E. coli. Chromosomal insertion provides stable expression without the need for continuous antibiotic selection. Three non-essential loci have been characterised as insertion loci; combinatorial insertion at all three loci can be performed in one strain. The largest insertion at a single site described here was 5.4 kb; we have used this method in other studies to insert a total of 7.3 kb at one locus and 11.3 kb across two loci. These vectors are particularly useful for integration of multigene cassettes for metabolic engineering applications.
Sequential addition of short DNA oligos in DNA-polymerase-based synthesis reactions
Gardner, Shea N; Mariella, Jr., Raymond P; Christian, Allen T; Young, Jennifer A; Clague, David S
2013-06-25
A method of preselecting a multiplicity of DNA sequence segments that will comprise the DNA molecule of user-defined sequence, separating the DNA sequence segments temporally, and combining the multiplicity of DNA sequence segments with at least one polymerase enzyme wherein the multiplicity of DNA sequence segments join to produce the DNA molecule of user-defined sequence. Sequence segments may be of length n, where n is an odd integer. In one embodiment the length of desired hybridizing overlap is specified by the user and the sequences and the protocol for combining them are guided by computational (bioinformatics) predictions. In one embodiment sequence segments are combined from multiple reading frames to span the same region of a sequence, so that multiple desired hybridizations may occur with different overlap lengths.
Sequential addition of short DNA oligos in DNA-polymerase-based synthesis reactions
Gardner, Shea N [San Leandro, CA; Mariella, Jr., Raymond P.; Christian, Allen T [Tracy, CA; Young, Jennifer A [Berkeley, CA; Clague, David S [Livermore, CA
2011-01-18
A method of fabricating a DNA molecule of user-defined sequence. The method comprises the steps of preselecting a multiplicity of DNA sequence segments that will comprise the DNA molecule of user-defined sequence, separating the DNA sequence segments temporally, and combining the multiplicity of DNA sequence segments with at least one polymerase enzyme wherein the multiplicity of DNA sequence segments join to produce the DNA molecule of user-defined sequence. Sequence segments may be of length n, where n is an even or odd integer. In one embodiment the length of desired hybridizing overlap is specified by the user and the sequences and the protocol for combining them are guided by computational (bioinformatics) predictions. In one embodiment sequence segments are combined from multiple reading frames to span the same region of a sequence, so that multiple desired hybridizations may occur with different overlap lengths. In one embodiment starting sequence fragments are of different lengths, n, n+1, n+2, etc.
Hydroxyapatite-binding peptides for bone growth and inhibition
Bertozzi, Carolyn R [Berkeley, CA; Song, Jie [Shrewsbury, MA; Lee, Seung-Wuk [Walnut Creek, CA
2011-09-20
Hydroxyapatite (HA)-binding peptides are selected using combinatorial phage library display. Pseudo-repetitive consensus amino acid sequences possessing periodic hydroxyl side chains in every two or three amino acid sequences are obtained. These sequences resemble the (Gly-Pro-Hyp).sub.x repeat of human type I collagen, a major component of extracellular matrices of natural bone. A consistent presence of basic amino acid residues is also observed. The peptides are synthesized by the solid-phase synthetic method and then used for template-driven HA-mineralization. Microscopy reveal that the peptides template the growth of polycrystalline HA crystals .about.40 nm in size.
Epigenetics of Peripheral B-Cell Differentiation and the Antibody Response
Zan, Hong; Casali, Paolo
2015-01-01
Epigenetic modifications, such as histone post-translational modifications, DNA methylation, and alteration of gene expression by non-coding RNAs, including microRNAs (miRNAs) and long non-coding RNAs (lncRNAs), are heritable changes that are independent from the genomic DNA sequence. These regulate gene activities and, therefore, cellular functions. Epigenetic modifications act in concert with transcription factors and play critical roles in B cell development and differentiation, thereby modulating antibody responses to foreign- and self-antigens. Upon antigen encounter by mature B cells in the periphery, alterations of these lymphocytes epigenetic landscape are induced by the same stimuli that drive the antibody response. Such alterations instruct B cells to undergo immunoglobulin (Ig) class switch DNA recombination (CSR) and somatic hypermutation (SHM), as well as differentiation to memory B cells or long-lived plasma cells for the immune memory. Inducible histone modifications, together with DNA methylation and miRNAs modulate the transcriptome, particularly the expression of activation-induced cytidine deaminase, which is essential for CSR and SHM, and factors central to plasma cell differentiation, such as B lymphocyte-induced maturation protein-1. These inducible B cell-intrinsic epigenetic marks guide the maturation of antibody responses. Combinatorial histone modifications also function as histone codes to target CSR and, possibly, SHM machinery to the Ig loci by recruiting specific adaptors that can stabilize CSR/SHM factors. In addition, lncRNAs, such as recently reported lncRNA-CSR and an lncRNA generated through transcription of the S region that form G-quadruplex structures, are also important for CSR targeting. Epigenetic dysregulation in B cells, including the aberrant expression of non-coding RNAs and alterations of histone modifications and DNA methylation, can result in aberrant antibody responses to foreign antigens, such as those on microbial pathogens, and generation of pathogenic autoantibodies, IgE in allergic reactions, as well as B cell neoplasia. Epigenetic marks would be attractive targets for new therapeutics for autoimmune and allergic diseases, and B cell malignancies. PMID:26697022
Software-supported USER cloning strategies for site-directed mutagenesis and DNA assembly.
Genee, Hans Jasper; Bonde, Mads Tvillinggaard; Bagger, Frederik Otzen; Jespersen, Jakob Berg; Sommer, Morten O A; Wernersson, Rasmus; Olsen, Lars Rønn
2015-03-20
USER cloning is a fast and versatile method for engineering of plasmid DNA. We have developed a user friendly Web server tool that automates the design of optimal PCR primers for several distinct USER cloning-based applications. Our Web server, named AMUSER (Automated DNA Modifications with USER cloning), facilitates DNA assembly and introduction of virtually any type of site-directed mutagenesis by designing optimal PCR primers for the desired genetic changes. To demonstrate the utility, we designed primers for a simultaneous two-position site-directed mutagenesis of green fluorescent protein (GFP) to yellow fluorescent protein (YFP), which in a single step reaction resulted in a 94% cloning efficiency. AMUSER also supports degenerate nucleotide primers, single insert combinatorial assembly, and flexible parameters for PCR amplification. AMUSER is freely available online at http://www.cbs.dtu.dk/services/AMUSER/.
Discovery and application of peptides that bind to proteins and solid state inorganic materials
NASA Astrophysics Data System (ADS)
Stearns, Linda A.
A series of three projects was undertaken on the theme of peptide-based molecular recognition. In the first project, a messenger RNA (mRNA) display selection was carried out against the II-VI semiconductors zinc sulfide (ZnS), zinc selenide (ZnSe), and cadmium sulfide (CdS). Sequence analysis of 18-mer semiconductor-binding peptides (SBPs) following four rounds of selection indicated that the amino acid sequences were enriched in polar residues compared to the naive library, suggesting that hydrogen-bonding interactions are a dominant mode of interaction between the SBPs and their cognate inorganic surfaces. Select peptides were expressed as fusions of the green fluorescent protein (GFP) to visualize their recognition of semiconductor crystals. Interpretation of the results was complicated by a high fluorescence background that was observed with certain control GFP fusions. Additional experiments, including cross-specificity binding assays, are needed to characterize the peptides that were isolated in this selection. A second project described the practical application of a known inorganic-binding and nucleating peptide. Peptide A3, which was previously isolated by phage display, was chemically conjugated to a short DNA strand using the heterobifunctional linker succinimidyl 4-[N-maleimidomethyl]cyclohexane-1-carboxylate (SMCC). The resulting peptide-DNA conjugate was hybridized to ten complementary single-stranded capture probes extending outward from the surface of an origami DNA nanotube. A gold precursor solution was added to initiate nucleation and growth of gold nanoparticles at the site of the peptide. Transmission electron microscopy (TEM) was used to visualize the gold nanoparticle-decorated nanostructures. This approach holds immense promise for organizing compositionally-diverse materials at the nanoscale. In a third project, a novel non-iterative approach to mRNA display called covalent capture was demonstrated. Using human transferrin as a target protein, peptides with low-nanomolar affinity were isolated from a combinatorial library of one trillion distinct 12-mer peptide sequences by using UV light to covalently crosslink the peptides to a photoreactive arm that was displayed on the protein surface. The best peptide isolated from this screen exhibited a binding affinity constant (Kd) of 3 nM, which is equivalent to some of the best peptides isolated after many rounds of traditional bead-based selection. The approach itself is general and could be applied to many different types of problems in molecular biology.
2013-06-01
number of ways to generate either random mutations or specific alterations to the genome sequence . Unlike previous approaches however, both TALENs and...made to the donor construct will be incorporated into the endogenous genomic sequence (examples in Liu et al., 2012; Zu et al., 2013). One challenge... Drosophila with the CRISPR RNA-guided Cas9 nuclease. Genetics. 2013. Hwang WY, Fu Y, Reyon D, Maeder ML, Tsai SQ, Sander JD, et al. Efficient genome
Quantum Dilogarithms and Partition q-Series
NASA Astrophysics Data System (ADS)
Kato, Akishi; Terashima, Yuji
2015-08-01
In our previous work (Kato and Terashima, Commun Math Phys. arXiv:1403.6569, 2014), we introduced the partition q-series for mutation loop γ—a loop in exchange quiver. In this paper, we show that for a certain class of mutation sequences, called reddening sequences, the graded version of partition q-series essentially coincides with the ordered product of quantum dilogarithm associated with each mutation; the partition q-series provides a state-sum description of combinatorial Donaldson-Thomas invariants introduced by Keller.
Kajiwara, Shota; Yamada, Ryosuke; Ogino, Hiroyasu
2018-04-10
Simple and cost-effective lipase expression host microorganisms are highly desirable. A combinatorial library strategy is used to improve the secretory expression of lipase from Bacillus thermocatenulatus (BTL2) in the culture supernatant of Saccharomyces cerevisiae. A plasmid library including expression cassettes composed of sequences encoding one of each 15 promoters, 15 secretion signals, and 15 terminators derived from yeast species, S. cerevisiae, Pichia pastoris, and Hansenula polymorpha, is constructed. The S. cerevisiae transformant YPH499/D4, comprising H. polymorpha GAP promoter, S. cerevisiae SAG1 secretion signal, and P. pastoris AOX1 terminator, is selected by high-throughput screening. This transformant expresses BTL2 extra-cellularly with a 130-fold higher than the control strain, comprising S. cerevisiae PGK1 promoter, S. cerevisiae α-factor secretion signal, and S. cerevisiae PGK1 terminator, after cultivation for 72 h. This combinatorial library strategy holds promising potential for application in the optimization of the secretory expression of proteins in yeast. © 2018 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
Neural correlates of implicit and explicit combinatorial semantic processing
Graves, William W.; Binder, Jeffrey R.; Desai, Rutvik H.; Conant, Lisa L.; Seidenberg, Mark S.
2010-01-01
Language consists of sequences of words, but comprehending phrases involves more than concatenating meanings: A boat house is a shelter for boats, whereas a summer house is a house used during summer, and a ghost house is typically uninhabited. Little is known about the brain bases of combinatorial semantic processes. We performed two fMRI experiments using familiar, highly meaningful phrases (LAKE HOUSE) and unfamiliar phrases with minimal meaning created by reversing the word order of the familiar items (HOUSE LAKE). The first experiment used a 1-back matching task to assess implicit semantic processing, and the second used a classification task to engage explicit semantic processing. These conditions required processing of the same words, but with more effective combinatorial processing in the meaningful condition. The contrast of meaningful versus reversed phrases revealed activation primarily during the classification task, to a greater extent in the right hemisphere, including right angular gyrus, dorsomedial prefrontal cortex, and bilateral posterior cingulate/precuneus, areas previously implicated in semantic processing. Positive correlations of fMRI signal with lexical (word-level) frequency occurred exclusively with the 1-back task and to a greater spatial extent on the left, including left posterior middle temporal gyrus and bilateral parahippocampus. These results reveal strong effects of task demands on engagement of lexical versus combinatorial processing and suggest a hemispheric dissociation between these levels of semantic representation. PMID:20600969
Yan, Fang; Liu, Haihong; Liu, Zengrong
2014-01-01
P53 and E2F1 are critical transcription factors involved in the choices between different cell fates including cell differentiation, cell cycle arrest or apoptosis. Recent experiments have shown that two families of microRNAs (miRNAs), p53-responsive miR34 (miRNA-34 a, b and c) and E2F1-inducible miR449 (miRNA-449 a, b and c) are potent inducers of these different fates and might have an important role in sensitizing cancer cells to drug treatment and tumor suppression. Identifying the mechanisms responsible for the combinatorial regulatory roles of these two transcription factors and two miRNAs is an important and challenging problem. Here, based in part on the model proposed in Tongli Zhang et al. (2007), we developed a mathematical model of the decision process and explored the combinatorial regulation between these two transcription factors and two miRNAs in response to DNA damage. By analyzing nonlinear dynamic behaviors of the model, we found that p53 exhibits pulsatile behavior. Moreover, a comparison is given to reveal the subtle differences of the cell fate decision process between regulation and deregulation of miR34 on E2F1. It predicts that miR34 plays a critical role in promoting cell cycle arrest. In addition, a computer simulation result also predicts that the miR449 is necessary for apoptosis in response to sustained DNA damage. In agreement with experimental observations, our model can account for the intricate regulatory relationship between these two transcription factors and two miRNAs in the cell fate decision process after DNA damage. These theoretical results indicate that miR34 and miR449 are effective tumor suppressors and play critical roles in cell fate decisions. The work provides a dynamic mechanism that shows how cell fate decisions are coordinated by two transcription factors and two miRNAs. This article is part of a Special Issue entitled: Computational Proteomics, Systems Biology and Clinical Implications. Guest Editor: Yudong Cai. Crown Copyright © 2013. All rights reserved.
Kohmoto, Tomohiro; Masuda, Kiyoshi; Naruto, Takuya; Tange, Shoichiro; Shoda, Katsutoshi; Hamada, Junichi; Saito, Masako; Ichikawa, Daisuke; Tajima, Atsushi; Otsuji, Eigo; Imoto, Issei
2017-01-01
High-throughput next-generation sequencing is a powerful tool to identify the genotypic landscapes of somatic variants and therapeutic targets in various cancers including gastric cancer, forming the basis for personalized medicine in the clinical setting. Although the advent of many computational algorithms leads to higher accuracy in somatic variant calling, no standard method exists due to the limitations of each method. Here, we constructed a new pipeline. We combined two different somatic variant callers with different algorithms, Strelka and VarScan 2, and evaluated performance using whole exome sequencing data obtained from 19 Japanese cases with gastric cancer (GC); then, we characterized these tumors based on identified driver molecular alterations. More single nucleotide variants (SNVs) and small insertions/deletions were detected by Strelka and VarScan 2, respectively. SNVs detected by both tools showed higher accuracy for estimating somatic variants compared with those detected by only one of the two tools and accurately showed the mutation signature and mutations of driver genes reported for GC. Our combinatorial pipeline may have an advantage in detection of somatic mutations in GC and may be useful for further genomic characterization of Japanese patients with GC to improve the efficacy of GC treatments. J. Med. Invest. 64: 233-240, August, 2017.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Mullinax, R.L.; Gross, E.A.; Amberg, J.R.
1990-10-01
The authors have applied a molecular biology approach to the identification of human monoclonal antibodies. Human peripheral blood lymphocyte mRNA was converted to cDNA and a select subset was amplified by the polymerase chain reaction. These products, containing coding sequences for numerous immunoglobulin heavy- and {kappa} light-chain variable and constant region domains, were inserted into modified bacteriophase {lambda} expression vectors and introduced into Escherichia coli by infection to yield a combinatorial immunoexpression library. Clones with binding activity to tetanus toxoid were identified by filter hybridization with radiolabeled antigen and appeared at a frequency of 0.2{percent} in the library. These humanmore » antigen binding fragments, consisting of a heavy-chain fragment covalently linked to a light chain, displayed high affinity of binding to tetanus toxoid with equilibrium constants in the nanomolar range but did not cross-react with other proteins tested. They estimate that this human immunoexpression library contains 20,000 clones with high affinity and specificity to our chosen antigen.« less
NASA Astrophysics Data System (ADS)
Lestari, D.; Bustamam, A.; Novianti, T.; Ardaneswari, G.
2017-07-01
DNA sequence can be defined as a succession of letters, representing the order of nucleotides within DNA, using a permutation of four DNA base codes including adenine (A), guanine (G), cytosine (C), and thymine (T). The precise code of the sequences is determined using DNA sequencing methods and technologies, which have been developed since the 1970s and currently become highly developed, advanced and highly throughput sequencing technologies. So far, DNA sequencing has greatly accelerated biological and medical research and discovery. However, in some cases DNA sequencing could produce any ambiguous and not clear enough sequencing results that make them quite difficult to be determined whether these codes are A, T, G, or C. To solve these problems, in this study we can introduce other representation of DNA codes namely Quaternion Q = (PA, PT, PG, PC), where PA, PT, PG, PC are the probability of A, T, G, C bases that could appear in Q and PA + PT + PG + PC = 1. Furthermore, using Quaternion representations we are able to construct the improved scoring matrix for global sequence alignment processes, by applying a dot product method. Moreover, this scoring matrix produces better and higher quality of the match and mismatch score between two DNA base codes. In implementation, we applied the Needleman-Wunsch global sequence alignment algorithm using Octave, to analyze our target sequence which contains some ambiguous sequence data. The subject sequences are the DNA sequences of Streptococcus pneumoniae families obtained from the Genebank, meanwhile the target DNA sequence are received from our collaborator database. As the results we found the Quaternion representations improve the quality of the sequence alignment score and we can conclude that DNA sequence target has maximum similarity with Streptococcus pneumoniae.
Large-Scale Concatenation cDNA Sequencing
Yu, Wei; Andersson, Björn; Worley, Kim C.; Muzny, Donna M.; Ding, Yan; Liu, Wen; Ricafrente, Jennifer Y.; Wentland, Meredith A.; Lennon, Greg; Gibbs, Richard A.
1997-01-01
A total of 100 kb of DNA derived from 69 individual human brain cDNA clones of 0.7–2.0 kb were sequenced by concatenated cDNA sequencing (CCS), whereby multiple individual DNA fragments are sequenced simultaneously in a single shotgun library. The method yielded accurate sequences and a similar efficiency compared with other shotgun libraries constructed from single DNA fragments (>20 kb). Computer analyses were carried out on 65 cDNA clone sequences and their corresponding end sequences to examine both nucleic acid and amino acid sequence similarities in the databases. Thirty-seven clones revealed no DNA database matches, 12 clones generated exact matches (≥98% identity), and 16 clones generated nonexact matches (57%–97% identity) to either known human or other species genes. Of those 28 matched clones, 8 had corresponding end sequences that failed to identify similarities. In a protein similarity search, 27 clone sequences displayed significant matches, whereas only 20 of the end sequences had matches to known protein sequences. Our data indicate that full-length cDNA insert sequences provide significantly more nucleic acid and protein sequence similarity matches than expressed sequence tags (ESTs) for database searching. [All 65 cDNA clone sequences described in this paper have been submitted to the GenBank data library under accession nos. U79240–U79304.] PMID:9110174
Mariella, Jr., Raymond P.
2008-11-18
A method of synthesizing a desired double-stranded DNA of a predetermined length and of a predetermined sequence. Preselected sequence segments that will complete the desired double-stranded DNA are determined. Preselected segment sequences of DNA that will be used to complete the desired double-stranded DNA are provided. The preselected segment sequences of DNA are assembled to produce the desired double-stranded DNA.
Nanopore Technology: A Simple, Inexpensive, Futuristic Technology for DNA Sequencing.
Gupta, P D
2016-10-01
In health care, importance of DNA sequencing has been fully established. Sanger's Capillary Electrophoresis DNA sequencing methodology is time consuming, cumbersome, hence become more expensive. Lately, because of its versatility DNA sequencing became house hold name, and therefore, there is an urgent need of simple, fast, inexpensive, DNA sequencing technology. In the beginning of this century efforts were made, and Nanopore DNA sequencing technology was developed; still it is infancy, nevertheless, it is the futuristic technology.
Insights into Hox protein function from a large scale combinatorial analysis of protein domains.
Merabet, Samir; Litim-Mecheri, Isma; Karlsson, Daniel; Dixit, Richa; Saadaoui, Mehdi; Monier, Bruno; Brun, Christine; Thor, Stefan; Vijayraghavan, K; Perrin, Laurent; Pradel, Jacques; Graba, Yacine
2011-10-01
Protein function is encoded within protein sequence and protein domains. However, how protein domains cooperate within a protein to modulate overall activity and how this impacts functional diversification at the molecular and organism levels remains largely unaddressed. Focusing on three domains of the central class Drosophila Hox transcription factor AbdominalA (AbdA), we used combinatorial domain mutations and most known AbdA developmental functions as biological readouts to investigate how protein domains collectively shape protein activity. The results uncover redundancy, interactivity, and multifunctionality of protein domains as salient features underlying overall AbdA protein activity, providing means to apprehend functional diversity and accounting for the robustness of Hox-controlled developmental programs. Importantly, the results highlight context-dependency in protein domain usage and interaction, allowing major modifications in domains to be tolerated without general functional loss. The non-pleoitropic effect of domain mutation suggests that protein modification may contribute more broadly to molecular changes underlying morphological diversification during evolution, so far thought to rely largely on modification in gene cis-regulatory sequences.
Insights into Hox Protein Function from a Large Scale Combinatorial Analysis of Protein Domains
Karlsson, Daniel; Dixit, Richa; Saadaoui, Mehdi; Monier, Bruno; Brun, Christine; Thor, Stefan; Vijayraghavan, K.; Perrin, Laurent; Pradel, Jacques; Graba, Yacine
2011-01-01
Protein function is encoded within protein sequence and protein domains. However, how protein domains cooperate within a protein to modulate overall activity and how this impacts functional diversification at the molecular and organism levels remains largely unaddressed. Focusing on three domains of the central class Drosophila Hox transcription factor AbdominalA (AbdA), we used combinatorial domain mutations and most known AbdA developmental functions as biological readouts to investigate how protein domains collectively shape protein activity. The results uncover redundancy, interactivity, and multifunctionality of protein domains as salient features underlying overall AbdA protein activity, providing means to apprehend functional diversity and accounting for the robustness of Hox-controlled developmental programs. Importantly, the results highlight context-dependency in protein domain usage and interaction, allowing major modifications in domains to be tolerated without general functional loss. The non-pleoitropic effect of domain mutation suggests that protein modification may contribute more broadly to molecular changes underlying morphological diversification during evolution, so far thought to rely largely on modification in gene cis-regulatory sequences. PMID:22046139
The genome-wide DNA sequence specificity of the anti-tumour drug bleomycin in human cells.
Murray, Vincent; Chen, Jon K; Tanaka, Mark M
2016-07-01
The cancer chemotherapeutic agent, bleomycin, cleaves DNA at specific sites. For the first time, the genome-wide DNA sequence specificity of bleomycin breakage was determined in human cells. Utilising Illumina next-generation DNA sequencing techniques, over 200 million bleomycin cleavage sites were examined to elucidate the bleomycin genome-wide DNA selectivity. The genome-wide bleomycin cleavage data were analysed by four different methods to determine the cellular DNA sequence specificity of bleomycin strand breakage. For the most highly cleaved DNA sequences, the preferred site of bleomycin breakage was at 5'-GT* dinucleotide sequences (where the asterisk indicates the bleomycin cleavage site), with lesser cleavage at 5'-GC* dinucleotides. This investigation also determined longer bleomycin cleavage sequences, with preferred cleavage at 5'-GT*A and 5'- TGT* trinucleotide sequences, and 5'-TGT*A tetranucleotides. For cellular DNA, the hexanucleotide DNA sequence 5'-RTGT*AY (where R is a purine and Y is a pyrimidine) was the most highly cleaved DNA sequence. It was striking that alternating purine-pyrimidine sequences were highly cleaved by bleomycin. The highest intensity cleavage sites in cellular and purified DNA were very similar although there were some minor differences. Statistical nucleotide frequency analysis indicated a G nucleotide was present at the -3 position (relative to the cleavage site) in cellular DNA but was absent in purified DNA.
Anti-digoxin Fab variants generated by phage display.
Murata, Viviane Midori; Schmidt, Mariana Costa Braga; Kalil, Jorge; Tsuruta, Lilian Rumi; Moro, Ana Maria
2013-06-01
Digoxin is a pharmaceutical used in the control of cardiac dysfunction. Its therapeutic window is narrow, with effect dosage very close to the toxic dosage. To counteract the toxic effect, polyclonal Fab fragments are commercially available. Our study is based on a monoclonal anti-digoxin antibody, which would provide a product with a specific potency and more precise dosage for the detoxification of patients under digoxin treatment. Phage display technology was used to select variants with high affinity. From an anti-digoxin hybridoma, RNA was extracted for subsequent cDNA synthesis. Specific primers were used for the LC and Fd amplifications, then cloned sequentially in a phagemid vector (pComb3X) for the combinatorial Fab library construction. Clones were selected for their ability to bind to digoxin-BSA. The presence of light and heavy chains was checked, randomly selected clones then sequenced and induced to produce soluble Fabs, and subsequently analyzed for anti-digoxin expression. Out of ten clones randomly chosen, six resulted positive expression of the product. The sequencing of these revealed two identical clones and one presenting a pseudogene in the LC. Four clones presenting variations in the framework1 showed binding to digoxin-BSA by ELISA and western blotting. The specific binding was further confirmed by Biacore(®), which allowed ranking of the clones. The development of these clones allowed the selection of variants with higher affinity than the original version.
Chandramohan, Bathrachalam; Renieri, Carlo; La Manna, Vincenzo; La Terza, Antonietta
2015-01-01
The objectives of the present study were to characterize the MC1R gene, its transcripts and the single nucleotide polymorphisms (SNPs) associated with coat color in alpaca. Full length cDNA amplification revealed the presence of two transcripts, named as F1 and F2, differing only in the length of their 5'-terminal untranslated region (UTR) sequences and presenting a color specific expression. Whereas the F1 transcript was common to white and colored (black and brown) alpaca phenotypes, the shorter F2 transcript was specific to white alpaca. Further sequencing of the MC1R gene in white and colored alpaca identified a total of twelve SNPs; among those nine (four silent mutations (c.126C>A, c.354T>C, c.618G>A, and c.933G>A); five missense mutations (c.82A>G, c.92C>T, c.259A>G, c.376A>G, and c.901C>T)) were observed in coding region and three in the 3'UTR. A 4 bp deletion (c.224 227del) was also identified in the coding region. Molecular segregation analysis uncovered that the combinatory mutations in the MC1R locus could cause eumelanin and pheomelanin synthesis in alpaca. Overall, our data refine what is known about the MC1R gene and provides additional information on its role in alpaca pigmentation.
Titeux, Matthias; Izmiryan, Araksya; Hovnanian, Alain
2017-05-01
Stunning technological advances in genomics have led to spectacular breakthroughs in the understanding of the underlying defects, biological pathways and therapeutic targets of skin diseases leading to new therapeutic interventions. Next-generation sequencing has revolutionized the identification of disease-causing genes and has a profound impact in deciphering gene and protein signatures in rare and frequent skin diseases. Gene addition strategies have shown efficacy in junctional EB and in recessive dystrophic EB (RDEB). TALENs and Cripsr/Cas9 have emerged as highly efficient new tools to edit genomic sequences to creat new models and to correct or disrupt mutated genes to treat human diseases. Therapeutic approaches have not been limited to DNA modification and strategies at the mRNA, protein and cellular levels have also emerged, some of which have already proven clinical efficacy in RDEB. Improved understanding of the pathogenesis of skin disorders has led to the development of specific drugs or repurposing of existing medicines as in basal cell nevus syndrome, alopecia areata, melanoma and EB simplex. These discoveries pave the way for improved targeted personalized medicine for rare and frequent diseases. It is likely that a growing number of orphan skin diseases will benefit from combinatory new therapies in a near future. Copyright © 2016 The Authors. Published by Elsevier Inc. All rights reserved.
Sequence and Structure Dependent DNA-DNA Interactions
NASA Astrophysics Data System (ADS)
Kopchick, Benjamin; Qiu, Xiangyun
Molecular forces between dsDNA strands are largely dominated by electrostatics and have been extensively studied. Quantitative knowledge has been accumulated on how DNA-DNA interactions are modulated by varied biological constituents such as ions, cationic ligands, and proteins. Despite its central role in biology, the sequence of DNA has not received substantial attention and ``random'' DNA sequences are typically used in biophysical studies. However, ~50% of human genome is composed of non-random-sequence DNAs, particularly repetitive sequences. Furthermore, covalent modifications of DNA such as methylation play key roles in gene functions. Such DNAs with specific sequences or modifications often take on structures other than the canonical B-form. Here we present series of quantitative measurements of the DNA-DNA forces with the osmotic stress method on different DNA sequences, from short repeats to the most frequent sequences in genome, and to modifications such as bromination and methylation. We observe peculiar behaviors that appear to be strongly correlated with the incurred structural changes. We speculate the causalities in terms of the differences in hydration shell and DNA surface structures.
DeFranco, D; Yamamoto, K R
1986-01-01
The expression of genes fused downstream of the Moloney murine sarcoma virus (MoMSV) long terminal repeat is stimulated by glucocorticoids. We mapped the glucocorticoid response element that conferred this hormonal regulation and found that it is a hormone-dependent transcriptional enhancer, designated Sg; it resides within DNA fragments that also carry a previously described enhancer element (B. Levinson, G. Khoury, G. Vande Woude, and P. Gruss, Nature [London] 295:568-572, 1982), here termed Sa, whose activity is independent of the hormone. Nuclease footprinting revealed that purified glucocorticoid receptor bound at multiple discrete sites within and at the borders of the tandemly repeated sequence motif that defines Sa. The Sa and Sg activities stimulated the apparent efficiency of cognate or heterologous promoter utilization, individually providing modest enhancement and in concert yielding higher levels of activity. A deletion mutant lacking most of the tandem repeat but retaining a single receptor footprint sequence lost Sa activity but still conferred Sg activity. The two enhancer components could also be distinguished physiologically: both were operative within cultured rat fibroblasts, but only Sg activity was detectable in rat exocrine pancreas cells. Therefore, the sequence determinants of Sa and Sg activity may be interdigitated, and when both components are active, the receptor and a putative Sa factor can apparently bind and act simultaneously. We concluded that MoMSV enhancer activity is effected by at least two distinct binding factors, suggesting that combinatorial regulation of promoter function can be mediated even from a single genetic element. Images PMID:3023887
APADB: a database for alternative polyadenylation and microRNA regulation events
Müller, Sören; Rycak, Lukas; Afonso-Grunz, Fabian; Winter, Peter; Zawada, Adam M.; Damrath, Ewa; Scheider, Jessica; Schmäh, Juliane; Koch, Ina; Kahl, Günter; Rotter, Björn
2014-01-01
Alternative polyadenylation (APA) is a widespread mechanism that contributes to the sophisticated dynamics of gene regulation. Approximately 50% of all protein-coding human genes harbor multiple polyadenylation (PA) sites; their selective and combinatorial use gives rise to transcript variants with differing length of their 3′ untranslated region (3′UTR). Shortened variants escape UTR-mediated regulation by microRNAs (miRNAs), especially in cancer, where global 3′UTR shortening accelerates disease progression, dedifferentiation and proliferation. Here we present APADB, a database of vertebrate PA sites determined by 3′ end sequencing, using massive analysis of complementary DNA ends. APADB provides (A)PA sites for coding and non-coding transcripts of human, mouse and chicken genes. For human and mouse, several tissue types, including different cancer specimens, are available. APADB records the loss of predicted miRNA binding sites and visualizes next-generation sequencing reads that support each PA site in a genome browser. The database tables can either be browsed according to organism and tissue or alternatively searched for a gene of interest. APADB is the largest database of APA in human, chicken and mouse. The stored information provides experimental evidence for thousands of PA sites and APA events. APADB combines 3′ end sequencing data with prediction algorithms of miRNA binding sites, allowing to further improve prediction algorithms. Current databases lack correct information about 3′UTR lengths, especially for chicken, and APADB provides necessary information to close this gap. Database URL: http://tools.genxpro.net/apadb/ PMID:25052703
Design, synthesis and selection of DNA-encoded small-molecule libraries.
Clark, Matthew A; Acharya, Raksha A; Arico-Muendel, Christopher C; Belyanskaya, Svetlana L; Benjamin, Dennis R; Carlson, Neil R; Centrella, Paolo A; Chiu, Cynthia H; Creaser, Steffen P; Cuozzo, John W; Davie, Christopher P; Ding, Yun; Franklin, G Joseph; Franzen, Kurt D; Gefter, Malcolm L; Hale, Steven P; Hansen, Nils J V; Israel, David I; Jiang, Jinwei; Kavarana, Malcolm J; Kelley, Michael S; Kollmann, Christopher S; Li, Fan; Lind, Kenneth; Mataruse, Sibongile; Medeiros, Patricia F; Messer, Jeffrey A; Myers, Paul; O'Keefe, Heather; Oliff, Matthew C; Rise, Cecil E; Satz, Alexander L; Skinner, Steven R; Svendsen, Jennifer L; Tang, Lujia; van Vloten, Kurt; Wagner, Richard W; Yao, Gang; Zhao, Baoguang; Morgan, Barry A
2009-09-01
Biochemical combinatorial techniques such as phage display, RNA display and oligonucleotide aptamers have proven to be reliable methods for generation of ligands to protein targets. Adapting these techniques to small synthetic molecules has been a long-sought goal. We report the synthesis and interrogation of an 800-million-member DNA-encoded library in which small molecules are covalently attached to an encoding oligonucleotide. The library was assembled by a combination of chemical and enzymatic synthesis, and interrogated by affinity selection. We describe methods for the selection and deconvolution of the chemical display library, and the discovery of inhibitors for two enzymes: Aurora A kinase and p38 MAP kinase.
A High-Throughput Process for the Solid-Phase Purification of Synthetic DNA Sequences
Grajkowski, Andrzej; Cieślak, Jacek; Beaucage, Serge L.
2017-01-01
An efficient process for the purification of synthetic phosphorothioate and native DNA sequences is presented. The process is based on the use of an aminopropylated silica gel support functionalized with aminooxyalkyl functions to enable capture of DNA sequences through an oximation reaction with the keto function of a linker conjugated to the 5′-terminus of DNA sequences. Deoxyribonucleoside phosphoramidites carrying this linker, as a 5′-hydroxyl protecting group, have been synthesized for incorporation into DNA sequences during the last coupling step of a standard solid-phase synthesis protocol executed on a controlled pore glass (CPG) support. Solid-phase capture of the nucleobase- and phosphate-deprotected DNA sequences released from the CPG support is demonstrated to proceed near quantitatively. Shorter than full-length DNA sequences are first washed away from the capture support; the solid-phase purified DNA sequences are then released from this support upon reaction with tetra-n-butylammonium fluoride in dry dimethylsulfoxide (DMSO) and precipitated in tetrahydrofuran (THF). The purity of solid-phase-purified DNA sequences exceeds 98%. The simulated high-throughput and scalability features of the solid-phase purification process are demonstrated without sacrificing purity of the DNA sequences. PMID:28628204
ChIP-less analysis of chromatin states.
Su, Zhangli; Boersma, Melissa D; Lee, Jin-Hee; Oliver, Samuel S; Liu, Shichong; Garcia, Benjamin A; Denu, John M
2014-01-01
Histone post-translational modifications (PTMs) are key epigenetic regulators in chromatin-based processes. Increasing evidence suggests that vast combinations of PTMs exist within chromatin histones. These complex patterns, rather than individual PTMs, are thought to define functional chromatin states. However, the ability to interrogate combinatorial histone PTM patterns at the nucleosome level has been limited by the lack of direct molecular tools. Here we demonstrate an efficient, quantitative, antibody-free, chromatin immunoprecipitation-less (ChIP-less) method for interrogating diverse epigenetic states. At the heart of the workflow are recombinant chromatin reader domains, which target distinct chromatin states with combinatorial PTM patterns. Utilizing a newly designed combinatorial histone peptide microarray, we showed that three reader domains (ATRX-ADD, ING2-PHD and AIRE-PHD) displayed greater specificity towards combinatorial PTM patterns than corresponding commercial histone antibodies. Such specific recognitions were employed to develop a chromatin reader-based affinity enrichment platform (matrix-assisted reader chromatin capture, or MARCC). We successfully applied the reader-based platform to capture unique chromatin states, which were quantitatively profiled by mass spectrometry to reveal interconnections between nucleosomal histone PTMs. Specifically, a highly enriched signature that harbored H3K4me0, H3K9me2/3, H3K79me0 and H4K20me2/3 within the same nucleosome was identified from chromatin enriched by ATRX-ADD. This newly reported PTM combination was enriched in heterochromatin, as revealed by the associated DNA. Our results suggest the broad utility of recombinant reader domains as an enrichment tool specific to combinatorial PTM patterns, which are difficult to probe directly by antibody-based approaches. The reader affinity platform is compatible with several downstream analyses to investigate the physical coexistence of nucleosomal PTM states associated with specific genomic loci. Collectively, the reader-based workflow will greatly facilitate our understanding of how distinct chromatin states and reader domains function in gene regulatory mechanisms.
A Combinatorial Platform for the Optimization of Peptidomimetic Methyl-Lysine Reader Antagonists
NASA Astrophysics Data System (ADS)
Barnash, Kimberly D.
Post-translational modification of histone N-terminal tails mediates chromatin compaction and, consequently, DNA replication, transcription, and repair. While numerous post-translational modifications decorate histone tails, lysine methylation is an abundant mark important for both gene activation and repression. Methyl-lysine (Kme) readers function through binding mono-, di-, or trimethyl-lysine. Chemical intervention of Kme readers faces numerous challenges due to the broad surface-groove interactions between readers and their cognate histone peptides; yet, the increasing interest in understanding chromatin-modifying complexes suggests tractable lead compounds for Kme readers are critical for elucidating the mechanisms of chromatin dysregulation in disease states and validating the druggability of these domains and complexes. The successful discovery of a peptide-derived chemical probe, UNC3866, for the Polycomb repressive complex 1 (PRC1) chromodomain Kme readers has proven the potential for selective peptidomimetic inhibition of reader function. Unfortunately, the systematic modification of peptides-to-peptidomimetics is a costly and inefficient strategy for target-class hit discovery against Kme readers. Through the exploration of biased chemical space via combinatorial on-bead libraries, we have developed two concurrent methodologies for Kme reader chemical probe discovery. We employ biased peptide combinatorial libraries as a hit discovery strategy with subsequent optimization via iterative targeted libraries. Peptide-to-peptidomimetic optimization through targeted library design was applied based on structure-guided library design around the interaction of the endogenous peptide ligand with three target Kme readers. Efforts targeting the WD40 reader EED led to the discovery of the 3-mer peptidomimetic ligand UNC5115 while combinatorial repurposing of UNC3866 for off-target chromodomains resulted in the discovery of UNC4991, a CDYL/2-selective ligand, and UNC4848, a MPP8 and CDYL/2 ligand. Ultimately, our efforts demonstrate the generalizability of a peptidomimetic combinatorial platform for the optimization of Kme reader ligands in a target class manner.
Méndez, Carmen; Salas, José A
2003-09-01
Chemotherapeutic drugs for cancer treatment have been traditionally originated by the isolation of natural products from different environmental niches, by chemical synthesis or by a combination of both approaches thus generating semisynthetic drugs. In the last years, a number of gene clusters from several antitumor biosynthetic pathways, mainly produced by actinomycetes and belonging to the polyketides family, are being characterized. Genetic manipulation of these antitumor biosynthetic pathways will offer in the near future an alternative for the generation of novel antitumor derivatives and thus complementing current methods for obtaining novel anticancer drugs. Novel antitumor derivatives have been produced by targetted gene disruption and heterologous expression of single (or a few) gene(s) in another hosts or by combining genes from different, but structurally related, biosynthetic pathways ("combinatorial biosynthesis"). These strategies take advantage from the "relaxed substrate specificity" that characterize secondary metabolism enzymes.
Multifunctional RNA Nanoparticles
2015-01-01
Our recent advancements in RNA nanotechnology introduced novel nanoscaffolds (nanorings); however, the potential of their use for biomedical applications was never fully revealed. As presented here, besides functionalization with multiple different short interfering RNAs for combinatorial RNA interference (e.g., against multiple HIV-1 genes), nanorings also allow simultaneous embedment of assorted RNA aptamers, fluorescent dyes, proteins, as well as recently developed RNA–DNA hybrids aimed to conditionally activate multiple split functionalities inside cells. PMID:25267559
An improved model for whole genome phylogenetic analysis by Fourier transform.
Yin, Changchuan; Yau, Stephen S-T
2015-10-07
DNA sequence similarity comparison is one of the major steps in computational phylogenetic studies. The sequence comparison of closely related DNA sequences and genomes is usually performed by multiple sequence alignments (MSA). While the MSA method is accurate for some types of sequences, it may produce incorrect results when DNA sequences undergone rearrangements as in many bacterial and viral genomes. It is also limited by its computational complexity for comparing large volumes of data. Previously, we proposed an alignment-free method that exploits the full information contents of DNA sequences by Discrete Fourier Transform (DFT), but still with some limitations. Here, we present a significantly improved method for the similarity comparison of DNA sequences by DFT. In this method, we map DNA sequences into 2-dimensional (2D) numerical sequences and then apply DFT to transform the 2D numerical sequences into frequency domain. In the 2D mapping, the nucleotide composition of a DNA sequence is a determinant factor and the 2D mapping reduces the nucleotide composition bias in distance measure, and thus improving the similarity measure of DNA sequences. To compare the DFT power spectra of DNA sequences with different lengths, we propose an improved even scaling algorithm to extend shorter DFT power spectra to the longest length of the underlying sequences. After the DFT power spectra are evenly scaled, the spectra are in the same dimensionality of the Fourier frequency space, then the Euclidean distances of full Fourier power spectra of the DNA sequences are used as the dissimilarity metrics. The improved DFT method, with increased computational performance by 2D numerical representation, can be applicable to any DNA sequences of different length ranges. We assess the accuracy of the improved DFT similarity measure in hierarchical clustering of different DNA sequences including simulated and real datasets. The method yields accurate and reliable phylogenetic trees and demonstrates that the improved DFT dissimilarity measure is an efficient and effective similarity measure of DNA sequences. Due to its high efficiency and accuracy, the proposed DFT similarity measure is successfully applied on phylogenetic analysis for individual genes and large whole bacterial genomes. Copyright © 2015 Elsevier Ltd. All rights reserved.
EcoFlex: A Multifunctional MoClo Kit for E. coli Synthetic Biology.
Moore, Simon J; Lai, Hung-En; Kelwick, Richard J R; Chee, Soo Mei; Bell, David J; Polizzi, Karen Marie; Freemont, Paul S
2016-10-21
Golden Gate cloning is a prominent DNA assembly tool in synthetic biology for the assembly of plasmid constructs often used in combinatorial pathway optimization, with a number of assembly kits developed specifically for yeast and plant-based expression. However, its use for synthetic biology in commonly used bacterial systems such as Escherichia coli has surprisingly been overlooked. Here, we introduce EcoFlex a simplified modular package of DNA parts for a variety of applications in E. coli, cell-free protein synthesis, protein purification and hierarchical assembly of transcription units based on the MoClo assembly standard. The kit features a library of constitutive promoters, T7 expression, RBS strength variants, synthetic terminators, protein purification tags and fluorescence proteins. We validate EcoFlex by assembling a 68-part containing (20 genes) plasmid (31 kb), characterize in vivo and in vitro library parts, and perform combinatorial pathway assembly, using pooled libraries of either fluorescent proteins or the biosynthetic genes for the antimicrobial pigment violacein as a proof-of-concept. To minimize pathway screening, we also introduce a secondary module design site to simplify MoClo pathway optimization. In summary, EcoFlex provides a standardized and multifunctional kit for a variety of applications in E. coli synthetic biology.
Ribosomal RNA Genes Contribute to the Formation of Pseudogenes and Junk DNA in the Human Genome.
Robicheau, Brent M; Susko, Edward; Harrigan, Amye M; Snyder, Marlene
2017-02-01
Approximately 35% of the human genome can be identified as sequence devoid of a selected-effect function, and not derived from transposable elements or repeated sequences. We provide evidence supporting a known origin for a fraction of this sequence. We show that: 1) highly degraded, but near full length, ribosomal DNA (rDNA) units, including both 45S and Intergenic Spacer (IGS), can be found at multiple sites in the human genome on chromosomes without rDNA arrays, 2) that these rDNA sequences have a propensity for being centromere proximal, and 3) that sequence at all human functional rDNA array ends is divergent from canonical rDNA to the point that it is pseudogenic. We also show that small sequence strings of rDNA (from 45S + IGS) can be found distributed throughout the genome and are identifiable as an "rDNA-like signal", representing 0.26% of the q-arm of HSA21 and ∼2% of the total sequence of other regions tested. The size of sequence strings found in the rDNA-like signal intergrade into the size of sequence strings that make up the full-length degrading rDNA units found scattered throughout the genome. We conclude that the displaced and degrading rDNA sequences are likely of a similar origin but represent different stages in their evolution towards random sequence. Collectively, our data suggests that over vast evolutionary time, rDNA arrays contribute to the production of junk DNA. The concept that the production of rDNA pseudogenes is a by-product of concerted evolution represents a previously under-appreciated process; we demonstrate here its importance. © The Author(s) 2017. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.
Yin, Changchuan
2015-04-01
To apply digital signal processing (DSP) methods to analyze DNA sequences, the sequences first must be specially mapped into numerical sequences. Thus, effective numerical mappings of DNA sequences play key roles in the effectiveness of DSP-based methods such as exon prediction. Despite numerous mappings of symbolic DNA sequences to numerical series, the existing mapping methods do not include the genetic coding features of DNA sequences. We present a novel numerical representation of DNA sequences using genetic codon context (GCC) in which the numerical values are optimized by simulation annealing to maximize the 3-periodicity signal to noise ratio (SNR). The optimized GCC representation is then applied in exon and intron prediction by Short-Time Fourier Transform (STFT) approach. The results show the GCC method enhances the SNR values of exon sequences and thus increases the accuracy of predicting protein coding regions in genomes compared with the commonly used 4D binary representation. In addition, this study offers a novel way to reveal specific features of DNA sequences by optimizing numerical mappings of symbolic DNA sequences.
Single-cell genomic sequencing using Multiple Displacement Amplification.
Lasken, Roger S
2007-10-01
Single microbial cells can now be sequenced using DNA amplified by the Multiple Displacement Amplification (MDA) reaction. The few femtograms of DNA in a bacterium are amplified into micrograms of high molecular weight DNA suitable for DNA library construction and Sanger sequencing. The MDA-generated DNA also performs well when used directly as template for pyrosequencing by the 454 Life Sciences method. While MDA from single cells loses some of the genomic sequence, this approach will greatly accelerate the pace of sequencing from uncultured microbes. The genetically linked sequences from single cells are also a powerful tool to be used in guiding genomic assembly of shotgun sequences of multiple organisms from environmental DNA extracts (metagenomic sequences).
ERIC Educational Resources Information Center
Ellington, Roni; Wachira, James; Nkwanta, Asamoah
2010-01-01
The focus of this Research Experience for Undergraduates (REU) project was on RNA secondary structure prediction by using a lattice walk approach. The lattice walk approach is a combinatorial and computational biology method used to enumerate possible secondary structures and predict RNA secondary structure from RNA sequences. The method uses…
Wu, Zining; Graybill, Todd L; Zeng, Xin; Platchek, Michael; Zhang, Jean; Bodmer, Vera Q; Wisnoski, David D; Deng, Jianghe; Coppo, Frank T; Yao, Gang; Tamburino, Alex; Scavello, Genaro; Franklin, G Joseph; Mataruse, Sibongile; Bedard, Katie L; Ding, Yun; Chai, Jing; Summerfield, Jennifer; Centrella, Paolo A; Messer, Jeffrey A; Pope, Andrew J; Israel, David I
2015-12-14
DNA-encoded small-molecule library technology has recently emerged as a new paradigm for identifying ligands against drug targets. To date, this technology has been used with soluble protein targets that are produced and used in a purified state. Here, we describe a cell-based method for identifying small-molecule ligands from DNA-encoded libraries against integral membrane protein targets. We use this method to identify novel, potent, and specific inhibitors of NK3, a member of the tachykinin family of G-protein coupled receptors (GPCRs). The method is simple and broadly applicable to other GPCRs and integral membrane proteins. We have extended the application of DNA-encoded library technology to membrane-associated targets and demonstrate the feasibility of selecting DNA-tagged, small-molecule ligands from complex combinatorial libraries against targets in a heterogeneous milieu, such as the surface of a cell.
Acquisition of New DNA Sequences After Infection of Chicken Cells with Avian Myeloblastosis Virus
Shoyab, M.; Baluda, M. A.; Evans, R.
1974-01-01
DNA-RNA hybridization studies between 70S RNA from avian myeloblastosis virus (AMV) and an excess of DNA from (i) AMV-induced leukemic chicken myeloblasts or (ii) a mixture of normal and of congenitally infected K-137 chicken embryos producing avian leukosis viruses revealed the presence of fast- and slow-hybridizing virus-specific DNA sequences. However, the leukemic cells contained twice the level of AMV-specific DNA sequences observed in normal chicken embryonic cells. The fast-reacting sequences were two to three times more numerous in leukemic DNA than in DNA from the mixed embryos. The slow-reacting sequences had a reiteration frequency of approximately 9 and 6, in the two respective systems. Both the fast- and the slow-reacting DNA sequences in leukemic cells exhibited a higher Tm (2 C) than the respective DNA sequences in normal cells. In normal and leukemic cells the slow hybrid sequences appeared to have a Tm which was 2 C higher than that of the fast hybrid sequences. Individual non-virus-producing chicken embryos, either group-specific antigen positive or negative, contained 40 to 100 copies of the fast sequences and 2 to 6 copies of the slowly hybridizing sequences per cell genome. Normal rat cells did not contain DNA that hybridized with AMV RNA, whereas non-virus-producing rat cells transformed by B-77 avian sarcoma virus contained only the slowly reacting sequences. The results demonstrate that leukemic cells transformed by AMV contain new AMV-specific DNA sequences which were not present before infection. PMID:16789139
Matsuda, M; Tazumi, A; Kagawa, S; Sekizuka, T; Murayama, O; Moore, JE; Millar, BC
2006-01-01
Background At present, six accessible sequences of 16S rDNA from Taylorella equigenitalis (T. equigenitalis) are available, whose sequence differences occur at a few nucleotide positions. Thus it is important to determine these sequences from additional strains in other countries, if possible, in order to clarify any anomalies regarding 16S rDNA sequence heterogeneity. Here, we clone and sequence the approximate full-length 16S rDNA from additional strains of T. equigenitalis isolated in Japan, Australia and France and compare these sequences to the existing published sequences. Results Clarification of any anomalies regarding 16S rDNA sequence heterogeneity of T. equigenitalis was carried out. When cloning, sequencing and comparison of the approximate full-length 16S rDNA from 17 strains of T. equigenitalis isolated in Japan, Australia and France, nucleotide sequence differences were demonstrated at the six loci in the 1,469 nucleotide sequence. Moreover, 12 polymorphic sites occurred among 23 sequences of the 16S rDNA, including the six reference sequences. Conclusion High sequence similarity (99.5% or more) was observed throughout, except from nucleotide positions 138 to 501 where substitutions and deletions were noted. PMID:16398935
McCutchen-Maloney, Sandra L.
2002-01-01
DNA mutation binding proteins alone and as chimeric proteins with nucleases are used with solid supports to detect DNA sequence variations, DNA mutations and single nucleotide polymorphisms. The solid supports may be flow cytometry beads, DNA chips, glass slides or DNA dips sticks. DNA molecules are coupled to solid supports to form DNA-support complexes. Labeled DNA is used with unlabeled DNA mutation binding proteins such at TthMutS to detect DNA sequence variations, DNA mutations and single nucleotide length polymorphisms by binding which gives an increase in signal. Unlabeled DNA is utilized with labeled chimeras to detect DNA sequence variations, DNA mutations and single nucleotide length polymorphisms by nuclease activity of the chimera which gives a decrease in signal.
Methylation patterns of repetitive DNA sequences in germ cells of Mus musculus.
Sanford, J; Forrester, L; Chapman, V; Chandley, A; Hastie, N
1984-03-26
The major and the minor satellite sequences of Mus musculus were undermethylated in both sperm and oocyte DNAs relative to the amount of undermethylation observed in adult somatic tissue DNA. This hypomethylation was specific for satellite sequences in sperm DNA. Dispersed repetitive and low copy sequences show a high degree of methylation in sperm DNA; however, a dispersed repetitive sequence was undermethylated in oocyte DNA. This finding suggests a difference in the amount of total genomic DNA methylation between sperm and oocyte DNA. The methylation levels of the minor satellite sequences did not change during spermiogenesis, and were not associated with the onset of meiosis or a specific stage in sperm development.
Process of labeling specific chromosomes using recombinant repetitive DNA
Moyzis, R.K.; Meyne, J.
1988-02-12
Chromosome preferential nucleotide sequences are first determined from a library of recombinant DNA clones having families of repetitive sequences. Library clones are identified with a low homology with a sequence of repetitive DNA families to which the first clones respectively belong and variant sequences are then identified by selecting clones having a pattern of hybridization with genomic DNA dissimilar to the hybridization pattern shown by the respective families. In another embodiment, variant sequences are selected from a sequence of a known repetitive DNA family. The selected variant sequence is classified as chromosome specific, chromosome preferential, or chromosome nonspecific. Sequences which are classified as chromosome preferential are further sequenced and regions are identified having a low homology with other regions of the chromosome preferential sequence or with known sequences of other family members and consensus sequences of the repetitive DNA families for the chromosome preferential sequences. The selected low homology regions are then hybridized with chromosomes to determine those low homology regions hybridized with a specific chromosome under normal stringency conditions.
Sequence-based prediction of protein-binding sites in DNA: comparative study of two SVM models.
Park, Byungkyu; Im, Jinyong; Tuvshinjargal, Narankhuu; Lee, Wook; Han, Kyungsook
2014-11-01
As many structures of protein-DNA complexes have been known in the past years, several computational methods have been developed to predict DNA-binding sites in proteins. However, its inverse problem (i.e., predicting protein-binding sites in DNA) has received much less attention. One of the reasons is that the differences between the interaction propensities of nucleotides are much smaller than those between amino acids. Another reason is that DNA exhibits less diverse sequence patterns than protein. Therefore, predicting protein-binding DNA nucleotides is much harder than predicting DNA-binding amino acids. We computed the interaction propensity (IP) of nucleotide triplets with amino acids using an extensive dataset of protein-DNA complexes, and developed two support vector machine (SVM) models that predict protein-binding nucleotides from sequence data alone. One SVM model predicts protein-binding nucleotides using DNA sequence data alone, and the other SVM model predicts protein-binding nucleotides using both DNA and protein sequences. In a 10-fold cross-validation with 1519 DNA sequences, the SVM model that uses DNA sequence data only predicted protein-binding nucleotides with an accuracy of 67.0%, an F-measure of 67.1%, and a Matthews correlation coefficient (MCC) of 0.340. With an independent dataset of 181 DNAs that were not used in training, it achieved an accuracy of 66.2%, an F-measure 66.3% and a MCC of 0.324. Another SVM model that uses both DNA and protein sequences achieved an accuracy of 69.6%, an F-measure of 69.6%, and a MCC of 0.383 in a 10-fold cross-validation with 1519 DNA sequences and 859 protein sequences. With an independent dataset of 181 DNAs and 143 proteins, it showed an accuracy of 67.3%, an F-measure of 66.5% and a MCC of 0.329. Both in cross-validation and independent testing, the second SVM model that used both DNA and protein sequence data showed better performance than the first model that used DNA sequence data. To the best of our knowledge, this is the first attempt to predict protein-binding nucleotides in a given DNA sequence from the sequence data alone. Copyright © 2014 Elsevier Ireland Ltd. All rights reserved.
Enlightenment of Yeast Mitochondrial Homoplasmy: Diversified Roles of Gene Conversion
Ling, Feng; Mikawa, Tsutomu; Shibata, Takehiko
2011-01-01
Mitochondria have their own genomic DNA. Unlike the nuclear genome, each cell contains hundreds to thousands of copies of mitochondrial DNA (mtDNA). The copies of mtDNA tend to have heterogeneous sequences, due to the high frequency of mutagenesis, but are quickly homogenized within a cell (“homoplasmy”) during vegetative cell growth or through a few sexual generations. Heteroplasmy is strongly associated with mitochondrial diseases, diabetes and aging. Recent studies revealed that the yeast cell has the machinery to homogenize mtDNA, using a common DNA processing pathway with gene conversion; i.e., both genetic events are initiated by a double-stranded break, which is processed into 3′ single-stranded tails. One of the tails is base-paired with the complementary sequence of the recipient double-stranded DNA to form a D-loop (homologous pairing), in which repair DNA synthesis is initiated to restore the sequence lost by the breakage. Gene conversion generates sequence diversity, depending on the divergence between the donor and recipient sequences, especially when it occurs among a number of copies of a DNA sequence family with some sequence variations, such as in immunoglobulin diversification in chicken. MtDNA can be regarded as a sequence family, in which the members tend to be diversified by a high frequency of spontaneous mutagenesis. Thus, it would be interesting to determine why and how double-stranded breakage and D-loop formation induce sequence homogenization in mitochondria and sequence diversification in nuclear DNA. We will review the mechanisms and roles of mtDNA homoplasmy, in contrast to nuclear gene conversion, which diversifies gene and genome sequences, to provide clues toward understanding how the common DNA processing pathway results in such divergent outcomes. PMID:24710143
"First generation" automated DNA sequencing technology.
Slatko, Barton E; Kieleczawa, Jan; Ju, Jingyue; Gardner, Andrew F; Hendrickson, Cynthia L; Ausubel, Frederick M
2011-10-01
Beginning in the 1980s, automation of DNA sequencing has greatly increased throughput, reduced costs, and enabled large projects to be completed more easily. The development of automation technology paralleled the development of other aspects of DNA sequencing: better enzymes and chemistry, separation and imaging technology, sequencing protocols, robotics, and computational advancements (including base-calling algorithms with quality scores, database developments, and sequence analysis programs). Despite the emergence of high-throughput sequencing platforms, automated Sanger sequencing technology remains useful for many applications. This unit provides background and a description of the "First-Generation" automated DNA sequencing technology. It also includes protocols for using the current Applied Biosystems (ABI) automated DNA sequencing machines. © 2011 by John Wiley & Sons, Inc.
A genome-wide SNP scan accelerates trait-regulatory genomic loci identification in chickpea
Kujur, Alice; Bajaj, Deepak; Upadhyaya, Hari D.; Das, Shouvik; Ranjan, Rajeev; Shree, Tanima; Saxena, Maneesha S.; Badoni, Saurabh; Kumar, Vinod; Tripathi, Shailesh; Gowda, C.L.L.; Sharma, Shivali; Singh, Sube; Tyagi, Akhilesh K.; Parida, Swarup K.
2015-01-01
We identified 44844 high-quality SNPs by sequencing 92 diverse chickpea accessions belonging to a seed and pod trait-specific association panel using reference genome- and de novo-based GBS (genotyping-by-sequencing) assays. A GWAS (genome-wide association study) in an association panel of 211, including the 92 sequenced accessions, identified 22 major genomic loci showing significant association (explaining 23–47% phenotypic variation) with pod and seed number/plant and 100-seed weight. Eighteen trait-regulatory major genomic loci underlying 13 robust QTLs were validated and mapped on an intra-specific genetic linkage map by QTL mapping. A combinatorial approach of GWAS, QTL mapping and gene haplotype-specific LD mapping and transcript profiling uncovered one superior haplotype and favourable natural allelic variants in the upstream regulatory region of a CesA-type cellulose synthase (Ca_Kabuli_CesA3) gene regulating high pod and seed number/plant (explaining 47% phenotypic variation) in chickpea. The up-regulation of this superior gene haplotype correlated with increased transcript expression of Ca_Kabuli_CesA3 gene in the pollen and pod of high pod/seed number accession, resulting in higher cellulose accumulation for normal pollen and pollen tube growth. A rapid combinatorial genome-wide SNP genotyping-based approach has potential to dissect complex quantitative agronomic traits and delineate trait-regulatory genomic loci (candidate genes) for genetic enhancement in crop plants, including chickpea. PMID:26058368
Influence of DNA sequence on the structure of minicircles under torsional stress
Wang, Qian; Irobalieva, Rossitza N.; Chiu, Wah; Schmid, Michael F.; Fogg, Jonathan M.; Zechiedrich, Lynn
2017-01-01
Abstract The sequence dependence of the conformational distribution of DNA under various levels of torsional stress is an important unsolved problem. Combining theory and coarse-grained simulations shows that the DNA sequence and a structural correlation due to topology constraints of a circle are the main factors that dictate the 3D structure of a 336 bp DNA minicircle under torsional stress. We found that DNA minicircle topoisomers can have multiple bend locations under high torsional stress and that the positions of these sharp bends are determined by the sequence, and by a positive mechanical correlation along the sequence. We showed that simulations and theory are able to provide sequence-specific information about individual DNA minicircles observed by cryo-electron tomography (cryo-ET). We provided a sequence-specific cryo-ET tomogram fitting of DNA minicircles, registering the sequence within the geometric features. Our results indicate that the conformational distribution of minicircles under torsional stress can be designed, which has important implications for using minicircle DNA for gene therapy. PMID:28609782
Analysis of DNA Sequences by an Optical Time-Integrating Correlator: Proof-of-Concept Experiments.
1992-05-01
DNA ANALYSIS STRATEGY 4 2.1 Representation of DNA Bases 4 2.2 DNA Analysis Strategy 6 3.0 CUSTOM GENERATORS FOR DNA SEQUENCES 10 3.1 Hardware Design 10...of the DNA bases where each base is represented by a 7-bits long pseudorandom sequence. 5 Figure 4: Coarse analysis of a DNA sequence. 7 Figure 5: Fine...a 20-bases long database. 32 xiii LIST OF TABLES PAGE Table 1: Short representations of the DNA bases where each base is represented by 7-bits long
Gaytán, Paul; Yáñez, Jorge; Sánchez, Filiberto; Soberón, Xavier
2001-01-01
We describe here a method to generate combinatorial libraries of oligonucleotides mutated at the codon-level, with control of the mutagenesis rate so as to create predictable binomial distributions of mutants. The method allows enrichment of the libraries with single, double or larger multiplicity of amino acid replacements by appropriate choice of the mutagenesis rate, depending on the concentration of synthetic precursors. The method makes use of two sets of deoxynucleoside-phosphoramidites bearing orthogonal protecting groups [4,4′-dimethoxytrityl (DMT) and 9-fluorenylmethoxycarbonyl (Fmoc)] in the 5′ hydroxyl. These phosphoramidites are divergently combined during automated synthesis in such a way that wild-type codons are assembled with commercial DMT-deoxynucleoside-methyl-phosphoramidites while mutant codons are assembled with Fmoc-deoxynucleoside-methyl-phosphoramidites in an NNG/C fashion in a single synthesis column. This method is easily automated and suitable for low mutagenesis rates and large windows, such as those required for directed evolution and alanine scanning. Through the assembly of three oligonucleotide libraries at different mutagenesis rates, followed by cloning at the polylinker region of plasmid pUC18 and sequencing of 129 clones, we concluded that the method performs essentially as intended. PMID:11160911
Beyond Aztec Castles: Toric Cascades in the dP 3 Quiver
NASA Astrophysics Data System (ADS)
Lai, Tri; Musiker, Gregg
2017-12-01
Given one of an infinite class of supersymmetric quiver gauge theories, string theorists can associate a corresponding toric variety (which is a Calabi-Yau 3-fold) as well as an associated combinatorial model known as a brane tiling. In combinatorial language, a brane tiling is a bipartite graph on a torus and its perfect matchings are of interest to both combinatorialists and physicists alike. A cluster algebra may also be associated to such quivers and in this paper we study the generators of this algebra, known as cluster variables, for the quiver associated to the cone over the del Pezzo surface d P 3. In particular, mutation sequences involving mutations exclusively at vertices with two in-coming arrows and two out-going arrows are referred to as toric cascades in the string theory literature. Such toric cascades give rise to interesting discrete integrable systems on the level of cluster variable dynamics. We provide an explicit algebraic formula for all cluster variables that are reachable by toric cascades as well as a combinatorial interpretation involving perfect matchings of subgraphs of the d P 3 brane tiling for these formulas in most cases.
Drosophila CTCF tandemly aligns with other insulator proteins at the borders of H3K27me3 domains.
Van Bortle, Kevin; Ramos, Edward; Takenaka, Naomi; Yang, Jingping; Wahi, Jessica E; Corces, Victor G
2012-11-01
Several multiprotein DNA complexes capable of insulator activity have been identified in Drosophila melanogaster, yet only CTCF, a highly conserved zinc finger protein, and the transcription factor TFIIIC have been shown to function in mammals. CTCF is involved in diverse nuclear activities, and recent studies suggest that the proteins with which it associates and the DNA sequences that it targets may underlie these various roles. Here we show that the Drosophila homolog of CTCF (dCTCF) aligns in the genome with other Drosophila insulator proteins such as Suppressor of Hairy wing [SU(HW)] and Boundary Element Associated Factor of 32 kDa (BEAF-32) at the borders of H3K27me3 domains, which are also enriched for associated insulator proteins and additional cofactors. RNAi depletion of dCTCF and combinatorial knockdown of gene expression for other Drosophila insulator proteins leads to a reduction in H3K27me3 levels within repressed domains, suggesting that insulators are important for the maintenance of appropriate repressive chromatin structure in Polycomb (Pc) domains. These results shed new insights into the roles of insulators in chromatin domain organization and support recent models suggesting that insulators underlie interactions important for Pc-mediated repression. We reveal an important relationship between dCTCF and other Drosophila insulator proteins and speculate that vertebrate CTCF may also align with other nuclear proteins to accomplish similar functions.
Drosophila CTCF tandemly aligns with other insulator proteins at the borders of H3K27me3 domains
Van Bortle, Kevin; Ramos, Edward; Takenaka, Naomi; Yang, Jingping; Wahi, Jessica E.; Corces, Victor G.
2012-01-01
Several multiprotein DNA complexes capable of insulator activity have been identified in Drosophila melanogaster, yet only CTCF, a highly conserved zinc finger protein, and the transcription factor TFIIIC have been shown to function in mammals. CTCF is involved in diverse nuclear activities, and recent studies suggest that the proteins with which it associates and the DNA sequences that it targets may underlie these various roles. Here we show that the Drosophila homolog of CTCF (dCTCF) aligns in the genome with other Drosophila insulator proteins such as Suppressor of Hairy wing [SU(HW)] and Boundary Element Associated Factor of 32 kDa (BEAF-32) at the borders of H3K27me3 domains, which are also enriched for associated insulator proteins and additional cofactors. RNAi depletion of dCTCF and combinatorial knockdown of gene expression for other Drosophila insulator proteins leads to a reduction in H3K27me3 levels within repressed domains, suggesting that insulators are important for the maintenance of appropriate repressive chromatin structure in Polycomb (Pc) domains. These results shed new insights into the roles of insulators in chromatin domain organization and support recent models suggesting that insulators underlie interactions important for Pc-mediated repression. We reveal an important relationship between dCTCF and other Drosophila insulator proteins and speculate that vertebrate CTCF may also align with other nuclear proteins to accomplish similar functions. PMID:22722341
Laser mass spectrometry for DNA sequencing, disease diagnosis, and fingerprinting
NASA Astrophysics Data System (ADS)
Chen, C. H. Winston; Taranenko, N. I.; Zhu, Y. F.; Chung, C. N.; Allman, S. L.
1997-05-01
Since laser mass spectrometry has the potential for achieving very fast DNA analysis, we recently applied it to DNA sequencing, DNA typing for fingerprinting, and DNA screening for disease diagnosis. Two different approaches for sequencing DNA have been successfully demonstrated. One is to sequence DNA with DNA ladders produced from Sanger's enzymatic method. The other is to do direct sequencing without DNA ladders. The need for quick DNA typing for identification purposes is critical for forensic application. Our preliminary results indicate laser mass spectrometry can possible be used for rapid DNA fingerprinting applications at a much lower cost than gel electrophoresis. Population screening for certain genetic disease can be a very efficient step to reducing medical costs through prevention. Since laser mass spectrometry can provide very fast DNA analysis, we applied laser mass spectrometry to disease diagnosis. Clinical samples with both base deletion and point mutation have been tested with complete success.
Colombo, M M; Swanton, M T; Donini, P; Prescott, D M
1984-01-01
Oxytricha nova is a hypotrichous ciliate with micronuclei and macronuclei. Micronuclei, which contain large, chromosomal-sized DNA, are genetically inert but undergo meiosis and exchange during cell mating. Macronuclei, which contain only small, gene-sized DNA molecules, provide all of the nuclear RNA needed to run the cell. After cell mating the macronucleus is derived from a micronucleus, a derivation that includes excision of the genes from chromosomes and elimination of the remaining DNA. The eliminated DNA includes all of the repetitious sequences and approximately 95% of the unique sequences. We cloned large restriction fragments from the micronucleus that confer replication ability on a replication-deficient plasmid in Saccharomyces cerevisiae. Sequences that confer replication ability are called autonomously replicating sequences. The frequency and effectiveness of autonomously replicating sequences in micronuclear DNA are similar to those reported for DNAs of other organisms introduced into yeast cells. Of the 12 micronuclear fragments with autonomously replicating sequence activity, 9 also showed homology to macronuclear DNA, indicating that they contain a macronuclear gene sequence. We conclude from this that autonomously replicating sequence activity is nonrandomly distributed throughout micronuclear DNA and is preferentially associated with those regions of micronuclear DNA that contain genes. Images PMID:6092934
DNA sequence-dependent mechanics and protein-assisted bending in repressor-mediated loop formation
Boedicker, James Q.; Garcia, Hernan G.; Johnson, Stephanie; Phillips, Rob
2014-01-01
As the chief informational molecule of life, DNA is subject to extensive physical manipulations. The energy required to deform double-helical DNA depends on sequence, and this mechanical code of DNA influences gene regulation, such as through nucleosome positioning. Here we examine the sequence-dependent flexibility of DNA in bacterial transcription factor-mediated looping, a context for which the role of sequence remains poorly understood. Using a suite of synthetic constructs repressed by the Lac repressor and two well-known sequences that show large flexibility differences in vitro, we make precise statistical mechanical predictions as to how DNA sequence influences loop formation and test these predictions using in vivo transcription and in vitro single-molecule assays. Surprisingly, sequence-dependent flexibility does not affect in vivo gene regulation. By theoretically and experimentally quantifying the relative contributions of sequence and the DNA-bending protein HU to DNA mechanical properties, we reveal that bending by HU dominates DNA mechanics and masks intrinsic sequence-dependent flexibility. Such a quantitative understanding of how mechanical regulatory information is encoded in the genome will be a key step towards a predictive understanding of gene regulation at single-base pair resolution. PMID:24231252
Label-free detection of gliadin food allergen mediated by real-time apta-PCR.
Pinto, Alessandro; Polo, Pedro Nadal; Henry, Olivier; Redondo, M Carmen Bermudo; Svobodova, Marketa; O'Sullivan, Ciara K
2014-01-01
Celiac disease is an immune-mediated enteropathy triggered by the ingestion of gluten. The only effective treatment consists in a lifelong gluten-free diet, requiring the food industry to tightly control the gluten contents of their products. To date, several gluten quantification approaches using antibodies are available and recommended by the legal authorities, such as Codex Alimentarius. However, whilst these antibody-based tests exhibit high sensitivity and specificity, the production of antibodies inherently requires the killing of host animals and is time-consuming and relatively expensive. Aptamers are structured single-stranded nucleic acid ligands that bind with high affinity and specificity to their cognate target, and aiming for a cost-effective viable alternative to the use of antibodies. Herein, we report the systematic evolution of ligands by exponential enrichment (SELEX)-based selection of a DNA aptamer against gliadin from a combinatorial DNA library and its application in a novel detection assay. Taking into account the hydrophobic nature of the gliadin target, a microtitre plate format was exploited for SELEX, where the target was immobilised via hydrophobic interactions, thus exposing aptatopes accessible for interaction with the DNA library. Evolution was followed using surface plasmon resonance, and following eight rounds of SELEX, the enriched DNA pool was cloned, sequenced and a clear consensus motif was identified. An apta-PCR assay was developed where competition for the aptamer takes place between the surface-immobilised gliadin and gliadin in the target sample, akin to an ELISA competitive format where the more target present in the sample, the less aptamer will bind to the immobilised gliadin. Following competition, any aptamer bound to the immobilised gliadin was heat-eluted and quantitatively amplified using real-time PCR, achieving a detection limit of approx. 2 nM (100 ng mL(-1)). The specificity of the selected aptamer was demonstrated and no cross-reactivity was observed with streptavidin, bovine serum albumin or anti-gliadin IgG.
El-Sherry, Shiem; Ogedengbe, Mosun E; Hafeez, Mian A; Barta, John R
2013-07-01
Multiple 18S rDNA sequences were obtained from two single-oocyst-derived lines of each of Eimeria meleagrimitis and Eimeria adenoeides. After analysing the 15 new 18S rDNA sequences from two lines of E. meleagrimitis and 17 new sequences from two lines of E. adenoeides, there were clear indications that divergent, paralogous 18S rDNA copies existed within the nuclear genome of E. meleagrimitis. In contrast, mitochondrial cytochrome c oxidase subunit I (COI) partial sequences from all lines of a particular Eimeria sp. were identical and, in phylogenetic analyses, COI sequences clustered unambiguously in monophyletic and highly-supported clades specific to individual Eimeria sp. Phylogenetic analysis of the new 18S rDNA sequences from E. meleagrimitis showed that they formed two distinct clades: Type A with four new sequences; and Type B with nine new sequences; both Types A and B sequences were obtained from each of the single-oocyst-derived lines of E. meleagrimitis. Together these rDNA types formed a well-supported E. meleagrimitis clade. Types A and B 18S rDNA sequences from E. meleagrimitis had a mean sequence identity of only 97.4% whereas mean sequence identity within types was 99.1-99.3%. The observed intraspecific sequence divergence among E. meleagrimitis 18S rDNA sequence types was even higher (approximately 2.6%) than the interspecific sequence divergence present between some well-recognized species such as Eimeria tenella and Eimeria necatrix (1.1%). Our observations suggest that, unlike COI sequences, 18S rDNA sequences are not reliable molecular markers to be used alone for species identification with coccidia, although 18S rDNA sequences have clear utility for phylogenetic reconstruction of apicomplexan parasites at the genus and higher taxonomic ranks. Copyright © 2013. Published by Elsevier Ltd.
Shah, Kushani; Thomas, Shelby; Stein, Arnold
2013-01-01
In this report, we describe a 5-week laboratory exercise for undergraduate biology and biochemistry students in which students learn to sequence DNA and to genotype their DNA for selected single nucleotide polymorphisms (SNPs). Students use miniaturized DNA sequencing gels that require approximately 8 min to run. The students perform G, A, T, C Sanger sequencing reactions. They prepare and run the gels, perform Southern blots (which require only 10 min), and detect sequencing ladders using a colorimetric detection system. Students enlarge their sequencing ladders from digital images of their small nylon membranes, and read the sequence manually. They compare their reads with the actual DNA sequence using BLAST2. After mastering the DNA sequencing system, students prepare their own DNA from a cheek swab, polymerase chain reaction-amplify a region of their DNA that encompasses a SNP of interest, and perform sequencing to determine their genotype at the SNP position. A family pedigree can also be constructed. The SNP chosen by the instructor was rs17822931, which is in the ABCC11 gene and is the determinant of human earwax type. Genotypes at the rs178229931 site vary in different ethnic populations. © 2013 by The International Union of Biochemistry and Molecular Biology.
Kröber, Magdalena; Bekel, Thomas; Diaz, Naryttza N; Goesmann, Alexander; Jaenicke, Sebastian; Krause, Lutz; Miller, Dimitri; Runte, Kai J; Viehöver, Prisca; Pühler, Alfred; Schlüter, Andreas
2009-06-01
The phylogenetic structure of the microbial community residing in a fermentation sample from a production-scale biogas plant fed with maize silage, green rye and liquid manure was analysed by an integrated approach using clone library sequences and metagenome sequence data obtained by 454-pyrosequencing. Sequencing of 109 clones from a bacterial and an archaeal 16S-rDNA amplicon library revealed that the obtained nucleotide sequences are similar but not identical to 16S-rDNA database sequences derived from different anaerobic environments including digestors and bioreactors. Most of the bacterial 16S-rDNA sequences could be assigned to the phylum Firmicutes with the most abundant class Clostridia and to the class Bacteroidetes, whereas most archaeal 16S-rDNA sequences cluster close to the methanogen Methanoculleus bourgensis. Further sequences of the archaeal library most probably represent so far non-characterised species within the genus Methanoculleus. A similar result derived from phylogenetic analysis of mcrA clone sequences. The mcrA gene product encodes the alpha-subunit of methyl-coenzyme-M reductase involved in the final step of methanogenesis. BLASTn analysis applying stringent settings resulted in assignment of 16S-rDNA metagenome sequence reads to 62 16S-rDNA amplicon sequences thus enabling frequency of abundance estimations for 16S-rDNA clone library sequences. Ribosomal Database Project (RDP) Classifier processing of metagenome 16S-rDNA reads revealed abundance of the phyla Firmicutes, Bacteroidetes and Euryarchaeota and the orders Clostridiales, Bacteroidales and Methanomicrobiales. Moreover, a large fraction of 16S-rDNA metagenome reads could not be assigned to lower taxonomic ranks, demonstrating that numerous microorganisms in the analysed fermentation sample of the biogas plant are still unclassified or unknown.
2013-01-01
Background Mitochondrial DNA (mtDNA) typing can be a useful aid for identifying people from compromised samples when nuclear DNA is too damaged, degraded or below detection thresholds for routine short tandem repeat (STR)-based analysis. Standard mtDNA typing, focused on PCR amplicon sequencing of the control region (HVS I and HVS II), is limited by the resolving power of this short sequence, which misses up to 70% of the variation present in the mtDNA genome. Methods We used in-solution hybridisation-based DNA capture (using DNA capture probes prepared from modern human mtDNA) to recover mtDNA from post-mortem human remains in which the majority of DNA is both highly fragmented (<100 base pairs in length) and chemically damaged. The method ‘immortalises’ the finite quantities of DNA in valuable extracts as DNA libraries, which is followed by the targeted enrichment of endogenous mtDNA sequences and characterisation by next-generation sequencing (NGS). Results We sequenced whole mitochondrial genomes for human identification from samples where standard nuclear STR typing produced only partial profiles or demonstrably failed and/or where standard mtDNA hypervariable region sequences lacked resolving power. Multiple rounds of enrichment can substantially improve coverage and sequencing depth of mtDNA genomes from highly degraded samples. The application of this method has led to the reliable mitochondrial sequencing of human skeletal remains from unidentified World War Two (WWII) casualties approximately 70 years old and from archaeological remains (up to 2,500 years old). Conclusions This approach has potential applications in forensic science, historical human identification cases, archived medical samples, kinship analysis and population studies. In particular the methodology can be applied to any case, involving human or non-human species, where whole mitochondrial genome sequences are required to provide the highest level of maternal lineage discrimination. Multiple rounds of in-solution hybridisation-based DNA capture can retrieve whole mitochondrial genome sequences from even the most challenging samples. PMID:24289217
On avoided words, absent words, and their application to biological sequence analysis.
Almirantis, Yannis; Charalampopoulos, Panagiotis; Gao, Jia; Iliopoulos, Costas S; Mohamed, Manal; Pissis, Solon P; Polychronopoulos, Dimitris
2017-01-01
The deviation of the observed frequency of a word w from its expected frequency in a given sequence x is used to determine whether or not the word is avoided . This concept is particularly useful in DNA linguistic analysis. The value of the deviation of w , denoted by [Formula: see text], effectively characterises the extent of a word by its edge contrast in the context in which it occurs. A word w of length [Formula: see text] is a [Formula: see text]-avoided word in x if [Formula: see text], for a given threshold [Formula: see text]. Notice that such a word may be completely absent from x . Hence, computing all such words naïvely can be a very time-consuming procedure, in particular for large k . In this article, we propose an [Formula: see text]-time and [Formula: see text]-space algorithm to compute all [Formula: see text]-avoided words of length k in a given sequence of length n over a fixed-sized alphabet. We also present a time-optimal [Formula: see text]-time algorithm to compute all [Formula: see text]-avoided words (of any length) in a sequence of length n over an integer alphabet of size [Formula: see text]. In addition, we provide a tight asymptotic upper bound for the number of [Formula: see text]-avoided words over an integer alphabet and the expected length of the longest one. We make available an implementation of our algorithm. Experimental results, using both real and synthetic data, show the efficiency and applicability of our implementation in biological sequence analysis. The systematic search for avoided words is particularly useful for biological sequence analysis. We present a linear-time and linear-space algorithm for the computation of avoided words of length k in a given sequence x . We suggest a modification to this algorithm so that it computes all avoided words of x , irrespective of their length, within the same time complexity. We also present combinatorial results with regards to avoided words and absent words.
RDNAnalyzer: A tool for DNA secondary structure prediction and sequence analysis.
Afzal, Muhammad; Shahid, Ahmad Ali; Shehzadi, Abida; Nadeem, Shahid; Husnain, Tayyab
2012-01-01
RDNAnalyzer is an innovative computer based tool designed for DNA secondary structure prediction and sequence analysis. It can randomly generate the DNA sequence or user can upload the sequences of their own interest in RAW format. It uses and extends the Nussinov dynamic programming algorithm and has various application for the sequence analysis. It predicts the DNA secondary structure and base pairings. It also provides the tools for routinely performed sequence analysis by the biological scientists such as DNA replication, reverse compliment generation, transcription, translation, sequence specific information as total number of nucleotide bases, ATGC base contents along with their respective percentages and sequence cleaner. RDNAnalyzer is a unique tool developed in Microsoft Visual Studio 2008 using Microsoft Visual C# and Windows Presentation Foundation and provides user friendly environment for sequence analysis. It is freely available. http://www.cemb.edu.pk/sw.html RDNAnalyzer - Random DNA Analyser, GUI - Graphical user interface, XAML - Extensible Application Markup Language.
Direct Detection and Sequencing of Damaged DNA Bases
2011-01-01
Products of various forms of DNA damage have been implicated in a variety of important biological processes, such as aging, neurodegenerative diseases, and cancer. Therefore, there exists great interest to develop methods for interrogating damaged DNA in the context of sequencing. Here, we demonstrate that single-molecule, real-time (SMRT®) DNA sequencing can directly detect damaged DNA bases in the DNA template - as a by-product of the sequencing method - through an analysis of the DNA polymerase kinetics that are altered by the presence of a modified base. We demonstrate the sequencing of several DNA templates containing products of DNA damage, including 8-oxoguanine, 8-oxoadenine, O6-methylguanine, 1-methyladenine, O4-methylthymine, 5-hydroxycytosine, 5-hydroxyuracil, 5-hydroxymethyluracil, or thymine dimers, and show that these base modifications can be readily detected with single-modification resolution and DNA strand specificity. We characterize the distinct kinetic signatures generated by these DNA base modifications. PMID:22185597
Direct detection and sequencing of damaged DNA bases.
Clark, Tyson A; Spittle, Kristi E; Turner, Stephen W; Korlach, Jonas
2011-12-20
Products of various forms of DNA damage have been implicated in a variety of important biological processes, such as aging, neurodegenerative diseases, and cancer. Therefore, there exists great interest to develop methods for interrogating damaged DNA in the context of sequencing. Here, we demonstrate that single-molecule, real-time (SMRT®) DNA sequencing can directly detect damaged DNA bases in the DNA template - as a by-product of the sequencing method - through an analysis of the DNA polymerase kinetics that are altered by the presence of a modified base. We demonstrate the sequencing of several DNA templates containing products of DNA damage, including 8-oxoguanine, 8-oxoadenine, O6-methylguanine, 1-methyladenine, O4-methylthymine, 5-hydroxycytosine, 5-hydroxyuracil, 5-hydroxymethyluracil, or thymine dimers, and show that these base modifications can be readily detected with single-modification resolution and DNA strand specificity. We characterize the distinct kinetic signatures generated by these DNA base modifications.
Combinatorial algorithms for design of DNA arrays.
Hannenhalli, Sridhar; Hubell, Earl; Lipshutz, Robert; Pevzner, Pavel A
2002-01-01
Optimal design of DNA arrays requires the development of algorithms with two-fold goals: reducing the effects caused by unintended illumination (border length minimization problem) and reducing the complexity of masks (mask decomposition problem). We describe algorithms that reduce the number of rectangles in mask decomposition by 20-30% as compared to a standard array design under the assumption that the arrangement of oligonucleotides on the array is fixed. This algorithm produces provably optimal solution for all studied real instances of array design. We also address the difficult problem of finding an arrangement which minimizes the border length and come up with a new idea of threading that significantly reduces the border length as compared to standard designs.
A comprehensive list of cloned human DNA sequences
Schmidtke, Jörg; Cooper, David N.
1987-01-01
A list of DNA sequences cloned from the human genome is presented. Intended as a guide to clone availability, this list includes published reports of cDNA, genomic and synthetic clones comprising gene and pseudogene sequences, uncharacterised DNA segments and repetitive DNA elements. PMID:3575113
A comprehensive list of cloned human DNA sequences
Schmidtke, Jörg; Cooper, David N.
1990-01-01
A list of DNA sequences cloned from the human genome is presented. Intended as a guide to clone availability, this list includes published reports of cDNA, genomic and synthetic clones comprising gene and pseudogene sequences, uncharacterised DNA segments and repetitive DNA elements. PMID:2333227
A comprehensive list of cloned human DNA sequences
Schmidtke, Jörg; Cooper, David N.
1988-01-01
A list of DNA sequences cloned from the human genome is presented. Intended as a guide to clone availability, this list includes published reports of cDNA, genomic and synthetic clones comprising gene and pseudogene sequences, uncharacterised DNA segments and repetitive DNA elements. PMID:3368330
A comprehensive list of cloned human DNA sequences
Schmidtke, Jörg; Cooper, David N.
1989-01-01
A list of DNA sequences cloned from the human genome is presented. Intended as a guide to clone availability, this list includes published reports of cDNA, genomic and synthetic clones comprising gene and pseudogene sequences, uncharacterised DNA segments and repetitive DNA elements. PMID:2654889
Kilo-sequencing: an ordered strategy for rapid DNA sequence data acquisition.
Barnes, W M; Bevan, M
1983-01-01
A strategy for rapid DNA sequence acquisition in an ordered, nonrandom manner, while retaining all of the conveniences of the dideoxy method with M13 transducing phage DNA template, is described. Target DNA 3 to 14 kb in size can be stably carried by our M13 vectors. Suitable targets are stretches of DNA which lack an enzyme recognition site which is unique on our cloning vectors and adjacent to the sequencing primer; current sites that are so useful when lacking are Pst, Xba, HindIII, BglII, EcoRI. By an in vitro procedure, we cut RF DNA once randomly and once specifically, to create thousands of deletions which start at the unique restriction site adjacent to the dideoxy sequencing primer and extend various distances across the target DNA. Phage carrying a desired size of deletions, whose DNA as template will give rise to DNA sequence data in a desired location along the target DNA, may be purified by electrophoresis alive on agarose gels. Phage running in the same location on the agarose gel thus conveniently give rise to nucleotide sequence data from the same kilobase of target DNA. Images PMID:6298723
DOE Office of Scientific and Technical Information (OSTI.GOV)
Saldanha, Sabita N., E-mail: sabivan@uab.edu; Department of Biological Sciences, Alabama State University, Montgomery, AL 36104; Kala, Rishabh
Bioactive compounds are considered safe and have been shown to alter genetic and epigenetic profiles of tumor cells. However, many of these changes have been reported at molecular concentrations higher than physiologically achievable levels. We investigated the role of the combinatorial effects of epigallocatechin gallate (EGCG), a predominant polyphenol in green tea, and sodium butyrate (NaB), a dietary microbial fermentation product of fiber, in the regulation of survivin, which is an overexpressed anti-apoptotic protein in colon cancer cells. For the first time, our study showed that the combination treatment induced apoptosis and cell cycle arrest in RKO, HCT-116 and HT-29more » colorectal cancer cells. This was found to be regulated by the decrease in HDAC1, DNMT1, survivin and HDAC activity in all three cell lines. A G2/M arrest was observed for RKO and HCT-116 cells, and G1 arrest for HT-29 colorectal cancer cells for combinatorial treatment. Further experimentation of the molecular mechanisms in RKO colorectal cancer (CRC) cells revealed a p53-dependent induction of p21 and an increase in nuclear factor kappa B (NF-κB)-p65. An increase in double strand breaks as determined by gamma-H2A histone family member X (γ-H2AX) protein levels and induction of histone H3 hyperacetylation was also observed with the combination treatment. Further, we observed a decrease in global CpG methylation. Taken together, these findings suggest that at low and physiologically achievable concentrations, combinatorial EGCG and NaB are effective in promoting apoptosis, inducing cell cycle arrest and DNA-damage in CRC cells. - Highlights: • EGCG and NaB as a combination inhibits colorectal cancer cell proliferation. • The combination treatment induces DNA damage, G2/M and G1 arrest and apoptosis. • Survivin is effectively down-regulated by the combination treatment. • p21 and p53 expressions are induced by the combination treatment. • Epigenetic proteins DNMT1 and HDAC1 are effectively down-regulated by the treatment.« less
Silicene nanoribbon as a new DNA sequencing device
NASA Astrophysics Data System (ADS)
Alesheikh, Sara; Shahtahmassebi, Nasser; Roknabadi, Mahmood Rezaee; Pilevar Shahri, Raheleh
2018-02-01
The importance of applying DNA sequencing in different fields, results in looking for fast and cheap methods. Nanotechnology helps this development by introducing nanostructures used for DNA sequencing. In this work we study the interaction between zigzag silicene nanoribbon and DNA nucleobases using DFT and non equilibrium Green's function approach, to investigate the possibility of using zigzag silicene nanoribbons as a biosensor for DNA sequencing.
Isolation and characterization of target sequences of the chicken CdxA homeobox gene.
Margalit, Y; Yarus, S; Shapira, E; Gruenbaum, Y; Fainsod, A
1993-01-01
The DNA binding specificity of the chicken homeodomain protein CDXA was studied. Using a CDXA-glutathione-S-transferase fusion protein, DNA fragments containing the binding site for this protein were isolated. The sources of DNA were oligonucleotides with random sequence and chicken genomic DNA. The DNA fragments isolated were sequenced and tested in DNA binding assays. Sequencing revealed that most DNA fragments are AT rich which is a common feature of homeodomain binding sites. By electrophoretic mobility shift assays it was shown that the different target sequences isolated bind to the CDXA protein with different affinities. The specific sequences bound by the CDXA protein in the genomic fragments isolated, were determined by DNase I footprinting. From the footprinted sequences, the CDXA consensus binding site was determined. The CDXA protein binds the consensus sequence A, A/T, T, A/T, A, T, A/G. The CAUDAL binding site in the ftz promoter is also included in this consensus sequence. When tested, some of the genomic target sequences were capable of enhancing the transcriptional activity of reporter plasmids when introduced into CDXA expressing cells. This study determined the DNA sequence specificity of the CDXA protein and it also shows that this protein can further activate transcription in cells in culture. Images PMID:7909943
Sequence periodicity in nucleosomal DNA and intrinsic curvature.
Nair, T Murlidharan
2010-05-17
Most eukaryotic DNA contained in the nucleus is packaged by wrapping DNA around histone octamers. Histones are ubiquitous and bind most regions of chromosomal DNA. In order to achieve smooth wrapping of the DNA around the histone octamer, the DNA duplex should be able to deform and should possess intrinsic curvature. The deformability of DNA is a result of the non-parallelness of base pair stacks. The stacking interaction between base pairs is sequence dependent. The higher the stacking energy the more rigid the DNA helix, thus it is natural to expect that sequences that are involved in wrapping around the histone octamer should be unstacked and possess intrinsic curvature. Intrinsic curvature has been shown to be dictated by the periodic recurrence of certain dinucleotides. Several genome-wide studies directed towards mapping of nucleosome positions have revealed periodicity associated with certain stretches of sequences. In the current study, these sequences have been analyzed with a view to understand their sequence-dependent structures. Higher order DNA structures and the distribution of molecular bend loci associated with 146 base nucleosome core DNA sequence from C. elegans and chicken have been analyzed using the theoretical model for DNA curvature. The curvature dispersion calculated by cyclically permuting the sequences revealed that the molecular bend loci were delocalized throughout the nucleosome core region and had varying degrees of intrinsic curvature. The higher order structures associated with nucleosomes of C.elegans and chicken calculated from the sequences revealed heterogeneity with respect to the deviation of the DNA axis. The results points to the possibility of context dependent curvature of varying degrees to be associated with nucleosomal DNA.
Assessing the Fidelity of Ancient DNA Sequences Amplified From Nuclear Genes
Binladen, Jonas; Wiuf, Carsten; Gilbert, M. Thomas P.; Bunce, Michael; Barnett, Ross; Larson, Greger; Greenwood, Alex D.; Haile, James; Ho, Simon Y. W.; Hansen, Anders J.; Willerslev, Eske
2006-01-01
To date, the field of ancient DNA has relied almost exclusively on mitochondrial DNA (mtDNA) sequences. However, a number of recent studies have reported the successful recovery of ancient nuclear DNA (nuDNA) sequences, thereby allowing the characterization of genetic loci directly involved in phenotypic traits of extinct taxa. It is well documented that postmortem damage in ancient mtDNA can lead to the generation of artifactual sequences. However, as yet no one has thoroughly investigated the damage spectrum in ancient nuDNA. By comparing clone sequences from 23 fossil specimens, recovered from environments ranging from permafrost to desert, we demonstrate the presence of miscoding lesion damage in both the mtDNA and nuDNA, resulting in insertion of erroneous bases during amplification. Interestingly, no significant differences in the frequency of miscoding lesion damage are recorded between mtDNA and nuDNA despite great differences in cellular copy numbers. For both mtDNA and nuDNA, we find significant positive correlations between total sequence heterogeneity and the rates of type 1 transitions (adenine → guanine and thymine → cytosine) and type 2 transitions (cytosine → thymine and guanine → adenine), respectively. Type 2 transitions are by far the most dominant and increase relative to those of type 1 with damage load. The results suggest that the deamination of cytosine (and 5-methyl cytosine) to uracil (and thymine) is the main cause of miscoding lesions in both ancient mtDNA and nuDNA sequences. We argue that the problems presented by postmortem damage, as well as problems with contamination from exogenous sources of conserved nuclear genes, allelic variation, and the reliance on single nucleotide polymorphisms, call for great caution in studies relying on ancient nuDNA sequences. PMID:16299392
[Current applications of high-throughput DNA sequencing technology in antibody drug research].
Yu, Xin; Liu, Qi-Gang; Wang, Ming-Rong
2012-03-01
Since the publication of a high-throughput DNA sequencing technology based on PCR reaction was carried out in oil emulsions in 2005, high-throughput DNA sequencing platforms have been evolved to a robust technology in sequencing genomes and diverse DNA libraries. Antibody libraries with vast numbers of members currently serve as a foundation of discovering novel antibody drugs, and high-throughput DNA sequencing technology makes it possible to rapidly identify functional antibody variants with desired properties. Herein we present a review of current applications of high-throughput DNA sequencing technology in the analysis of antibody library diversity, sequencing of CDR3 regions, identification of potent antibodies based on sequence frequency, discovery of functional genes, and combination with various display technologies, so as to provide an alternative approach of discovery and development of antibody drugs.
DNA fingerprinting, DNA barcoding, and next generation sequencing technology in plants.
Sucher, Nikolaus J; Hennell, James R; Carles, Maria C
2012-01-01
DNA fingerprinting of plants has become an invaluable tool in forensic, scientific, and industrial laboratories all over the world. PCR has become part of virtually every variation of the plethora of approaches used for DNA fingerprinting today. DNA sequencing is increasingly used either in combination with or as a replacement for traditional DNA fingerprinting techniques. A prime example is the use of short, standardized regions of the genome as taxon barcodes for biological identification of plants. Rapid advances in "next generation sequencing" (NGS) technology are driving down the cost of sequencing and bringing large-scale sequencing projects into the reach of individual investigators. We present an overview of recent publications that demonstrate the use of "NGS" technology for DNA fingerprinting and DNA barcoding applications.
Mammalian DNA enriched for replication origins is enriched for snap-back sequences.
Zannis-Hadjopoulos, M; Kaufmann, G; Martin, R G
1984-11-15
Using the instability of replication loops as a method for the isolation of double-stranded nascent DNA, extruded DNA enriched for replication origins was obtained and denatured. Snap-back DNA, single-stranded DNA with inverted repeats (palindromic sequences), reassociates rapidly into stem-loop structures with zero-order kinetics when conditions are changed from denaturing to renaturing, and can be assayed by chromatography on hydroxyapatite. Origin-enriched nascent DNA strands from mouse, rat and monkey cells growing either synchronously or asynchronously were purified and assayed for the presence of snap-back sequences. The results show that origin-enriched DNA is also enriched for snap-back sequences, implying that some origins for mammalian DNA replication contain or lie near palindromic sequences.
Combinatorial pulse position modulation for power-efficient free-space laser communications
NASA Technical Reports Server (NTRS)
Budinger, James M.; Vanderaar, M.; Wagner, P.; Bibyk, Steven
1993-01-01
A new modulation technique called combinatorial pulse position modulation (CPPM) is presented as a power-efficient alternative to quaternary pulse position modulation (QPPM) for direct-detection, free-space laser communications. The special case of 16C4PPM is compared to QPPM in terms of data throughput and bit error rate (BER) performance for similar laser power and pulse duty cycle requirements. The increased throughput from CPPM enables the use of forward error corrective (FEC) encoding for a net decrease in the amount of laser power required for a given data throughput compared to uncoded QPPM. A specific, practical case of coded CPPM is shown to reduce the amount of power required to transmit and receive a given data sequence by at least 4.7 dB. Hardware techniques for maximum likelihood detection and symbol timing recovery are presented.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Hancock, Stephen P.; Stella, Stefano; Cascio, Duilio
The abundant Fis nucleoid protein selectively binds poorly related DNA sequences with high affinities to regulate diverse DNA reactions. Fis binds DNA primarily through DNA backbone contacts and selects target sites by reading conformational properties of DNA sequences, most prominently intrinsic minor groove widths. High-affinity binding requires Fis-stabilized DNA conformational changes that vary depending on DNA sequence. In order to better understand the molecular basis for high affinity site recognition, we analyzed the effects of DNA sequence within and flanking the core Fis binding site on binding affinity and DNA structure. X-ray crystal structures of Fis-DNA complexes containing variable sequencesmore » in the noncontacted center of the binding site or variations within the major groove interfaces show that the DNA can adapt to the Fis dimer surface asymmetrically. We show that the presence and position of pyrimidine-purine base steps within the major groove interfaces affect both local DNA bending and minor groove compression to modulate affinities and lifetimes of Fis-DNA complexes. Sequences flanking the core binding site also modulate complex affinities, lifetimes, and the degree of local and global Fis-induced DNA bending. In particular, a G immediately upstream of the 15 bp core sequence inhibits binding and bending, and A-tracts within the flanking base pairs increase both complex lifetimes and global DNA curvatures. Taken together, our observations support a revised DNA motif specifying high-affinity Fis binding and highlight the range of conformations that Fis-bound DNA can adopt. Lastly, the affinities and DNA conformations of individual Fis-DNA complexes are likely to be tailored to their context-specific biological functions.« less
Hancock, Stephen P.; Stella, Stefano; Cascio, Duilio; ...
2016-03-09
The abundant Fis nucleoid protein selectively binds poorly related DNA sequences with high affinities to regulate diverse DNA reactions. Fis binds DNA primarily through DNA backbone contacts and selects target sites by reading conformational properties of DNA sequences, most prominently intrinsic minor groove widths. High-affinity binding requires Fis-stabilized DNA conformational changes that vary depending on DNA sequence. In order to better understand the molecular basis for high affinity site recognition, we analyzed the effects of DNA sequence within and flanking the core Fis binding site on binding affinity and DNA structure. X-ray crystal structures of Fis-DNA complexes containing variable sequencesmore » in the noncontacted center of the binding site or variations within the major groove interfaces show that the DNA can adapt to the Fis dimer surface asymmetrically. We show that the presence and position of pyrimidine-purine base steps within the major groove interfaces affect both local DNA bending and minor groove compression to modulate affinities and lifetimes of Fis-DNA complexes. Sequences flanking the core binding site also modulate complex affinities, lifetimes, and the degree of local and global Fis-induced DNA bending. In particular, a G immediately upstream of the 15 bp core sequence inhibits binding and bending, and A-tracts within the flanking base pairs increase both complex lifetimes and global DNA curvatures. Taken together, our observations support a revised DNA motif specifying high-affinity Fis binding and highlight the range of conformations that Fis-bound DNA can adopt. Lastly, the affinities and DNA conformations of individual Fis-DNA complexes are likely to be tailored to their context-specific biological functions.« less
Specific minor groove solvation is a crucial determinant of DNA binding site recognition
Harris, Lydia-Ann; Williams, Loren Dean; Koudelka, Gerald B.
2014-01-01
The DNA sequence preferences of nearly all sequence specific DNA binding proteins are influenced by the identities of bases that are not directly contacted by protein. Discrimination between non-contacted base sequences is commonly based on the differential abilities of DNA sequences to allow narrowing of the DNA minor groove. However, the factors that govern the propensity of minor groove narrowing are not completely understood. Here we show that the differential abilities of various DNA sequences to support formation of a highly ordered and stable minor groove solvation network are a key determinant of non-contacted base recognition by a sequence-specific binding protein. In addition, disrupting the solvent network in the non-contacted region of the binding site alters the protein's ability to recognize contacted base sequences at positions 5–6 bases away. This observation suggests that DNA solvent interactions link contacted and non-contacted base recognition by the protein. PMID:25429976
A Method for Preparing DNA Sequencing Templates Using a DNA-Binding Microplate
Yang, Yu; Hebron, Haroun R.; Hang, Jun
2009-01-01
A DNA-binding matrix was immobilized on the surface of a 96-well microplate and used for plasmid DNA preparation for DNA sequencing. The same DNA-binding plate was used for bacterial growth, cell lysis, DNA purification, and storage. In a single step using one buffer, bacterial cells were lysed by enzymes, and released DNA was captured on the plate simultaneously. After two wash steps, DNA was eluted and stored in the same plate. Inclusion of phosphates in the culture medium was found to enhance the yield of plasmid significantly. Purified DNA samples were used successfully in DNA sequencing with high consistency and reproducibility. Eleven vectors and nine libraries were tested using this method. In 10 μl sequencing reactions using 3 μl sample and 0.25 μl BigDye Terminator v3.1, the results from a 3730xl sequencer gave a success rate of 90–95% and read-lengths of 700 bases or more. The method is fully automatable and convenient for manual operation as well. It enables reproducible, high-throughput, rapid production of DNA with purity and yields sufficient for high-quality DNA sequencing at a substantially reduced cost. PMID:19568455
Dendritic Cell-Based Immunotherapy of Breast Cancer: Modulation by CpG DNA
2005-09-01
tumor-associated antigens and bacterial DNA oligodeoxynucleotides containing unmethylated CpG sequences (CpG DNA) further augment the immune priming...associated antigens by cytotoxic T lymphocytes, and bacterial DNA oligodeoxy- nucleotides containing unmethylated CpG sequences (CpG DNA) can further...further amplify their immunostimulatory capacity and bacterial DNA oligodeoxynucleotides (ODN) containing unmethylated CpG sequences (CpG DNA) provide such
Morozumi, Takeya; Toki, Daisuke; Eguchi-Ogawa, Tomoko; Uenishi, Hirohide
2011-09-01
Large-scale cDNA-sequencing projects require an efficient strategy for mass sequencing. Here we describe a method for sequencing pooled cDNA clones using a combination of transposon insertion and Gateway technology. Our method reduces the number of shotgun clones that are unsuitable for reconstruction of cDNA sequences, and has the advantage of reducing the total costs of the sequencing project.
Biological sequence compression algorithms.
Matsumoto, T; Sadakane, K; Imai, H
2000-01-01
Today, more and more DNA sequences are becoming available. The information about DNA sequences are stored in molecular biology databases. The size and importance of these databases will be bigger and bigger in the future, therefore this information must be stored or communicated efficiently. Furthermore, sequence compression can be used to define similarities between biological sequences. The standard compression algorithms such as gzip or compress cannot compress DNA sequences, but only expand them in size. On the other hand, CTW (Context Tree Weighting Method) can compress DNA sequences less than two bits per symbol. These algorithms do not use special structures of biological sequences. Two characteristic structures of DNA sequences are known. One is called palindromes or reverse complements and the other structure is approximate repeats. Several specific algorithms for DNA sequences that use these structures can compress them less than two bits per symbol. In this paper, we improve the CTW so that characteristic structures of DNA sequences are available. Before encoding the next symbol, the algorithm searches an approximate repeat and palindrome using hash and dynamic programming. If there is a palindrome or an approximate repeat with enough length then our algorithm represents it with length and distance. By using this preprocessing, a new program achieves a little higher compression ratio than that of existing DNA-oriented compression algorithms. We also describe new compression algorithm for protein sequences.
Detection of DNA Methylation by Whole-Genome Bisulfite Sequencing.
Li, Qing; Hermanson, Peter J; Springer, Nathan M
2018-01-01
DNA methylation plays an important role in the regulation of the expression of transposons and genes. Various methods have been developed to assay DNA methylation levels. Bisulfite sequencing is considered to be the "gold standard" for single-base resolution measurement of DNA methylation levels. Coupled with next-generation sequencing, whole-genome bisulfite sequencing (WGBS) allows DNA methylation to be evaluated at a genome-wide scale. Here, we described a protocol for WGBS in plant species with large genomes. This protocol has been successfully applied to assay genome-wide DNA methylation levels in maize and barley. This protocol has also been successfully coupled with sequence capture technology to assay DNA methylation levels in a targeted set of genomic regions.
Single-Molecule Electrical Random Resequencing of DNA and RNA
NASA Astrophysics Data System (ADS)
Ohshiro, Takahito; Matsubara, Kazuki; Tsutsui, Makusu; Furuhashi, Masayuki; Taniguchi, Masateru; Kawai, Tomoji
2012-07-01
Two paradigm shifts in DNA sequencing technologies--from bulk to single molecules and from optical to electrical detection--are expected to realize label-free, low-cost DNA sequencing that does not require PCR amplification. It will lead to development of high-throughput third-generation sequencing technologies for personalized medicine. Although nanopore devices have been proposed as third-generation DNA-sequencing devices, a significant milestone in these technologies has been attained by demonstrating a novel technique for resequencing DNA using electrical signals. Here we report single-molecule electrical resequencing of DNA and RNA using a hybrid method of identifying single-base molecules via tunneling currents and random sequencing. Our method reads sequences of nine types of DNA oligomers. The complete sequence of 5'-UGAGGUA-3' from the let-7 microRNA family was also identified by creating a composite of overlapping fragment sequences, which was randomly determined using tunneling current conducted by single-base molecules as they passed between a pair of nanoelectrodes.
DNA/RNA hybrid substrates modulate the catalytic activity of purified AID.
Abdouni, Hala S; King, Justin J; Ghorbani, Atefeh; Fifield, Heather; Berghuis, Lesley; Larijani, Mani
2018-01-01
Activation-induced cytidine deaminase (AID) converts cytidine to uridine at Immunoglobulin (Ig) loci, initiating somatic hypermutation and class switching of antibodies. In vitro, AID acts on single stranded DNA (ssDNA), but neither double-stranded DNA (dsDNA) oligonucleotides nor RNA, and it is believed that transcription is the in vivo generator of ssDNA targeted by AID. It is also known that the Ig loci, particularly the switch (S) regions targeted by AID are rich in transcription-generated DNA/RNA hybrids. Here, we examined the binding and catalytic behavior of purified AID on DNA/RNA hybrid substrates bearing either random sequences or GC-rich sequences simulating Ig S regions. If substrates were made up of a random sequence, AID preferred substrates composed entirely of DNA over DNA/RNA hybrids. In contrast, if substrates were composed of S region sequences, AID preferred to mutate DNA/RNA hybrids over substrates composed entirely of DNA. Accordingly, AID exhibited a significantly higher affinity for binding DNA/RNA hybrid substrates composed specifically of S region sequences, than any other substrates composed of DNA. Thus, in the absence of any other cellular processes or factors, AID itself favors binding and mutating DNA/RNA hybrids composed of S region sequences. AID:DNA/RNA complex formation and supporting mutational analyses suggest that recognition of DNA/RNA hybrids is an inherent structural property of AID. Copyright © 2017 Elsevier Ltd. All rights reserved.
Characterization of the repetitive DNA elements in the genome of fish lymphocystis disease viruses.
Schnitzler, P; Darai, G
1989-09-01
The complete DNA nucleotide sequence of the repetitive DNA elements in the genome of fish lymphocystis disease virus (FLDV) isolated from two different species (flounder and dab) was determined. The size of these repetitive DNA elements was found to be 1413 bp which corresponds to the DNA sequences of the 5' terminus of the EcoRI DNA fragment B (0.034 to 0.052 m.u.) and to the EcoRI DNA fragment M (0.718 to 0.736 m.u.) of the FLDV genome causing lymphocystis disease in flounder and plaice. The degree of DNA nucleotide homology between both regions was found to be 99%. The repetitive DNA element in the genome of FLDV isolated from other fish species (dab) was identified and is located within the EcoRI DNA fragment B and J of the viral genome. The DNA nucleotide sequence of one duplicate of this repetition (EcoRI DNA fragment J) was determined (1410 bp) and compared to the DNA nucleotide sequences of the repetitive DNA elements of the genome of FLDV isolated from flounder. It was found that the repetitive DNA elements of the genome of FLDV derived from two different fish species are highly conserved and possess a degree of DNA sequence homology of 94%. The DNA sequences of each strand of the individual repetitive element possess one open reading frame.
Long-range correlations and charge transport properties of DNA sequences
NASA Astrophysics Data System (ADS)
Liu, Xiao-liang; Ren, Yi; Xie, Qiong-tao; Deng, Chao-sheng; Xu, Hui
2010-04-01
By using Hurst's analysis and transfer approach, the rescaled range functions and Hurst exponents of human chromosome 22 and enterobacteria phage lambda DNA sequences are investigated and the transmission coefficients, Landauer resistances and Lyapunov coefficients of finite segments based on above genomic DNA sequences are calculated. In a comparison with quasiperiodic and random artificial DNA sequences, we find that λ-DNA exhibits anticorrelation behavior characterized by a Hurst exponent 0.5
[Whole Genome Sequencing of Human mtDNA Based on Ion Torrent PGM™ Platform].
Cao, Y; Zou, K N; Huang, J P; Ma, K; Ping, Y
2017-08-01
To analyze and detect the whole genome sequence of human mitochondrial DNA (mtDNA) by Ion Torrent PGM™ platform and to study the differences of mtDNA sequence in different tissues. Samples were collected from 6 unrelated individuals by forensic postmortem examination, including chest blood, hair, costicartilage, nail, skeletal muscle and oral epithelium. Amplification of whole genome sequence of mtDNA was performed by 4 pairs of primer. Libraries were constructed with Ion Shear™ Plus Reagents kit and Ion Plus Fragment Library kit. Whole genome sequencing of mtDNA was performed using Ion Torrent PGM™ platform. Sanger sequencing was used to determine the heteroplasmy positions and the mutation positions on HVⅠ region. The whole genome sequence of mtDNA from all samples were amplified successfully. Six unrelated individuals belonged to 6 different haplotypes. Different tissues in one individual had heteroplasmy difference. The heteroplasmy positions and the mutation positions on HVⅠ region were verified by Sanger sequencing. After a consistency check by the Kappa method, it was found that the results of mtDNA sequence had a high consistency in different tissues. The testing method used in present study for sequencing the whole genome sequence of human mtDNA can detect the heteroplasmy difference in different tissues, which have good consistency. The results provide guidance for the further applications of mtDNA in forensic science. Copyright© by the Editorial Department of Journal of Forensic Medicine
DNA Polyplexes as Combinatory Drug Carriers of Doxorubicin and Cisplatin: An In Vitro Study
Kang, Han Chang; Cho, Hana; Bae, You Han
2015-01-01
Double helix nucleic acids were used as a combination drug carrier for doxorubicin (DOX), which physically intercalates with DNA double helices, and cisplatin (CDDP), which binds to DNA without an alkylation reaction. DNA interacting with DOX, CDDP, or both was complexed with positively charged, endosomolytic polymers. Compared with the free drug, the polyplexes (100 ~ 170 nm in size) delivered more drug into the cytosol and the nucleus and demonstrated similar or superior (up to a 7-fold increase) in vitro cell-killing activity. Additionally, the gene expression activities of most of the chemical drug-loaded plasmid DNA (pDNA) polyplexes were not impaired by the physical interactions between the nucleic acid and DOX/CDDP. When a model reporter pDNA (luciferase) was employed, it expressed luciferase protein at 0.7- ~ 1.4-fold the amount expressed by the polyplex with no bound drugs (a control), which indicated the fast translocation of the intercalated or bound drugs from the “carrier DNA” to the “nuclear DNA” of target cells. The proposed concept may offer the possibility of versatile combination therapies of genetic materials and small molecule drugs that bind to nucleic acids to treat various diseases. PMID:26132975
El Zoeiby, Ahmed; Sanschagrin, François; Darveau, André; Brisson, Jean-Robert; Levesque, Roger C
2003-03-01
The machinery of peptidoglycan biosynthesis is an ideal site at which to look for novel antimicrobial targets. Phage display was used to develop novel peptide inhibitors for MurC, an essential enzyme involved in the early steps of biosynthesis of peptidoglycan monomer. We cloned and overexpressed the murA, -B and -C genes from Pseudomonas aeruginosa in the pET expression vector, adding a His-tag to their C termini. The three proteins were overproduced in Escherichia coli and purified to homogeneity in milligram quantities. MurA and -B were combinatorially used to synthesize the MurC substrate UDP-N-acetylmuramate, the identity of which was confirmed by mass spectrometry and nuclear magnetic resonance analysis. Two phage-display libraries were screened against MurC in order to identify peptide ligands to the enzyme. Three rounds of biopanning were carried out, successively increasing elution specificity from round 1 to 3. The third round was accomplished with both non-specific elution and competitive elution with each of the three MurC substrates, UDP-N-acetylmuramic acid (UNAM), ATP and L-alanine. The DNA of 10 phage, selected randomly from each group, was extracted and sequenced, and consensus peptide sequences were elucidated. Peptides were synthesized and tested for inhibition of the MurC-catalysed reaction, and two peptides were shown to be inhibitors of MurC activity with IC(50)s of 1.5 and 0.9 mM, respectively. The powerful selection technique of phage display allowed us to identify two peptide inhibitors of the essential bacterial enzyme MurC. The peptide sequences represent the basis for the synthesis of inhibitory peptidomimetic molecules.
Li, Wan-Ming; Hu, Ting-Ting; Zhou, Lin-Lin; Feng, Yi-Ming; Wang, Yun-Yi; Fang, Jin
2016-07-12
The PIK3CA (H1047R) mutation is considered to be a potential predictive biomarker for EGFR-targeted therapies. In this study, we developed a novel PCR-PFLP approach to detect the PIK3CA (H1047R) mutation in high effectiveness. A 126-bp fragment of PIK3CA exon-20 was amplified by PCR, digested with FspI restriction endonuclease and separated by 3 % agarose gel electrophoresis for the PCR-RFLP analysis. The mutant sequence of the PIK3CA (H1047R) was spiked into the corresponding wild-type sequence in decreasing ratios for sensitivity analysis. Eight-six cases of formalin-fixed paraffin-embedded colorectal cancer (CRC) specimens were subjected to PCR-RFLP to evaluate the applicability of the method. The PCR-RFLP method had a capability to detect as litter as 0.4 % of mutation, and revealed 16.3 % of the PIK3CA (H1047R) mutation in 86 CRC tissues, which was significantly higher than that discovered by DNA sequencing (9.3 %). A positive association between the PIK3CA (H1047R) mutation and the patients' age was first found, except for the negative relationship with the degree of tumor differentiation. In addition, the highly sensitive detection of a combinatorial mutation of PIK3CA, KRAS and BRAF was achieved using individual PCR-RFLP methods. We developed a sensitive, simple and rapid approach to detect the low-abundance PIK3CA (H1047R) mutation in real CRC specimens, providing an effective tool for guiding cancer targeted therapy.
Renieri, Carlo; La Terza, Antonietta
2015-01-01
The objectives of the present study were to characterize the MC1R gene, its transcripts and the single nucleotide polymorphisms (SNPs) associated with coat color in alpaca. Full length cDNA amplification revealed the presence of two transcripts, named as F1 and F2, differing only in the length of their 5′-terminal untranslated region (UTR) sequences and presenting a color specific expression. Whereas the F1 transcript was common to white and colored (black and brown) alpaca phenotypes, the shorter F2 transcript was specific to white alpaca. Further sequencing of the MC1R gene in white and colored alpaca identified a total of twelve SNPs; among those nine (four silent mutations (c.126C>A, c.354T>C, c.618G>A, and c.933G>A); five missense mutations (c.82A>G, c.92C>T, c.259A>G, c.376A>G, and c.901C>T)) were observed in coding region and three in the 3′UTR. A 4 bp deletion (c.224 227del) was also identified in the coding region. Molecular segregation analysis uncovered that the combinatory mutations in the MC1R locus could cause eumelanin and pheomelanin synthesis in alpaca. Overall, our data refine what is known about the MC1R gene and provides additional information on its role in alpaca pigmentation. PMID:25685836
Sequence periodicity in nucleosomal DNA and intrinsic curvature
2010-01-01
Background Most eukaryotic DNA contained in the nucleus is packaged by wrapping DNA around histone octamers. Histones are ubiquitous and bind most regions of chromosomal DNA. In order to achieve smooth wrapping of the DNA around the histone octamer, the DNA duplex should be able to deform and should possess intrinsic curvature. The deformability of DNA is a result of the non-parallelness of base pair stacks. The stacking interaction between base pairs is sequence dependent. The higher the stacking energy the more rigid the DNA helix, thus it is natural to expect that sequences that are involved in wrapping around the histone octamer should be unstacked and possess intrinsic curvature. Intrinsic curvature has been shown to be dictated by the periodic recurrence of certain dinucleotides. Several genome-wide studies directed towards mapping of nucleosome positions have revealed periodicity associated with certain stretches of sequences. In the current study, these sequences have been analyzed with a view to understand their sequence-dependent structures. Results Higher order DNA structures and the distribution of molecular bend loci associated with 146 base nucleosome core DNA sequence from C. elegans and chicken have been analyzed using the theoretical model for DNA curvature. The curvature dispersion calculated by cyclically permuting the sequences revealed that the molecular bend loci were delocalized throughout the nucleosome core region and had varying degrees of intrinsic curvature. Conclusions The higher order structures associated with nucleosomes of C.elegans and chicken calculated from the sequences revealed heterogeneity with respect to the deviation of the DNA axis. The results points to the possibility of context dependent curvature of varying degrees to be associated with nucleosomal DNA. PMID:20487515
Murray, V
1999-01-01
This article reviews the literature concerning the sequence specificity of DNA-damaging agents. DNA-damaging agents are widely used in cancer chemotherapy. It is important to understand fully the determinants of DNA sequence specificity so that more effective DNA-damaging agents can be developed as antitumor drugs. There are five main methods of DNA sequence specificity analysis: cleavage of end-labeled fragments, linear amplification with Taq DNA polymerase, ligation-mediated polymerase chain reaction (PCR), single-strand ligation PCR, and footprinting. The DNA sequence specificity in purified DNA and in intact mammalian cells is reviewed for several classes of DNA-damaging agent. These include agents that form covalent adducts with DNA, free radical generators, topoisomerase inhibitors, intercalators and minor groove binders, enzymes, and electromagnetic radiation. The main sites of adduct formation are at the N-7 of guanine in the major groove of DNA and the N-3 of adenine in the minor groove, whereas free radical generators abstract hydrogen from the deoxyribose sugar and topoisomerase inhibitors cause enzyme-DNA cross-links to form. Several issues involved in the determination of the DNA sequence specificity are discussed. The future directions of the field, with respect to cancer chemotherapy, are also examined.
Deciphering the genomic targets of alkylating polyamide conjugates using high-throughput sequencing
Chandran, Anandhakumar; Syed, Junetha; Taylor, Rhys D.; Kashiwazaki, Gengo; Sato, Shinsuke; Hashiya, Kaori; Bando, Toshikazu; Sugiyama, Hiroshi
2016-01-01
Chemically engineered small molecules targeting specific genomic sequences play an important role in drug development research. Pyrrole-imidazole polyamides (PIPs) are a group of molecules that can bind to the DNA minor-groove and can be engineered to target specific sequences. Their biological effects rely primarily on their selective DNA binding. However, the binding mechanism of PIPs at the chromatinized genome level is poorly understood. Herein, we report a method using high-throughput sequencing to identify the DNA-alkylating sites of PIP-indole-seco-CBI conjugates. High-throughput sequencing analysis of conjugate 2 showed highly similar DNA-alkylating sites on synthetic oligos (histone-free DNA) and on human genomes (chromatinized DNA context). To our knowledge, this is the first report identifying alkylation sites across genomic DNA by alkylating PIP conjugates using high-throughput sequencing. PMID:27098039
A Case Study into Microbial Genome Assembly Gap Sequences and Finishing Strategies.
Utturkar, Sagar M; Klingeman, Dawn M; Hurt, Richard A; Brown, Steven D
2017-01-01
This study characterized regions of DNA which remained unassembled by either PacBio and Illumina sequencing technologies for seven bacterial genomes. Two genomes were manually finished using bioinformatics and PCR/Sanger sequencing approaches and regions not assembled by automated software were analyzed. Gaps present within Illumina assemblies mostly correspond to repetitive DNA regions such as multiple rRNA operon sequences. PacBio gap sequences were evaluated for several properties such as GC content, read coverage, gap length, ability to form strong secondary structures, and corresponding annotations. Our hypothesis that strong secondary DNA structures blocked DNA polymerases and contributed to gap sequences was not accepted. PacBio assemblies had few limitations overall and gaps were explained as cumulative effect of lower than average sequence coverage and repetitive sequences at contig termini. An important aspect of the present study is the compilation of biological features that interfered with assembly and included active transposons, multiple plasmid sequences, phage DNA integration, and large sequence duplication. Our targeted genome finishing approach and systematic evaluation of the unassembled DNA will be useful for others looking to close, finish, and polish microbial genome sequences.
Scalable whole-exome sequencing of cell-free DNA reveals high concordance with metastatic tumors.
Adalsteinsson, Viktor A; Ha, Gavin; Freeman, Samuel S; Choudhury, Atish D; Stover, Daniel G; Parsons, Heather A; Gydush, Gregory; Reed, Sarah C; Rotem, Denisse; Rhoades, Justin; Loginov, Denis; Livitz, Dimitri; Rosebrock, Daniel; Leshchiner, Ignaty; Kim, Jaegil; Stewart, Chip; Rosenberg, Mara; Francis, Joshua M; Zhang, Cheng-Zhong; Cohen, Ofir; Oh, Coyin; Ding, Huiming; Polak, Paz; Lloyd, Max; Mahmud, Sairah; Helvie, Karla; Merrill, Margaret S; Santiago, Rebecca A; O'Connor, Edward P; Jeong, Seong H; Leeson, Rachel; Barry, Rachel M; Kramkowski, Joseph F; Zhang, Zhenwei; Polacek, Laura; Lohr, Jens G; Schleicher, Molly; Lipscomb, Emily; Saltzman, Andrea; Oliver, Nelly M; Marini, Lori; Waks, Adrienne G; Harshman, Lauren C; Tolaney, Sara M; Van Allen, Eliezer M; Winer, Eric P; Lin, Nancy U; Nakabayashi, Mari; Taplin, Mary-Ellen; Johannessen, Cory M; Garraway, Levi A; Golub, Todd R; Boehm, Jesse S; Wagle, Nikhil; Getz, Gad; Love, J Christopher; Meyerson, Matthew
2017-11-06
Whole-exome sequencing of cell-free DNA (cfDNA) could enable comprehensive profiling of tumors from blood but the genome-wide concordance between cfDNA and tumor biopsies is uncertain. Here we report ichorCNA, software that quantifies tumor content in cfDNA from 0.1× coverage whole-genome sequencing data without prior knowledge of tumor mutations. We apply ichorCNA to 1439 blood samples from 520 patients with metastatic prostate or breast cancers. In the earliest tested sample for each patient, 34% of patients have ≥10% tumor-derived cfDNA, sufficient for standard coverage whole-exome sequencing. Using whole-exome sequencing, we validate the concordance of clonal somatic mutations (88%), copy number alterations (80%), mutational signatures, and neoantigens between cfDNA and matched tumor biopsies from 41 patients with ≥10% cfDNA tumor content. In summary, we provide methods to identify patients eligible for comprehensive cfDNA profiling, revealing its applicability to many patients, and demonstrate high concordance of cfDNA and metastatic tumor whole-exome sequencing.
An evolution based biosensor receptor DNA sequence generation algorithm.
Kim, Eungyeong; Lee, Malrey; Gatton, Thomas M; Lee, Jaewan; Zang, Yupeng
2010-01-01
A biosensor is composed of a bioreceptor, an associated recognition molecule, and a signal transducer that can selectively detect target substances for analysis. DNA based biosensors utilize receptor molecules that allow hybridization with the target analyte. However, most DNA biosensor research uses oligonucleotides as the target analytes and does not address the potential problems of real samples. The identification of recognition molecules suitable for real target analyte samples is an important step towards further development of DNA biosensors. This study examines the characteristics of DNA used as bioreceptors and proposes a hybrid evolution-based DNA sequence generating algorithm, based on DNA computing, to identify suitable DNA bioreceptor recognition molecules for stable hybridization with real target substances. The Traveling Salesman Problem (TSP) approach is applied in the proposed algorithm to evaluate the safety and fitness of the generated DNA sequences. This approach improves efficiency and stability for enhanced and variable-length DNA sequence generation and allows extension to generation of variable-length DNA sequences with diverse receptor recognition requirements.
RDNAnalyzer: A tool for DNA secondary structure prediction and sequence analysis
Afzal, Muhammad; Shahid, Ahmad Ali; Shehzadi, Abida; Nadeem, Shahid; Husnain, Tayyab
2012-01-01
RDNAnalyzer is an innovative computer based tool designed for DNA secondary structure prediction and sequence analysis. It can randomly generate the DNA sequence or user can upload the sequences of their own interest in RAW format. It uses and extends the Nussinov dynamic programming algorithm and has various application for the sequence analysis. It predicts the DNA secondary structure and base pairings. It also provides the tools for routinely performed sequence analysis by the biological scientists such as DNA replication, reverse compliment generation, transcription, translation, sequence specific information as total number of nucleotide bases, ATGC base contents along with their respective percentages and sequence cleaner. RDNAnalyzer is a unique tool developed in Microsoft Visual Studio 2008 using Microsoft Visual C# and Windows Presentation Foundation and provides user friendly environment for sequence analysis. It is freely available. Availability http://www.cemb.edu.pk/sw.html Abbreviations RDNAnalyzer - Random DNA Analyser, GUI - Graphical user interface, XAML - Extensible Application Markup Language. PMID:23055611
Structural and Thermodynamic Signatures of DNA Recognition by Mycobacterium tuberculosis DnaA
DOE Office of Scientific and Technical Information (OSTI.GOV)
Tsodikov, Oleg V.; Biswas, Tapan
An essential protein, DnaA, binds to 9-bp DNA sites within the origin of replication oriC. These binding events are prerequisite to forming an enigmatic nucleoprotein scaffold that initiates replication. The number, sequences, positions, and orientations of these short DNA sites, or DnaA boxes, within the oriCs of different bacteria vary considerably. To investigate features of DnaA boxes that are important for binding Mycobacterium tuberculosis DnaA (MtDnaA), we have determined the crystal structures of the DNA binding domain (DBD) of MtDnaA bound to a cognate MtDnaA-box (at 2.0 {angstrom} resolution) and to a consensus Escherichia coli DnaA-box (at 2.3 {angstrom}). Thesemore » structures, complemented by calorimetric equilibrium binding studies of MtDnaA DBD in a series of DnaA-box variants, reveal the main determinants of DNA recognition and establish the [T/C][T/A][G/A]TCCACA sequence as a high-affinity MtDnaA-box. Bioinformatic and calorimetric analyses indicate that DnaA-box sequences in mycobacterial oriCs generally differ from the optimal binding sequence. This sequence variation occurs commonly at the first 2 bp, making an in vivo mycobacterial DnaA-box effectively a 7-mer and not a 9-mer. We demonstrate that the decrease in the affinity of these MtDnaA-box variants for MtDnaA DBD relative to that of the highest-affinity box TTGTCCACA is less than 10-fold. The understanding of DnaA-box recognition by MtDnaA and E. coli DnaA enables one to map DnaA-box sequences in the genomes of M. tuberculosis and other eubacteria.« less
Programmable disorder in random DNA tilings
NASA Astrophysics Data System (ADS)
Tikhomirov, Grigory; Petersen, Philip; Qian, Lulu
2017-03-01
Scaling up the complexity and diversity of synthetic molecular structures will require strategies that exploit the inherent stochasticity of molecular systems in a controlled fashion. Here we demonstrate a framework for programming random DNA tilings and show how to control the properties of global patterns through simple, local rules. We constructed three general forms of planar network—random loops, mazes and trees—on the surface of self-assembled DNA origami arrays on the micrometre scale with nanometre resolution. Using simple molecular building blocks and robust experimental conditions, we demonstrate control of a wide range of properties of the random networks, including the branching rules, the growth directions, the proximity between adjacent networks and the size distribution. Much as combinatorial approaches for generating random one-dimensional chains of polymers have been used to revolutionize chemical synthesis and the selection of functional nucleic acids, our strategy extends these principles to random two-dimensional networks of molecules and creates new opportunities for fabricating more complex molecular devices that are organized by DNA nanostructures.
Taricani, Lorena; Shanahan, Frances; Malinao, Maria-Christina; Beaumont, Maribel; Parry, David
2014-01-01
Ribonucleotide reductase (RNR) enzyme is composed of the homodimeric RRM1 and RRM2 subunits, which together form a heterotetramic active enzyme that catalyzes the de novo reduction of ribonucleotides to generate deoxyribonucleotides (dNTPs), which are required for DNA replication and DNA repair processes. In this study, we show that ablation of RRM1 and RRM2 by siRNA induces G1/S phase arrest, phosphorylation of Chk1 on Ser345 and phosphorylation of γ-H2AX on S139. Combinatorial ablation of RRM1 or RRM2 and Chk1 causes a dramatic accumulation of γ-H2AX, a marker of double-strand DNA breaks, suggesting that activation of Chk1 in this context is essential for suppression of DNA damage. Significantly, we demonstrate for the first time that Chk1 and RNR subunits co-immunoprecipitate from native cell extracts. These functional genomic studies suggest that RNR is a critical mediator of replication checkpoint activation. PMID:25375241
DNA barcode goes two-dimensions: DNA QR code web server.
Liu, Chang; Shi, Linchun; Xu, Xiaolan; Li, Huan; Xing, Hang; Liang, Dong; Jiang, Kun; Pang, Xiaohui; Song, Jingyuan; Chen, Shilin
2012-01-01
The DNA barcoding technology uses a standard region of DNA sequence for species identification and discovery. At present, "DNA barcode" actually refers to DNA sequences, which are not amenable to information storage, recognition, and retrieval. Our aim is to identify the best symbology that can represent DNA barcode sequences in practical applications. A comprehensive set of sequences for five DNA barcode markers ITS2, rbcL, matK, psbA-trnH, and CO1 was used as the test data. Fifty-three different types of one-dimensional and ten two-dimensional barcode symbologies were compared based on different criteria, such as coding capacity, compression efficiency, and error detection ability. The quick response (QR) code was found to have the largest coding capacity and relatively high compression ratio. To facilitate the further usage of QR code-based DNA barcodes, a web server was developed and is accessible at http://qrfordna.dnsalias.org. The web server allows users to retrieve the QR code for a species of interests, convert a DNA sequence to and from a QR code, and perform species identification based on local and global sequence similarities. In summary, the first comprehensive evaluation of various barcode symbologies has been carried out. The QR code has been found to be the most appropriate symbology for DNA barcode sequences. A web server has also been constructed to allow biologists to utilize QR codes in practical DNA barcoding applications.
TaxI: a software tool for DNA barcoding using distance methods
Steinke, Dirk; Vences, Miguel; Salzburger, Walter; Meyer, Axel
2005-01-01
DNA barcoding is a promising approach to the diagnosis of biological diversity in which DNA sequences serve as the primary key for information retrieval. Most existing software for evolutionary analysis of DNA sequences was designed for phylogenetic analyses and, hence, those algorithms do not offer appropriate solutions for the rapid, but precise analyses needed for DNA barcoding, and are also unable to process the often large comparative datasets. We developed a flexible software tool for DNA taxonomy, named TaxI. This program calculates sequence divergences between a query sequence (taxon to be barcoded) and each sequence of a dataset of reference sequences defined by the user. Because the analysis is based on separate pairwise alignments this software is also able to work with sequences characterized by multiple insertions and deletions that are difficult to align in large sequence sets (i.e. thousands of sequences) by multiple alignment algorithms because of computational restrictions. Here, we demonstrate the utility of this approach with two datasets of fish larvae and juveniles from Lake Constance and juvenile land snails under different models of sequence evolution. Sets of ribosomal 16S rRNA sequences, characterized by multiple indels, performed as good as or better than cox1 sequence sets in assigning sequences to species, demonstrating the suitability of rRNA genes for DNA barcoding. PMID:16214755
Tabor, Stanley; Richardson, Charles C.
1995-04-25
A method for sequencing a strand of DNA, including the steps off: providing the strand of DNA; annealing the strand with a primer able to hybridize to the strand to give an annealed mixture; incubating the mixture with four deoxyribonucleoside triphosphates, a DNA polymerase, and at least three deoxyribonucleoside triphosphates in different amounts, under conditions in favoring primer extension to form nucleic acid fragments complementory to the DNA to be sequenced; labelling the nucleic and fragments; separating them and determining the position of the deoxyribonucleoside triphosphates by differences in the intensity of the labels, thereby to determine the DNA sequence.
Kukita, Yoji; Matoba, Ryo; Uchida, Junji; Hamakawa, Takuya; Doki, Yuichiro; Imamura, Fumio; Kato, Kikuya
2015-08-01
Circulating tumour DNA (ctDNA) is an emerging field of cancer research. However, current ctDNA analysis is usually restricted to one or a few mutation sites due to technical limitations. In the case of massively parallel DNA sequencers, the number of false positives caused by a high read error rate is a major problem. In addition, the final sequence reads do not represent the original DNA population due to the global amplification step during the template preparation. We established a high-fidelity target sequencing system of individual molecules identified in plasma cell-free DNA using barcode sequences; this system consists of the following two steps. (i) A novel target sequencing method that adds barcode sequences by adaptor ligation. This method uses linear amplification to eliminate the errors introduced during the early cycles of polymerase chain reaction. (ii) The monitoring and removal of erroneous barcode tags. This process involves the identification of individual molecules that have been sequenced and for which the number of mutations have been absolute quantitated. Using plasma cell-free DNA from patients with gastric or lung cancer, we demonstrated that the system achieved near complete elimination of false positives and enabled de novo detection and absolute quantitation of mutations in plasma cell-free DNA. © The Author 2015. Published by Oxford University Press on behalf of Kazusa DNA Research Institute.
Aguilar, William; Paz, Manuel M; Vargas, Anayatzinc; Clement, Cristina C; Cheng, Shu-Yuan; Champeil, Elise
2018-04-20
Mitomycin C (MC), a potent antitumor drug, and decarbamoylmitomycin C (DMC), a derivative lacking the carbamoyl group, form highly cytotoxic DNA interstrand crosslinks. The major interstrand crosslink formed by DMC is the C1'' epimer of the major crosslink formed by MC. The molecular basis for the stereochemical configuration exhibited by DMC was investigated using biomimetic synthesis. The formation of DNA-DNA crosslinks by DMC is diastereospecific and diastereodivergent: Only the 1''S-diastereomer of the initially formed monoadduct can form crosslinks at GpC sequences, and only the 1''R-diastereomer of the monoadduct can form crosslinks at CpG sequences. We also show that CpG and GpC sequences react with divergent diastereoselectivity in the first alkylation step: 1"S stereochemistry is favored at GpC sequences and 1''R stereochemistry is favored at CpG sequences. Therefore, the first alkylation step results, at each sequence, in the selective formation of the diastereomer able to generate an interstrand DNA-DNA crosslink after the "second arm" alkylation. Examination of the known DNA adduct pattern obtained after treatment of cancer cell cultures with DMC indicates that the GpC sequence is the major target for the formation of DNA-DNA crosslinks in vivo by this drug. © 2018 Wiley-VCH Verlag GmbH & Co. KGaA, Weinheim.
Sproul, John S; Maddison, David R
2017-11-01
Despite advances that allow DNA sequencing of old museum specimens, sequencing small-bodied, historical specimens can be challenging and unreliable as many contain only small amounts of fragmented DNA. Dependable methods to sequence such specimens are especially critical if the specimens are unique. We attempt to sequence small-bodied (3-6 mm) historical specimens (including nomenclatural types) of beetles that have been housed, dried, in museums for 58-159 years, and for which few or no suitable replacement specimens exist. To better understand ideal approaches of sample preparation and produce preparation guidelines, we compared different library preparation protocols using low amounts of input DNA (1-10 ng). We also explored low-cost optimizations designed to improve library preparation efficiency and sequencing success of historical specimens with minimal DNA, such as enzymatic repair of DNA. We report successful sample preparation and sequencing for all historical specimens despite our low-input DNA approach. We provide a list of guidelines related to DNA repair, bead handling, reducing adapter dimers and library amplification. We present these guidelines to facilitate more economical use of valuable DNA and enable more consistent results in projects that aim to sequence challenging, irreplaceable historical specimens. © 2017 John Wiley & Sons Ltd.
Mohammed, Monzoorul Haque; Ghosh, Tarini Shankar; Chadaram, Sudha; Mande, Sharmila S
2011-11-30
Obtaining accurate estimates of microbial diversity using rDNA profiling is the first step in most metagenomics projects. Consequently, most metagenomic projects spend considerable amounts of time, money and manpower for experimentally cloning, amplifying and sequencing the rDNA content in a metagenomic sample. In the second step, the entire genomic content of the metagenome is extracted, sequenced and analyzed. Since DNA sequences obtained in this second step also contain rDNA fragments, rapid in silico identification of these rDNA fragments would drastically reduce the cost, time and effort of current metagenomic projects by entirely bypassing the experimental steps of primer based rDNA amplification, cloning and sequencing. In this study, we present an algorithm called i-rDNA that can facilitate the rapid detection of 16S rDNA fragments from amongst millions of sequences in metagenomic data sets with high detection sensitivity. Performance evaluation with data sets/database variants simulating typical metagenomic scenarios indicates the significantly high detection sensitivity of i-rDNA. Moreover, i-rDNA can process a million sequences in less than an hour on a simple desktop with modest hardware specifications. In addition to the speed of execution, high sensitivity and low false positive rate, the utility of the algorithmic approach discussed in this paper is immense given that it would help in bypassing the entire experimental step of primer-based rDNA amplification, cloning and sequencing. Application of this algorithmic approach would thus drastically reduce the cost, time and human efforts invested in all metagenomic projects. A web-server for the i-rDNA algorithm is available at http://metagenomics.atc.tcs.com/i-rDNA/
Biosensors for DNA sequence detection
NASA Technical Reports Server (NTRS)
Vercoutere, Wenonah; Akeson, Mark
2002-01-01
DNA biosensors are being developed as alternatives to conventional DNA microarrays. These devices couple signal transduction directly to sequence recognition. Some of the most sensitive and functional technologies use fibre optics or electrochemical sensors in combination with DNA hybridization. In a shift from sequence recognition by hybridization, two emerging single-molecule techniques read sequence composition using zero-mode waveguides or electrical impedance in nanoscale pores.
Thomas, W. Kelley; Vida, J. T.; Frisse, Linda M.; Mundo, Manuel; Baldwin, James G.
1997-01-01
To effectively integrate DNA sequence analysis and classical nematode taxonomy, we must be able to obtain DNA sequences from formalin-fixed specimens. Microdissected sections of nematodes were removed from specimens fixed in formalin, using standard protocols and without destroying morphological features. The fixed sections provided sufficient template for multiple polymerase chain reaction-based DNA sequence analyses. PMID:19274156
Star, Bastiaan; Nederbragt, Alexander J.; Hansen, Marianne H. S.; Skage, Morten; Gilfillan, Gregor D.; Bradbury, Ian R.; Pampoulie, Christophe; Stenseth, Nils Chr; Jakobsen, Kjetill S.; Jentoft, Sissel
2014-01-01
Degradation-specific processes and variation in laboratory protocols can bias the DNA sequence composition from samples of ancient or historic origin. Here, we identify a novel artifact in sequences from historic samples of Atlantic cod (Gadus morhua), which forms interrupted palindromes consisting of reverse complementary sequence at the 5′ and 3′-ends of sequencing reads. The palindromic sequences themselves have specific properties – the bases at the 5′-end align well to the reference genome, whereas extensive misalignments exists among the bases at the terminal 3′-end. The terminal 3′ bases are artificial extensions likely caused by the occurrence of hairpin loops in single stranded DNA (ssDNA), which can be ligated and amplified in particular library creation protocols. We propose that such hairpin loops allow the inclusion of erroneous nucleotides, specifically at the 3′-end of DNA strands, with the 5′-end of the same strand providing the template. We also find these palindromes in previously published ancient DNA (aDNA) datasets, albeit at varying and substantially lower frequencies. This artifact can negatively affect the yield of endogenous DNA in these types of samples and introduces sequence bias. PMID:24608104
Yamada, Kazuhiko; Nishida-Umehara, Chizuko; Matsuda, Yoichi
2004-03-01
We isolated a new family of satellite DNA sequences from HaeIII- and EcoRI-digested genomic DNA of the Blakiston's fish owl ( Ketupa blakistoni). The repetitive sequences were organized in tandem arrays of the 174 bp element, and localized to the centromeric regions of all macrochromosomes, including the Z and W chromosomes, and microchromosomes. This hybridization pattern was consistent with the distribution of C-band-positive centromeric heterochromatin, and the satellite DNA sequences occupied 10% of the total genome as a major component of centromeric heterochromatin. The sequences were homogenized between macro- and microchromosomes in this species, and therefore intraspecific divergence of the nucleotide sequences was low. The 174 bp element cross-hybridized to the genomic DNA of six other Strigidae species, but not to that of the Tytonidae, suggesting that the satellite DNA sequences are conserved in the same family but fairly divergent between the different families in the Strigiformes. Secondly, the centromeric satellite DNAs were cloned from eight Strigidae species, and the nucleotide sequences of 41 monomer fragments were compared within and between species. Molecular phylogenetic relationships of the nucleotide sequences were highly correlated with both the taxonomy based on morphological traits and the phylogenetic tree constructed by DNA-DNA hybridization. These results suggest that the satellite DNA sequence has evolved by concerted evolution in the Strigidae and that it is a good taxonomic and phylogenetic marker to examine genetic diversity between Strigiformes species.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Sobottka, Marcelo, E-mail: sobottka@mtm.ufsc.br; Hart, Andrew G., E-mail: ahart@dim.uchile.cl
Highlights: {yields} We propose a simple stochastic model to construct primitive DNA sequences. {yields} The model provide an explanation for Chargaff's second parity rule in primitive DNA sequences. {yields} The model is also used to predict a novel type of strand symmetry in primitive DNA sequences. {yields} We extend the results for bacterial DNA sequences and compare distributional properties intrinsic to the model to statistical estimates from 1049 bacterial genomes. {yields} We find out statistical evidences that the novel type of strand symmetry holds for bacterial DNA sequences. -- Abstract: Chargaff's second parity rule for short oligonucleotides states that themore » frequency of any short nucleotide sequence on a strand is approximately equal to the frequency of its reverse complement on the same strand. Recent studies have shown that, with the exception of organellar DNA, this parity rule generally holds for double-stranded DNA genomes and fails to hold for single-stranded genomes. While Chargaff's first parity rule is fully explained by the Watson-Crick pairing in the DNA double helix, a definitive explanation for the second parity rule has not yet been determined. In this work, we propose a model based on a hidden Markov process for approximating the distributional structure of primitive DNA sequences. Then, we use the model to provide another possible theoretical explanation for Chargaff's second parity rule, and to predict novel distributional aspects of bacterial DNA sequences.« less
A Simulation of DNA Sequencing Utilizing 3M Post-It[R] Notes
ERIC Educational Resources Information Center
Christensen, Doug
2009-01-01
An inexpensive and equipment free approach to teaching the technical aspects of DNA sequencing. The activity described requires an instructor with a familiarity of DNA sequencing technology but provides a straight forward method of teaching the technical aspects of sequencing in the absence of expensive sequencing equipment. The final sequence…
Lee, James W.; Thundat, Thomas G.
2005-06-14
An apparatus and method for performing nucleic acid (DNA and/or RNA) sequencing on a single molecule. The genetic sequence information is obtained by probing through a DNA or RNA molecule base by base at nanometer scale as though looking through a strip of movie film. This DNA sequencing nanotechnology has the theoretical capability of performing DNA sequencing at a maximal rate of about 1,000,000 bases per second. This enhanced performance is made possible by a series of innovations including: novel applications of a fine-tuned nanometer gap for passage of a single DNA or RNA molecule; thin layer microfluidics for sample loading and delivery; and programmable electric fields for precise control of DNA or RNA movement. Detection methods include nanoelectrode-gated tunneling current measurements, dielectric molecular characterization, and atomic force microscopy/electrostatic force microscopy (AFM/EFM) probing for nanoscale reading of the nucleic acid sequences.
The sequence specificity of UV-induced DNA damage in a systematically altered DNA sequence.
Khoe, Clairine V; Chung, Long H; Murray, Vincent
2018-06-01
The sequence specificity of UV-induced DNA damage was investigated in a specifically designed DNA plasmid using two procedures: end-labelling and linear amplification. Absorption of UV photons by DNA leads to dimerisation of pyrimidine bases and produces two major photoproducts, cyclobutane pyrimidine dimers (CPDs) and pyrimidine(6-4)pyrimidone photoproducts (6-4PPs). A previous study had determined that two hexanucleotide sequences, 5'-GCTC*AC and 5'-TATT*AA, were high intensity UV-induced DNA damage sites. The UV clone plasmid was constructed by systematically altering each nucleotide of these two hexanucleotide sequences. One of the main goals of this study was to determine the influence of single nucleotide alterations on the intensity of UV-induced DNA damage. The sequence 5'-GCTC*AC was designed to examine the sequence specificity of 6-4PPs and the highest intensity 6-4PP damage sites were found at 5'-GTTC*CC nucleotides. The sequence 5'-TATT*AA was devised to investigate the sequence specificity of CPDs and the highest intensity CPD damage sites were found at 5'-TTTT*CG nucleotides. It was proposed that the tetranucleotide DNA sequence, 5'-YTC*Y (where Y is T or C), was the consensus sequence for the highest intensity UV-induced 6-4PP adduct sites; while it was 5'-YTT*C for the highest intensity UV-induced CPD damage sites. These consensus tetranucleotides are composed entirely of consecutive pyrimidines and must have a DNA conformation that is highly productive for the absorption of UV photons. Crown Copyright © 2018. Published by Elsevier B.V. All rights reserved.
Global gene expression analysis by combinatorial optimization.
Ameur, Adam; Aurell, Erik; Carlsson, Mats; Westholm, Jakub Orzechowski
2004-01-01
Generally, there is a trade-off between methods of gene expression analysis that are precise but labor-intensive, e.g. RT-PCR, and methods that scale up to global coverage but are not quite as quantitative, e.g. microarrays. In the present paper, we show how how a known method of gene expression profiling (K. Kato, Nucleic Acids Res. 23, 3685-3690 (1995)), which relies on a fairly small number of steps, can be turned into a global gene expression measurement by advanced data post-processing, with potentially little loss of accuracy. Post-processing here entails solving an ancillary combinatorial optimization problem. Validation is performed on in silico experiments generated from the FANTOM data base of full-length mouse cDNA. We present two variants of the method. One uses state-of-the-art commercial software for solving problems of this kind, the other a code developed by us specifically for this purpose, released in the public domain under GPL license.
Torque measurements reveal sequence-specific cooperative transitions in supercoiled DNA
Oberstrass, Florian C.; Fernandes, Louis E.; Bryant, Zev
2012-01-01
B-DNA becomes unstable under superhelical stress and is able to adopt a wide range of alternative conformations including strand-separated DNA and Z-DNA. Localized sequence-dependent structural transitions are important for the regulation of biological processes such as DNA replication and transcription. To directly probe the effect of sequence on structural transitions driven by torque, we have measured the torsional response of a panel of DNA sequences using single molecule assays that employ nanosphere rotational probes to achieve high torque resolution. The responses of Z-forming d(pGpC)n sequences match our predictions based on a theoretical treatment of cooperative transitions in helical polymers. “Bubble” templates containing 50–100 bp mismatch regions show cooperative structural transitions similar to B-DNA, although less torque is required to disrupt strand–strand interactions. Our mechanical measurements, including direct characterization of the torsional rigidity of strand-separated DNA, establish a framework for quantitative predictions of the complex torsional response of arbitrary sequences in their biological context. PMID:22474350
NASA Astrophysics Data System (ADS)
Yang, Hong
Until recently, recovery and analysis of genetic information encoded in ancient DNA sequences from Pleistocene fossils were impossible. Recent advances in molecular biology offered technical tools to obtain ancient DNA sequences from well-preserved Quaternary fossils and opened the possibilities to directly study genetic changes in fossil species to address various biological and paleontological questions. Ancient DNA studies involving Pleistocene fossil material and ancient DNA degradation and preservation in Quaternary deposits are reviewed. The molecular technology applied to isolate, amplify, and sequence ancient DNA is also presented. Authentication of ancient DNA sequences and technical problems associated with modern and ancient DNA contamination are discussed. As illustrated in recent studies on ancient DNA from proboscideans, it is apparent that fossil DNA sequence data can shed light on many aspects of Quaternary research such as systematics and phylogeny. conservation biology, evolutionary theory, molecular taphonomy, and forensic sciences. Improvement of molecular techniques and a better understanding of DNA degradation during fossilization are likely to build on current strengths and to overcome existing problems, making fossil DNA data a unique source of information for Quaternary scientists.
Enantiospecific recognition of DNA sequences by a proflavine Tröger base.
Bailly, C; Laine, W; Demeunynck, M; Lhomme, J
2000-07-05
The DNA interaction of a chiral Tröger base derived from proflavine was investigated by DNA melting temperature measurements and complementary biochemical assays. DNase I footprinting experiments demonstrate that the binding of the proflavine-based Tröger base is both enantio- and sequence-specific. The (+)-isomer poorly interacts with DNA in a non-sequence-selective fashion. In sharp contrast, the corresponding (-)-isomer recognizes preferentially certain DNA sequences containing both A. T and G. C base pairs, such as the motifs 5'-GTT. AAC and 5'-ATGA. TCAT. This is the first experimental demonstration that acridine-type Tröger bases can be used for enantiospecific recognition of DNA sequences. Copyright 2000 Academic Press.
NASA Astrophysics Data System (ADS)
Peng, Jun; Ling, Jian; Zhang, Xiu-Qing; Bai, Hui-Ping; Zheng, Liyan; Cao, Qiu-E.; Ding, Zhong-Tao
2015-02-01
In this work, we designed a new fluorescent oligonucleotides-stabilized silver nanoclusters (DNA/AgNCs) probe for sensitive detection of mercury and copper ions. This probe contains two tailored DNA sequence. One is a signal probe contains a cytosine-rich sequence template for AgNCs synthesis and link sequence at both ends. The other is a guanine-rich sequence for signal enhancement and link sequence complementary to the link sequence of the signal probe. After hybridization, the fluorescence of hybridized double-strand DNA/AgNCs is 200-fold enhanced based on the fluorescence enhancement effect of DNA/AgNCs in proximity of guanine-rich DNA sequence. The double-strand DNA/AgNCs probe is brighter and stable than that of single-strand DNA/AgNCs, and more importantly, can be used as novel fluorescent probes for detecting mercury and copper ions. Mercury and copper ions in the range of 6.0-160.0 and 6-240 nM, can be linearly detected with the detection limits of 2.1 and 3.4 nM, respectively. Our results indicated that the analytical parameters of the method for mercury and copper ions detection are much better than which using a single-strand DNA/AgNCs.
Antipova, Valeriya N; Zheleznaya, Lyudmila A; Zyrina, Nadezhda V
2014-08-01
In the absence of added DNA, thermophilic DNA polymerases synthesize double-stranded DNA from free dNTPs, which consist of numerous repetitive units (ab initio DNA synthesis). The addition of thermophilic restriction endonuclease (REase), or nicking endonuclease (NEase), effectively stimulates ab initio DNA synthesis and determines the nucleotide sequence of reaction products. We have found that NEases Nt.AlwI, Nb.BbvCI, and Nb.BsmI with non-palindromic recognition sites stimulate the synthesis of sequences organized mainly as palindromes. Moreover, the nucleotide sequence of the palindromes appeared to be dependent on NEase recognition/cleavage modes. Thus, the heterodimeric Nb.BbvCI stimulated the synthesis of palindromes composed of two recognition sites of this NEase, which were separated by AT-reach sequences or (A)n (T)m spacers. Palindromic DNA sequences obtained in the ab initio DNA synthesis with the monomeric NEases Nb.BsmI and Nt.AlwI contained, along with the sites of these NEases, randomly synthesized sequences consisted of blocks of short repeats. These findings could help investigation of the potential abilities of highly productive ab initio DNA synthesis for the creation of DNA molecules with desirable sequence. © 2014 Federation of European Microbiological Societies. Published by John Wiley & Sons Ltd. All rights reserved.
Shao, Zhiyong; Graf, Shannon; Chaga, Oleg Y; Lavrov, Dennis V
2006-10-15
The 16,937-nuceotide sequence of the linear mitochondrial DNA (mt-DNA) molecule of the moon jelly Aurelia aurita (Cnidaria, Scyphozoa) - the first mtDNA sequence from the class Scypozoa and the first sequence of a linear mtDNA from Metazoa - has been determined. This sequence contains genes for 13 energy pathway proteins, small and large subunit rRNAs, and methionine and tryptophan tRNAs. In addition, two open reading frames of 324 and 969 base pairs in length have been found. The deduced amino-acid sequence of one of them, ORF969, displays extensive sequence similarity with the polymerase [but not the exonuclease] domain of family B DNA polymerases, and this ORF has been tentatively identified as dnab. This is the first report of dnab in animal mtDNA. The genes in A. aurita mtDNA are arranged in two clusters with opposite transcriptional polarities; transcription proceeding toward the ends of the molecule. The determined sequences at the ends of the molecule are nearly identical but inverted and lack any obvious potential secondary structures or telomere-like repeat elements. The acquisition of mitochondrial genomic data for the second class of Cnidaria allows us to reconstruct characteristic features of mitochondrial evolution in this animal phylum.
Recent patents of nanopore DNA sequencing technology: progress and challenges.
Zhou, Jianfeng; Xu, Bingqian
2010-11-01
DNA sequencing techniques witnessed fast development in the last decades, primarily driven by the Human Genome Project. Among the proposed new techniques, Nanopore was considered as a suitable candidate for the single DNA sequencing with ultrahigh speed and very low cost. Several fabrication and modification techniques have been developed to produce robust and well-defined nanopore devices. Many efforts have also been done to apply nanopore to analyze the properties of DNA molecules. By comparing with traditional sequencing techniques, nanopore has demonstrated its distinctive superiorities in main practical issues, such as sample preparation, sequencing speed, cost-effective and read-length. Although challenges still remain, recent researches in improving the capabilities of nanopore have shed a light to achieve its ultimate goal: Sequence individual DNA strand at single nucleotide level. This patent review briefly highlights recent developments and technological achievements for DNA analysis and sequencing at single molecule level, focusing on nanopore based methods.
Small tandemly repeated DNA sequences of higher plants likely originate from a tRNA gene ancestor.
Benslimane, A A; Dron, M; Hartmann, C; Rode, A
1986-01-01
Several monomers (177 bp) of a tandemly arranged repetitive nuclear DNA sequence of Brassica oleracea have been cloned and sequenced. They share up to 95% homology between one another and up to 80% with other satellite DNA sequences of Cruciferae, suggesting a common ancestor. Both strands of these monomers show more than 50% homology with many tRNA genes; the best homologies have been obtained with Lys and His yeast mitochondrial tRNA genes (respectively 64% and 60%). These results suggest that small tandemly repeated DNA sequences of plants may have evolved from a tRNA gene ancestor. These tandem repeats have probably arisen via a process involving reverse transcription of polymerase III RNA intermediates, as is the case for interspersed DNA sequences of mammalians. A model is proposed to explain the formation of such small tandemly repeated DNA sequences. Images PMID:3774553
PR-PR: Cross-Platform Laboratory Automation System
DOE Office of Scientific and Technical Information (OSTI.GOV)
Linshiz, G; Stawski, N; Goyal, G
To enable protocol standardization, sharing, and efficient implementation across laboratory automation platforms, we have further developed the PR-PR open-source high-level biology-friendly robot programming language as a cross-platform laboratory automation system. Beyond liquid-handling robotics, PR-PR now supports microfluidic and microscopy platforms, as well as protocol translation into human languages, such as English. While the same set of basic PR-PR commands and features are available for each supported platform, the underlying optimization and translation modules vary from platform to platform. Here, we describe these further developments to PR-PR, and demonstrate the experimental implementation and validation of PR-PR protocols for combinatorial modified Goldenmore » Gate DNA assembly across liquid-handling robotic, microfluidic, and manual platforms. To further test PR-PR cross-platform performance, we then implement and assess PR-PR protocols for Kunkel DNA mutagenesis and hierarchical Gibson DNA assembly for microfluidic and manual platforms.« less
PR-PR: cross-platform laboratory automation system.
Linshiz, Gregory; Stawski, Nina; Goyal, Garima; Bi, Changhao; Poust, Sean; Sharma, Monica; Mutalik, Vivek; Keasling, Jay D; Hillson, Nathan J
2014-08-15
To enable protocol standardization, sharing, and efficient implementation across laboratory automation platforms, we have further developed the PR-PR open-source high-level biology-friendly robot programming language as a cross-platform laboratory automation system. Beyond liquid-handling robotics, PR-PR now supports microfluidic and microscopy platforms, as well as protocol translation into human languages, such as English. While the same set of basic PR-PR commands and features are available for each supported platform, the underlying optimization and translation modules vary from platform to platform. Here, we describe these further developments to PR-PR, and demonstrate the experimental implementation and validation of PR-PR protocols for combinatorial modified Golden Gate DNA assembly across liquid-handling robotic, microfluidic, and manual platforms. To further test PR-PR cross-platform performance, we then implement and assess PR-PR protocols for Kunkel DNA mutagenesis and hierarchical Gibson DNA assembly for microfluidic and manual platforms.
Next-Generation Sequencing Platforms
NASA Astrophysics Data System (ADS)
Mardis, Elaine R.
2013-06-01
Automated DNA sequencing instruments embody an elegant interplay among chemistry, engineering, software, and molecular biology and have built upon Sanger's founding discovery of dideoxynucleotide sequencing to perform once-unfathomable tasks. Combined with innovative physical mapping approaches that helped to establish long-range relationships between cloned stretches of genomic DNA, fluorescent DNA sequencers produced reference genome sequences for model organisms and for the reference human genome. New types of sequencing instruments that permit amazing acceleration of data-collection rates for DNA sequencing have been developed. The ability to generate genome-scale data sets is now transforming the nature of biological inquiry. Here, I provide an historical perspective of the field, focusing on the fundamental developments that predated the advent of next-generation sequencing instruments and providing information about how these instruments work, their application to biological research, and the newest types of sequencers that can extract data from single DNA molecules.
Regulatory link between DNA methylation and active demethylation in Arabidopsis
Lei, Mingguang; Zhang, Huiming; Julian, Russell; Tang, Kai; Xie, Shaojun; Zhu, Jian-Kang
2015-01-01
De novo DNA methylation through the RNA-directed DNA methylation (RdDM) pathway and active DNA demethylation play important roles in controlling genome-wide DNA methylation patterns in plants. Little is known about how cells manage the balance between DNA methylation and active demethylation activities. Here, we report the identification of a unique RdDM target sequence, where DNA methylation is required for maintaining proper active DNA demethylation of the Arabidopsis genome. In a genetic screen for cellular antisilencing factors, we isolated several REPRESSOR OF SILENCING 1 (ros1) mutant alleles, as well as many RdDM mutants, which showed drastically reduced ROS1 gene expression and, consequently, transcriptional silencing of two reporter genes. A helitron transposon element (TE) in the ROS1 gene promoter negatively controls ROS1 expression, whereas DNA methylation of an RdDM target sequence between ROS1 5′ UTR and the promoter TE region antagonizes this helitron TE in regulating ROS1 expression. This RdDM target sequence is also targeted by ROS1, and defective DNA demethylation in loss-of-function ros1 mutant alleles causes DNA hypermethylation of this sequence and concomitantly causes increased ROS1 expression. Our results suggest that this sequence in the ROS1 promoter region serves as a DNA methylation monitoring sequence (MEMS) that senses DNA methylation and active DNA demethylation activities. Therefore, the ROS1 promoter functions like a thermostat (i.e., methylstat) to sense DNA methylation levels and regulates DNA methylation by controlling ROS1 expression. PMID:25733903
Attomole-level Genomics with Single-molecule Direct DNA, cDNA and RNA Sequencing Technologies.
Ozsolak, Fatih
2016-01-01
With the introduction of next-generation sequencing (NGS) technologies in 2005, the domination of microarrays in genomics quickly came to an end due to NGS's superior technical performance and cost advantages. By enabling genetic analysis capabilities that were not possible previously, NGS technologies have started to play an integral role in all areas of biomedical research. This chapter outlines the low-quantity DNA and cDNA sequencing capabilities and applications developed with the Helicos single molecule DNA sequencing technology.
Walker, M D; Park, C W; Rosen, A; Aronheim, A
1990-01-01
Cell specific expression of the insulin gene is achieved through transcriptional mechanisms operating on multiple DNA sequence elements located in the 5' flanking region of the gene. Of particular importance in the rat insulin I gene are two closely similar 9 bp sequences (IEB1 and IEB2): mutation of either of these leads to 5-10 fold reduction in transcriptional activity. We have screened an expression cDNA library derived from mouse pancreatic endocrine beta cells with a radioactive DNA probe containing multiple copies of the IEB1 sequence. A cDNA clone (A1) isolated by this procedure encodes a protein which shows efficient binding to the IEB1 probe, but much weaker binding to either an unrelated DNA probe or to a probe bearing a single base pair insertion within the recognition sequence. DNA sequence analysis indicates a protein belonging to the helix-loop-helix family of DNA-binding proteins. The ability of the protein encoded by clone A1 to recognize a number of wild type and mutant DNA sequences correlates closely with the ability of each sequence element to support transcription in vivo in the context of the insulin 5' flanking DNA. We conclude that the isolated cDNA may encode a transcription factor that participates in control of insulin gene expression. Images PMID:2181401
Highly multiplexed targeted DNA sequencing from single nuclei.
Leung, Marco L; Wang, Yong; Kim, Charissa; Gao, Ruli; Jiang, Jerry; Sei, Emi; Navin, Nicholas E
2016-02-01
Single-cell DNA sequencing methods are challenged by poor physical coverage, high technical error rates and low throughput. To address these issues, we developed a single-cell DNA sequencing protocol that combines flow-sorting of single nuclei, time-limited multiple-displacement amplification (MDA), low-input library preparation, DNA barcoding, targeted capture and next-generation sequencing (NGS). This approach represents a major improvement over our previous single nucleus sequencing (SNS) Nature Protocols paper in terms of generating higher-coverage data (>90%), thereby enabling the detection of genome-wide variants in single mammalian cells at base-pair resolution. Furthermore, by pooling 48-96 single-cell libraries together for targeted capture, this approach can be used to sequence many single-cell libraries in parallel in a single reaction. This protocol greatly reduces the cost of single-cell DNA sequencing, and it can be completed in 5-6 d by advanced users. This single-cell DNA sequencing protocol has broad applications for studying rare cells and complex populations in diverse fields of biological research and medicine.
Solving Assembly Sequence Planning using Angle Modulated Simulated Kalman Filter
NASA Astrophysics Data System (ADS)
Mustapa, Ainizar; Yusof, Zulkifli Md.; Adam, Asrul; Muhammad, Badaruddin; Ibrahim, Zuwairie
2018-03-01
This paper presents an implementation of Simulated Kalman Filter (SKF) algorithm for optimizing an Assembly Sequence Planning (ASP) problem. The SKF search strategy contains three simple steps; predict-measure-estimate. The main objective of the ASP is to determine the sequence of component installation to shorten assembly time or save assembly costs. Initially, permutation sequence is generated to represent each agent. Each agent is then subjected to a precedence matrix constraint to produce feasible assembly sequence. Next, the Angle Modulated SKF (AMSKF) is proposed for solving ASP problem. The main idea of the angle modulated approach in solving combinatorial optimization problem is to use a function, g(x), to create a continuous signal. The performance of the proposed AMSKF is compared against previous works in solving ASP by applying BGSA, BPSO, and MSPSO. Using a case study of ASP, the results show that AMSKF outperformed all the algorithms in obtaining the best solution.
Kille, Sabrina; Acevedo-Rocha, Carlos G; Parra, Loreto P; Zhang, Zhi-Gang; Opperman, Diederik J; Reetz, Manfred T; Acevedo, Juan Pablo
2013-02-15
Saturation mutagenesis probes define sections of the vast protein sequence space. However, even if randomization is limited this way, the combinatorial numbers problem is severe. Because diversity is created at the codon level, codon redundancy is a crucial factor determining the necessary effort for library screening. Additionally, due to the probabilistic nature of the sampling process, oversampling is required to ensure library completeness as well as a high probability to encounter all unique variants. Our trick employs a special mixture of three primers, creating a degeneracy of 22 unique codons coding for the 20 canonical amino acids. Therefore, codon redundancy and subsequent screening effort is significantly reduced, and a balanced distribution of codon per amino acid is achieved, as demonstrated exemplarily for a library of cyclohexanone monooxygenase. We show that this strategy is suitable for any saturation mutagenesis methodology to generate less-redundant libraries.
Martínez-Ceron, María C; Giudicessi, Silvana L; Marani, Mariela M; Albericio, Fernando; Cascone, Osvaldo; Erra-Balsells, Rosa; Camperi, Silvia A
2010-05-15
Optimization of bead analysis by matrix-assisted laser desorption/ionization time-of-flight mass spectrometry (MALDI-TOF-MS) after the screening of one-bead-one-peptide combinatorial libraries was achieved, involving the fine-tuning of the whole process. Guanidine was replaced by acetonitrile (MeCN)/acetic acid (AcOH)/water (H(2)O), improving matrix crystallization. Peptide-bead cleavage with NH(4)OH was cheaper and safer than, yet as efficient as, NH(3)/tetrahydrofuran (THF). Peptide elution in microtubes instead of placing the beads in the sample plate yielded more sample aliquots. Successive dry layers deposit sample preparation was better than the dried droplet method. Among the matrices analyzed, alpha-cyano-4-hydroxycinnamic acid resulted in the best peptide ion yield. Cluster formation was minimized by the addition of additives to the matrix. Copyright 2010 Elsevier Inc. All rights reserved.
Scaffold diversification enhances effectiveness of a superlibrary of hyperthermophilic proteins.
Hussain, Mahmud; Gera, Nimish; Hill, Andrew B; Rao, Balaji M
2013-01-18
The use of binding proteins from non-immunoglobulin scaffolds has become increasingly common in biotechnology and medicine. Typically, binders are isolated from a combinatorial library generated by mutating a single scaffold protein. In contrast, here we generated a "superlibrary" or "library-of-libraries" of 4 × 10(8) protein variants by mutagenesis of seven different hyperthermophilic proteins; six of the seven proteins have not been used as scaffolds prior to this study. Binding proteins for five different model targets were successfully isolated from this library. Binders obtained were derived from five out of the seven scaffolds. Strikingly, binders from this modestly sized superlibrary have affinities comparable or higher than those obtained from a library with 1000-fold higher sequence diversity but derived from a single stable scaffold. Thus scaffold diversification, i.e., randomization of multiple different scaffolds, is a powerful alternate strategy for combinatorial library construction.
Phage display for the discovery of hydroxyapatite-associated peptides.
Jin, Hyo-Eon; Chung, Woo-Jae; Lee, Seung-Wuk
2013-01-01
In nature, proteins play a critical role in the biomineralization process. Understanding how different peptide or protein sequences selectively interact with the target crystal is of great importance. Identifying such protein structures is one of the critical steps in verifying the molecular mechanisms of biomineralization. One of the promising ways to obtain such information for a particular crystal surface is to screen combinatorial peptide libraries in a high-throughput manner. Among the many combinatorial library screening procedures, phage display is a powerful method to isolate such proteins and peptides. In this chapter, we will describe our established methods to perform phage display with inorganic crystal surfaces. Specifically, we will use hydroxyapatite as a model system for discovery of apatite-associated proteins in bone or tooth biomineralization studies. This model approach can be generalized to other desired crystal surfaces using the same experimental design principles with a little modification of the procedures. © 2013 Elsevier Inc. All rights reserved.
Effect of the Implicit Combinatorial Model on Combinatorial Reasoning in Secondary School Pupils.
ERIC Educational Resources Information Center
Batanero, Carmen; And Others
1997-01-01
Elementary combinatorial problems may be classified into three different combinatorial models: (1) selection; (2) partition; and (3) distribution. The main goal of this research was to determine the effect of the implicit combinatorial model on pupils' combinatorial reasoning before and after instruction. Gives an analysis of variance of the…
A Case Study into Microbial Genome Assembly Gap Sequences and Finishing Strategies
DOE Office of Scientific and Technical Information (OSTI.GOV)
Utturkar, Sagar M.; Klingeman, Dawn M.; Hurt, Jr., Richard A.
This study characterized regions of DNA which remained unassembled by either PacBio and Illumina sequencing technologies for seven bacterial genomes. Two genomes were manually finished using bioinformatics and PCR/Sanger sequencing approaches and regions not assembled by automated software were analyzed. Gaps present within Illumina assemblies mostly correspond to repetitive DNA regions such as multiple rRNA operon sequences. PacBio gap sequences were evaluated for several properties such as GC content, read coverage, gap length, ability to form strong secondary structures, and corresponding annotations. Our hypothesis that strong secondary DNA structures blocked DNA polymerases and contributed to gap sequences was not accepted.more » PacBio assemblies had few limitations overall and gaps were explained as cumulative effect of lower than average sequence coverage and repetitive sequences at contig termini. An important aspect of the present study is the compilation of biological features that interfered with assembly and included active transposons, multiple plasmid sequences, phage DNA integration, and large sequence duplication. Furthermore, our targeted genome finishing approach and systematic evaluation of the unassembled DNA will be useful for others looking to close, finish, and polish microbial genome sequences.« less
A Case Study into Microbial Genome Assembly Gap Sequences and Finishing Strategies
Utturkar, Sagar M.; Klingeman, Dawn M.; Hurt, Jr., Richard A.; ...
2017-07-18
This study characterized regions of DNA which remained unassembled by either PacBio and Illumina sequencing technologies for seven bacterial genomes. Two genomes were manually finished using bioinformatics and PCR/Sanger sequencing approaches and regions not assembled by automated software were analyzed. Gaps present within Illumina assemblies mostly correspond to repetitive DNA regions such as multiple rRNA operon sequences. PacBio gap sequences were evaluated for several properties such as GC content, read coverage, gap length, ability to form strong secondary structures, and corresponding annotations. Our hypothesis that strong secondary DNA structures blocked DNA polymerases and contributed to gap sequences was not accepted.more » PacBio assemblies had few limitations overall and gaps were explained as cumulative effect of lower than average sequence coverage and repetitive sequences at contig termini. An important aspect of the present study is the compilation of biological features that interfered with assembly and included active transposons, multiple plasmid sequences, phage DNA integration, and large sequence duplication. Furthermore, our targeted genome finishing approach and systematic evaluation of the unassembled DNA will be useful for others looking to close, finish, and polish microbial genome sequences.« less
A Case Study into Microbial Genome Assembly Gap Sequences and Finishing Strategies
Utturkar, Sagar M.; Klingeman, Dawn M.; Hurt, Richard A.; Brown, Steven D.
2017-01-01
This study characterized regions of DNA which remained unassembled by either PacBio and Illumina sequencing technologies for seven bacterial genomes. Two genomes were manually finished using bioinformatics and PCR/Sanger sequencing approaches and regions not assembled by automated software were analyzed. Gaps present within Illumina assemblies mostly correspond to repetitive DNA regions such as multiple rRNA operon sequences. PacBio gap sequences were evaluated for several properties such as GC content, read coverage, gap length, ability to form strong secondary structures, and corresponding annotations. Our hypothesis that strong secondary DNA structures blocked DNA polymerases and contributed to gap sequences was not accepted. PacBio assemblies had few limitations overall and gaps were explained as cumulative effect of lower than average sequence coverage and repetitive sequences at contig termini. An important aspect of the present study is the compilation of biological features that interfered with assembly and included active transposons, multiple plasmid sequences, phage DNA integration, and large sequence duplication. Our targeted genome finishing approach and systematic evaluation of the unassembled DNA will be useful for others looking to close, finish, and polish microbial genome sequences. PMID:28769883
DOE Office of Scientific and Technical Information (OSTI.GOV)
Benasutti, M.; Ejadi, S.; Whitlow, M.D.
The mutagenic and carcinogenic chemical aflatoxin B/sub 1/ (AFB/sub 1/) reacts almost exclusively at the N(7)-position of guanine following activation to its reactive form, the 8,9-epoxide (AFB/sub 1/ oxide). In general N(7)-guanine adducts yield DNA strand breaks when heated in base, a property that serves as the basis for the Maxam-Gilbert DNA sequencing reaction specific for guanine. Using DNA sequencing methods, other workers have shown that AFB/sub 1/ oxide gives strand breaks at positions of guanines; however, the guanine bands varied in intensity. This phenomenon has been used to infer that AFB/sub 1/ oxide prefers to react with guanines inmore » some sequence contexts more than in others and has been referred to as sequence specificity of binding. Herein, data on the reaction of AFB/sub 1/ oxide with several synthetic DNA polymers with different sequences are presented, and (following hydrolysis) adduct levels are determine by high-pressure liquid chromatography. These results reveal that for AFB/sub 1/ oxide (1) the N(7)-guanine adduct is the major adduct found in all of the DNA polymers, (2) adduct levels vary in different sequences, and, thus, sequence specificity is also observed by this more direct method, and (3) the intensity of bands in DNA sequencing gels is likely to reflect adduct levels formed at the N(7)-position of guanine. Knowing this, a reinvestigation of the reactivity of guanines in different DNA sequences using DNA sequencing methods was undertaken. Methods are developed to determine the X (5'-side) base and the Y (3'-side) base are most influential in determining guanine reactivity. These rules in conjunction with molecular modeling studies were used to assess the binding sites that might be utilized by AFB/sub 1/ oxide in its reaction with DNA.« less
Causal gene identification using combinatorial V-structure search.
Cai, Ruichu; Zhang, Zhenjie; Hao, Zhifeng
2013-07-01
With the advances of biomedical techniques in the last decade, the costs of human genomic sequencing and genomic activity monitoring are coming down rapidly. To support the huge genome-based business in the near future, researchers are eager to find killer applications based on human genome information. Causal gene identification is one of the most promising applications, which may help the potential patients to estimate the risk of certain genetic diseases and locate the target gene for further genetic therapy. Unfortunately, existing pattern recognition techniques, such as Bayesian networks, cannot be directly applied to find the accurate causal relationship between genes and diseases. This is mainly due to the insufficient number of samples and the extremely high dimensionality of the gene space. In this paper, we present the first practical solution to causal gene identification, utilizing a new combinatorial formulation over V-Structures commonly used in conventional Bayesian networks, by exploring the combinations of significant V-Structures. We prove the NP-hardness of the combinatorial search problem under a general settings on the significance measure on the V-Structures, and present a greedy algorithm to find sub-optimal results. Extensive experiments show that our proposal is both scalable and effective, particularly with interesting findings on the causal genes over real human genome data. Copyright © 2013 Elsevier Ltd. All rights reserved.
Chromosome specific repetitive DNA sequences
Moyzis, Robert K.; Meyne, Julianne
1991-01-01
A method is provided for determining specific nucleotide sequences useful in forming a probe which can identify specific chromosomes, preferably through in situ hybridization within the cell itself. In one embodiment, chromosome preferential nucleotide sequences are first determined from a library of recombinant DNA clones having families of repetitive sequences. Library clones are identified with a low homology with a sequence of repetitive DNA families to which the first clones respectively belong and variant sequences are then identified by selecting clones having a pattern of hybridization with genomic DNA dissimilar to the hybridization pattern shown by the respective families. In another embodiment, variant sequences are selected from a sequence of a known repetitive DNA family. The selected variant sequence is classified as chromosome specific, chromosome preferential, or chromosome nonspecific. Sequences which are classified as chromosome preferential are further sequenced and regions are identified having a low homology with other regions of the chromosome preferential sequence or with known sequences of other family me This invention is the result of a contract with the Department of Energy (Contract No. W-7405-ENG-36).
Chemical Space of DNA-Encoded Libraries.
Franzini, Raphael M; Randolph, Cassie
2016-07-28
In recent years, DNA-encoded chemical libraries (DECLs) have attracted considerable attention as a potential discovery tool in drug development. Screening encoded libraries may offer advantages over conventional hit discovery approaches and has the potential to complement such methods in pharmaceutical research. As a result of the increased application of encoded libraries in drug discovery, a growing number of hit compounds are emerging in scientific literature. In this review we evaluate reported encoded library-derived structures and identify general trends of these compounds in relation to library design parameters. We in particular emphasize the combinatorial nature of these libraries. Generally, the reported molecules demonstrate the ability of this technology to afford hits suitable for further lead development, and on the basis of them, we derive guidelines for DECL design.
ERIC Educational Resources Information Center
Shah, Kushani; Thomas, Shelby; Stein, Arnold
2013-01-01
In this report, we describe a 5-week laboratory exercise for undergraduate biology and biochemistry students in which students learn to sequence DNA and to genotype their DNA for selected single nucleotide polymorphisms (SNPs). Students use miniaturized DNA sequencing gels that require approximately 8 min to run. The students perform G, A, T, C…
DNA Barcode Goes Two-Dimensions: DNA QR Code Web Server
Li, Huan; Xing, Hang; Liang, Dong; Jiang, Kun; Pang, Xiaohui; Song, Jingyuan; Chen, Shilin
2012-01-01
The DNA barcoding technology uses a standard region of DNA sequence for species identification and discovery. At present, “DNA barcode” actually refers to DNA sequences, which are not amenable to information storage, recognition, and retrieval. Our aim is to identify the best symbology that can represent DNA barcode sequences in practical applications. A comprehensive set of sequences for five DNA barcode markers ITS2, rbcL, matK, psbA-trnH, and CO1 was used as the test data. Fifty-three different types of one-dimensional and ten two-dimensional barcode symbologies were compared based on different criteria, such as coding capacity, compression efficiency, and error detection ability. The quick response (QR) code was found to have the largest coding capacity and relatively high compression ratio. To facilitate the further usage of QR code-based DNA barcodes, a web server was developed and is accessible at http://qrfordna.dnsalias.org. The web server allows users to retrieve the QR code for a species of interests, convert a DNA sequence to and from a QR code, and perform species identification based on local and global sequence similarities. In summary, the first comprehensive evaluation of various barcode symbologies has been carried out. The QR code has been found to be the most appropriate symbology for DNA barcode sequences. A web server has also been constructed to allow biologists to utilize QR codes in practical DNA barcoding applications. PMID:22574113
High-resolution characterization of sequence signatures due to non-random cleavage of cell-free DNA.
Chandrananda, Dineika; Thorne, Natalie P; Bahlo, Melanie
2015-06-17
High-throughput sequencing of cell-free DNA fragments found in human plasma has been used to non-invasively detect fetal aneuploidy, monitor organ transplants and investigate tumor DNA. However, many biological properties of this extracellular genetic material remain unknown. Research that further characterizes circulating DNA could substantially increase its diagnostic value by allowing the application of more sophisticated bioinformatics tools that lead to an improved signal to noise ratio in the sequencing data. In this study, we investigate various features of cell-free DNA in plasma using deep-sequencing data from two pregnant women (>70X, >50X) and compare them with matched cellular DNA. We utilize a descriptive approach to examine how the biological cleavage of cell-free DNA affects different sequence signatures such as fragment lengths, sequence motifs at fragment ends and the distribution of cleavage sites along the genome. We show that the size distributions of these cell-free DNA molecules are dependent on their autosomal and mitochondrial origin as well as the genomic location within chromosomes. DNA mapping to particular microsatellites and alpha repeat elements display unique size signatures. We show how cell-free fragments occur in clusters along the genome, localizing to nucleosomal arrays and are preferentially cleaved at linker regions by correlating the mapping locations of these fragments with ENCODE annotation of chromatin organization. Our work further demonstrates that cell-free autosomal DNA cleavage is sequence dependent. The region spanning up to 10 positions on either side of the DNA cleavage site show a consistent pattern of preference for specific nucleotides. This sequence motif is present in cleavage sites localized to nucleosomal cores and linker regions but is absent in nucleosome-free mitochondrial DNA. These background signals in cell-free DNA sequencing data stem from the non-random biological cleavage of these fragments. This sequence structure can be harnessed to improve bioinformatics algorithms, in particular for CNV and structural variant detection. Descriptive measures for cell-free DNA features developed here could also be used in biomarker analysis to monitor the changes that occur during different pathological conditions.
Analysis of DNA Sequences by An Optical Time-Integrating Correlator: Proof-Of-Concept Experiments.
1992-05-01
TABLES xv LIST OF ABBREVIATIONS xvii 1.0 INTRODUCTION 1 2.0 DNA ANALYSIS STRATEGY 4 2.1 Representation of DNA Bases 4 2.2 DNA Analysis Strategy 6 3.0...Zehnder architecture. 3 Figure 3: Short representations of the DNA bases where each base is represented by a 7-bits long pseudorandom sequence. 5... DNA bases where each base is represented by 7-bits long pseudorandom sequences. 4 Table 2: Long representations of the DNA bases with 255-bits maximum
Spatiotemporal clustering of the epigenome reveals rules of dynamic gene regulation
Yu, Pengfei; Xiao, Shu; Xin, Xiaoyun; Song, Chun-Xiao; Huang, Wei; McDee, Darina; Tanaka, Tetsuya; Wang, Ting; He, Chuan; Zhong, Sheng
2013-01-01
Spatial organization of different epigenomic marks was used to infer functions of the epigenome. It remains unclear what can be learned from the temporal changes of the epigenome. Here, we developed a probabilistic model to cluster genomic sequences based on the similarity of temporal changes of multiple epigenomic marks during a cellular differentiation process. We differentiated mouse embryonic stem (ES) cells into mesendoderm cells. At three time points during this differentiation process, we used high-throughput sequencing to measure seven histone modifications and variants—H3K4me1/2/3, H3K27ac, H3K27me3, H3K36me3, and H2A.Z; two DNA modifications—5-mC and 5-hmC; and transcribed mRNAs and noncoding RNAs (ncRNAs). Genomic sequences were clustered based on the spatiotemporal epigenomic information. These clusters not only clearly distinguished gene bodies, promoters, and enhancers, but also were predictive of bidirectional promoters, miRNA promoters, and piRNAs. This suggests specific epigenomic patterns exist on piRNA genes much earlier than germ cell development. Temporal changes of H3K4me2, unmethylated CpG, and H2A.Z were predictive of 5-hmC changes, suggesting unmethylated CpG and H3K4me2 as potential upstream signals guiding TETs to specific sequences. Several rules on combinatorial epigenomic changes and their effects on mRNA expression and ncRNA expression were derived, including a simple rule governing the relationship between 5-hmC and gene expression levels. A Sox17 enhancer containing a FOXA2 binding site and a Foxa2 enhancer containing a SOX17 binding site were identified, suggesting a positive feedback loop between the two mesendoderm transcription factors. These data illustrate the power of using epigenome dynamics to investigate regulatory functions. PMID:23033340
SNP discovery through de novo deep sequencing using the next generation of DNA sequencers
USDA-ARS?s Scientific Manuscript database
The production of high volumes of DNA sequence data using new technologies has permitted more efficient identification of single nucleotide polymorphisms in vertebrate genomes. This chapter presented practical methodology for production and analysis of DNA sequence data for SNP discovery....
A simple procedure for parallel sequence analysis of both strands of 5'-labeled DNA.
Razvi, F; Gargiulo, G; Worcel, A
1983-08-01
Ligation of a 5'-labeled DNA restriction fragment results in a circular DNA molecule carrying the two 32Ps at the reformed restriction site. Double digestions of the circular DNA with the original enzyme and a second restriction enzyme cleavage near the labeled site allows direct chemical sequencing of one 5'-labeled DNA strand. Similar double digestions, using an isoschizomer that cleaves differently at the 32P-labeled site, allows direct sequencing of the now 3'-labeled complementary DNA strand. It is possible to directly sequence both strands of cloned DNA inserts by using the above protocol and a multiple cloning site vector that provides the necessary restriction sites. The simultaneous and parallel visualization of both DNA strands eliminates sequence ambiguities. In addition, the labeled circular molecules are particularly useful for single-hit DNA cleavage studies and DNA footprint analysis. As an example, we show here an analysis of the micrococcal nuclease-induced breaks on the two strands of the somatic 5S RNA gene of Xenopus borealis, which suggests that the enzyme may recognize and cleave small AT-containing palindromes along the DNA helix.
A Glimpse into the Satellite DNA Library in Characidae Fish (Teleostei, Characiformes)
Utsunomia, Ricardo; Ruiz-Ruano, Francisco J.; Silva, Duílio M. Z. A.; Serrano, Érica A.; Rosa, Ivana F.; Scudeler, Patrícia E. S.; Hashimoto, Diogo T.; Oliveira, Claudio; Camacho, Juan Pedro M.; Foresti, Fausto
2017-01-01
Satellite DNA (satDNA) is an abundant fraction of repetitive DNA in eukaryotic genomes and plays an important role in genome organization and evolution. In general, satDNA sequences follow a concerted evolutionary pattern through the intragenomic homogenization of different repeat units. In addition, the satDNA library hypothesis predicts that related species share a series of satDNA variants descended from a common ancestor species, with differential amplification of different satDNA variants. The finding of a same satDNA family in species belonging to different genera within Characidae fish provided the opportunity to test both concerted evolution and library hypotheses. For this purpose, we analyzed here sequence variation and abundance of this satDNA family in ten species, by a combination of next generation sequencing (NGS), PCR and Sanger sequencing, and fluorescence in situ hybridization (FISH). We found extensive between-species variation for the number and size of pericentromeric FISH signals. At genomic level, the analysis of 1000s of DNA sequences obtained by Illumina sequencing and PCR amplification allowed defining 150 haplotypes which were linked in a common minimum spanning tree, where different patterns of concerted evolution were apparent. This also provided a glimpse into the satDNA library of this group of species. In consistency with the library hypothesis, different variants for this satDNA showed high differences in abundance between species, from highly abundant to simply relictual variants. PMID:28855916
Short, interspersed, and repetitive DNA sequences in Spiroplasma species.
Nur, I; LeBlanc, D J; Tully, J G
1987-03-01
Small fragments of DNA from an 8-kbp plasmid, pRA1, from a plant pathogenic strain of Spiroplasma citri were shown previously to be present in the chromosomal DNA of at least two species of Spiroplasma. We describe here the shot-gun cloning of chromosomal DNA from S. citri Maroc and the identification of two distinct sequences exhibiting homology to pRA1. Further subcloning experiments provided specific molecular probes for the identification of these two sequences in chromosomal DNA from three distinct plant pathogenic species of Spiroplasma. The results of Southern blot hybridization indicated that each of the pRA1-associated sequences is present as multiple copies in short, dispersed, and repetitive sequences in the chromosomes of these three strains. None of the sequences was detectable in chromosomal DNA from an additional nine Spiroplasma strains examined.
Laser Desorption Mass Spectrometry for DNA Sequencing and Analysis
NASA Astrophysics Data System (ADS)
Chen, C. H. Winston; Taranenko, N. I.; Golovlev, V. V.; Isola, N. R.; Allman, S. L.
1998-03-01
Rapid DNA sequencing and/or analysis is critically important for biomedical research. In the past, gel electrophoresis has been the primary tool to achieve DNA analysis and sequencing. However, gel electrophoresis is a time-consuming and labor-extensive process. Recently, we have developed and used laser desorption mass spectrometry (LDMS) to achieve sequencing of ss-DNA longer than 100 nucleotides. With LDMS, we succeeded in sequencing DNA in seconds instead of hours or days required by gel electrophoresis. In addition to sequencing, we also applied LDMS for the detection of DNA probes for hybridization LDMS was also used to detect short tandem repeats for forensic applications. Clinical applications for disease diagnosis such as cystic fibrosis caused by base deletion and point mutation have also been demonstrated. Experimental details will be presented in the meeting. abstract.
Constructing DNA Barcode Sets Based on Particle Swarm Optimization.
Wang, Bin; Zheng, Xuedong; Zhou, Shihua; Zhou, Changjun; Wei, Xiaopeng; Zhang, Qiang; Wei, Ziqi
2018-01-01
Following the completion of the human genome project, a large amount of high-throughput bio-data was generated. To analyze these data, massively parallel sequencing, namely next-generation sequencing, was rapidly developed. DNA barcodes are used to identify the ownership between sequences and samples when they are attached at the beginning or end of sequencing reads. Constructing DNA barcode sets provides the candidate DNA barcodes for this application. To increase the accuracy of DNA barcode sets, a particle swarm optimization (PSO) algorithm has been modified and used to construct the DNA barcode sets in this paper. Compared with the extant results, some lower bounds of DNA barcode sets are improved. The results show that the proposed algorithm is effective in constructing DNA barcode sets.
Winnowing DNA for rare sequences: highly specific sequence and methylation based enrichment.
Thompson, Jason D; Shibahara, Gosuke; Rajan, Sweta; Pel, Joel; Marziali, Andre
2012-01-01
Rare mutations in cell populations are known to be hallmarks of many diseases and cancers. Similarly, differential DNA methylation patterns arise in rare cell populations with diagnostic potential such as fetal cells circulating in maternal blood. Unfortunately, the frequency of alleles with diagnostic potential, relative to wild-type background sequence, is often well below the frequency of errors in currently available methods for sequence analysis, including very high throughput DNA sequencing. We demonstrate a DNA preparation and purification method that through non-linear electrophoretic separation in media containing oligonucleotide probes, achieves 10,000 fold enrichment of target DNA with single nucleotide specificity, and 100 fold enrichment of unmodified methylated DNA differing from the background by the methylation of a single cytosine residue.
Carpenter, Meredith L.; Buenrostro, Jason D.; Valdiosera, Cristina; Schroeder, Hannes; Allentoft, Morten E.; Sikora, Martin; Rasmussen, Morten; Gravel, Simon; Guillén, Sonia; Nekhrizov, Georgi; Leshtakov, Krasimir; Dimitrova, Diana; Theodossiev, Nikola; Pettener, Davide; Luiselli, Donata; Sandoval, Karla; Moreno-Estrada, Andrés; Li, Yingrui; Wang, Jun; Gilbert, M. Thomas P.; Willerslev, Eske; Greenleaf, William J.; Bustamante, Carlos D.
2013-01-01
Most ancient specimens contain very low levels of endogenous DNA, precluding the shotgun sequencing of many interesting samples because of cost. Ancient DNA (aDNA) libraries often contain <1% endogenous DNA, with the majority of sequencing capacity taken up by environmental DNA. Here we present a capture-based method for enriching the endogenous component of aDNA sequencing libraries. By using biotinylated RNA baits transcribed from genomic DNA libraries, we are able to capture DNA fragments from across the human genome. We demonstrate this method on libraries created from four Iron Age and Bronze Age human teeth from Bulgaria, as well as bone samples from seven Peruvian mummies and a Bronze Age hair sample from Denmark. Prior to capture, shotgun sequencing of these libraries yielded an average of 1.2% of reads mapping to the human genome (including duplicates). After capture, this fraction increased substantially, with up to 59% of reads mapped to human and enrichment ranging from 6- to 159-fold. Furthermore, we maintained coverage of the majority of regions sequenced in the precapture library. Intersection with the 1000 Genomes Project reference panel yielded an average of 50,723 SNPs (range 3,062–147,243) for the postcapture libraries sequenced with 1 million reads, compared with 13,280 SNPs (range 217–73,266) for the precapture libraries, increasing resolution in population genetic analyses. Our whole-genome capture approach makes it less costly to sequence aDNA from specimens containing very low levels of endogenous DNA, enabling the analysis of larger numbers of samples. PMID:24568772
Biological nanopore MspA for DNA sequencing
NASA Astrophysics Data System (ADS)
Manrao, Elizabeth A.
Unlocking the information hidden in the human genome provides insight into the inner workings of complex biological systems and can be used to greatly improve health-care. In order to allow for widespread sequencing, new technologies are required that provide fast and inexpensive readings of DNA. Nanopore sequencing is a third generation DNA sequencing technology that is currently being developed to fulfill this need. In nanopore sequencing, a voltage is applied across a small pore in an electrolyte solution and the resulting ionic current is recorded. When DNA passes through the channel, the ionic current is partially blocked. If the DNA bases uniquely modulate the ionic current flowing through the channel, the time trace of the current can be related to the sequence of DNA passing through the pore. There are two main challenges to realizing nanopore sequencing: identifying a pore with sensitivity to single nucleotides and controlling the translocation of DNA through the pore so that the small single nucleotide current signatures are distinguishable from background noise. In this dissertation, I explore the use of Mycobacterium smegmatis porin A (MspA) for nanopore sequencing. In order to determine MspA's sensitivity to single nucleotides, DNA strands of various compositions are held in the pore as the resulting ionic current is measured. DNA is immobilized in MspA by attaching it to a large molecule which acts as an anchor. This technique confirms the single nucleotide resolution of the pore and additionally shows that MspA is sensitive to epigenetic modifications and single nucleotide polymorphisms. The forces from the electric field within MspA, the effective charge of nucleotides, and elasticity of DNA are estimated using a Freely Jointed Chain model of single stranded DNA. These results offer insight into the interactions of DNA within the pore. With the nucleotide sensitivity of MspA confirmed, a method is introduced to controllably pass DNA through the pore. Using a DNA polymerase, DNA strands are stepped through MspA one nucleotide at a time. The steps are observable as distinct levels on the ionic-current time-trace and are related to the DNA sequence. These experiments overcome the two fundamental challenges to realizing MspA nanopore sequencing and pave the way to the development of a commercial technology.
Effects of sequence on DNA wrapping around histones
NASA Astrophysics Data System (ADS)
Ortiz, Vanessa
2011-03-01
A central question in biophysics is whether the sequence of a DNA strand affects its mechanical properties. In epigenetics, these are thought to influence nucleosome positioning and gene expression. Theoretical and experimental attempts to answer this question have been hindered by an inability to directly resolve DNA structure and dynamics at the base-pair level. In our previous studies we used a detailed model of DNA to measure the effects of sequence on the stability of naked DNA under bending. Sequence was shown to influence DNA's ability to form kinks, which arise when certain motifs slide past others to form non-native contacts. Here, we have now included histone-DNA interactions to see if the results obtained for naked DNA are transferable to the problem of nucleosome positioning. Different DNA sequences interacting with the histone protein complex are studied, and their equilibrium and mechanical properties are compared among themselves and with the naked case. NLM training grant to the Computation and Informatics in Biology and Medicine Training Program (NLM T15LM007359).
Taggart, David J.; Camerlengo, Terry L.; Harrison, Jason K.; Sherrer, Shanen M.; Kshetry, Ajay K.; Taylor, John-Stephen; Huang, Kun; Suo, Zucai
2013-01-01
Cellular genomes are constantly damaged by endogenous and exogenous agents that covalently and structurally modify DNA to produce DNA lesions. Although most lesions are mended by various DNA repair pathways in vivo, a significant number of damage sites persist during genomic replication. Our understanding of the mutagenic outcomes derived from these unrepaired DNA lesions has been hindered by the low throughput of existing sequencing methods. Therefore, we have developed a cost-effective high-throughput short oligonucleotide sequencing assay that uses next-generation DNA sequencing technology for the assessment of the mutagenic profiles of translesion DNA synthesis catalyzed by any error-prone DNA polymerase. The vast amount of sequencing data produced were aligned and quantified by using our novel software. As an example, the high-throughput short oligonucleotide sequencing assay was used to analyze the types and frequencies of mutations upstream, downstream and at a site-specifically placed cis–syn thymidine–thymidine dimer generated individually by three lesion-bypass human Y-family DNA polymerases. PMID:23470999
Sequencing of Oligourea Foldamers by Tandem Mass Spectrometry
NASA Astrophysics Data System (ADS)
Bathany, Katell; Owens, Neil W.; Guichard, Gilles; Schmitter, Jean-Marie
2013-03-01
This study is focused on sequence analysis of peptidomimetic helical oligoureas by means of tandem mass spectrometry, to build a basis for de novo sequencing for future high-throughput combinatorial library screening of oligourea foldamers. After the evaluation of MS/MS spectra obtained for model compounds with either MALDI or ESI sources, we found that the MALDI-TOF-TOF instrument gave more satisfactory results. MS/MS spectra of oligoureas generated by decay of singly charged precursor ions show major ion series corresponding to fragmentation across both CO-NH and N'H-CO urea bonds. Oligourea backbones fragment to produce a pattern of a, x, b, and y type fragment ions. De novo decoding of spectral information is facilitated by the occurrence of low mass reporter ions, representative of constitutive monomers, in an analogous manner to the use of immonium ions for peptide sequencing.
An extended sequence specificity for UV-induced DNA damage.
Chung, Long H; Murray, Vincent
2018-01-01
The sequence specificity of UV-induced DNA damage was determined with a higher precision and accuracy than previously reported. UV light induces two major damage adducts: cyclobutane pyrimidine dimers (CPDs) and pyrimidine(6-4)pyrimidone photoproducts (6-4PPs). Employing capillary electrophoresis with laser-induced fluorescence and taking advantages of the distinct properties of the CPDs and 6-4PPs, we studied the sequence specificity of UV-induced DNA damage in a purified DNA sequence using two approaches: end-labelling and a polymerase stop/linear amplification assay. A mitochondrial DNA sequence that contained a random nucleotide composition was employed as the target DNA sequence. With previous methodology, the UV sequence specificity was determined at a dinucleotide or trinucleotide level; however, in this paper, we have extended the UV sequence specificity to a hexanucleotide level. With the end-labelling technique (for 6-4PPs), the consensus sequence was found to be 5'-GCTC*AC (where C* is the breakage site); while with the linear amplification procedure, it was 5'-TCTT*AC. With end-labelling, the dinucleotide frequency of occurrence was highest for 5'-TC*, 5'-TT* and 5'-CC*; whereas it was 5'-TT* for linear amplification. The influence of neighbouring nucleotides on the degree of UV-induced DNA damage was also examined. The core sequences consisted of pyrimidine nucleotides 5'-CTC* and 5'-CTT* while an A at position "1" and C at position "2" enhanced UV-induced DNA damage. Crown Copyright © 2017. Published by Elsevier B.V. All rights reserved.
Church, George M.; Kieffer-Higgins, Stephen
1992-01-01
This invention features vectors and a method for sequencing DNA. The method includes the steps of: a) ligating the DNA into a vector comprising a tag sequence, the tag sequence includes at least 15 bases, wherein the tag sequence will not hybridize to the DNA under stringent hybridization conditions and is unique in the vector, to form a hybrid vector, b) treating the hybrid vector in a plurality of vessels to produce fragments comprising the tag sequence, wherein the fragments differ in length and terminate at a fixed known base or bases, wherein the fixed known base or bases differs in each vessel, c) separating the fragments from each vessel according to their size, d) hybridizing the fragments with an oligonucleotide able to hybridize specifically with the tag sequence, and e) detecting the pattern of hybridization of the tag sequence, wherein the pattern reflects the nucleotide sequence of the DNA.
BiQ Analyzer HT: locus-specific analysis of DNA methylation by high-throughput bisulfite sequencing
Lutsik, Pavlo; Feuerbach, Lars; Arand, Julia; Lengauer, Thomas; Walter, Jörn; Bock, Christoph
2011-01-01
Bisulfite sequencing is a widely used method for measuring DNA methylation in eukaryotic genomes. The assay provides single-base pair resolution and, given sufficient sequencing depth, its quantitative accuracy is excellent. High-throughput sequencing of bisulfite-converted DNA can be applied either genome wide or targeted to a defined set of genomic loci (e.g. using locus-specific PCR primers or DNA capture probes). Here, we describe BiQ Analyzer HT (http://biq-analyzer-ht.bioinf.mpi-inf.mpg.de/), a user-friendly software tool that supports locus-specific analysis and visualization of high-throughput bisulfite sequencing data. The software facilitates the shift from time-consuming clonal bisulfite sequencing to the more quantitative and cost-efficient use of high-throughput sequencing for studying locus-specific DNA methylation patterns. In addition, it is useful for locus-specific visualization of genome-wide bisulfite sequencing data. PMID:21565797
A DNA sequence analysis package for the IBM personal computer.
Lagrimini, L M; Brentano, S T; Donelson, J E
1984-01-01
We present here a collection of DNA sequence analysis programs, called "PC Sequence" (PCS), which are designed to run on the IBM Personal Computer (PC). These programs are written in IBM PC compiled BASIC and take full advantage of the IBM PC's speed, error handling, and graphics capabilities. For a modest initial expense in hardware any laboratory can use these programs to quickly perform computer analysis on DNA sequences. They are written with the novice user in mind and require very little training or previous experience with computers. Also provided are a text editing program for creating and modifying DNA sequence files and a communications program which enables the PC to communicate with and collect information from mainframe computers and DNA sequence databases. PMID:6546433
Genomic sequencing of Pleistocene cave bears
DOE Office of Scientific and Technical Information (OSTI.GOV)
Noonan, James P.; Hofreiter, Michael; Smith, Doug
2005-04-01
Despite the information content of genomic DNA, ancient DNA studies to date have largely been limited to amplification of mitochondrial DNA due to technical hurdles such as contamination and degradation of ancient DNAs. In this study, we describe two metagenomic libraries constructed using unamplified DNA extracted from the bones of two 40,000-year-old extinct cave bears. Analysis of {approx}1 Mb of sequence from each library showed that, despite significant microbial contamination, 5.8 percent and 1.1 percent of clones in the libraries contain cave bear inserts, yielding 26,861 bp of cave bear genome sequence. Alignment of this sequence to the dog genome,more » the closest sequenced genome to cave bear in terms of evolutionary distance, revealed roughly the expected ratio of cave bear exons, repeats and conserved noncoding sequences. Only 0.04 percent of all clones sequenced were derived from contamination with modern human DNA. Comparison of cave bear with orthologous sequences from several modern bear species revealed the evolutionary relationship of these lineages. Using the metagenomic approach described here, we have recovered substantial quantities of mammalian genomic sequence more than twice as old as any previously reported, establishing the feasibility of ancient DNA genomic sequencing programs.« less
M. -S. Kim; N. B. Klopfenstein; J. W. Hanna; G. I. McDonald
2006-01-01
Phylogenetic and genetic relationships among 10 North American Armillaria species were analysed using sequence data from ribosomal DNA (rDNA), including intergenic spacer (IGS-1), internal transcribed spacers with associated 5.8S (ITS + 5.8S), and nuclear large subunit rDNA (nLSU), and amplified fragment length polymorphism (AFLP) markers. Based on rDNA sequence data,...
Fractal landscape analysis of DNA walks
NASA Technical Reports Server (NTRS)
Peng, C. K.; Buldyrev, S. V.; Goldberger, A. L.; Havlin, S.; Sciortino, F.; Simons, M.; Stanley, H. E.
1992-01-01
By mapping nucleotide sequences onto a "DNA walk", we uncovered remarkably long-range power law correlations [Nature 356 (1992) 168] that imply a new scale invariant property of DNA. We found such long-range correlations in intron-containing genes and in non-transcribed regulatory DNA sequences, but not in cDNA sequences or intron-less genes. In this paper, we present more explicit evidences to support our findings.
[Genome-scale sequence data processing and epigenetic analysis of DNA methylation].
Wang, Ting-Zhang; Shan, Gao; Xu, Jian-Hong; Xue, Qing-Zhong
2013-06-01
A new approach recently developed for detecting cytosine DNA methylation (mC) and analyzing the genome-scale DNA methylation profiling, is called BS-Seq which is based on bisulfite conversion of genomic DNA combined with next-generation sequencing. The method can not only provide an insight into the difference of genome-scale DNA methylation among different organisms, but also reveal the conservation of DNA methylation in all contexts and nucleotide preference for different genomic regions, including genes, exons, and repetitive DNA sequences. It will be helpful to under-stand the epigenetic impacts of cytosine DNA methylation on the regulation of gene expression and maintaining silence of repetitive sequences, such as transposable elements. In this paper, we introduce the preprocessing steps of DNA methylation data, by which cytosine (C) and guanine (G) in the reference sequence are transferred to thymine (T) and adenine (A), and cytosine in reads is transferred to thymine, respectively. We also comprehensively review the main content of the DNA methylation analysis on the genomic scale: (1) the cytosine methylation under the context of different sequences; (2) the distribution of genomic methylcytosine; (3) DNA methylation context and the preference for the nucleotides; (4) DNA- protein interaction sites of DNA methylation; (5) degree of methylation of cytosine in the different structural elements of genes. DNA methylation analysis technique provides a powerful tool for the epigenome study in human and other species, and genes and environment interaction, and founds the theoretical basis for further development of disease diagnostics and therapeutics in human.
Extracting DNA words based on the sequence features: non-uniform distribution and integrity.
Li, Zhi; Cao, Hongyan; Cui, Yuehua; Zhang, Yanbo
2016-01-25
DNA sequence can be viewed as an unknown language with words as its functional units. Given that most sequence alignment algorithms such as the motif discovery algorithms depend on the quality of background information about sequences, it is necessary to develop an ab initio algorithm for extracting the "words" based only on the DNA sequences. We considered that non-uniform distribution and integrity were two important features of a word, based on which we developed an ab initio algorithm to extract "DNA words" that have potential functional meaning. A Kolmogorov-Smirnov test was used for consistency test of uniform distribution of DNA sequences, and the integrity was judged by the sequence and position alignment. Two random base sequences were adopted as negative control, and an English book was used as positive control to verify our algorithm. We applied our algorithm to the genomes of Saccharomyces cerevisiae and 10 strains of Escherichia coli to show the utility of the methods. The results provide strong evidences that the algorithm is a promising tool for ab initio building a DNA dictionary. Our method provides a fast way for large scale screening of important DNA elements and offers potential insights into the understanding of a genome.
Xu, Yi-Hua; Manoharan, Herbert T; Pitot, Henry C
2007-09-01
The bisulfite genomic sequencing technique is one of the most widely used techniques to study sequence-specific DNA methylation because of its unambiguous ability to reveal DNA methylation status to the order of a single nucleotide. One characteristic feature of the bisulfite genomic sequencing technique is that a number of sample sequence files will be produced from a single DNA sample. The PCR products of bisulfite-treated DNA samples cannot be sequenced directly because they are heterogeneous in nature; therefore they should be cloned into suitable plasmids and then sequenced. This procedure generates an enormous number of sample DNA sequence files as well as adding extra bases belonging to the plasmids to the sequence, which will cause problems in the final sequence comparison. Finding the methylation status for each CpG in each sample sequence is not an easy job. As a result CpG PatternFinder was developed for this purpose. The main functions of the CpG PatternFinder are: (i) to analyze the reference sequence to obtain CpG and non-CpG-C residue position information. (ii) To tailor sample sequence files (delete insertions and mark deletions from the sample sequence files) based on a configuration of ClustalW multiple alignment. (iii) To align sample sequence files with a reference file to obtain bisulfite conversion efficiency and CpG methylation status. And, (iv) to produce graphics, highlighted aligned sequence text and a summary report which can be easily exported to Microsoft Office suite. CpG PatternFinder is designed to operate cooperatively with BioEdit, a freeware on the internet. It can handle up to 100 files of sample DNA sequences simultaneously, and the total CpG pattern analysis process can be finished in minutes. CpG PatternFinder is an ideal software tool for DNA methylation studies to determine the differential methylation pattern in a large number of individuals in a population. Previously we developed the CpG Analyzer program; CpG PatternFinder is our further effort to create software tools for DNA methylation studies.
DNA-based watermarks using the DNA-Crypt algorithm.
Heider, Dominik; Barnekow, Angelika
2007-05-29
The aim of this paper is to demonstrate the application of watermarks based on DNA sequences to identify the unauthorized use of genetically modified organisms (GMOs) protected by patents. Predicted mutations in the genome can be corrected by the DNA-Crypt program leaving the encrypted information intact. Existing DNA cryptographic and steganographic algorithms use synthetic DNA sequences to store binary information however, although these sequences can be used for authentication, they may change the target DNA sequence when introduced into living organisms. The DNA-Crypt algorithm and image steganography are based on the same watermark-hiding principle, namely using the least significant base in case of DNA-Crypt and the least significant bit in case of the image steganography. It can be combined with binary encryption algorithms like AES, RSA or Blowfish. DNA-Crypt is able to correct mutations in the target DNA with several mutation correction codes such as the Hamming-code or the WDH-code. Mutations which can occur infrequently may destroy the encrypted information, however an integrated fuzzy controller decides on a set of heuristics based on three input dimensions, and recommends whether or not to use a correction code. These three input dimensions are the length of the sequence, the individual mutation rate and the stability over time, which is represented by the number of generations. In silico experiments using the Ypt7 in Saccharomyces cerevisiae shows that the DNA watermarks produced by DNA-Crypt do not alter the translation of mRNA into protein. The program is able to store watermarks in living organisms and can maintain the original information by correcting mutations itself. Pairwise or multiple sequence alignments show that DNA-Crypt produces few mismatches between the sequences similar to all steganographic algorithms.
DNA-based watermarks using the DNA-Crypt algorithm
Heider, Dominik; Barnekow, Angelika
2007-01-01
Background The aim of this paper is to demonstrate the application of watermarks based on DNA sequences to identify the unauthorized use of genetically modified organisms (GMOs) protected by patents. Predicted mutations in the genome can be corrected by the DNA-Crypt program leaving the encrypted information intact. Existing DNA cryptographic and steganographic algorithms use synthetic DNA sequences to store binary information however, although these sequences can be used for authentication, they may change the target DNA sequence when introduced into living organisms. Results The DNA-Crypt algorithm and image steganography are based on the same watermark-hiding principle, namely using the least significant base in case of DNA-Crypt and the least significant bit in case of the image steganography. It can be combined with binary encryption algorithms like AES, RSA or Blowfish. DNA-Crypt is able to correct mutations in the target DNA with several mutation correction codes such as the Hamming-code or the WDH-code. Mutations which can occur infrequently may destroy the encrypted information, however an integrated fuzzy controller decides on a set of heuristics based on three input dimensions, and recommends whether or not to use a correction code. These three input dimensions are the length of the sequence, the individual mutation rate and the stability over time, which is represented by the number of generations. In silico experiments using the Ypt7 in Saccharomyces cerevisiae shows that the DNA watermarks produced by DNA-Crypt do not alter the translation of mRNA into protein. Conclusion The program is able to store watermarks in living organisms and can maintain the original information by correcting mutations itself. Pairwise or multiple sequence alignments show that DNA-Crypt produces few mismatches between the sequences similar to all steganographic algorithms. PMID:17535434
Conserved Sequences at the Origin of Adenovirus DNA Replication
Stillman, Bruce W.; Topp, William C.; Engler, Jeffrey A.
1982-01-01
The origin of adenovirus DNA replication lies within an inverted sequence repetition at either end of the linear, double-stranded viral DNA. Initiation of DNA replication is primed by a deoxynucleoside that is covalently linked to a protein, which remains bound to the newly synthesized DNA. We demonstrate that virion-derived DNA-protein complexes from five human adenovirus serological subgroups (A to E) can act as a template for both the initiation and the elongation of DNA replication in vitro, using nuclear extracts from adenovirus type 2 (Ad2)-infected HeLa cells. The heterologous template DNA-protein complexes were not as active as the homologous Ad2 DNA, most probably due to inefficient initiation by Ad2 replication factors. In an attempt to identify common features which may permit this replication, we have also sequenced the inverted terminal repeated DNA from human adenovirus serotypes Ad4 (group E), Ad9 and Ad10 (group D), and Ad31 (group A), and we have compared these to previously determined sequences from Ad2 and Ad5 (group C), Ad7 (group B), and Ad12 and Ad18 (group A) DNA. In all cases, the sequence around the origin of DNA replication can be divided into two structural domains: a proximal A · T-rich region which is partially conserved among these serotypes, and a distal G · C-rich region which is less well conserved. The G · C-rich region contains sequences similar to sequences present in papovavirus replication origins. The two domains may reflect a dual mechanism for initiation of DNA replication: adenovirus-specific protein priming of replication, and subsequent utilization of this primer by host replication factors for completion of DNA synthesis. Images PMID:7143575
Hardware Acceleration Of Multi-Deme Genetic Algorithm for DNA Codeword Searching
2008-01-01
C and G are complementary to each other. A Watson - Crick complement of a DNA sequence is another DNA sequence which replaces all the A with T or vise...versa and replaces all the T with A or vise versa, and also switches the 5’ and 3’ ends. A DNA sequence binds most stably with its Watson - Crick ...bind with 5 Watson - Crick pairs. The length of the longest complementary sequence between two flexible DNA strands, A and B, is the same as the
Bjourson, A J; Stone, C E; Cooper, J E
1992-01-01
A novel subtraction hybridization procedure, incorporating a combination of four separation strategies, was developed to isolate unique DNA sequences from a strain of Rhizobium leguminosarum bv. trifolii. Sau3A-digested DNA from this strain, i.e., the probe strain, was ligated to a linker and hybridized in solution with an excess of pooled subtracter DNA from seven other strains of the same biovar which had been restricted, ligated to a different, biotinylated, subtracter-specific linker, and amplified by polymerase chain reaction to incorporate dUTP. Subtracter DNA and subtracter-probe hybrids were removed by phenol-chloroform extraction of a streptavidin-biotin-DNA complex. NENSORB chromatography of the sequences remaining in the aqueous layer captured biotinylated subtracter DNA which may have escaped removal by phenol-chloroform treatment. Any traces of contaminating subtracter DNA were removed by digestion with uracil DNA glycosylase. Finally, remaining sequences were amplified by polymerase chain reaction with a probe strain-specific primer, labelled with 32P, and tested for specificity in dot blot hybridizations against total genomic target DNA from each strain in the subtracter pool. Two rounds of subtraction-amplification were sufficient to remove cross-hybridizing sequences and to give a probe which hybridized only with homologous target DNA. The method is applicable to the isolation of DNA and RNA sequences from both procaryotic and eucaryotic cells. Images PMID:1637166
Combinatorial Labeling Method for Improving Peptide Fragmentation in Mass Spectrometry
NASA Astrophysics Data System (ADS)
Kuchibhotla, Bhanuramanand; Kola, Sankara Rao; Medicherla, Jagannadham V.; Cherukuvada, Swamy V.; Dhople, Vishnu M.; Nalam, Madhusudhana Rao
2017-06-01
Annotation of peptide sequence from tandem mass spectra constitutes the central step of mass spectrometry-based proteomics. Peptide mass spectra are obtained upon gas-phase fragmentation. Identification of the protein from a set of experimental peptide spectral matches is usually referred as protein inference. Occurrence and intensity of these fragment ions in the MS/MS spectra are dependent on many factors such as amino acid composition, peptide basicity, activation mode, protease, etc. Particularly, chemical derivatizations of peptides were known to alter their fragmentation. In this study, the influence of acetylation, guanidinylation, and their combination on peptide fragmentation was assessed initially on a lipase (LipA) from Bacillus subtilis followed by a bovine six protein mix digest. The dual modification resulted in improved fragment ion occurrence and intensity changes, and this resulted in the equivalent representation of b- and y-type fragment ions in an ion trap MS/MS spectrum. The improved representation has allowed us to accurately annotate the peptide sequences de novo. Dual labeling has significantly reduced the false positive protein identifications in standard bovine six peptide digest. Our study suggests that the combinatorial labeling of peptides is a useful method to validate protein identifications for high confidence protein inference. [Figure not available: see fulltext.
Sequence Dependent Interactions Between DNA and Single-Walled Carbon Nanotubes
NASA Astrophysics Data System (ADS)
Roxbury, Daniel
It is known that single-stranded DNA adopts a helical wrap around a single-walled carbon nanotube (SWCNT), forming a water-dispersible hybrid molecule. The ability to sort mixtures of SWCNTs based on chirality (electronic species) has recently been demonstrated using special short DNA sequences that recognize certain matching SWCNTs of specific chirality. This thesis investigates the intricacies of DNA-SWCNT sequence-specific interactions through both experimental and molecular simulation studies. The DNA-SWCNT binding strengths were experimentally quantified by studying the kinetics of DNA replacement by a surfactant on the surface of particular SWCNTs. Recognition ability was found to correlate strongly with measured binding strength, e.g. DNA sequence (TAT)4 was found to bind 20 times stronger to the (6,5)-SWCNT than sequence (TAT)4T. Next, using replica exchange molecular dynamics (REMD) simulations, equilibrium structures formed by (a) single-strands and (b) multiple-strands of 12-mer oligonucleotides adsorbed on various SWCNTs were explored. A number of structural motifs were discovered in which the DNA strand wraps around the SWCNT and 'stitches' to itself via hydrogen bonding. Great variability among equilibrium structures was observed and shown to be directly influenced by DNA sequence and SWCNT type. For example, the (6,5)-SWCNT DNA recognition sequence, (TAT)4, was found to wrap in a tight single-stranded right-handed helical conformation. In contrast, DNA sequence T12 forms a beta-barrel left-handed structure on the same SWCNT. These are the first theoretical indications that DNA-based SWCNT selectivity can arise on a molecular level. In a biomedical collaboration with the Mayo Clinic, pathways for DNA-SWCNT internalization into healthy human endothelial cells were explored. Through absorbance spectroscopy, TEM imaging, and confocal fluorescence microscopy, we showed that intracellular concentrations of SWCNTs far exceeded those of the incubation solution, which suggested an energy-dependent pathway. Additionally, by means of pharmacological inhibition and vector-induced gene knockout studies, the DNA-SWCNTs were shown to enter the cells via Rac1-mediated macropinocytosis.
Development of a Novel Technology for Label Free DNA Sequencing
2012-05-21
of the C-H bond stretch vibrations in the planes of the corresponding DNA bases , and in the higher-frequency side, sequence-identifier region is...composed of the N-H bond stretch vibrations in the planes of the corresponding DNA bases . In addition, the sequence-identifier dividing region almost...regions are localized at the corresponding DNA bases and exhibit a definable dependence on the sequence form of the codons under study. Final
Flow cytometry for enrichment and titration in massively parallel DNA sequencing
Sandberg, Julia; Ståhl, Patrik L.; Ahmadian, Afshin; Bjursell, Magnus K.; Lundeberg, Joakim
2009-01-01
Massively parallel DNA sequencing is revolutionizing genomics research throughout the life sciences. However, the reagent costs and labor requirements in current sequencing protocols are still substantial, although improvements are continuously being made. Here, we demonstrate an effective alternative to existing sample titration protocols for the Roche/454 system using Fluorescence Activated Cell Sorting (FACS) technology to determine the optimal DNA-to-bead ratio prior to large-scale sequencing. Our method, which eliminates the need for the costly pilot sequencing of samples during titration is capable of rapidly providing accurate DNA-to-bead ratios that are not biased by the quantification and sedimentation steps included in current protocols. Moreover, we demonstrate that FACS sorting can be readily used to highly enrich fractions of beads carrying template DNA, with near total elimination of empty beads and no downstream sacrifice of DNA sequencing quality. Automated enrichment by FACS is a simple approach to obtain pure samples for bead-based sequencing systems, and offers an efficient, low-cost alternative to current enrichment protocols. PMID:19304748
A DNA sequence obtained by replacement of the dopamine RNA aptamer bases is not an aptamer.
Álvarez-Martos, Isabel; Ferapontova, Elena E
2017-08-05
A unique specificity of the aptamer-ligand biorecognition and binding facilitates bioanalysis and biosensor development, contributing to discrimination of structurally related molecules, such as dopamine and other catecholamine neurotransmitters. The aptamer sequence capable of specific binding of dopamine is a 57 nucleotides long RNA sequence reported in 1997 (Biochemistry, 1997, 36, 9726). Later, it was suggested that the DNA homologue of the RNA aptamer retains the specificity of dopamine binding (Biochem. Biophys. Res. Commun., 2009, 388, 732). Here, we show that the DNA sequence obtained by the replacement of the RNA aptamer bases for their DNA analogues is not able of specific biorecognition of dopamine, in contrast to the original RNA aptamer sequence. This DNA sequence binds dopamine and structurally related catecholamine neurotransmitters non-specifically, as any DNA sequence, and, thus, is not an aptamer and cannot be used neither for in vivo nor in situ analysis of dopamine in the presence of structurally related neurotransmitters. Copyright © 2017 Elsevier Inc. All rights reserved.
Method for sequencing DNA base pairs
Sessler, Andrew M.; Dawson, John
1993-01-01
The base pairs of a DNA structure are sequenced with the use of a scanning tunneling microscope (STM). The DNA structure is scanned by the STM probe tip, and, as it is being scanned, the DNA structure is separately subjected to a sequence of infrared radiation from four different sources, each source being selected to preferentially excite one of the four different bases in the DNA structure. Each particular base being scanned is subjected to such sequence of infrared radiation from the four different sources as that particular base is being scanned. The DNA structure as a whole is separately imaged for each subjection thereof to radiation from one only of each source.
van der Kuyl, A C; Kuiken, C L; Dekker, J T; Perizonius, W R; Goudsmit, J
1995-06-01
Monkey mummy bones and teeth originating from the North Saqqara Baboon Galleries (Egypt), soft tissue from a mummified baboon in a museum collection, and nineteenth/twentieth-century skin fragments from mangabeys were used for DNA extraction and PCR amplification of part of the mitochondrial 12S rRNA gene. Sequences aligning with the 12S rRNA gene were recovered but were only distantly related to contemporary monkey mitochondrial 12S rRNA sequences. However, many of these sequences were identical or closely related to human nuclear DNA sequences resembling mitochondrial 12S rRNA (isolated from a cell line depleted in mitochondria) and therefore have to be considered contamination. Subsequently in a separate study we were able to recover genuine mitochondrial 12S rRNA sequences from many extant species of nonhuman Old World primates and sequences closely resembling the human nuclear integrations. Analysis of all sequences by the neighbor-joining (NJ) method indicated that mitochondrial DNA sequences and their nuclear counterparts can be divided into two distinct clusters. One cluster contained all temporary cytoplasmic mitochondrial DNA sequences and approximately half of the monkey nuclear mitochondriallike sequences. A second cluster contained most human nuclear sequences and the other half of monkey nuclear sequences with a separate branch leading to human and gorilla mitochondrial and nuclear sequences. Sequences recovered from ancient materials were equally divided between the two clusters. These results constitute a warning for when working with ancient DNA or performing phylogenetic analysis using mitochondrial DNA as a target sequence: Nuclear counterparts of mitochondrial genes may lead to faulty interpretation of results.
Sequence independent amplification of DNA
Bohlander, S.K.
1998-03-24
The present invention is a rapid sequence-independent amplification procedure (SIA). Even minute amounts of DNA from various sources can be amplified independent of any sequence requirements of the DNA or any a priori knowledge of any sequence characteristics of the DNA to be amplified. This method allows, for example, the sequence independent amplification of microdissected chromosomal material and the reliable construction of high quality fluorescent in situ hybridization (FISH) probes from YACs or from other sources. These probes can be used to localize YACs on metaphase chromosomes but also--with high efficiency--in interphase nuclei. 25 figs.
Sequence independent amplification of DNA
Bohlander, Stefan K.
1998-01-01
The present invention is a rapid sequence-independent amplification procedure (SIA). Even minute amounts of DNA from various sources can be amplified independent of any sequence requirements of the DNA or any a priori knowledge of any sequence characteristics of the DNA to be amplified. This method allows, for example the sequence independent amplification of microdissected chromosomal material and the reliable construction of high quality fluorescent in situ hybridization (FISH) probes from YACs or from other sources. These probes can be used to localize YACs on metaphase chromosomes but also--with high efficiency--in interphase nuclei.
UV-Visible Spectroscopy-Based Quantification of Unlabeled DNA Bound to Gold Nanoparticles.
Baldock, Brandi L; Hutchison, James E
2016-12-20
DNA-functionalized gold nanoparticles have been increasingly applied as sensitive and selective analytical probes and biosensors. The DNA ligands bound to a nanoparticle dictate its reactivity, making it essential to know the type and number of DNA strands bound to the nanoparticle surface. Existing methods used to determine the number of DNA strands per gold nanoparticle (AuNP) require that the sequences be fluorophore-labeled, which may affect the DNA surface coverage and reactivity of the nanoparticle and/or require specialized equipment and other fluorophore-containing reagents. We report a UV-visible-based method to conveniently and inexpensively determine the number of DNA strands attached to AuNPs of different core sizes. When this method is used in tandem with a fluorescence dye assay, it is possible to determine the ratio of two unlabeled sequences of different lengths bound to AuNPs. Two sizes of citrate-stabilized AuNPs (5 and 12 nm) were functionalized with mixtures of short (5 base) and long (32 base) disulfide-terminated DNA sequences, and the ratios of sequences bound to the AuNPs were determined using the new method. The long DNA sequence was present as a lower proportion of the ligand shell than in the ligand exchange mixture, suggesting it had a lower propensity to bind the AuNPs than the short DNA sequence. The ratio of DNA sequences bound to the AuNPs was not the same for the large and small AuNPs, which suggests that the radius of curvature had a significant influence on the assembly of DNA strands onto the AuNPs.
Ancient DNA sequence revealed by error-correcting codes.
Brandão, Marcelo M; Spoladore, Larissa; Faria, Luzinete C B; Rocha, Andréa S L; Silva-Filho, Marcio C; Palazzo, Reginaldo
2015-07-10
A previously described DNA sequence generator algorithm (DNA-SGA) using error-correcting codes has been employed as a computational tool to address the evolutionary pathway of the genetic code. The code-generated sequence alignment demonstrated that a residue mutation revealed by the code can be found in the same position in sequences of distantly related taxa. Furthermore, the code-generated sequences do not promote amino acid changes in the deviant genomes through codon reassignment. A Bayesian evolutionary analysis of both code-generated and homologous sequences of the Arabidopsis thaliana malate dehydrogenase gene indicates an approximately 1 MYA divergence time from the MDH code-generated sequence node to its paralogous sequences. The DNA-SGA helps to determine the plesiomorphic state of DNA sequences because a single nucleotide alteration often occurs in distantly related taxa and can be found in the alternative codon patterns of noncanonical genetic codes. As a consequence, the algorithm may reveal an earlier stage of the evolution of the standard code.
Ancient DNA sequence revealed by error-correcting codes
Brandão, Marcelo M.; Spoladore, Larissa; Faria, Luzinete C. B.; Rocha, Andréa S. L.; Silva-Filho, Marcio C.; Palazzo, Reginaldo
2015-01-01
A previously described DNA sequence generator algorithm (DNA-SGA) using error-correcting codes has been employed as a computational tool to address the evolutionary pathway of the genetic code. The code-generated sequence alignment demonstrated that a residue mutation revealed by the code can be found in the same position in sequences of distantly related taxa. Furthermore, the code-generated sequences do not promote amino acid changes in the deviant genomes through codon reassignment. A Bayesian evolutionary analysis of both code-generated and homologous sequences of the Arabidopsis thaliana malate dehydrogenase gene indicates an approximately 1 MYA divergence time from the MDH code-generated sequence node to its paralogous sequences. The DNA-SGA helps to determine the plesiomorphic state of DNA sequences because a single nucleotide alteration often occurs in distantly related taxa and can be found in the alternative codon patterns of noncanonical genetic codes. As a consequence, the algorithm may reveal an earlier stage of the evolution of the standard code. PMID:26159228
Didelot, Audrey; Kotsopoulos, Steve K; Lupo, Audrey; Pekin, Deniz; Li, Xinyu; Atochin, Ivan; Srinivasan, Preethi; Zhong, Qun; Olson, Jeff; Link, Darren R; Laurent-Puig, Pierre; Blons, Hélène; Hutchison, J Brian; Taly, Valerie
2013-05-01
Assessment of DNA integrity and quantity remains a bottleneck for high-throughput molecular genotyping technologies, including next-generation sequencing. In particular, DNA extracted from paraffin-embedded tissues, a major potential source of tumor DNA, varies widely in quality, leading to unpredictable sequencing data. We describe a picoliter droplet-based digital PCR method that enables simultaneous detection of DNA integrity and the quantity of amplifiable DNA. Using a multiplex assay, we detected 4 different target lengths (78, 159, 197, and 550 bp). Assays were validated with human genomic DNA fragmented to sizes of 170 bp to 3000 bp. The technique was validated with DNA quantities as low as 1 ng. We evaluated 12 DNA samples extracted from paraffin-embedded lung adenocarcinoma tissues. One sample contained no amplifiable DNA. The fractions of amplifiable DNA for the 11 other samples were between 0.05% and 10.1% for 78-bp fragments and ≤1% for longer fragments. Four samples were chosen for enrichment and next-generation sequencing. The quality of the sequencing data was in agreement with the results of the DNA-integrity test. Specifically, DNA with low integrity yielded sequencing results with lower levels of coverage and uniformity and had higher levels of false-positive variants. The development of DNA-quality assays will enable researchers to downselect samples or process more DNA to achieve reliable genome sequencing with the highest possible efficiency of cost and effort, as well as minimize the waste of precious samples. © 2013 American Association for Clinical Chemistry.
Jun, Goo; Flickinger, Matthew; Hetrick, Kurt N.; Romm, Jane M.; Doheny, Kimberly F.; Abecasis, Gonçalo R.; Boehnke, Michael; Kang, Hyun Min
2012-01-01
DNA sample contamination is a serious problem in DNA sequencing studies and may result in systematic genotype misclassification and false positive associations. Although methods exist to detect and filter out cross-species contamination, few methods to detect within-species sample contamination are available. In this paper, we describe methods to identify within-species DNA sample contamination based on (1) a combination of sequencing reads and array-based genotype data, (2) sequence reads alone, and (3) array-based genotype data alone. Analysis of sequencing reads allows contamination detection after sequence data is generated but prior to variant calling; analysis of array-based genotype data allows contamination detection prior to generation of costly sequence data. Through a combination of analysis of in silico and experimentally contaminated samples, we show that our methods can reliably detect and estimate levels of contamination as low as 1%. We evaluate the impact of DNA contamination on genotype accuracy and propose effective strategies to screen for and prevent DNA contamination in sequencing studies. PMID:23103226
Integrated sequencing of exome and mRNA of large-sized single cells.
Wang, Lily Yan; Guo, Jiajie; Cao, Wei; Zhang, Meng; He, Jiankui; Li, Zhoufang
2018-01-10
Current approaches of single cell DNA-RNA integrated sequencing are difficult to call SNPs, because a large amount of DNA and RNA is lost during DNA-RNA separation. Here, we performed simultaneous single-cell exome and transcriptome sequencing on individual mouse oocytes. Using microinjection, we kept the nuclei intact to avoid DNA loss, while retaining the cytoplasm inside the cell membrane, to maximize the amount of DNA and RNA captured from the single cell. We then conducted exome-sequencing on the isolated nuclei and mRNA-sequencing on the enucleated cytoplasm. For single oocytes, exome-seq can cover up to 92% of exome region with an average sequencing depth of 10+, while mRNA-sequencing reveals more than 10,000 expressed genes in enucleated cytoplasm, with similar performance for intact oocytes. This approach provides unprecedented opportunities to study DNA-RNA regulation, such as RNA editing at single nucleotide level in oocytes. In future, this method can also be applied to other large cells, including neurons, large dendritic cells and large tumour cells for integrated exome and transcriptome sequencing.
Shinozuka, Hiroshi; Cogan, Noel O I; Shinozuka, Maiko; Marshall, Alexis; Kay, Pippa; Lin, Yi-Han; Spangenberg, German C; Forster, John W
2015-04-11
Fragmentation at random nucleotide locations is an essential process for preparation of DNA libraries to be used on massively parallel short-read DNA sequencing platforms. Although instruments for physical shearing, such as the Covaris S2 focused-ultrasonicator system, and products for enzymatic shearing, such as the Nextera technology and NEBNext dsDNA Fragmentase kit, are commercially available, a simple and inexpensive method is desirable for high-throughput sequencing library preparation. MspJI is a recently characterised restriction enzyme which recognises the sequence motif CNNR (where R = G or A) when the first base is modified to 5-methylcytosine or 5-hydroxymethylcytosine. A semi-random enzymatic DNA amplicon fragmentation method was developed based on the unique cleavage properties of MspJI. In this method, random incorporation of 5-methyl-2'-deoxycytidine-5'-triphosphate is achieved through DNA amplification with DNA polymerase, followed by DNA digestion with MspJI. Due to the recognition sequence of the enzyme, DNA amplicons are fragmented in a relatively sequence-independent manner. The size range of the resulting fragments was capable of control through optimisation of 5-methyl-2'-deoxycytidine-5'-triphosphate concentration in the reaction mixture. A library suitable for sequencing using the Illumina MiSeq platform was prepared and processed using the proposed method. Alignment of generated short reads to a reference sequence demonstrated a relatively high level of random fragmentation. The proposed method may be performed with standard laboratory equipment. Although the uniformity of coverage was slightly inferior to the Covaris physical shearing procedure, due to efficiencies of cost and labour, the method may be more suitable than existing approaches for implementation in large-scale sequencing activities, such as bacterial artificial chromosome (BAC)-based genome sequence assembly, pan-genomic studies and locus-targeted genotyping-by-sequencing.
Genomics dataset of unidentified disclosed isolates.
Rekadwad, Bhagwan N
2016-09-01
Analysis of DNA sequences is necessary for higher hierarchical classification of the organisms. It gives clues about the characteristics of organisms and their taxonomic position. This dataset is chosen to find complexities in the unidentified DNA in the disclosed patents. A total of 17 unidentified DNA sequences were thoroughly analyzed. The quick response codes were generated. AT/GC content of the DNA sequences analysis was carried out. The QR is helpful for quick identification of isolates. AT/GC content is helpful for studying their stability at different temperatures. Additionally, a dataset on cleavage code and enzyme code studied under the restriction digestion study, which helpful for performing studies using short DNA sequences was reported. The dataset disclosed here is the new revelatory data for exploration of unique DNA sequences for evaluation, identification, comparison and analysis.
Winnowing DNA for Rare Sequences: Highly Specific Sequence and Methylation Based Enrichment
Thompson, Jason D.; Shibahara, Gosuke; Rajan, Sweta; Pel, Joel; Marziali, Andre
2012-01-01
Rare mutations in cell populations are known to be hallmarks of many diseases and cancers. Similarly, differential DNA methylation patterns arise in rare cell populations with diagnostic potential such as fetal cells circulating in maternal blood. Unfortunately, the frequency of alleles with diagnostic potential, relative to wild-type background sequence, is often well below the frequency of errors in currently available methods for sequence analysis, including very high throughput DNA sequencing. We demonstrate a DNA preparation and purification method that through non-linear electrophoretic separation in media containing oligonucleotide probes, achieves 10,000 fold enrichment of target DNA with single nucleotide specificity, and 100 fold enrichment of unmodified methylated DNA differing from the background by the methylation of a single cytosine residue. PMID:22355378
Rackwitz, Jenny; Bald, Ilko
2018-03-26
During cancer radiation therapy high-energy radiation is used to reduce tumour tissue. The irradiation produces a shower of secondary low-energy (<20 eV) electrons, which are able to damage DNA very efficiently by dissociative electron attachment. Recently, it was suggested that low-energy electron-induced DNA strand breaks strongly depend on the specific DNA sequence with a high sensitivity of G-rich sequences. Here, we use DNA origami platforms to expose G-rich telomere sequences to low-energy (8.8 eV) electrons to determine absolute cross sections for strand breakage and to study the influence of sequence modifications and topology of telomeric DNA on the strand breakage. We find that the telomeric DNA 5'-(TTA GGG) 2 is more sensitive to low-energy electrons than an intermixed sequence 5'-(TGT GTG A) 2 confirming the unique electronic properties resulting from G-stacking. With increasing length of the oligonucleotide (i.e., going from 5'-(GGG ATT) 2 to 5'-(GGG ATT) 4 ), both the variety of topology and the electron-induced strand break cross sections increase. Addition of K + ions decreases the strand break cross section for all sequences that are able to fold G-quadruplexes or G-intermediates, whereas the strand break cross section for the intermixed sequence remains unchanged. These results indicate that telomeric DNA is rather sensitive towards low-energy electron-induced strand breakage suggesting significant telomere shortening that can also occur during cancer radiation therapy. © 2018 Wiley-VCH Verlag GmbH & Co. KGaA, Weinheim.
Improved multiple displacement amplification (iMDA) and ultraclean reagents.
Motley, S Timothy; Picuri, John M; Crowder, Chris D; Minich, Jeremiah J; Hofstadler, Steven A; Eshoo, Mark W
2014-06-06
Next-generation sequencing sample preparation requires nanogram to microgram quantities of DNA; however, many relevant samples are comprised of only a few cells. Genomic analysis of these samples requires a whole genome amplification method that is unbiased and free of exogenous DNA contamination. To address these challenges we have developed protocols for the production of DNA-free consumables including reagents and have improved upon multiple displacement amplification (iMDA). A specialized ethylene oxide treatment was developed that renders free DNA and DNA present within Gram positive bacterial cells undetectable by qPCR. To reduce DNA contamination in amplification reagents, a combination of ion exchange chromatography, filtration, and lot testing protocols were developed. Our multiple displacement amplification protocol employs a second strand-displacing DNA polymerase, improved buffers, improved reaction conditions and DNA free reagents. The iMDA protocol, when used in combination with DNA-free laboratory consumables and reagents, significantly improved efficiency and accuracy of amplification and sequencing of specimens with moderate to low levels of DNA. The sensitivity and specificity of sequencing of amplified DNA prepared using iMDA was compared to that of DNA obtained with two commercial whole genome amplification kits using 10 fg (~1-2 bacterial cells worth) of bacterial genomic DNA as a template. Analysis showed >99% of the iMDA reads mapped to the template organism whereas only 0.02% of the reads from the commercial kits mapped to the template. To assess the ability of iMDA to achieve balanced genomic coverage, a non-stochastic amount of bacterial genomic DNA (1 pg) was amplified and sequenced, and data obtained were compared to sequencing data obtained directly from genomic DNA. The iMDA DNA and genomic DNA sequencing had comparable coverage 99.98% of the reference genome at ≥1X coverage and 99.9% at ≥5X coverage while maintaining both balance and representation of the genome. The iMDA protocol in combination with DNA-free laboratory consumables, significantly improved the ability to sequence specimens with low levels of DNA. iMDA has broad utility in metagenomics, diagnostics, ancient DNA analysis, pre-implantation embryo screening, single-cell genomics, whole genome sequencing of unculturable organisms, and forensic applications for both human and microbial targets.
Schneider, T D
2001-12-01
The sequence logo for DNA binding sites of the bacteriophage P1 replication protein RepA shows unusually high sequence conservation ( approximately 2 bits) at a minor groove that faces RepA. However, B-form DNA can support only 1 bit of sequence conservation via contacts into the minor groove. The high conservation in RepA sites therefore implies a distorted DNA helix with direct or indirect contacts to the protein. Here I show that a high minor groove conservation signature also appears in sequence logos of sites for other replication origin binding proteins (Rts1, DnaA, P4 alpha, EBNA1, ORC) and promoter binding proteins (sigma(70), sigma(D) factors). This finding implies that DNA binding proteins generally use non-B-form DNA distortion such as base flipping to initiate replication and transcription.
Molecular design of sequence specific DNA alkylating agents.
Minoshima, Masafumi; Bando, Toshikazu; Shinohara, Ken-ichi; Sugiyama, Hiroshi
2009-01-01
Sequence-specific DNA alkylating agents have great interest for novel approach to cancer chemotherapy. We designed the conjugates between pyrrole (Py)-imidazole (Im) polyamides and DNA alkylating chlorambucil moiety possessing at different positions. The sequence-specific DNA alkylation by conjugates was investigated by using high-resolution denaturing polyacrylamide gel electrophoresis (PAGE). The results showed that polyamide chlorambucil conjugates alkylate DNA at flanking adenines in recognition sequences of Py-Im polyamides, however, the reactivities and alkylation sites were influenced by the positions of conjugation. In addition, we synthesized conjugate between Py-Im polyamide and another alkylating agent, 1-(chloromethyl)-5-hydroxy-1,2-dihydro-3H-benz[e]indole (seco-CBI). DNA alkylation reactivies by both alkylating polyamides were almost comparable. In contrast, cytotoxicities against cell lines differed greatly. These comparative studies would promote development of appropriate sequence-specific DNA alkylating polyamides against specific cancer cells.
Binladen, Jonas; Gilbert, M Thomas P; Bollback, Jonathan P; Panitz, Frank; Bendixen, Christian; Nielsen, Rasmus; Willerslev, Eske
2007-02-14
The invention of the Genome Sequence 20 DNA Sequencing System (454 parallel sequencing platform) has enabled the rapid and high-volume production of sequence data. Until now, however, individual emulsion PCR (emPCR) reactions and subsequent sequencing runs have been unable to combine template DNA from multiple individuals, as homologous sequences cannot be subsequently assigned to their original sources. We use conventional PCR with 5'-nucleotide tagged primers to generate homologous DNA amplification products from multiple specimens, followed by sequencing through the high-throughput Genome Sequence 20 DNA Sequencing System (GS20, Roche/454 Life Sciences). Each DNA sequence is subsequently traced back to its individual source through 5'tag-analysis. We demonstrate that this new approach enables the assignment of virtually all the generated DNA sequences to the correct source once sequencing anomalies are accounted for (miss-assignment rate<0.4%). Therefore, the method enables accurate sequencing and assignment of homologous DNA sequences from multiple sources in single high-throughput GS20 run. We observe a bias in the distribution of the differently tagged primers that is dependent on the 5' nucleotide of the tag. In particular, primers 5' labelled with a cytosine are heavily overrepresented among the final sequences, while those 5' labelled with a thymine are strongly underrepresented. A weaker bias also exists with regards to the distribution of the sequences as sorted by the second nucleotide of the dinucleotide tags. As the results are based on a single GS20 run, the general applicability of the approach requires confirmation. However, our experiments demonstrate that 5'primer tagging is a useful method in which the sequencing power of the GS20 can be applied to PCR-based assays of multiple homologous PCR products. The new approach will be of value to a broad range of research areas, such as those of comparative genomics, complete mitochondrial analyses, population genetics, and phylogenetics.
Utility of 16S rDNA Sequencing for Identification of Rare Pathogenic Bacteria.
Loong, Shih Keng; Khor, Chee Sieng; Jafar, Faizatul Lela; AbuBakar, Sazaly
2016-11-01
Phenotypic identification systems are established methods for laboratory identification of bacteria causing human infections. Here, the utility of phenotypic identification systems was compared against 16S rDNA identification method on clinical isolates obtained during a 5-year study period, with special emphasis on isolates that gave unsatisfactory identification. One hundred and eighty-seven clinical bacteria isolates were tested with commercial phenotypic identification systems and 16S rDNA sequencing. Isolate identities determined using phenotypic identification systems and 16S rDNA sequencing were compared for similarity at genus and species level, with 16S rDNA sequencing as the reference method. Phenotypic identification systems identified ~46% (86/187) of the isolates with identity similar to that identified using 16S rDNA sequencing. Approximately 39% (73/187) and ~15% (28/187) of the isolates showed different genus identity and could not be identified using the phenotypic identification systems, respectively. Both methods succeeded in determining the species identities of 55 isolates; however, only ~69% (38/55) of the isolates matched at species level. 16S rDNA sequencing could not determine the species of ~20% (37/187) of the isolates. The 16S rDNA sequencing is a useful method over the phenotypic identification systems for the identification of rare and difficult to identify bacteria species. The 16S rDNA sequencing method, however, does have limitation for species-level identification of some bacteria highlighting the need for better bacterial pathogen identification tools. © 2016 Wiley Periodicals, Inc.
Mapping the Space of Genomic Signatures
Kari, Lila; Hill, Kathleen A.; Sayem, Abu S.; Karamichalis, Rallis; Bryans, Nathaniel; Davis, Katelyn; Dattani, Nikesh S.
2015-01-01
We propose a computational method to measure and visualize interrelationships among any number of DNA sequences allowing, for example, the examination of hundreds or thousands of complete mitochondrial genomes. An "image distance" is computed for each pair of graphical representations of DNA sequences, and the distances are visualized as a Molecular Distance Map: Each point on the map represents a DNA sequence, and the spatial proximity between any two points reflects the degree of structural similarity between the corresponding sequences. The graphical representation of DNA sequences utilized, Chaos Game Representation (CGR), is genome- and species-specific and can thus act as a genomic signature. Consequently, Molecular Distance Maps could inform species identification, taxonomic classifications and, to a certain extent, evolutionary history. The image distance employed, Structural Dissimilarity Index (DSSIM), implicitly compares the occurrences of oligomers of length up to k (herein k = 9) in DNA sequences. We computed DSSIM distances for more than 5 million pairs of complete mitochondrial genomes, and used Multi-Dimensional Scaling (MDS) to obtain Molecular Distance Maps that visually display the sequence relatedness in various subsets, at different taxonomic levels. This general-purpose method does not require DNA sequence alignment and can thus be used to compare similar or vastly different DNA sequences, genomic or computer-generated, of the same or different lengths. We illustrate potential uses of this approach by applying it to several taxonomic subsets: phylum Vertebrata, (super)kingdom Protista, classes Amphibia-Insecta-Mammalia, class Amphibia, and order Primates. This analysis of an extensive dataset confirms that the oligomer composition of full mtDNA sequences can be a source of taxonomic information. This method also correctly finds the mtDNA sequences most closely related to that of the anatomically modern human (the Neanderthal, the Denisovan, and the chimp), and that the sequence most different from it in this dataset belongs to a cucumber. PMID:26000734
The number of reduced alignments between two DNA sequences
2014-01-01
Background In this study we consider DNA sequences as mathematical strings. Total and reduced alignments between two DNA sequences have been considered in the literature to measure their similarity. Results for explicit representations of some alignments have been already obtained. Results We present exact, explicit and computable formulas for the number of different possible alignments between two DNA sequences and a new formula for a class of reduced alignments. Conclusions A unified approach for a wide class of alignments between two DNA sequences has been provided. The formula is computable and, if complemented by software development, will provide a deeper insight into the theory of sequence alignment and give rise to new comparison methods. AMS Subject Classification Primary 92B05, 33C20, secondary 39A14, 65Q30 PMID:24684679
Novel numerical and graphical representation of DNA sequences and proteins.
Randić, M; Novic, M; Vikić-Topić, D; Plavsić, D
2006-12-01
We have introduced novel numerical and graphical representations of DNA, which offer a simple and unique characterization of DNA sequences. The numerical representation of a DNA sequence is given as a sequence of real numbers derived from a unique graphical representation of the standard genetic code. There is no loss of information on the primary structure of a DNA sequence associated with this numerical representation. The novel representations are illustrated with the coding sequences of the first exon of beta-globin gene of half a dozen species in addition to human. The method can be extended to proteins as is exemplified by humanin, a 24-aa peptide that has recently been identified as a specific inhibitor of neuronal cell death induced by familial Alzheimer's disease mutant genes.
Montesino, Marta; Prieto, Lourdes
2012-01-01
Cycle sequencing reaction with Big-Dye terminators provides the methodology to analyze mtDNA Control Region amplicons by means of capillary electrophoresis. DNA sequencing with ddNTPs or terminators was developed by (1). The progressive automation of the method by combining the use of fluorescent-dye terminators with cycle sequencing has made it possible to increase the sensibility and efficiency of the method and hence has allowed its introduction into the forensic field. PCR-generated mitochondrial DNA products are the templates for sequencing reactions. Different set of primers can be used to generate amplicons with different sizes according to the quality and quantity of the DNA extract providing sequence data for different ranges inside the Control Region.
Gene Identification Algorithms Using Exploratory Statistical Analysis of Periodicity
NASA Astrophysics Data System (ADS)
Mukherjee, Shashi Bajaj; Sen, Pradip Kumar
2010-10-01
Studying periodic pattern is expected as a standard line of attack for recognizing DNA sequence in identification of gene and similar problems. But peculiarly very little significant work is done in this direction. This paper studies statistical properties of DNA sequences of complete genome using a new technique. A DNA sequence is converted to a numeric sequence using various types of mappings and standard Fourier technique is applied to study the periodicity. Distinct statistical behaviour of periodicity parameters is found in coding and non-coding sequences, which can be used to distinguish between these parts. Here DNA sequences of Drosophila melanogaster were analyzed with significant accuracy.
Sequencing of adenine in DNA by scanning tunneling microscopy
NASA Astrophysics Data System (ADS)
Tanaka, Hiroyuki; Taniguchi, Masateru
2017-08-01
The development of DNA sequencing technology utilizing the detection of a tunnel current is important for next-generation sequencer technologies based on single-molecule analysis technology. Using a scanning tunneling microscope, we previously reported that dI/dV measurements and dI/dV mapping revealed that the guanine base (purine base) of DNA adsorbed onto the Cu(111) surface has a characteristic peak at V s = -1.6 V. If, in addition to guanine, the other purine base of DNA, namely, adenine, can be distinguished, then by reading all the purine bases of each single strand of a DNA double helix, the entire base sequence of the original double helix can be determined due to the complementarity of the DNA base pair. Therefore, the ability to read adenine is important from the viewpoint of sequencing. Here, we report on the identification of adenine by STM topographic and spectroscopic measurements using a synthetic DNA oligomer and viral DNA.
Dabney, Jesse; Knapp, Michael; Glocke, Isabelle; Gansauge, Marie-Theres; Weihmann, Antje; Nickel, Birgit; Valdiosera, Cristina; García, Nuria; Pääbo, Svante; Arsuaga, Juan-Luis; Meyer, Matthias
2013-09-24
Although an inverse relationship is expected in ancient DNA samples between the number of surviving DNA fragments and their length, ancient DNA sequencing libraries are strikingly deficient in molecules shorter than 40 bp. We find that a loss of short molecules can occur during DNA extraction and present an improved silica-based extraction protocol that enables their efficient retrieval. In combination with single-stranded DNA library preparation, this method enabled us to reconstruct the mitochondrial genome sequence from a Middle Pleistocene cave bear (Ursus deningeri) bone excavated at Sima de los Huesos in the Sierra de Atapuerca, Spain. Phylogenetic reconstructions indicate that the U. deningeri sequence forms an early diverging sister lineage to all Western European Late Pleistocene cave bears. Our results prove that authentic ancient DNA can be preserved for hundreds of thousand years outside of permafrost. Moreover, the techniques presented enable the retrieval of phylogenetically informative sequences from samples in which virtually all DNA is diminished to fragments shorter than 50 bp.
Dabney, Jesse; Knapp, Michael; Glocke, Isabelle; Gansauge, Marie-Theres; Weihmann, Antje; Nickel, Birgit; Valdiosera, Cristina; García, Nuria; Pääbo, Svante; Arsuaga, Juan-Luis; Meyer, Matthias
2013-01-01
Although an inverse relationship is expected in ancient DNA samples between the number of surviving DNA fragments and their length, ancient DNA sequencing libraries are strikingly deficient in molecules shorter than 40 bp. We find that a loss of short molecules can occur during DNA extraction and present an improved silica-based extraction protocol that enables their efficient retrieval. In combination with single-stranded DNA library preparation, this method enabled us to reconstruct the mitochondrial genome sequence from a Middle Pleistocene cave bear (Ursus deningeri) bone excavated at Sima de los Huesos in the Sierra de Atapuerca, Spain. Phylogenetic reconstructions indicate that the U. deningeri sequence forms an early diverging sister lineage to all Western European Late Pleistocene cave bears. Our results prove that authentic ancient DNA can be preserved for hundreds of thousand years outside of permafrost. Moreover, the techniques presented enable the retrieval of phylogenetically informative sequences from samples in which virtually all DNA is diminished to fragments shorter than 50 bp. PMID:24019490
Crystal structure of MboIIA methyltransferase.
Osipiuk, Jerzy; Walsh, Martin A; Joachimiak, Andrzej
2003-09-15
DNA methyltransferases (MTases) are sequence-specific enzymes which transfer a methyl group from S-adenosyl-L-methionine (AdoMet) to the amino group of either cytosine or adenine within a recognized DNA sequence. Methylation of a base in a specific DNA sequence protects DNA from nucleolytic cleavage by restriction enzymes recognizing the same DNA sequence. We have determined at 1.74 A resolution the crystal structure of a beta-class DNA MTase MboIIA (M.MboIIA) from the bacterium Moraxella bovis, the smallest DNA MTase determined to date. M.MboIIA methylates the 3' adenine of the pentanucleotide sequence 5'-GAAGA-3'. The protein crystallizes with two molecules in the asymmetric unit which we propose to resemble the dimer when M.MboIIA is not bound to DNA. The overall structure of the enzyme closely resembles that of M.RsrI. However, the cofactor-binding pocket in M.MboIIA forms a closed structure which is in contrast to the open-form structures of other known MTases.
Wang, Bo; Lee, Chang-Han; Johnson, Erik L; Kluwe, Christien A; Cunningham, Josephine C; Tanno, Hidetaka; Crooks, Richard M; Georgiou, George; Ellington, Andrew D
2016-01-01
Ricin is a toxin that could potentially be used as a bioweapon. We identified anti-ricin A chain antibodies by sequencing the antibody repertoire from immunized mice and by selecting high affinity antibodies using yeast surface display. These methods led to the isolation of multiple antibodies with high (sub-nanomolar) affinity. Interestingly, the antibodies identified by the 2 independent approaches are from the same clonal lineages, indicating for the first time that yeast surface display can identify native antibodies. The new antibodies represent well-characterized reagents for biodefense diagnostics and therapeutics development.
Genomics dataset on unclassified published organism (patent US 7547531).
Khan Shawan, Mohammad Mahfuz Ali; Hasan, Md Ashraful; Hossain, Md Mozammel; Hasan, Md Mahmudul; Parvin, Afroza; Akter, Salina; Uddin, Kazi Rasel; Banik, Subrata; Morshed, Mahbubul; Rahman, Md Nazibur; Rahman, S M Badier
2016-12-01
Nucleotide (DNA) sequence analysis provides important clues regarding the characteristics and taxonomic position of an organism. With the intention that, DNA sequence analysis is very crucial to learn about hierarchical classification of that particular organism. This dataset (patent US 7547531) is chosen to simplify all the complex raw data buried in undisclosed DNA sequences which help to open doors for new collaborations. In this data, a total of 48 unidentified DNA sequences from patent US 7547531 were selected and their complete sequences were retrieved from NCBI BioSample database. Quick response (QR) code of those DNA sequences was constructed by DNA BarID tool. QR code is useful for the identification and comparison of isolates with other organisms. AT/GC content of the DNA sequences was determined using ENDMEMO GC Content Calculator, which indicates their stability at different temperature. The highest GC content was observed in GP445188 (62.5%) which was followed by GP445198 (61.8%) and GP445189 (59.44%), while lowest was in GP445178 (24.39%). In addition, New England BioLabs (NEB) database was used to identify cleavage code indicating the 5, 3 and blunt end and enzyme code indicating the methylation site of the DNA sequences was also shown. These data will be helpful for the construction of the organisms' hierarchical classification, determination of their phylogenetic and taxonomic position and revelation of their molecular characteristics.
Fluorescent DNA-templated silver nanoclusters
NASA Astrophysics Data System (ADS)
Lin, Ruoqian
Because of the ultra-small size and biocompatibility of silver nanoclusters, they have attracted much research interest for their applications in biolabeling. Among the many ways of synthesizing silver nanoclusters, DNA templated method is particularly attractive---the high tunability of DNA sequences provides another degree of freedom for controlling the chemical and photophysical properties. However, systematic studies about how DNA sequences and concentrations are controlling the photophysical properties are still lacking. The aim of this thesis is to investigate the binding mechanisms of silver clusters binding and single stranded DNAs. Here in this thesis, we report synthesis and characterization of DNA-templated silver nanoclusters and provide a systematic interrogation of the effects of DNA concentrations and sequences, including lengths and secondary structures. We performed a series of syntheses utilizing five different sequences to explore the optimal synthesis condition. By characterizing samples with UV-vis and fluorescence spectroscopy, we achieved the most proper reactants ratio and synthesis conditions. Two of them were chosen for further concentration dependence studies and sequence dependence studies. We found that cytosine-rich sequences are more likely to produce silver nanoclusters with stronger fluorescence signals; however, sequences with hairpin secondary structures are more capable in stabilizing silver nanoclusters. In addition, the fluorescence peak emission intensities and wavelengths of the DNA templated silver clusters have sequence dependent fingerprints. This potentially can be applied to sequence sensing in the future. However all the current conclusions are not warranted; there is still difficulty in formulating general rules in DNA strand design and silver nanocluster production. Further investigation of more sequences could solve these questions in the future.
Dialynas, D P; Murre, C; Quertermous, T; Boss, J M; Leiden, J M; Seidman, J G; Strominger, J L
1986-01-01
Complementary DNA (cDNA) encoding a human T-cell gamma chain has been cloned and sequenced. At the junction of the variable and joining regions, there is an apparent deletion of two nucleotides in the human cDNA sequence relative to the murine gamma-chain cDNA sequence, resulting simultaneously in the generation of an in-frame stop codon and in a translational frameshift. For this reason, the sequence presented here encodes an aberrantly rearranged human T-cell gamma chain. There are several surprising differences between the deduced human and murine gamma-chain amino acid sequences. These include poor homology in the variable region, poor homology in a discrete segment of the constant region precisely bounded by the expected junctions of exon CII, and the presence in the human sequence of five potential sites for N-linked glycosylation. Images PMID:3458221
Marck, C
1988-01-01
DNA Strider is a new integrated DNA and Protein sequence analysis program written with the C language for the Macintosh Plus, SE and II computers. It has been designed as an easy to learn and use program as well as a fast and efficient tool for the day-to-day sequence analysis work. The program consists of a multi-window sequence editor and of various DNA and Protein analysis functions. The editor may use 4 different types of sequences (DNA, degenerate DNA, RNA and one-letter coded protein) and can handle simultaneously 6 sequences of any type up to 32.5 kB each. Negative numbering of the bases is allowed for DNA sequences. All classical restriction and translation analysis functions are present and can be performed in any order on any open sequence or part of a sequence. The main feature of the program is that the same analysis function can be repeated several times on different sequences, thus generating multiple windows on the screen. Many graphic capabilities have been incorporated such as graphic restriction map, hydrophobicity profile and the CAI plot- codon adaptation index according to Sharp and Li. The restriction sites search uses a newly designed fast hexamer look-ahead algorithm. Typical runtime for the search of all sites with a library of 130 restriction endonucleases is 1 second per 10,000 bases. The circular graphic restriction map of the pBR322 plasmid can be therefore computed from its sequence and displayed on the Macintosh Plus screen within 2 seconds and its multiline restriction map obtained in a scrolling window within 5 seconds. PMID:2832831
Sequence-dependent DNA deformability studied using molecular dynamics simulations.
Fujii, Satoshi; Kono, Hidetoshi; Takenaka, Shigeori; Go, Nobuhiro; Sarai, Akinori
2007-01-01
Proteins recognize specific DNA sequences not only through direct contact between amino acids and bases, but also indirectly based on the sequence-dependent conformation and deformability of the DNA (indirect readout). We used molecular dynamics simulations to analyze the sequence-dependent DNA conformations of all 136 possible tetrameric sequences sandwiched between CGCG sequences. The deformability of dimeric steps obtained by the simulations is consistent with that by the crystal structures. The simulation results further showed that the conformation and deformability of the tetramers can highly depend on the flanking base pairs. The conformations of xATx tetramers show the most rigidity and are not affected by the flanking base pairs and the xYRx show by contrast the greatest flexibility and change their conformations depending on the base pairs at both ends, suggesting tetramers with the same central dimer can show different deformabilities. These results suggest that analysis of dimeric steps alone may overlook some conformational features of DNA and provide insight into the mechanism of indirect readout during protein-DNA recognition. Moreover, the sequence dependence of DNA conformation and deformability may be used to estimate the contribution of indirect readout to the specificity of protein-DNA recognition as well as nucleosome positioning and large-scale behavior of nucleic acids.
NASA Astrophysics Data System (ADS)
Ma, Song-Shan; Xu, Hui; Wang, Huan-You; Guo, Rui
2009-08-01
This paper presents a model to describe alternating current (AC) conductivity of DNA sequences, in which DNA is considered as a one-dimensional (1D) disordered system, and electrons transport via hopping between localized states. It finds that AC conductivity in DNA sequences increases as the frequency of the external electric field rises, and it takes the form of øac(ω) ~ ω2 ln2(1/ω). Also AC conductivity of DNA sequences increases with the increase of temperature, this phenomenon presents characteristics of weak temperature-dependence. Meanwhile, the AC conductivity in an off-diagonally correlated case is much larger than that in the uncorrelated case of the Anderson limit in low temperatures, which indicates that the off-diagonal correlations in DNA sequences have a great effect on the AC conductivity, while at high temperature the off-diagonal correlations no longer play a vital role in electric transport. In addition, the proportion of nucleotide pairs p also plays an important role in AC electron transport of DNA sequences. For p < 0.5, the conductivity of DNA sequence decreases with the increase of p, while for p >= 0.5, the conductivity increases with the increase of p.
Admir J. Giachini; Kentaro Hosaka; Eduardo Nouhra; Joseph Spatafora; James M. Trappe
2010-01-01
Phylogenetic relationships among Geastrales, Gomphales, Hysterangiales, and Phallales were estimated via combined sequences: nuclear large subunit ribosomal DNA (nuc-25S-rDNA), mitochondrial small subunit ribosomal DNA (mit-12S-rDNA), and mitochondrial atp6 DNA (mit-atp6-DNA). Eighty-one taxa comprising 19 genera and 58 species...
Method for performing site-specific affinity fractionation for use in DNA sequencing
Mirzabekov, Andrei Darievich; Lysov, Yuri Petrovich; Dubley, Svetlana A.
1999-01-01
A method for fractionating and sequencing DNA via affinity interaction is provided comprising contacting cleaved DNA to a first array of oligonucleotide molecules to facilitate hybridization between said cleaved DNA and the molecules; extracting the hybridized DNA from the molecules; contacting said extracted hybridized DNA with a second array of oligonucleotide molecules, wherein the oligonucleotide molecules in the second array have specified base sequences that are complementary to said extracted hybridized DNA; and attaching labeled DNA to the second array of oligonucleotide molecules, wherein the labeled re-hybridized DNA have sequences that are complementary to the oligomers. The invention further provides a method for performing multi-step conversions of the chemical structure of compounds comprising supplying an array of polyacrylamide vessels separated by hydrophobic surfaces; immobilizing a plurality of reactants, such as enzymes, in the vessels so that each vessel contains one reactant; contacting the compounds to each of the vessels in a predetermined sequence and for a sufficient time to convert the compounds to a desired state; and isolating the converted compounds from said array.
Mirzabekov, Andrei Darievich; Lysov, Yuri Petrovich; Dubley, Svetlana A.
2000-01-01
A method for fractionating and sequencing DNA via affinity interaction is provided comprising contacting cleaved DNA to a first array of oligonucleotide molecules to facilitate hybridization between said cleaved DNA and the molecules; extracting the hybridized DNA from the molecules; contacting said extracted hybridized DNA with a second array of oligonucleotide molecules, wherein the oligonucleotide molecules in the second array have specified base sequences that are complementary to said extracted hybridized DNA; and attaching labeled DNA to the second array of oligonucleotide molecules, wherein the labeled re-hybridized DNA have sequences that are complementary to the oligomers. The invention further provides a method for performing multi-step conversions of the chemical structure of compounds comprising supplying an array of polyacrylamide vessels separated by hydrophobic surfaces; immobilizing a plurality of reactants, such as enzymes, in the vessels so that each vessel contains one reactant; contacting the compounds to each of the vessels in a predetermined sequence and for a sufficient time to convert the compounds to a desired state; and isolating the converted compounds from said array.
DNABIT Compress - Genome compression algorithm.
Rajarajeswari, Pothuraju; Apparao, Allam
2011-01-22
Data compression is concerned with how information is organized in data. Efficient storage means removal of redundancy from the data being stored in the DNA molecule. Data compression algorithms remove redundancy and are used to understand biologically important molecules. We present a compression algorithm, "DNABIT Compress" for DNA sequences based on a novel algorithm of assigning binary bits for smaller segments of DNA bases to compress both repetitive and non repetitive DNA sequence. Our proposed algorithm achieves the best compression ratio for DNA sequences for larger genome. Significantly better compression results show that "DNABIT Compress" algorithm is the best among the remaining compression algorithms. While achieving the best compression ratios for DNA sequences (Genomes),our new DNABIT Compress algorithm significantly improves the running time of all previous DNA compression programs. Assigning binary bits (Unique BIT CODE) for (Exact Repeats, Reverse Repeats) fragments of DNA sequence is also a unique concept introduced in this algorithm for the first time in DNA compression. This proposed new algorithm could achieve the best compression ratio as much as 1.58 bits/bases where the existing best methods could not achieve a ratio less than 1.72 bits/bases.
Method for performing site-specific affinity fractionation for use in DNA sequencing
Mirzabekov, A.D.; Lysov, Y.P.; Dubley, S.A.
1999-05-18
A method for fractionating and sequencing DNA via affinity interaction is provided comprising contacting cleaved DNA to a first array of oligonucleotide molecules to facilitate hybridization between the cleaved DNA and the molecules; extracting the hybridized DNA from the molecules; contacting the extracted hybridized DNA with a second array of oligonucleotide molecules, wherein the oligonucleotide molecules in the second array have specified base sequences that are complementary to the extracted hybridized DNA; and attaching labeled DNA to the second array of oligonucleotide molecules, wherein the labeled re-hybridized DNA have sequences that are complementary to the oligomers. The invention further provides a method for performing multi-step conversions of the chemical structure of compounds comprising supplying an array of polyacrylamide vessels separated by hydrophobic surfaces; immobilizing a plurality of reactants, such as enzymes, in the vessels so that each vessel contains one reactant; contacting the compounds to each of the vessels in a predetermined sequence and for a sufficient time to convert the compounds to a desired state; and isolating the converted compounds from the array. 14 figs.
Partial DNA sequencing of Douglas-fir cDNAs used in RFLP mapping
K.D. Jermstad; D.L. Bassoni; C.S. Kinlaw; D.B. Neale
1998-01-01
DNA sequences from 87 Douglas-fir (Pseudotsuga menziesii [Mirb.] Franco) cDNA RFLP probes were determined. Sequences were submitted to the GenBank dbEST database and searched for similarity against nucleotide and protein databases using the BLASTn and BLASTx programs. Twenty-one sequences (24%) were assigned putative functions; 18 of which...
USDA-ARS?s Scientific Manuscript database
We explored the phylogenetic utility of entire plastid DNA sequences in Daucus and compared the results to prior phylogenetic results using plastid, nuclear, and mitochondrial DNA sequences. We obtained, using Illumina sequencing, full plastid sequences of 37 accessions of 20 Daucus taxa and outgrou...
USDA-ARS?s Scientific Manuscript database
A reassociation kinetics-based approach was used to reduce the complexity of genomic DNA from the Deutsch laboratory strain of the cattle tick, Rhipicephalus microplus, to facilitate genome sequencing. Selected genomic DNA (Cot value = 660) was sequenced using 454 GS FLX technology, resulting in 356...
Clifford, Jacob; Adami, Christoph
2015-09-02
Transcription factor binding to the surface of DNA regulatory regions is one of the primary causes of regulating gene expression levels. A probabilistic approach to model protein-DNA interactions at the sequence level is through position weight matrices (PWMs) that estimate the joint probability of a DNA binding site sequence by assuming positional independence within the DNA sequence. Here we construct conditional PWMs that depend on the motif signatures in the flanking DNA sequence, by conditioning known binding site loci on the presence or absence of additional binding sites in the flanking sequence of each site's locus. Pooling known sites with similar flanking sequence patterns allows for the estimation of the conditional distribution function over the binding site sequences. We apply our model to the Dorsal transcription factor binding sites active in patterning the Dorsal-Ventral axis of Drosophila development. We find that those binding sites that cooperate with nearby Twist sites on average contain about 0.5 bits of information about the presence of Twist transcription factor binding sites in the flanking sequence. We also find that Dorsal binding site detectors conditioned on flanking sequence information make better predictions about what is a Dorsal site relative to background DNA than detection without information about flanking sequence features.
Real-Time DNA Sequencing in the Antarctic Dry Valleys Using the Oxford Nanopore Sequencer
Johnson, Sarah S.; Zaikova, Elena; Goerlitz, David S.; Bai, Yu; Tighe, Scott W.
2017-01-01
The ability to sequence DNA outside of the laboratory setting has enabled novel research questions to be addressed in the field in diverse areas, ranging from environmental microbiology to viral epidemics. Here, we demonstrate the application of offline DNA sequencing of environmental samples using a hand-held nanopore sequencer in a remote field location: the McMurdo Dry Valleys, Antarctica. Sequencing was performed using a MK1B MinION sequencer from Oxford Nanopore Technologies (ONT; Oxford, United Kingdom) that was equipped with software to operate without internet connectivity. One-direction (1D) genomic libraries were prepared using portable field techniques on DNA isolated from desiccated microbial mats. By adequately insulating the sequencer and laptop, it was possible to run the sequencing protocol for up to 2½ h under arduous conditions. PMID:28337073
Multiplexed Sequence Encoding: A Framework for DNA Communication.
Zakeri, Bijan; Carr, Peter A; Lu, Timothy K
2016-01-01
Synthetic DNA has great propensity for efficiently and stably storing non-biological information. With DNA writing and reading technologies rapidly advancing, new applications for synthetic DNA are emerging in data storage and communication. Traditionally, DNA communication has focused on the encoding and transfer of complete sets of information. Here, we explore the use of DNA for the communication of short messages that are fragmented across multiple distinct DNA molecules. We identified three pivotal points in a communication-data encoding, data transfer & data extraction-and developed novel tools to enable communication via molecules of DNA. To address data encoding, we designed DNA-based individualized keyboards (iKeys) to convert plaintext into DNA, while reducing the occurrence of DNA homopolymers to improve synthesis and sequencing processes. To address data transfer, we implemented a secret-sharing system-Multiplexed Sequence Encoding (MuSE)-that conceals messages between multiple distinct DNA molecules, requiring a combination key to reveal messages. To address data extraction, we achieved the first instance of chromatogram patterning through multiplexed sequencing, thereby enabling a new method for data extraction. We envision these approaches will enable more widespread communication of information via DNA.
Horn, T; Chang, C A; Urdea, M S
1997-12-01
The divergent synthesis of branched DNA (bDNA) comb structures is described. This new type of bDNA contains one unique oligonucleotide, the primary sequence, covalently attached through a comb-like branch network to many identical copies of a different oligonucleotide, the secondary sequence. The bDNA comb structures were assembled on a solid support and several synthesis parameters were investigated and optimized. The bDNA comb molecules were characterized by polyacrylamide gel electrophoretic methods and by controlled cleavage at periodate-cleavable moieties incorporated during synthesis. The developed chemistry allows synthesis of bDNA comb molecules containing multiple secondary sequences. In the accompanying article we describe the synthesis and characterization of large bDNA combs containing all four deoxynucleotides for use as signal amplifiers in nucleic acid quantification assays.
Horn, T; Chang, C A; Urdea, M S
1997-01-01
The divergent synthesis of branched DNA (bDNA) comb structures is described. This new type of bDNA contains one unique oligonucleotide, the primary sequence, covalently attached through a comb-like branch network to many identical copies of a different oligonucleotide, the secondary sequence. The bDNA comb structures were assembled on a solid support and several synthesis parameters were investigated and optimized. The bDNA comb molecules were characterized by polyacrylamide gel electrophoretic methods and by controlled cleavage at periodate-cleavable moieties incorporated during synthesis. The developed chemistry allows synthesis of bDNA comb molecules containing multiple secondary sequences. In the accompanying article we describe the synthesis and characterization of large bDNA combs containing all four deoxynucleotides for use as signal amplifiers in nucleic acid quantification assays. PMID:9365265
Caramelli, David; Milani, Lucio; Vai, Stefania; Modi, Alessandra; Pecchioli, Elena; Girardi, Matteo; Pilli, Elena; Lari, Martina; Lippi, Barbara; Ronchitelli, Annamaria; Mallegni, Francesco; Casoli, Antonella; Bertorelle, Giorgio; Barbujani, Guido
2008-01-01
Background DNA sequences from ancient speciments may in fact result from undetected contamination of the ancient specimens by modern DNA, and the problem is particularly challenging in studies of human fossils. Doubts on the authenticity of the available sequences have so far hampered genetic comparisons between anatomically archaic (Neandertal) and early modern (Cro-Magnoid) Europeans. Methodology/Principal Findings We typed the mitochondrial DNA (mtDNA) hypervariable region I in a 28,000 years old Cro-Magnoid individual from the Paglicci cave, in Italy (Paglicci 23) and in all the people who had contact with the sample since its discovery in 2003. The Paglicci 23 sequence, determined through the analysis of 152 clones, is the Cambridge reference sequence, and cannot possibly reflect contamination because it differs from all potentially contaminating modern sequences. Conclusions/Significance: The Paglicci 23 individual carried a mtDNA sequence that is still common in Europe, and which radically differs from those of the almost contemporary Neandertals, demonstrating a genealogical continuity across 28,000 years, from Cro-Magnoid to modern Europeans. Because all potential sources of modern DNA contamination are known, the Paglicci 23 sample will offer a unique opportunity to get insight for the first time into the nuclear genes of early modern Europeans. PMID:18628960
High-Throughput Block Optical DNA Sequence Identification.
Sagar, Dodderi Manjunatha; Korshoj, Lee Erik; Hanson, Katrina Bethany; Chowdhury, Partha Pratim; Otoupal, Peter Britton; Chatterjee, Anushree; Nagpal, Prashant
2018-01-01
Optical techniques for molecular diagnostics or DNA sequencing generally rely on small molecule fluorescent labels, which utilize light with a wavelength of several hundred nanometers for detection. Developing a label-free optical DNA sequencing technique will require nanoscale focusing of light, a high-throughput and multiplexed identification method, and a data compression technique to rapidly identify sequences and analyze genomic heterogeneity for big datasets. Such a method should identify characteristic molecular vibrations using optical spectroscopy, especially in the "fingerprinting region" from ≈400-1400 cm -1 . Here, surface-enhanced Raman spectroscopy is used to demonstrate label-free identification of DNA nucleobases with multiplexed 3D plasmonic nanofocusing. While nanometer-scale mode volumes prevent identification of single nucleobases within a DNA sequence, the block optical technique can identify A, T, G, and C content in DNA k-mers. The content of each nucleotide in a DNA block can be a unique and high-throughput method for identifying sequences, genes, and other biomarkers as an alternative to single-letter sequencing. Additionally, coupling two complementary vibrational spectroscopy techniques (infrared and Raman) can improve block characterization. These results pave the way for developing a novel, high-throughput block optical sequencing method with lossy genomic data compression using k-mer identification from multiplexed optical data acquisition. © 2017 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
Basic quantitative polymerase chain reaction using real-time fluorescence measurements.
Ares, Manuel
2014-10-01
This protocol uses quantitative polymerase chain reaction (qPCR) to measure the number of DNA molecules containing a specific contiguous sequence in a sample of interest (e.g., genomic DNA or cDNA generated by reverse transcription). The sample is subjected to fluorescence-based PCR amplification and, theoretically, during each cycle, two new duplex DNA molecules are produced for each duplex DNA molecule present in the sample. The progress of the reaction during PCR is evaluated by measuring the fluorescence of dsDNA-dye complexes in real time. In the early cycles, DNA duplication is not detected because inadequate amounts of DNA are made. At a certain threshold cycle, DNA-dye complexes double each cycle for 8-10 cycles, until the DNA concentration becomes so high and the primer concentration so low that the reassociation of the product strands blocks efficient synthesis of new DNA and the reaction plateaus. There are two types of measurements: (1) the relative change of the target sequence compared to a reference sequence and (2) the determination of molecule number in the starting sample. The first requires a reference sequence, and the second requires a sample of the target sequence with known numbers of the molecules of sequence to generate a standard curve. By identifying the threshold cycle at which a sample first begins to accumulate DNA-dye complexes exponentially, an estimation of the numbers of starting molecules in the sample can be extrapolated. © 2014 Cold Spring Harbor Laboratory Press.
The disadvantage of combinatorial communication.
Lachmann, Michael; Bergstrom, Carl T.
2004-01-01
Combinatorial communication allows rapid and efficient transfer of detailed information, yet combinatorial communication is used by few, if any, non-human species. To complement recent studies illustrating the advantages of combinatorial communication, we highlight a critical disadvantage. We use the concept of information value to show that deception poses a greater and qualitatively different threat to combinatorial signalling than to non-combinatorial systems. This additional potential for deception may represent a strategic barrier that has prevented widespread evolution of combinatorial communication. Our approach has the additional benefit of drawing clear distinctions among several types of deception that can occur in communication systems. PMID:15556886
The disadvantage of combinatorial communication.
Lachmann, Michael; Bergstrom, Carl T
2004-11-22
Combinatorial communication allows rapid and efficient transfer of detailed information, yet combinatorial communication is used by few, if any, non-human species. To complement recent studies illustrating the advantages of combinatorial communication, we highlight a critical disadvantage. We use the concept of information value to show that deception poses a greater and qualitatively different threat to combinatorial signalling than to non-combinatorial systems. This additional potential for deception may represent a strategic barrier that has prevented widespread evolution of combinatorial communication. Our approach has the additional benefit of drawing clear distinctions among several types of deception that can occur in communication systems.
Spiroplasma species share common DNA sequences among their viruses, plasmids and genomes.
Ranhand, J M; Nur, I; Rose, D L; Tully, J G
1987-01-01
Alkaline-Southern-blot analyses showed that a spiroplasma plasmid, pRA1, obtained from Spiroplasma citri (Maroc-R8A2), contained DNA sequences that were homologous to spiroplasma type 3 viruses (SV3) obtained from S. citri (Maroc-R8A2), S. citri (608) and S. mirum (SMCA). In addition, pRA1 and SV3(608) DNA shared common, but not necessarily related, sequences with extrachromosomal DNA derived from 11 Spiroplasma species or strains. Furthermore, SV3(608) had DNA homology with the chromosome from 6 distinct spiroplasmas but not with chromosomal DNA from eight other Spiroplasma species or strains. The biological function of these common sequences is unknown.
Flow cytometric detection method for DNA samples
Nasarabadi, Shanavaz [Livermore, CA; Langlois, Richard G [Livermore, CA; Venkateswaran, Kodumudi S [Round Rock, TX
2011-07-05
Disclosed herein are two methods for rapid multiplex analysis to determine the presence and identity of target DNA sequences within a DNA sample. Both methods use reporting DNA sequences, e.g., modified conventional Taqman.RTM. probes, to combine multiplex PCR amplification with microsphere-based hybridization using flow cytometry means of detection. Real-time PCR detection can also be incorporated. The first method uses a cyanine dye, such as, Cy3.TM., as the reporter linked to the 5' end of a reporting DNA sequence. The second method positions a reporter dye, e.g., FAM.TM. on the 3' end of the reporting DNA sequence and a quencher dye, e.g., TAMRA.TM., on the 5' end.
Flow cytometric detection method for DNA samples
Nasarabadi, Shanavaz [Livermore, CA; Langlois, Richard G [Livermore, CA; Venkateswaran, Kodumudi S [Livermore, CA
2006-08-01
Disclosed herein are two methods for rapid multiplex analysis to determine the presence and identity of target DNA sequences within a DNA sample. Both methods use reporting DNA sequences, e.g., modified conventional Taqman.RTM. probes, to combine multiplex PCR amplification with microsphere-based hybridization using flow cytometry means of detection. Real-time PCR detection can also be incorporated. The first method uses a cyanine dye, such as, Cy3.TM., as the reporter linked to the 5' end of a reporting DNA sequence. The second method positions a reporter dye, e.g., FAM, on the 3' end of the reporting DNA sequence and a quencher dye, e.g., TAMRA, on the 5' end.
Method for sequencing DNA base pairs
Sessler, A.M.; Dawson, J.
1993-12-14
The base pairs of a DNA structure are sequenced with the use of a scanning tunneling microscope (STM). The DNA structure is scanned by the STM probe tip, and, as it is being scanned, the DNA structure is separately subjected to a sequence of infrared radiation from four different sources, each source being selected to preferentially excite one of the four different bases in the DNA structure. Each particular base being scanned is subjected to such sequence of infrared radiation from the four different sources as that particular base is being scanned. The DNA structure as a whole is separately imaged for each subjection thereof to radiation from one only of each source. 6 figures.
Zhang, Bo; Wu, Wen-Qiang; Liu, Na-Nv; Duan, Xiao-Lei; Li, Ming; Dou, Shuo-Xing; Hou, Xi-Miao; Xi, Xu-Guang
2016-01-01
Alternative DNA structures that deviate from B-form double-stranded DNA such as G-quadruplex (G4) DNA can be formed by G-rich sequences that are widely distributed throughout the human genome. We have previously shown that Pif1p not only unfolds G4, but also unwinds the downstream duplex DNA in a G4-stimulated manner. In the present study, we further characterized the G4-stimulated duplex DNA unwinding phenomenon by means of single-molecule fluorescence resonance energy transfer. It was found that Pif1p did not unwind the partial duplex DNA immediately after unfolding the upstream G4 structure, but rather, it would dwell at the ss/dsDNA junction with a ‘waiting time’. Further studies revealed that the waiting time was in fact related to a protein dimerization process that was sensitive to ssDNA sequence and would become rapid if the sequence is G-rich. Furthermore, we identified that the G-rich sequence, as the G4 structure, equally stimulates duplex DNA unwinding. The present work sheds new light on the molecular mechanism by which G4-unwinding helicase Pif1p resolves physiological G4/duplex DNA structures in cells. PMID:27471032
Continuous Influx of Genetic Material from Host to Virus Populations
Gilbert, Clément; Peccoud, Jean; Chateigner, Aurélien; Moumen, Bouziane
2016-01-01
Many genes of large double-stranded DNA viruses have a cellular origin, suggesting that host-to-virus horizontal transfer (HT) of DNA is recurrent. Yet, the frequency of these transfers has never been assessed in viral populations. Here we used ultra-deep DNA sequencing of 21 baculovirus populations extracted from two moth species to show that a large diversity of moth DNA sequences (n = 86) can integrate into viral genomes during the course of a viral infection. The majority of the 86 different moth DNA sequences are transposable elements (TEs, n = 69) belonging to 10 superfamilies of DNA transposons and three superfamilies of retrotransposons. The remaining 17 sequences are moth sequences of unknown nature. In addition to bona fide DNA transposition, we uncover microhomology-mediated recombination as a mechanism explaining integration of moth sequences into viral genomes. Many sequences integrated multiple times at multiple positions along the viral genome. We detected a total of 27,504 insertions of moth sequences in the 21 viral populations and we calculate that on average, 4.8% of viruses harbor at least one moth sequence in these populations. Despite this substantial proportion, no insertion of moth DNA was maintained in any viral population after 10 successive infection cycles. Hence, there is a constant turnover of host DNA inserted into viral genomes each time the virus infects a moth. Finally, we found that at least 21 of the moth TEs integrated into viral genomes underwent repeated horizontal transfers between various insect species, including some lepidopterans susceptible to baculoviruses. Our results identify host DNA influx as a potent source of genetic diversity in viral populations. They also support a role for baculoviruses as vectors of DNA HT between insects, and call for an evaluation of possible gene or TE spread when using viruses as biopesticides or gene delivery vectors. PMID:26829124
Continuous Influx of Genetic Material from Host to Virus Populations.
Gilbert, Clément; Peccoud, Jean; Chateigner, Aurélien; Moumen, Bouziane; Cordaux, Richard; Herniou, Elisabeth A
2016-02-01
Many genes of large double-stranded DNA viruses have a cellular origin, suggesting that host-to-virus horizontal transfer (HT) of DNA is recurrent. Yet, the frequency of these transfers has never been assessed in viral populations. Here we used ultra-deep DNA sequencing of 21 baculovirus populations extracted from two moth species to show that a large diversity of moth DNA sequences (n = 86) can integrate into viral genomes during the course of a viral infection. The majority of the 86 different moth DNA sequences are transposable elements (TEs, n = 69) belonging to 10 superfamilies of DNA transposons and three superfamilies of retrotransposons. The remaining 17 sequences are moth sequences of unknown nature. In addition to bona fide DNA transposition, we uncover microhomology-mediated recombination as a mechanism explaining integration of moth sequences into viral genomes. Many sequences integrated multiple times at multiple positions along the viral genome. We detected a total of 27,504 insertions of moth sequences in the 21 viral populations and we calculate that on average, 4.8% of viruses harbor at least one moth sequence in these populations. Despite this substantial proportion, no insertion of moth DNA was maintained in any viral population after 10 successive infection cycles. Hence, there is a constant turnover of host DNA inserted into viral genomes each time the virus infects a moth. Finally, we found that at least 21 of the moth TEs integrated into viral genomes underwent repeated horizontal transfers between various insect species, including some lepidopterans susceptible to baculoviruses. Our results identify host DNA influx as a potent source of genetic diversity in viral populations. They also support a role for baculoviruses as vectors of DNA HT between insects, and call for an evaluation of possible gene or TE spread when using viruses as biopesticides or gene delivery vectors.
NASA Technical Reports Server (NTRS)
Ho, P. S.; Ellison, M. J.; Quigley, G. J.; Rich, A.
1986-01-01
The ease with which a particular DNA segment adopts the left-handed Z-conformation depends largely on the sequence and on the degree of negative supercoiling to which it is subjected. We describe a computer program (Z-hunt) that is designed to search long sequences of naturally occurring DNA and retrieve those nucleotide combinations of up to 24 bp in length which show a strong propensity for Z-DNA formation. Incorporated into Z-hunt is a statistical mechanical model based on empirically determined energetic parameters for the B to Z transition accumulated to date. The Z-forming potential of a sequence is assessed by ranking its behavior as a function of negative superhelicity relative to the behavior of similar sized randomly generated nucleotide sequences assembled from over 80,000 combinations. The program makes it possible to compare directly the Z-forming potential of sequences with different base compositions and different sequence lengths. Using Z-hunt, we have analyzed the DNA sequences of the bacteriophage phi X174, plasmid pBR322, the animal virus SV40 and the replicative form of the eukaryotic adenovirus-2. The results are compared with those previously obtained by others from experiments designed to locate Z-DNA forming regions in these sequences using probes which show specificity for the left-handed DNA conformation.
Royston, Kendra J.; Udayakumar, Neha; Lewis, Kayla; Tollefsbol, Trygve O.
2017-01-01
With cancer often classified as a disease that has an important epigenetic component, natural compounds that have the ability to regulate the epigenome become ideal candidates for study. Humans have a complex diet, which illustrates the need to elucidate the mechanisms of interaction between these bioactive compounds in combination. The natural compounds withaferin A (WA), from the Indian winter cherry, and sulforaphane (SFN), from cruciferous vegetables, have numerous anti-cancer effects and some report their ability to regulate epigenetic processes. Our study is the first to investigate the combinatorial effects of low physiologically achievable concentrations of WA and SFN on breast cancer cell proliferation, histone deacetylase1 (HDAC1) and DNA methyltransferases (DNMTs). No adverse effects were observed on control cells at optimal concentrations. There was synergistic inhibition of cellular viability in MCF-7 cells and a greater induction of apoptosis with the combinatorial approach than with either compound administered alone in both MDA-MB-231 and MCF-7 cells. HDAC expression was down-regulated at multiple levels. Lastly, we determined the combined effects of these bioactive compounds on the pro-apoptotic BAX and anti-apoptotic BCL-2 and found decreases in BCL-2 and increases in BAX. Taken together, our findings demonstrate the ability of low concentrations of combinatorial WA and SFN to promote cancer cell death and regulate key epigenetic modifiers in human breast cancer cells. PMID:28534825
Royston, Kendra J; Udayakumar, Neha; Lewis, Kayla; Tollefsbol, Trygve O
2017-05-19
With cancer often classified as a disease that has an important epigenetic component, natural compounds that have the ability to regulate the epigenome become ideal candidates for study. Humans have a complex diet, which illustrates the need to elucidate the mechanisms of interaction between these bioactive compounds in combination. The natural compounds withaferin A (WA), from the Indian winter cherry, and sulforaphane (SFN), from cruciferous vegetables, have numerous anti-cancer effects and some report their ability to regulate epigenetic processes. Our study is the first to investigate the combinatorial effects of low physiologically achievable concentrations of WA and SFN on breast cancer cell proliferation, histone deacetylase1 (HDAC1) and DNA methyltransferases (DNMTs). No adverse effects were observed on control cells at optimal concentrations. There was synergistic inhibition of cellular viability in MCF-7 cells and a greater induction of apoptosis with the combinatorial approach than with either compound administered alone in both MDA-MB-231 and MCF-7 cells. HDAC expression was down-regulated at multiple levels. Lastly, we determined the combined effects of these bioactive compounds on the pro-apoptotic BAX and anti-apoptotic BCL-2 and found decreases in BCL-2 and increases in BAX . Taken together, our findings demonstrate the ability of low concentrations of combinatorial WA and SFN to promote cancer cell death and regulate key epigenetic modifiers in human breast cancer cells.
Recognition of platinum-DNA adducts by HMGB1a.
Ramachandran, Srinivas; Temple, Brenda; Alexandrova, Anastassia N; Chaney, Stephen G; Dokholyan, Nikolay V
2012-09-25
Cisplatin (CP) and oxaliplatin (OX), platinum-based drugs used widely in chemotherapy, form adducts on intrastrand guanines (5'GG) in genomic DNA. DNA damage recognition proteins, transcription factors, mismatch repair proteins, and DNA polymerases discriminate between CP- and OX-GG DNA adducts, which could partly account for differences in the efficacy, toxicity, and mutagenicity of CP and OX. In addition, differential recognition of CP- and OX-GG adducts is highly dependent on the sequence context of the Pt-GG adduct. In particular, DNA binding protein domain HMGB1a binds to CP-GG DNA adducts with up to 53-fold greater affinity than to OX-GG adducts in the TGGA sequence context but shows much smaller differences in binding in the AGGC or TGGT sequence contexts. Here, simulations of the HMGB1a-Pt-DNA complex in the three sequence contexts revealed a higher number of interface contacts for the CP-DNA complex in the TGGA sequence context than in the OX-DNA complex. However, the number of interface contacts was similar in the TGGT and AGGC sequence contexts. The higher number of interface contacts in the CP-TGGA sequence context corresponded to a larger roll of the Pt-GG base pair step. Furthermore, geometric analysis of stacking of phenylalanine 37 in HMGB1a (Phe37) with the platinated guanines revealed more favorable stacking modes correlated with a larger roll of the Pt-GG base pair step in the TGGA sequence context. These data are consistent with our previous molecular dynamics simulations showing that the CP-TGGA complex was able to sample larger roll angles than the OX-TGGA complex or either CP- or OX-DNA complexes in the AGGC or TGGT sequences. We infer that the high binding affinity of HMGB1a for CP-TGGA is due to the greater flexibility of CP-TGGA compared to OX-TGGA and other Pt-DNA adducts. This increased flexibility is reflected in the ability of CP-TGGA to sample larger roll angles, which allows for a higher number of interface contacts between the Pt-DNA adduct and HMGB1a.
Characterization of proviruses cloned from mink cell focus-forming virus-infected cellular DNA.
Khan, A S; Repaske, R; Garon, C F; Chan, H W; Rowe, W P; Martin, M A
1982-01-01
Two proviruses were cloned from EcoRI-digested DNA extracted from mink cells chronically infected with AKR mink cell focus-forming (MCF) 247 murine leukemia virus (MuLV), using a lambda phage host vector system. One cloned MuLV DNA fragment (designated MCF 1) contained sequences extending 6.8 kilobases from an EcoRI restriction site in the 5' long terminal repeat (LTR) to an EcoRI site located in the envelope (env) region and was indistinguishable by restriction endonuclease mapping for 5.1 kilobases (except for the EcoRI site in the LTR) from the 5' end of AKR ecotropic proviral DNA. The DNA segment extending from 5.1 to 6.8 kilobases contained several restriction sites that were not present in the AKR ecotropic provirus. A 0.5-kilobase DNA segment located at the 3' end of MCF 1 DNA contained sequences which hybridized to a xenotropic env-specific DNA probe but not to labeled ecotropic env-specific DNA. This dual character of MCF 1 proviral DNA was also confirmed by analyzing heteroduplex molecules by electron microscopy. The second cloned proviral DNA (designated MCF 2) was a 6.9-kilobase EcoRI DNA fragment which contained LTR sequences at each end and a 2.0-kilobase deletion encompassing most of the env region. The MCF 2 proviral DNA proved to be a useful reagent for detecting LTRs electron microscopically due to the presence of nonoverlapping, terminally located LTR sequences which effected its circularization with DNAs containing homologous LTR sequences. Nucleotide sequence analysis demonstrated the presence of a 104-base-pair direct repeat in the LTR of MCF 2 DNA. In contrast, only a single copy of the reiterated component of the direct repeat was present in MCF 1 DNA. Images PMID:6281459
Organization and evolution of highly repeated satellite DNA sequences in plant chromosomes.
Sharma, S; Raina, S N
2005-01-01
A major component of the plant nuclear genome is constituted by different classes of repetitive DNA sequences. The structural, functional and evolutionary aspects of the satellite repetitive DNA families, and their organization in the chromosomes is reviewed. The tandem satellite DNA sequences exhibit characteristic chromosomal locations, usually at subtelomeric and centromeric regions. The repetitive DNA family(ies) may be widely distributed in a taxonomic family or a genus, or may be specific for a species, genome or even a chromosome. They may acquire large-scale variations in their sequence and copy number over an evolutionary time-scale. These features have formed the basis of extensive utilization of repetitive sequences for taxonomic and phylogenetic studies. Hybrid polyploids have especially proven to be excellent models for studying the evolution of repetitive DNA sequences. Recent studies explicitly show that some repetitive DNA families localized at the telomeres and centromeres have acquired important structural and functional significance. The repetitive elements are under different evolutionary constraints as compared to the genes. Satellite DNA families are thought to arise de novo as a consequence of molecular mechanisms such as unequal crossing over, rolling circle amplification, replication slippage and mutation that constitute "molecular drive". Copyright 2005 S. Karger AG, Basel.
Benabdelkrim Filali, Oumama; Kabine, Mostafa; El Hamouchi, Adil; Lemrani, Meryem; Debboun, Mustapha; Sarih, M'hammed
2018-06-05
Anopheles sergentii known as the "oasis vector" or the "desert malaria vector" is considered the main vector of malaria in the southern parts of Morocco. Its presence in Morocco is confirmed for the first time through sequencing of mitochondrial DNA (mDNA) cytochrome c oxidase subunit I (COI) barcodes and nuclear ribosomal DNA (rDNA) second internal transcribed spacer (ITS2) sequences and direct comparison with specimens of A. sergentii of other countries. The DNA barcodes (n = 39) obtained from A. sergentii collected in 2015 and 2016 showed more diversity with 10 haplotypes, compared with 3 haplotypes obtained from ITS2 sequences (n = 59). Moreover, the comparison using the ITS2 sequences showed closer evolutionary relationship between the Moroccan and Egyptian strains than the Iranian strain. Nevertheless, genetic differences due to geographical segregation were also observed. This study provides the first report on the sequence of rDNA-ITS2 and mtDNA COI, which could be used to better understand the biodiversity of A. sergentii.
Pastor, N; Pardo, L; Weinstein, H
1997-01-01
The binding of the TATA box-binding protein (TBP) to a TATA sequence in DNA is essential for eukaryotic basal transcription. TBP binds in the minor groove of DNA, causing a large distortion of the DNA helix. Given the apparent stereochemical equivalence of AT and TA basepairs in the minor groove, DNA deformability must play a significant role in binding site selection, because not all AT-rich sequences are bound effectively by TBP. To gain insight into the precise role that the properties of the TATA sequence have in determining the specificity of the DNA substrates of TBP, the solution structure and dynamics of seven DNA dodecamers have been studied by using molecular dynamics simulations. The analysis of the structural properties of basepair steps in these TATA sequences suggests a reason for the preference for alternating pyrimidine-purine (YR) sequences, but indicates that these properties cannot be the sole determinant of the sequence specificity of TBP. Rather, recognition depends on the interplay between the inherent deformability of the DNA and steric complementarity at the molecular interface. Images FIGURE 2 PMID:9251783
Competition between B-Z and B-L transitions in a single DNA molecule: Computational studies
NASA Astrophysics Data System (ADS)
Kwon, Ah-Young; Nam, Gi-Moon; Johner, Albert; Kim, Seyong; Hong, Seok-Cheol; Lee, Nam-Kyung
2016-02-01
Under negative torsion, DNA adopts left-handed helical forms, such as Z-DNA and L-DNA. Using the random copolymer model developed for a wormlike chain, we represent a single DNA molecule with structural heterogeneity as a helical chain consisting of monomers which can be characterized by different helical senses and pitches. By Monte Carlo simulation, where we take into account bending and twist fluctuations explicitly, we study sequence dependence of B-Z transitions under torsional stress and tension focusing on the interaction with B-L transitions. We consider core sequences, (GC) n repeats or (TG) n repeats, which can interconvert between the right-handed B form and the left-handed Z form, imbedded in a random sequence, which can convert to left-handed L form with different (tension dependent) helical pitch. We show that Z-DNA formation from the (GC) n sequence is always supported by unwinding torsional stress but Z-DNA formation from the (TG) n sequence, which are more costly to convert but numerous, can be strongly influenced by the quenched disorder in the surrounding random sequence.
Extending the spectrum of DNA sequences retrieved from ancient bones and teeth
Glocke, Isabelle; Meyer, Matthias
2017-01-01
The number of DNA fragments surviving in ancient bones and teeth is known to decrease with fragment length. Recent genetic analyses of Middle Pleistocene remains have shown that the recovery of extremely short fragments can prove critical for successful retrieval of sequence information from particularly degraded ancient biological material. Current sample preparation techniques, however, are not optimized to recover DNA sequences from fragments shorter than ∼35 base pairs (bp). Here, we show that much shorter DNA fragments are present in ancient skeletal remains but lost during DNA extraction. We present a refined silica-based DNA extraction method that not only enables efficient recovery of molecules as short as 25 bp but also doubles the yield of sequences from longer fragments due to improved recovery of molecules with single-strand breaks. Furthermore, we present strategies for monitoring inefficiencies in library preparation that may result from co-extraction of inhibitory substances during DNA extraction. The combination of DNA extraction and library preparation techniques described here substantially increases the yield of DNA sequences from ancient remains and provides access to a yet unexploited source of highly degraded DNA fragments. Our work may thus open the door for genetic analyses on even older material. PMID:28408382
Kimura, Tomohiro; Nakano, Toshiki; Yamaguchi, Toshiyasu; Sato, Minoru; Ogawa, Tomohisa; Muramoto, Koji; Yokoyama, Takehiko; Kan-No, Nobuhiro; Nagahisa, Eizou; Janssen, Frank; Grieshaber, Manfred K
2004-01-01
The complete complementary DNA sequences of genes presumably coding for opine dehydrogenases from Arabella iricolor (sandworm), Haliotis discus hannai (abalone), and Patinopecten yessoensis (scallop) were determined, and partial cDNA sequences were derived for Meretrix lusoria (Japanese hard clam) and Spisula sachalinensis (Sakhalin surf clam). The primers ODH-9F and ODH-11R proved useful for amplifying the sequences for opine dehydrogenases from the 4 mollusk species investigated in this study. The sequence of the sandworm was obtained using primers constructed from the amino acid sequence of tauropine dehydrogenase, the main opine dehydrogenase in A. iricolor. The complete cDNA sequence of A. iricolor, H. discus hannai, and P. yessoensis encode 397, 400, and 405 amino acids, respectively. All sequences were aligned and compared with published databank sequences of Loligo opalescens, Loligo vulgaris (squid), Sepia officinalis (cuttlefish), and Pecten maximus (scallop). As expected, a high level of homology was observed for the cDNA from closely related species, such as for cephalopods or scallops, whereas cDNA from the other species showed lower-level homologies. A similar trend was observed when the deduced amino acid sequences were compared. Furthermore, alignment of these sequences revealed some structural motifs that are possibly related to the binding sites of the substrates. The phylogenetic trees derived from the nucleotide and amino acid sequences were consistent with the classification of species resulting from classical taxonomic analyses.
NASA Astrophysics Data System (ADS)
Leukhin, Anatolii N.
2005-08-01
The algebraic solution of a 'complex' problem of synthesis of phase-coded (PC) sequences with the zero level of side lobes of the cyclic autocorrelation function (ACF) is proposed. It is shown that the solution of the synthesis problem is connected with the existence of difference sets for a given code dimension. The problem of estimating the number of possible code combinations for a given code dimension is solved. It is pointed out that the problem of synthesis of PC sequences is related to the fundamental problems of discrete mathematics and, first of all, to a number of combinatorial problems, which can be solved, as the number factorisation problem, by algebraic methods by using the theory of Galois fields and groups.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Chen, Kai; Roberts, Gareth A.; Stephanou, Augoustinos S.
2010-07-23
Research highlights: {yields} Successful fusion of GFP to M.EcoKI DNA methyltransferase. {yields} GFP located at C-terminal of sequence specificity subunit does not later enzyme activity. {yields} FRET confirms structural model of M.EcoKI bound to DNA. -- Abstract: We describe the fusion of enhanced green fluorescent protein to the C-terminus of the HsdS DNA sequence-specificity subunit of the Type I DNA modification methyltransferase M.EcoKI. The fusion expresses well in vivo and assembles with the two HsdM modification subunits. The fusion protein functions as a sequence-specific DNA methyltransferase protecting DNA against digestion by the EcoKI restriction endonuclease. The purified enzyme shows Foerstermore » resonance energy transfer to fluorescently-labelled DNA duplexes containing the target sequence and to fluorescently-labelled ocr protein, a DNA mimic that binds to the M.EcoKI enzyme. Distances determined from the energy transfer experiments corroborate the structural model of M.EcoKI.« less
In silico evidence for sequence-dependent nucleosome sliding
DOE Office of Scientific and Technical Information (OSTI.GOV)
Lequieu, Joshua; Schwartz, David C.; de Pablo, Juan J.
Nucleosomes represent the basic building block of chromatin and provide an important mechanism by which cellular processes are controlled. The locations of nucleosomes across the genome are not random but instead depend on both the underlying DNA sequence and the dynamic action of other proteins within the nucleus. These processes are central to cellular function, and the molecular details of the interplay between DNA sequence and nudeosome dynamics remain poorly understood. In this work, we investigate this interplay in detail by relying on a molecular model, which permits development of a comprehensive picture of the underlying free energy surfaces andmore » the corresponding dynamics of nudeosome repositioning. The mechanism of nudeosome repositioning is shown to be strongly linked to DNA sequence and directly related to the binding energy of a given DNA sequence to the histone core. It is also demonstrated that chromatin remodelers can override DNA-sequence preferences by exerting torque, and the histone H4 tail is then identified as a key component by which DNA-sequence, histone modifications, and chromatin remodelers could in fact be coupled.« less
Su, Jiao; Zhang, Haijie; Jiang, Bingying; Zheng, Huzhi; Chai, Yaqin; Yuan, Ruo; Xiang, Yun
2011-11-15
We report an ultrasensitive electrochemical approach for the detection of uropathogen sequence-specific DNA target. The sensing strategy involves a dual signal amplification process, which combines the signal enhancement by the enzymatic target recycling technique with the sensitivity improvement by the quantum dot (QD) layer-by-layer (LBL) assembled labels. The enzyme-based catalytic target DNA recycling process results in the use of each target DNA sequence for multiple times and leads to direct amplification of the analytical signal. Moreover, the LBL assembled QD labels can further enhance the sensitivity of the sensing system. The coupling of these two effective signal amplification strategies thus leads to low femtomolar (5fM) detection of the target DNA sequences. The proposed strategy also shows excellent discrimination between the target DNA and the single-base mismatch sequences. The advantageous intrinsic sequence-independent property of exonuclease III over other sequence-dependent enzymes makes our new dual signal amplification system a general sensing platform for monitoring ultralow level of various types of target DNA sequences. Copyright © 2011 Elsevier B.V. All rights reserved.
Relations between Shannon entropy and genome order index in segmenting DNA sequences.
Zhang, Yi
2009-04-01
Shannon entropy H and genome order index S are used in segmenting DNA sequences. Zhang [Phys. Rev. E 72, 041917 (2005)] found that the two schemes are equivalent when a DNA sequence is converted to a binary sequence of S (strong H bond) and W (weak H bond). They left the mathematical proof to mathematicians who are interested in this issue. In this paper, a possible mathematical explanation is given. Moreover, we find that Chargaff parity rule 2 is the necessary condition of the equivalence, and the equivalence disappears when a DNA sequence is regarded as a four-symbol sequence. At last, we propose that S-2(-H) may be related to species evolution.
Evaluating the role of coherent delocalized phonon-like modes in DNA cyclization
Alexandrov, Ludmil B.; Rasmussen, Kim Ã.; Bishop, Alan R.; ...
2017-08-29
The innate flexibility of a DNA sequence is quantified by the Jacobson-Stockmayer’s J-factor, which measures the propensity for DNA loop formation. Recent studies of ultra-short DNA sequences revealed a discrepancy of up to six orders of magnitude between experimentally measured and theoretically predicted J-factors. These large differences suggest that, in addition to the elastic moduli of the double helix, other factors contribute to loop formation. We develop a new theoretical model that explores how coherent delocalized phonon-like modes in DNA provide single-stranded ”flexible hinges” to assist in loop formation. We also combine the Czapla-Swigon-Olson structural model of DNA with ourmore » extended Peyrard-Bishop-Dauxois model and, without changing any of the parameters of the two models, apply this new computational framework to 86 experimentally characterized DNA sequences. Our results demonstrate that the new computational framework can predict J-factors within an order of magnitude of experimental measurements for most ultra-short DNA sequences, while continuing to accurately describe the J-factors of longer sequences. Furthermore, we demonstrate that our computational framework can be used to describe the cyclization of DNA sequences that contain a base pair mismatch. Overall, our results support the conclusion that coherent delocalized phonon-like modes play an important role in DNA cyclization.« less
Evaluating the role of coherent delocalized phonon-like modes in DNA cyclization
DOE Office of Scientific and Technical Information (OSTI.GOV)
Alexandrov, Ludmil B.; Rasmussen, Kim Ã.; Bishop, Alan R.
The innate flexibility of a DNA sequence is quantified by the Jacobson-Stockmayer’s J-factor, which measures the propensity for DNA loop formation. Recent studies of ultra-short DNA sequences revealed a discrepancy of up to six orders of magnitude between experimentally measured and theoretically predicted J-factors. These large differences suggest that, in addition to the elastic moduli of the double helix, other factors contribute to loop formation. We develop a new theoretical model that explores how coherent delocalized phonon-like modes in DNA provide single-stranded ”flexible hinges” to assist in loop formation. We also combine the Czapla-Swigon-Olson structural model of DNA with ourmore » extended Peyrard-Bishop-Dauxois model and, without changing any of the parameters of the two models, apply this new computational framework to 86 experimentally characterized DNA sequences. Our results demonstrate that the new computational framework can predict J-factors within an order of magnitude of experimental measurements for most ultra-short DNA sequences, while continuing to accurately describe the J-factors of longer sequences. Furthermore, we demonstrate that our computational framework can be used to describe the cyclization of DNA sequences that contain a base pair mismatch. Overall, our results support the conclusion that coherent delocalized phonon-like modes play an important role in DNA cyclization.« less
Lee, Hwan Young; Song, Injee; Ha, Eunho; Cho, Sung-Bae; Yang, Woo Ick; Shin, Kyoung-Jin
2008-01-01
Background For the past few years, scientific controversy has surrounded the large number of errors in forensic and literature mitochondrial DNA (mtDNA) data. However, recent research has shown that using mtDNA phylogeny and referring to known mtDNA haplotypes can be useful for checking the quality of sequence data. Results We developed a Web-based bioinformatics resource "mtDNAmanager" that offers a convenient interface supporting the management and quality analysis of mtDNA sequence data. The mtDNAmanager performs computations on mtDNA control-region sequences to estimate the most-probable mtDNA haplogroups and retrieves similar sequences from a selected database. By the phased designation of the most-probable haplogroups (both expected and estimated haplogroups), mtDNAmanager enables users to systematically detect errors whilst allowing for confirmation of the presence of clear key diagnostic mutations and accompanying mutations. The query tools of mtDNAmanager also facilitate database screening with two options of "match" and "include the queried nucleotide polymorphism". In addition, mtDNAmanager provides Web interfaces for users to manage and analyse their own data in batch mode. Conclusion The mtDNAmanager will provide systematic routines for mtDNA sequence data management and analysis via easily accessible Web interfaces, and thus should be very useful for population, medical and forensic studies that employ mtDNA analysis. mtDNAmanager can be accessed at . PMID:19014619
Liu, Bin; Liu, Fule; Fang, Longyun; Wang, Xiaolong; Chou, Kuo-Chen
2015-04-15
In order to develop powerful computational predictors for identifying the biological features or attributes of DNAs, one of the most challenging problems is to find a suitable approach to effectively represent the DNA sequences. To facilitate the studies of DNAs and nucleotides, we developed a Python package called representations of DNAs (repDNA) for generating the widely used features reflecting the physicochemical properties and sequence-order effects of DNAs and nucleotides. There are three feature groups composed of 15 features. The first group calculates three nucleic acid composition features describing the local sequence information by means of kmers; the second group calculates six autocorrelation features describing the level of correlation between two oligonucleotides along a DNA sequence in terms of their specific physicochemical properties; the third group calculates six pseudo nucleotide composition features, which can be used to represent a DNA sequence with a discrete model or vector yet still keep considerable sequence-order information via the physicochemical properties of its constituent oligonucleotides. In addition, these features can be easily calculated based on both the built-in and user-defined properties via using repDNA. The repDNA Python package is freely accessible to the public at http://bioinformatics.hitsz.edu.cn/repDNA/. bliu@insun.hit.edu.cn or kcchou@gordonlifescience.org Supplementary data are available at Bioinformatics online. © The Author 2014. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.
Sunflower centromeres consist of a centromere-specific LINE and a chromosome-specific tandem repeat.
Nagaki, Kiyotaka; Tanaka, Keisuke; Yamaji, Naoki; Kobayashi, Hisato; Murata, Minoru
2015-01-01
The kinetochore is a protein complex including kinetochore-specific proteins that plays a role in chromatid segregation during mitosis and meiosis. The complex associates with centromeric DNA sequences that are usually species-specific. In plant species, tandem repeats including satellite DNA sequences and retrotransposons have been reported as centromeric DNA sequences. In this study on sunflowers, a cDNA-encoding centromere-specific histone H3 (CENH3) was isolated from a cDNA pool from a seedling, and an antibody was raised against a peptide synthesized from the deduced cDNA. The antibody specifically recognized the sunflower CENH3 (HaCENH3) and showed centromeric signals by immunostaining and immunohistochemical staining analysis. The antibody was also applied in chromatin immunoprecipitation (ChIP)-Seq to isolate centromeric DNA sequences and two different types of repetitive DNA sequences were identified. One was a long interspersed nuclear element (LINE)-like sequence, which showed centromere-specific signals on almost all chromosomes in sunflowers. This is the first report of a centromeric LINE sequence, suggesting possible centromere targeting ability. Another type of identified repetitive DNA was a tandem repeat sequence with a 187-bp unit that was found only on a pair of chromosomes. The HaCENH3 content of the tandem repeats was estimated to be much higher than that of the LINE, which implies centromere evolution from LINE-based centromeres to more stable tandem-repeat-based centromeres. In addition, the epigenetic status of the sunflower centromeres was investigated by immunohistochemical staining and ChIP, and it was found that centromeres were heterochromatic.
cgDNAweb: a web interface to the cgDNA sequence-dependent coarse-grain model of double-stranded DNA.
De Bruin, Lennart; Maddocks, John H
2018-06-14
The sequence-dependent statistical mechanical properties of fragments of double-stranded DNA is believed to be pertinent to its biological function at length scales from a few base pairs (or bp) to a few hundreds of bp, e.g. indirect read-out protein binding sites, nucleosome positioning sequences, phased A-tracts, etc. In turn, the equilibrium statistical mechanics behaviour of DNA depends upon its ground state configuration, or minimum free energy shape, as well as on its fluctuations as governed by its stiffness (in an appropriate sense). We here present cgDNAweb, which provides browser-based interactive visualization of the sequence-dependent ground states of double-stranded DNA molecules, as predicted by the underlying cgDNA coarse-grain rigid-base model of fragments with arbitrary sequence. The cgDNAweb interface is specifically designed to facilitate comparison between ground state shapes of different sequences. The server is freely available at cgDNAweb.epfl.ch with no login requirement.
First Complete Squash leaf curl China virus Genomic Segment DNA-A Sequence from East Timor
Maina, Solomon; Edwards, Owain R.; de Almeida, Luis; Ximenes, Abel
2017-01-01
ABSTRACT We present here the first complete Squash leaf curl China virus (SLCCV) genomic segment DNA-A sequence from East Timor. It was isolated from a pumpkin plant. When compared with 15 complete SLCCV DNA-A genome sequences from other world regions, it most resembled the Malaysian isolate MC1 sequence. PMID:28619789
Multiple tag labeling method for DNA sequencing
Mathies, Richard A.; Huang, Xiaohua C.; Quesada, Mark A.
1995-01-01
A DNA sequencing method described which uses single lane or channel electrophoresis. Sequencing fragments are separated in said lane and detected using a laser-excited, confocal fluorescence scanner. Each set of DNA sequencing fragments is separated in the same lane and then distinguished using a binary coding scheme employing only two different fluorescent labels. Also described is a method of using radio-isotope labels.
Molecular dynamics studies on the DNA-binding process of ERG.
Beuerle, Matthias G; Dufton, Neil P; Randi, Anna M; Gould, Ian R
2016-11-15
The ETS family of transcription factors regulate gene targets by binding to a core GGAA DNA-sequence. The ETS factor ERG is required for homeostasis and lineage-specific functions in endothelial cells, some subset of haemopoietic cells and chondrocytes; its ectopic expression is linked to oncogenesis in multiple tissues. To date details of the DNA-binding process of ERG including DNA-sequence recognition outside the core GGAA-sequence are largely unknown. We combined available structural and experimental data to perform molecular dynamics simulations to study the DNA-binding process of ERG. In particular we were able to reproduce the ERG DNA-complex with a DNA-binding simulation starting in an unbound configuration with a final root-mean-square-deviation (RMSD) of 2.1 Å to the core ETS domain DNA-complex crystal structure. This allowed us to elucidate the relevance of amino acids involved in the formation of the ERG DNA-complex and to identify Arg385 as a novel key residue in the DNA-binding process. Moreover we were able to show that water-mediated hydrogen bonds are present between ERG and DNA in our simulations and that those interactions have the potential to achieve sequence recognition outside the GGAA core DNA-sequence. The methodology employed in this study shows the promising capabilities of modern molecular dynamics simulations in the field of protein DNA-interactions.
King, Brian R; Aburdene, Maurice; Thompson, Alex; Warres, Zach
2014-01-01
Digital signal processing (DSP) techniques for biological sequence analysis continue to grow in popularity due to the inherent digital nature of these sequences. DSP methods have demonstrated early success for detection of coding regions in a gene. Recently, these methods are being used to establish DNA gene similarity. We present the inter-coefficient difference (ICD) transformation, a novel extension of the discrete Fourier transformation, which can be applied to any DNA sequence. The ICD method is a mathematical, alignment-free DNA comparison method that generates a genetic signature for any DNA sequence that is used to generate relative measures of similarity among DNA sequences. We demonstrate our method on a set of insulin genes obtained from an evolutionarily wide range of species, and on a set of avian influenza viral sequences, which represents a set of highly similar sequences. We compare phylogenetic trees generated using our technique against trees generated using traditional alignment techniques for similarity and demonstrate that the ICD method produces a highly accurate tree without requiring an alignment prior to establishing sequence similarity.
Walther, Stefanie; Tietze, Manfred; Czerny, Claus-Peter; König, Sven; Diesterbeck, Ulrike S
2016-01-01
We have developed a new bioinformatics framework for the analysis of rearranged bovine heavy chain immunoglobulin (Ig) variable regions by combining and refining widely used alignment algorithms. This bioinformatics framework allowed us to investigate alignments of heavy chain framework regions (FRHs) and the separate alignments of FRHs and heavy chain complementarity determining regions (CDRHs) to determine their germline origin in the four cattle breeds Aubrac, German Black Pied, German Simmental, and Holstein Friesian. Now it is also possible to specifically analyze Ig heavy chains possessing exceptionally long CDR3Hs. In order to gain more insight into breed specific differences in Ig combinatorial diversity, somatic hypermutations and putative gene conversions of IgG, we compared the dominantly transcribed variable (IGHV), diversity (IGHD), and joining (IGHJ) segments and their recombination in the four cattle breeds. The analysis revealed the use of 15 different IGHV segments, 21 IGHD segments, and two IGHJ segments with significant different transcription levels within the breeds. Furthermore, there are preferred rearrangements within the three groups of CDR3H lengths. In the sequences of group 2 (CDR3H lengths (L) of 11-47 amino acid residues (aa)) a higher number of recombination was observed than in sequences of group 1 (L≤10 aa) and 3 (L≥48 aa). The combinatorial diversity of germline IGHV, IGHD, and IGHJ-segments revealed 162 rearrangements that were significantly different. The few preferably rearranged gene segments within group 3 CDR3H regions may indicate specialized antibodies because this length is unique in cattle. The most important finding of this study, which was enabled by using the bioinformatics framework, is the discovery of strong evidence for gene conversion as a rare event using pseudogenes fulfilling all definitions for this particular diversification mechanism.
Czerny, Claus-Peter; König, Sven; Diesterbeck, Ulrike S.
2016-01-01
We have developed a new bioinformatics framework for the analysis of rearranged bovine heavy chain immunoglobulin (Ig) variable regions by combining and refining widely used alignment algorithms. This bioinformatics framework allowed us to investigate alignments of heavy chain framework regions (FRHs) and the separate alignments of FRHs and heavy chain complementarity determining regions (CDRHs) to determine their germline origin in the four cattle breeds Aubrac, German Black Pied, German Simmental, and Holstein Friesian. Now it is also possible to specifically analyze Ig heavy chains possessing exceptionally long CDR3Hs. In order to gain more insight into breed specific differences in Ig combinatorial diversity, somatic hypermutations and putative gene conversions of IgG, we compared the dominantly transcribed variable (IGHV), diversity (IGHD), and joining (IGHJ) segments and their recombination in the four cattle breeds. The analysis revealed the use of 15 different IGHV segments, 21 IGHD segments, and two IGHJ segments with significant different transcription levels within the breeds. Furthermore, there are preferred rearrangements within the three groups of CDR3H lengths. In the sequences of group 2 (CDR3H lengths (L) of 11–47 amino acid residues (aa)) a higher number of recombination was observed than in sequences of group 1 (L≤10 aa) and 3 (L≥48 aa). The combinatorial diversity of germline IGHV, IGHD, and IGHJ-segments revealed 162 rearrangements that were significantly different. The few preferably rearranged gene segments within group 3 CDR3H regions may indicate specialized antibodies because this length is unique in cattle. The most important finding of this study, which was enabled by using the bioinformatics framework, is the discovery of strong evidence for gene conversion as a rare event using pseudogenes fulfilling all definitions for this particular diversification mechanism. PMID:27828971
2014-06-01
Specifically, we combined the CRISPR genome editing system with a novel approach allowing efficient single cell cloning of Drosophila cells with the aim of...and culture these to produce cultures completely lacking wildtype sequence at the target locus. No robust methods existed to clone single Drosophila ...targeting all kinases and phosphatases (563 genes) in the Drosophila genome . 65 samples that displayed synthetic lethality (15 genes) or synthetic
Hykin, Sarah M.; Bi, Ke; McGuire, Jimmy A.
2015-01-01
For 150 years or more, specimens were routinely collected and deposited in natural history collections without preserving fresh tissue samples for genetic analysis. In the case of most herpetological specimens (i.e. amphibians and reptiles), attempts to extract and sequence DNA from formalin-fixed, ethanol-preserved specimens—particularly for use in phylogenetic analyses—has been laborious and largely ineffective due to the highly fragmented nature of the DNA. As a result, tens of thousands of specimens in herpetological collections have not been available for sequence-based phylogenetic studies. Massively parallel High-Throughput Sequencing methods and the associated bioinformatics, however, are particularly suited to recovering meaningful genetic markers from severely degraded/fragmented DNA sequences such as DNA damaged by formalin-fixation. In this study, we compared previously published DNA extraction methods on three tissue types subsampled from formalin-fixed specimens of Anolis carolinensis, followed by sequencing. Sufficient quality DNA was recovered from liver tissue, making this technique minimally destructive to museum specimens. Sequencing was only successful for the more recently collected specimen (collected ~30 ybp). We suspect this could be due either to the conditions of preservation and/or the amount of tissue used for extraction purposes. For the successfully sequenced sample, we found a high rate of base misincorporation. After rigorous trimming, we successfully mapped 27.93% of the cleaned reads to the reference genome, were able to reconstruct the complete mitochondrial genome, and recovered an accurate phylogenetic placement for our specimen. We conclude that the amount of DNA available, which can vary depending on specimen age and preservation conditions, will determine if sequencing will be successful. The technique described here will greatly improve the value of museum collections by making many formalin-fixed specimens available for genetic analysis. PMID:26505622
Hykin, Sarah M; Bi, Ke; McGuire, Jimmy A
2015-01-01
For 150 years or more, specimens were routinely collected and deposited in natural history collections without preserving fresh tissue samples for genetic analysis. In the case of most herpetological specimens (i.e. amphibians and reptiles), attempts to extract and sequence DNA from formalin-fixed, ethanol-preserved specimens-particularly for use in phylogenetic analyses-has been laborious and largely ineffective due to the highly fragmented nature of the DNA. As a result, tens of thousands of specimens in herpetological collections have not been available for sequence-based phylogenetic studies. Massively parallel High-Throughput Sequencing methods and the associated bioinformatics, however, are particularly suited to recovering meaningful genetic markers from severely degraded/fragmented DNA sequences such as DNA damaged by formalin-fixation. In this study, we compared previously published DNA extraction methods on three tissue types subsampled from formalin-fixed specimens of Anolis carolinensis, followed by sequencing. Sufficient quality DNA was recovered from liver tissue, making this technique minimally destructive to museum specimens. Sequencing was only successful for the more recently collected specimen (collected ~30 ybp). We suspect this could be due either to the conditions of preservation and/or the amount of tissue used for extraction purposes. For the successfully sequenced sample, we found a high rate of base misincorporation. After rigorous trimming, we successfully mapped 27.93% of the cleaned reads to the reference genome, were able to reconstruct the complete mitochondrial genome, and recovered an accurate phylogenetic placement for our specimen. We conclude that the amount of DNA available, which can vary depending on specimen age and preservation conditions, will determine if sequencing will be successful. The technique described here will greatly improve the value of museum collections by making many formalin-fixed specimens available for genetic analysis.
Schnitzler, P; Delius, H; Scholz, J; Touray, M; Orth, E; Darai, G
1987-12-01
The genome of the fish lymphocystis disease virus (FLDV) was screened for the existence of repetitive DNA sequences using a defined and complete gene library of the viral genome (98 kbp) by DNA-DNA hybridization, heteroduplex analysis, and restriction fine mapping. A repetitive DNA sequence was detected at the coordinates 0.034 to 0.057 and 0.718 to 0.736 map units (m.u.) of the FLDV genome. The first region (0.034 to 0.057 m.u.) corresponds to the 5' terminus of the EcoRI FLDV DNA fragment B (0.034 to 0.165 m.u.) and the second region (0.718 to 0.736 m.u.) is identical to the EcoRI DNA fragment M of the viral genome. The DNA nucleotide sequence of the EcoRI FLDV DNA fragment M was determined. This analysis revealed the presence of many short direct and inverted repetitions, e.g., a 18-mer direct repetition (TTTAAAATTTAATTAA) that started at nucleotide positions 812 and 942 and a 14-mer inverted repeat (TTAAATTTAAATTT) at nucleotide positions 820 and 959. Only short open reading frames were detected within this region. The DNA repetitions are discussed as sequences that play a possible regulatory role for virus replication. Furthermore, hybridization experiments revealed that the repetitive DNA sequences are conserved in the genome of different strains of fish lymphocystis disease virus isolated from two species of Pleuronectidae (flounder and dab).
Phylogenetic Network for European mtDNA
Finnilä, Saara; Lehtonen, Mervi S.; Majamaa, Kari
2001-01-01
The sequence in the first hypervariable segment (HVS-I) of the control region has been used as a source of evolutionary information in most phylogenetic analyses of mtDNA. Population genetic inference would benefit from a better understanding of the variation in the mtDNA coding region, but, thus far, complete mtDNA sequences have been rare. We determined the nucleotide sequence in the coding region of mtDNA from 121 Finns, by conformation-sensitive gel electrophoresis and subsequent sequencing and by direct sequencing of the D loop. Furthermore, 71 sequences from our previous reports were included, so that the samples represented all the mtDNA haplogroups present in the Finnish population. We found a total of 297 variable sites in the coding region, which allowed the compilation of unambiguous phylogenetic networks. The D loop harbored 104 variable sites, and, in most cases, these could be localized within the coding-region networks, without discrepancies. Interestingly, many homoplasies were detected in the coding region. Nucleotide variation in the rRNA and tRNA genes was 6%, and that in the third nucleotide positions of structural genes amounted to 22% of that in the HVS-I. The complete networks enabled the relationships between the mtDNA haplogroups to be analyzed. Phylogenetic networks based on the entire coding-region sequence in mtDNA provide a rich source for further population genetic studies, and complete sequences make it easier to differentiate between disease-causing mutations and rare polymorphisms. PMID:11349229
Bandelt, Hans-Jürgen; Kloss-Brandstätter, Anita; Richards, Martin B; Yao, Yong-Gang; Logan, Ian
2014-02-01
Since the determination in 1981 of the sequence of the human mitochondrial DNA (mtDNA) genome, the Cambridge Reference Sequence (CRS), has been used as the reference sequence to annotate mtDNA in molecular anthropology, forensic science and medical genetics. The CRS was eventually upgraded to the revised version (rCRS) in 1999. This reference sequence is a convenient device for recording mtDNA variation, although it has often been misunderstood as a wild-type (WT) or consensus sequence by medical geneticists. Recently, there has been a proposal to replace the rCRS with the so-called Reconstructed Sapiens Reference Sequence (RSRS). Even if it had been estimated accurately, the RSRS would be a cumbersome substitute for the rCRS, as the new proposal fuses--and thus confuses--the two distinct concepts of ancestral lineage and reference point for human mtDNA. Instead, we prefer to maintain the rCRS and to report mtDNA profiles by employing the hitherto predominant circumfix style. Tree diagrams could display mutations by using either the profile notation (in conventional short forms where appropriate) or in a root-upwards way with two suffixes indicating ancestral and derived nucleotides. This would guard against misunderstandings about reporting mtDNA variation. It is therefore neither necessary nor sensible to change the present reference sequence, the rCRS, in any way. The proposed switch to RSRS would inevitably lead to notational chaos, mistakes and misinterpretations.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Wu, Liyou; Yi, T. Y.; Van Nostrand, Joy
Phylogenetic analyses were done for the Shewanella strains isolated from Baltic Sea (38 strains), US DOE Hanford Uranium bioremediation site [Hanford Reach of the Columbia River (HRCR), 11 strains], Pacific Ocean and Hawaiian sediments (8 strains), and strains from other resources (16 strains) with three out group strains, Rhodopseudomonas palustris, Clostridium cellulolyticum, and Thermoanaerobacter ethanolicus X514, using DNA relatedness derived from WCGA-based DNA-DNA hybridizations, sequence similarities of 16S rRNA gene and gyrB gene, and sequence similarities of 6 loci of Shewanella genome selected from a shared gene list of the Shewanella strains with whole genome sequenced based on the averagemore » nucleotide identity of them (ANI). The phylogenetic trees based on 16S rRNA and gyrB gene sequences, and DNA relatedness derived from WCGA hybridizations of the tested Shewanella strains share exactly the same sub-clusters with very few exceptions, in which the strains were basically grouped by species. However, the phylogenetic analysis based on DNA relatedness derived from WCGA hybridizations dramatically increased the differentiation resolution at species and strains level within Shewanella genus. When the tree based on DNA relatedness derived from WCGA hybridizations was compared to the tree based on the combined sequences of the selected functional genes (6 loci), we found that the resolutions of both methods are similar, but the clustering of the tree based on DNA relatedness derived from WMGA hybridizations was clearer. These results indicate that WCGA-based DNA-DNA hybridization is an idea alternative of conventional DNA-DNA hybridization methods and it is superior to the phylogenetics methods based on sequence similarities of single genes. Detailed analysis is being performed for the re-classification of the strains examined.« less
DNA Replication Profiling Using Deep Sequencing.
Saayman, Xanita; Ramos-Pérez, Cristina; Brown, Grant W
2018-01-01
Profiling of DNA replication during progression through S phase allows a quantitative snap-shot of replication origin usage and DNA replication fork progression. We present a method for using deep sequencing data to profile DNA replication in S. cerevisiae.
Wang, Yongjie; Kleespies, Regina G; Ramle, Moslim B; Jehle, Johannes A
2008-09-01
The genomic sequence analysis of many large dsDNA viruses is hampered by the lack of enough sample materials. Here, we report a whole genome amplification of the Oryctes rhinoceros nudivirus (OrNV) isolate Ma07 starting from as few as about 10 ng of purified viral DNA by application of phi29 DNA polymerase- and exonuclease-resistant random hexamer-based multiple displacement amplification (MDA) method. About 60 microg of high molecular weight DNA with fragment sizes of up to 25 kbp was amplified. A genomic DNA clone library was generated using the product DNA. After 8-fold sequencing coverage, the 127,615 bp of OrNV whole genome was sequenced successfully. The results demonstrate that the MDA-based whole genome amplification enables rapid access to genomic information from exiguous virus samples.
Mak, Sarah Siu Tze; Gopalakrishnan, Shyam; Carøe, Christian; Geng, Chunyu; Liu, Shanlin; Sinding, Mikkel-Holger S; Kuderna, Lukas F K; Zhang, Wenwei; Fu, Shujin; Vieira, Filipe G; Germonpré, Mietje; Bocherens, Hervé; Fedorov, Sergey; Petersen, Bent; Sicheritz-Pontén, Thomas; Marques-Bonet, Tomas; Zhang, Guojie; Jiang, Hui; Gilbert, M Thomas P
2017-01-01
Abstract Ancient DNA research has been revolutionized following development of next-generation sequencing platforms. Although a number of such platforms have been applied to ancient DNA samples, the Illumina series are the dominant choice today, mainly because of high production capacities and short read production. Recently a potentially attractive alternative platform for palaeogenomic data generation has been developed, the BGISEQ-500, whose sequence output are comparable with the Illumina series. In this study, we modified the standard BGISEQ-500 library preparation specifically for use on degraded DNA, then directly compared the sequencing performance and data quality of the BGISEQ-500 to the Illumina HiSeq2500 platform on DNA extracted from 8 historic and ancient dog and wolf samples. The data generated were largely comparable between sequencing platforms, with no statistically significant difference observed for parameters including level (P = 0.371) and average sequence length (P = 0718) of endogenous nuclear DNA, sequence GC content (P = 0.311), double-stranded DNA damage rate (v. 0.309), and sequence clonality (P = 0.093). Small significant differences were found in single-strand DNA damage rate (δS; slightly lower for the BGISEQ-500, P = 0.011) and the background rate of difference from the reference genome (θ; slightly higher for BGISEQ-500, P = 0.012). This may result from the differences in amplification cycles used to polymerase chain reaction–amplify the libraries. A significant difference was also observed in the mitochondrial DNA percentages recovered (P = 0.018), although we believe this is likely a stochastic effect relating to the extremely low levels of mitochondria that were sequenced from 3 of the samples with overall very low levels of endogenous DNA. Although we acknowledge that our analyses were limited to animal material, our observations suggest that the BGISEQ-500 holds the potential to represent a valid and potentially valuable alternative platform for palaeogenomic data generation that is worthy of future exploration by those interested in the sequencing and analysis of degraded DNA. PMID:28854615
Yanagi, Tomohiro; Shirasawa, Kenta; Terachi, Mayuko; Isobe, Sachiko
2017-01-01
Cultivated strawberry ( Fragaria × ananassa Duch.) has homoeologous chromosomes because of allo-octoploidy. For example, two homoeologous chromosomes that belong to different sub-genome of allopolyploids have similar base sequences. Thus, when conducting de novo assembly of DNA sequences, it is difficult to determine whether these sequences are derived from the same chromosome. To avoid the difficulties associated with homoeologous chromosomes and demonstrate the possibility of sequencing allopolyploids using single chromosomes, we conducted sequence analysis using microdissected single somatic chromosomes of cultivated strawberry. Three hundred and ten somatic chromosomes of the Japanese octoploid strawberry 'Reiko' were individually selected under a light microscope using a microdissection system. DNA from 288 of the dissected chromosomes was successfully amplified using a DNA amplification kit. Using next-generation sequencing, we decoded the base sequences of the amplified DNA segments, and on the basis of mapping, we identified DNA sequences from 144 samples that were best matched to the reference genomes of the octoploid strawberry, F. × ananassa , and the diploid strawberry, F. vesca . The 144 samples were classified into seven pseudo-molecules of F. vesca . The coverage rates of the DNA sequences from the single chromosome onto all pseudo-molecular sequences varied from 3 to 29.9%. We demonstrated an efficient method for sequence analysis of allopolyploid plants using microdissected single chromosomes. On the basis of our results, we believe that whole-genome analysis of allopolyploid plants can be enhanced using methodology that employs microdissected single chromosomes.
Tsui, Nancy B. Y.; Jiang, Peiyong; Chow, Katherine C. K.; Su, Xiaoxi; Leung, Tak Y.; Sun, Hao; Chan, K. C. Allen; Chiu, Rossa W. K.; Lo, Y. M. Dennis
2012-01-01
Background Fetal DNA in maternal urine, if present, would be a valuable source of fetal genetic material for noninvasive prenatal diagnosis. However, the existence of fetal DNA in maternal urine has remained controversial. The issue is due to the lack of appropriate technology to robustly detect the potentially highly degraded fetal DNA in maternal urine. Methodology We have used massively parallel paired-end sequencing to investigate cell-free DNA molecules in maternal urine. Catheterized urine samples were collected from seven pregnant women during the third trimester of pregnancies. We detected fetal DNA by identifying sequenced reads that contained fetal-specific alleles of the single nucleotide polymorphisms. The sizes of individual urinary DNA fragments were deduced from the alignment positions of the paired reads. We measured the fractional fetal DNA concentration as well as the size distributions of fetal and maternal DNA in maternal urine. Principal Findings Cell-free fetal DNA was detected in five of the seven maternal urine samples, with the fractional fetal DNA concentrations ranged from 1.92% to 4.73%. Fetal DNA became undetectable in maternal urine after delivery. The total urinary cell-free DNA molecules were less intact when compared with plasma DNA. Urinary fetal DNA fragments were very short, and the most dominant fetal sequences were between 29 bp and 45 bp in length. Conclusions With the use of massively parallel sequencing, we have confirmed the existence of transrenal fetal DNA in maternal urine, and have shown that urinary fetal DNA was heavily degraded. PMID:23118982
Scott-Phillips, Thomas C; Blythe, Richard A
2013-11-06
In a combinatorial communication system, some signals consist of the combinations of other signals. Such systems are more efficient than equivalent, non-combinatorial systems, yet despite this they are rare in nature. Why? Previous explanations have focused on the adaptive limits of combinatorial communication, or on its purported cognitive difficulties, but neither of these explains the full distribution of combinatorial communication in the natural world. Here, we present a nonlinear dynamical model of the emergence of combinatorial communication that, unlike previous models, considers how initially non-communicative behaviour evolves to take on a communicative function. We derive three basic principles about the emergence of combinatorial communication. We hence show that the interdependence of signals and responses places significant constraints on the historical pathways by which combinatorial signals might emerge, to the extent that anything other than the most simple form of combinatorial communication is extremely unlikely. We also argue that these constraints can be bypassed if individuals have the socio-cognitive capacity to engage in ostensive communication. Humans, but probably no other species, have this ability. This may explain why language, which is massively combinatorial, is such an extreme exception to nature's general trend for non-combinatorial communication.
Effect of Base Sequence "Defects" on the Electrostatic Potential of Dissolved DNA
NASA Astrophysics Data System (ADS)
Adams, Scott V.; Wagner, Katrina; Kephart, Thomas S.; Edwards, Glenn
1997-11-01
An analytical model of the electrostatic potential surrounding dissolved DNA has been developed. The model consists of an all-atom, mathematically helical structure for DNA, in which the atoms are arranged in infinite lines of discrete point charges on concentric cylindrical surfaces. The surrounding solvent and counterions are treated with the Debye-Huckel approximation (Wagner et al., Biophysical Journal 73, 21-30, 1997). Variation in the electrostatic potential due to structural differences between A, B, and Z conformations and homopolymer base sequence is apparent. The most recent modification to the model exploits the principle of superposition to calculate the potential of DNA with a base sequence containing `defects.' That is, the base sequence is no longer uniform along the polymer. Differences between the potential of homopolymer DNA and the potential of DNA containing base `defects' are immediately obvious. These results may aid in understanding the role of electrostatics in base-sequence specificity exhibited by DNA-binding proteins.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Kim, Suhkmann; Zhang, Ziming; Upchurch, Sean
2004-04-16
2 ARID is a homologous family of DNA-binding domains that occur in DNA binding proteins from a wide variety of species, ranging from yeast to nematodes, insects, mammals and plants. SWI1, a member of the SWI/SNF protein complex that is involved in chromatin remodeling during transcription, contains the ARID motif. The ARID domain of human SWI1 (also known as p270) does not select for a specific DNA sequence from a random sequence pool. The lack of sequence specificity shown by the SWI1 ARID domain stands in contrast to the other characterized ARID domains, which recognize specific AT-rich sequences. We havemore » solved the three-dimensional structure of human SWI1 ARID using solution NMR methods. In addition, we have characterized non-specific DNA-binding by the SWI1 ARID domain. Results from this study indicate that a flexible long internal loop in ARID motif is likely to be important for sequence specific DNA-recognition. The structure of human SWI1 ARID domain also represents a distinct structural subfamily. Studies of ARID indicate that boundary of the DNA binding structural and functional domains can extend beyond the sequence homologous region in a homologous family of proteins. Structural studies of homologous domains such as ARID family of DNA-binding domains should provide information to better predict the boundary of structural and functional domains in structural genomic studies. Key Words: ARID, SWI1, NMR, structural genomics, protein-DNA interaction.« less
Gilley, D; Preer, J R; Aufderheide, K J; Polisky, B
1988-01-01
Paramecium tetraurelia can be transformed by microinjection of cloned serotype A gene sequences into the macronucleus. Transformants are detected by their ability to express serotype A surface antigen from the injected templates. After injection, the DNA is converted from a supercoiled form to a linear form by cleavage at nonrandom sites. The linear form appears to replicate autonomously as a unit-length molecule and is present in transformants at high copy number. The injected DNA is further processed by the addition of paramecium-type telomeric sequences to the termini of the linear DNA. To examine the fate of injected linear DNA molecules, plasmid pSA14SB DNA containing the A gene was cleaved into two linear pieces, a 14-kilobase (kb) piece containing the A gene and flanking sequences and a 2.2-kb piece consisting of the procaryotic vector. In transformants expressing the A gene, we observed that two linear DNA species were present which correspond to the two species injected. Both species had Paramecium telomerelike sequences added to their termini. For the 2.2-kb DNA, we show that the site of addition of the telomerelike sequences is directly at one terminus and within one nucleotide of the other terminus. These results indicate that injected procaryotic DNA is capable of autonomous replication in Paramecium macronuclei and that telomeric addition in the macronucleus does not require specific recognition sequences. Images PMID:3211128
Hiding message into DNA sequence through DNA coding and chaotic maps.
Liu, Guoyan; Liu, Hongjun; Kadir, Abdurahman
2014-09-01
The paper proposes an improved reversible substitution method to hide data into deoxyribonucleic acid (DNA) sequence, and four measures have been taken to enhance the robustness and enlarge the hiding capacity, such as encode the secret message by DNA coding, encrypt it by pseudo-random sequence, generate the relative hiding locations by piecewise linear chaotic map, and embed the encoded and encrypted message into a randomly selected DNA sequence using the complementary rule. The key space and the hiding capacity are analyzed. Experimental results indicate that the proposed method has a better performance compared with the competing methods with respect to robustness and capacity.
A complete Neandertal mitochondrial genome sequence determined by high-throughput sequencing
Green, Richard E.; Malaspinas, Anna-Sapfo; Krause, Johannes; Briggs, Adrian W.; Johnson, Philip L. F.; Uhler, Caroline; Meyer, Matthias; Good, Jeffrey M.; Maricic, Tomislav; Stenzel, Udo; Prüfer, Kay; Siebauer, Michael; Burbano, Hernán A.; Ronan, Michael; Rothberg, Jonathan M.; Egholm, Michael; Rudan, Pavao; Brajković, Dejana; Kućan, Željko; Gušić, Ivan; Wikström, Mårten; Laakkonen, Liisa; Kelso, Janet; Slatkin, Montgomery; Pääbo, Svante
2008-01-01
Summary A complete mitochondrial (mt) genome sequence was reconstructed from a 38,000-year-old Neandertal individual using 8,341 mtDNA sequences identified among 4.8 Gb of DNA generated from ~0.3 grams of bone. Analysis of the assembled sequence unequivocally establishes that the Neandertal mtDNA falls outside the variation of extant human mtDNAs and allows an estimate of the divergence date between the two mtDNA lineages of 660,000±140,000 years. Of the 13 proteins encoded in the mtDNA, subunit 2 of cytochrome c oxidase of the mitochondrial electron transport chain has experienced the largest number of amino acid substitutions in human ancestors since the separation from Neandertals. There is evidence that purifying selection in the Neandertal mtDNA was reduced compared to other primate lineages suggesting that the effective population size of Neandertals was small. PMID:18692465
NASA Technical Reports Server (NTRS)
Smith, David J.; Burton, Aaron; Castro-Wallace, Sarah; John, Kristen; Stahl, Sarah E.; Dworkin, Jason Peter; Lupisella, Mark L.
2016-01-01
On the International Space Station (ISS), technologies capable of rapid microbial identification and disease diagnostics are not currently available. NASA still relies upon sample return for comprehensive, molecular-based sample characterization. Next-generation DNA sequencing is a powerful approach for identifying microorganisms in air, water, and surfaces onboard spacecraft. The Biomolecule Sequencer payload, manifested to SpaceX-9 and scheduled on the Increment 4748 research plan (June 2016), will assess the functionality of a commercially-available next-generation DNA sequencer in the microgravity environment of ISS. The MinION device from Oxford Nanopore Technologies (Oxford, UK) measures picoamp changes in electrical current dependent on nucleotide sequences of the DNA strand migrating through nanopores in the system. The hardware is exceptionally small (9.5 x 3.2 x 1.6 cm), lightweight (120 grams), and powered only by a USB connection. For the ISS technology demonstration, the Biomolecule Sequencer will be powered by a Microsoft Surface Pro3. Ground-prepared samples containing lambda bacteriophage, Escherichia coli, and mouse genomic DNA, will be launched and stored frozen on the ISS until experiment initiation. Immediately prior to sequencing, a crew member will collect and thaw frozen DNA samples, connect the sequencer to the Surface Pro3, inject thawed samples into a MinION flow cell, and initiate sequencing. At the completion of the sequencing run, data will be downlinked for ground analysis. Identical, synchronous ground controls will be used for data comparisons to determine sequencer functionality, run-time sequence, current dynamics, and overall accuracy. We will present our latest results from the ISS flight experiment the first time DNA has ever been sequenced in space and discuss the many potential applications of the Biomolecule Sequencer for environmental monitoring, medical diagnostics, higher fidelity and more adaptable Space Biology Human Research Program investigations, and even life detection experiments for astrobiology missions.
Shih, Yao-Ming; Sun, Cheng-Pu; Chou, Hui-Hsien; Wu, Tzu-Hui; Chen, Chun-Chi; Wu, Ping-Yi; Enya Chen, Yu-Chen; Bissig, Karl-Dimiter; Tao, Mi-Hua
2015-01-01
Selection of escape mutants with mutations within the target sequence could abolish the antiviral RNA interference activity. Here, we investigated the impact of a pre-existing shRNA-resistant HBV variant on the efficacy of shRNA therapy. We previously identified a highly potent shRNA, S1, which, when delivered by an adeno-associated viral vector, effectively inhibits HBV replication in HBV transgenic mice. We applied the “PICKY” software to systemically screen the HBV genome, then used hydrodynamic transfection and HBV transgenic mice to identify additional six highly potent shRNAs. Human liver chimeric mice were infected with a mixture of wild-type and T472C HBV, a S1-resistant HBV variant, and then treated with a single or combined shRNAs. The presence of T472C mutant compromised the therapeutic efficacy of S1 and resulted in replacement of serum wild-type HBV by T472C HBV. In contrast, combinatorial therapy using S1 and P28, one of six potent shRNAs, markedly reduced titers for both wild-type and T472C HBV. Interestingly, treatment with P28 alone led to the emergence of escape mutants with mutations in the P28 target region. Our results demonstrate that combinatorial RNAi therapy can minimize the escape of resistant viral mutants in chronic HBV patients. PMID:26482836
Crystal structure of MboIIA methyltransferase.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Osipiuk, J.; Walsh, M. A.; Joachimiak, A.
2003-09-15
DNA methyltransferases (MTases) are sequence-specific enzymes which transfer a methyl group from S-adenosyl-L-methionine (AdoMet) to the amino group of either cytosine or adenine within a recognized DNA sequence. Methylation of a base in a specific DNA sequence protects DNA from nucleolytic cleavage by restriction enzymes recognizing the same DNA sequence. We have determined at 1.74 {angstrom} resolution the crystal structure of a {beta}-class DNA MTase MboIIA (M {center_dot} MboIIA) from the bacterium Moraxella bovis, the smallest DNA MTase determined to date. M {center_dot} MboIIA methylates the 3' adenine of the pentanucleotide sequence 5'-GAAGA-3'. The protein crystallizes with two molecules inmore » the asymmetric unit which we propose to resemble the dimer when M {center_dot} MboIIA is not bound to DNA. The overall structure of the enzyme closely resembles that of M {center_dot} RsrI. However, the cofactor-binding pocket in M {center_dot} MboIIA forms a closed structure which is in contrast to the open-form structures of other known MTases.« less
Lin, Ya-Ying
2017-01-01
A portion of the mitochondrial cytochrome c oxidase I gene was sequenced using both genomic DNA and complement DNA from three planktonic copepod Neocalanus species (N. cristatus, N. plumchrus, and N. flemingeri). Small but critical sequence differences in CO1 were observed between gDNA and cDNA from N. plumchrus. Furthermore, careful observation revealed the presence of recombination between sequences in gDNA from N. plumchrus. Moreover, a chimera of the N. cristatus and N. plumchrus sequences was obtained from N. plumchrus gDNA. The observed phenomena can be best explained by the preferential amplification of the nuclear mitochondrial pseudogenes from gDNA of N. plumchrus. Two conclusions can be drawn from the observations. First, nuclear mitochondrial pseudogenes are pervasive in N. plumchrus. Second, a mating between a female N. cristatus and a male N. plumchrus produced viable offspring, which further backcrossed to a N. plumchrus individual. These observations not only demonstrate intriguing mating behavior in these species, but also emphasize the importance of careful interpretation of species marker sequences amplified from gDNA. PMID:28231343
Nagaki, Kiyotaka; Shibata, Fukashi; Kanatani, Asaka; Kashihara, Kazunari; Murata, Minoru
2012-04-01
The centromere is a multi-functional complex comprising centromeric DNA and a number of proteins. To isolate unidentified centromeric DNA sequences, centromere-specific histone H3 variants (CENH3) and chromatin immunoprecipitation (ChIP) have been utilized in some plant species. However, anti-CENH3 antibody for ChIP must be raised in each species because of its species specificity. Production of the antibodies is time-consuming and costly, and it is not easy to produce ChIP-grade antibodies. In this study, we applied a HaloTag7-based chromatin affinity purification system to isolate centromeric DNA sequences in tobacco. This system required no specific antibody, and made it possible to apply a highly stringent wash to remove contaminated DNA. As a result, we succeeded in isolating five tandem repetitive DNA sequences in addition to the centromeric retrotransposons that were previously identified by ChIP. Three of the tandem repeats were centromere-specific sequences located on different chromosomes. These results confirm the validity of the HaloTag7-based chromatin affinity purification system as an alternative method to ChIP for isolating unknown centromeric DNA sequences. The discovery of more than two chromosome-specific centromeric DNA sequences indicates the mosaic structure of tobacco centromeres. © Springer-Verlag 2011
Sequence dependence of electron-induced DNA strand breakage revealed by DNA nanoarrays
Keller, Adrian; Rackwitz, Jenny; Cauët, Emilie; Liévin, Jacques; Körzdörfer, Thomas; Rotaru, Alexandru; Gothelf, Kurt V.; Besenbacher, Flemming; Bald, Ilko
2014-01-01
The electronic structure of DNA is determined by its nucleotide sequence, which is for instance exploited in molecular electronics. Here we demonstrate that also the DNA strand breakage induced by low-energy electrons (18 eV) depends on the nucleotide sequence. To determine the absolute cross sections for electron induced single strand breaks in specific 13 mer oligonucleotides we used atomic force microscopy analysis of DNA origami based DNA nanoarrays. We investigated the DNA sequences 5′-TT(XYX)3TT with X = A, G, C and Y = T, BrU 5-bromouracil and found absolute strand break cross sections between 2.66 · 10−14 cm2 and 7.06 · 10−14 cm2. The highest cross section was found for 5′-TT(ATA)3TT and 5′-TT(ABrUA)3TT, respectively. BrU is a radiosensitizer, which was discussed to be used in cancer radiation therapy. The replacement of T by BrU into the investigated DNA sequences leads to a slight increase of the absolute strand break cross sections resulting in sequence-dependent enhancement factors between 1.14 and 1.66. Nevertheless, the variation of strand break cross sections due to the specific nucleotide sequence is considerably higher. Thus, the present results suggest the development of targeted radiosensitizers for cancer radiation therapy. PMID:25487346
Yamada, Kazuhiko; Kamimura, Eikichi; Kondo, Mariko; Tsuchiya, Kimiyuki; Nishida-Umehara, Chizuko; Matsuda, Yoichi
2006-02-01
We molecularly cloned new families of site-specific repetitive DNA sequences from BglII- and EcoRI-digested genomic DNA of the Syrian hamster (Mesocricetus auratus, Cricetrinae, Rodentia) and characterized them by chromosome in situ hybridization and filter hybridization. They were classified into six different types of repetitive DNA sequence families according to chromosomal distribution and genome organization. The hybridization patterns of the sequences were consistent with the distribution of C-positive bands and/or Hoechst-stained heterochromatin. The centromeric major satellite DNA and sex chromosome-specific and telomeric region-specific repetitive sequences were conserved in the same genus (Mesocricetus) but divergent in different genera. The chromosome-2-specific sequence was conserved in two genera, Mesocricetus and Cricetulus, and a low copy number of repetitive sequences on the heterochromatic chromosome arms were conserved in the subfamily Cricetinae but not in the subfamily Calomyscinae. By contrast, the other type of repetitive sequences on the heterochromatic chromosome arms, which had sequence similarities to a LINE sequence of rodents, was conserved through the three subfamilies, Cricetinae, Calomyscinae and Murinae. The nucleotide divergence of the repetitive sequences of heterochromatin was well correlated with the phylogenetic relationships of the Cricetinae species, and each sequence has been independently amplified and diverged in the same genome.
Tolerance of DNA Mismatches in Dmc1 Recombinase-mediated DNA Strand Exchange.
Borgogno, María V; Monti, Mariela R; Zhao, Weixing; Sung, Patrick; Argaraña, Carlos E; Pezza, Roberto J
2016-03-04
Recombination between homologous chromosomes is required for the faithful meiotic segregation of chromosomes and leads to the generation of genetic diversity. The conserved meiosis-specific Dmc1 recombinase catalyzes homologous recombination triggered by DNA double strand breaks through the exchange of parental DNA sequences. Although providing an efficient rate of DNA strand exchange between polymorphic alleles, Dmc1 must also guard against recombination between divergent sequences. How DNA mismatches affect Dmc1-mediated DNA strand exchange is not understood. We have used fluorescence resonance energy transfer to study the mechanism of Dmc1-mediated strand exchange between DNA oligonucleotides with different degrees of heterology. The efficiency of strand exchange is highly sensitive to the location, type, and distribution of mismatches. Mismatches near the 3' end of the initiating DNA strand have a small effect, whereas most mismatches near the 5' end impede strand exchange dramatically. The Hop2-Mnd1 protein complex stimulates Dmc1-catalyzed strand exchange on homologous DNA or containing a single mismatch. We observed that Dmc1 can reject divergent DNA sequences while bypassing a few mismatches in the DNA sequence. Our findings have important implications in understanding meiotic recombination. First, Dmc1 acts as an initial barrier for heterologous recombination, with the mismatch repair system providing a second level of proofreading, to ensure that ectopic sequences are not recombined. Second, Dmc1 stepping over infrequent mismatches is likely critical for allowing recombination between the polymorphic sequences of homologous chromosomes, thus contributing to gene conversion and genetic diversity. © 2016 by The American Society for Biochemistry and Molecular Biology, Inc.
Tolerance of DNA Mismatches in Dmc1 Recombinase-mediated DNA Strand Exchange*
Borgogno, María V.; Monti, Mariela R.; Zhao, Weixing; Sung, Patrick; Argaraña, Carlos E.; Pezza, Roberto J.
2016-01-01
Recombination between homologous chromosomes is required for the faithful meiotic segregation of chromosomes and leads to the generation of genetic diversity. The conserved meiosis-specific Dmc1 recombinase catalyzes homologous recombination triggered by DNA double strand breaks through the exchange of parental DNA sequences. Although providing an efficient rate of DNA strand exchange between polymorphic alleles, Dmc1 must also guard against recombination between divergent sequences. How DNA mismatches affect Dmc1-mediated DNA strand exchange is not understood. We have used fluorescence resonance energy transfer to study the mechanism of Dmc1-mediated strand exchange between DNA oligonucleotides with different degrees of heterology. The efficiency of strand exchange is highly sensitive to the location, type, and distribution of mismatches. Mismatches near the 3′ end of the initiating DNA strand have a small effect, whereas most mismatches near the 5′ end impede strand exchange dramatically. The Hop2-Mnd1 protein complex stimulates Dmc1-catalyzed strand exchange on homologous DNA or containing a single mismatch. We observed that Dmc1 can reject divergent DNA sequences while bypassing a few mismatches in the DNA sequence. Our findings have important implications in understanding meiotic recombination. First, Dmc1 acts as an initial barrier for heterologous recombination, with the mismatch repair system providing a second level of proofreading, to ensure that ectopic sequences are not recombined. Second, Dmc1 stepping over infrequent mismatches is likely critical for allowing recombination between the polymorphic sequences of homologous chromosomes, thus contributing to gene conversion and genetic diversity. PMID:26709229
Horn, T; Chang, C A; Urdea, M S
1997-12-01
The divergent synthesis of bDNA structures is described. This new type of branched DNA contains one unique oligonucleotide, the primary sequence, covalently attached through a comb-like branching network to many identical copies of a different oligonucleotide, the secondary sequence. The bDNA comb molecules were assembled on a solid support using parameters optimized for bDNA synthesis. The chemistry was used to synthesize bDNA comb molecules containing 15 secondary sequences. The bDNA comb molecules were elaborated by enzymatic ligation into branched amplification multimers, large bDNA molecules (a total of 1068 nt) containing an average of 36 repeated DNA oligomer sequences, each capable of hybridizing specifically to an alkaline phosphatase-labeled oligonucleotide. The bDNA comb molecules were characterized by electrophoretic methods and by controlled cleavage at periodate-cleavable moieties incorporated during synthesis. The branched amplification multimers have been used as signal amplifiers in nucleic acid quantification assays for detection of viral infection. It is possible to detect as few as 50 molecules with bDNA technology.
Horn, T; Chang, C A; Urdea, M S
1997-01-01
The divergent synthesis of bDNA structures is described. This new type of branched DNA contains one unique oligonucleotide, the primary sequence, covalently attached through a comb-like branching network to many identical copies of a different oligonucleotide, the secondary sequence. The bDNA comb molecules were assembled on a solid support using parameters optimized for bDNA synthesis. The chemistry was used to synthesize bDNA comb molecules containing 15 secondary sequences. The bDNA comb molecules were elaborated by enzymatic ligation into branched amplification multimers, large bDNA molecules (a total of 1068 nt) containing an average of 36 repeated DNA oligomer sequences, each capable of hybridizing specifically to an alkaline phosphatase-labeled oligonucleotide. The bDNA comb molecules were characterized by electrophoretic methods and by controlled cleavage at periodate-cleavable moieties incorporated during synthesis. The branched amplification multimers have been used as signal amplifiers in nucleic acid quantification assays for detection of viral infection. It is possible to detect as few as 50 molecules with bDNA technology. PMID:9365266
Gmünder, H; Kuratli, K; Keck, W
1995-01-01
The quinolones inhibit the A subunit of DNA gyrase in the presence of Mg2+ by interrupting the DNA breakage and resealing steps, and the latter step is also retarded without quinolones if Mg2+ is replaced by Ca2+. Pyrimido[1,6-a]benzimidazoles have been found to represent a new class of potent DNA gyrase inhibitors which also act at the A subunit. To determine alterations in the DNA sequence specificity of DNA gyrase for cleavage sites in the presence of inhibitors of both classes or in the presence of Ca2+, we used DNA restriction fragments of 164, 85, and 71 bp from the pBR322 plasmid as model substrates. Each contained, at a different position, the 20-bp pBR322 sequence around position 990, where DNA gyrase preferentially cleaves in the presence of quinolones. Our results show that pyrimido[1,6-a]benzimidazoles have a mode of action similar to that of quinolones; they inhibit the resealing step and influence the DNA sequence specificity of DNA gyrase in the same way. Differences between inhibitors of both classes could be observed only in the preferences of DNA gyrase for these cleavage sites. The 20-bp sequence appeared to have some properties that induced DNA gyrase to cleave all three DNA fragments in the presence of inhibitors within this sequence, whereas cleavage in the presence of Ca2+ was in addition dependent on the length of the DNA fragments. PMID:7695300
High-Throughput Analysis of T-DNA Location and Structure Using Sequence Capture.
Inagaki, Soichi; Henry, Isabelle M; Lieberman, Meric C; Comai, Luca
2015-01-01
Agrobacterium-mediated transformation of plants with T-DNA is used both to introduce transgenes and for mutagenesis. Conventional approaches used to identify the genomic location and the structure of the inserted T-DNA are laborious and high-throughput methods using next-generation sequencing are being developed to address these problems. Here, we present a cost-effective approach that uses sequence capture targeted to the T-DNA borders to select genomic DNA fragments containing T-DNA-genome junctions, followed by Illumina sequencing to determine the location and junction structure of T-DNA insertions. Multiple probes can be mixed so that transgenic lines transformed with different T-DNA types can be processed simultaneously, using a simple, index-based pooling approach. We also developed a simple bioinformatic tool to find sequence read pairs that span the junction between the genome and T-DNA or any foreign DNA. We analyzed 29 transgenic lines of Arabidopsis thaliana, each containing inserts from 4 different T-DNA vectors. We determined the location of T-DNA insertions in 22 lines, 4 of which carried multiple insertion sites. Additionally, our analysis uncovered a high frequency of unconventional and complex T-DNA insertions, highlighting the needs for high-throughput methods for T-DNA localization and structural characterization. Transgene insertion events have to be fully characterized prior to use as commercial products. Our method greatly facilitates the first step of this characterization of transgenic plants by providing an efficient screen for the selection of promising lines.
Quantitative DNA fiber mapping
Gray, Joe W.; Weier, Heinz-Ulrich G.
1998-01-01
The present invention relates generally to the DNA mapping and sequencing technologies. In particular, the present invention provides enhanced methods and compositions for the physical mapping and positional cloning of genomic DNA. The present invention also provides a useful analytical technique to directly map cloned DNA sequences onto individual stretched DNA molecules.
Berger, C; Berger, B; Parson, W
2012-01-01
In recent years, evidence from domestic dogs has increasingly been analyzed by forensic DNA testing. Especially, canine hairs have proved most suitable and practical due to the high rate of hair transfer occurring between dogs and humans. Starting with the description of a contamination-free sample handling procedure, we give a detailed workflow for sequencing hypervariable segments (HVS) of the mtDNA control region from canine evidence. After the hair material is lysed and the DNA extracted by Phenol/Chloroform, the amplification and sequencing strategy comprises the HVS I and II of the canine control region and is optimized for DNA of medium-to-low quality and quantity. The sequencing procedure is based on the Sanger Big-dye deoxy-terminator method and the separation of the sequencing reaction products is performed on a conventional multicolor fluorescence detection capillary electrophoresis platform. Finally, software-aided base calling and sequence interpretation are addressed exemplarily.
DNA Shape Dominates Sequence Affinity in Nucleosome Formation
NASA Astrophysics Data System (ADS)
Freeman, Gordon S.; Lequieu, Joshua P.; Hinckley, Daniel M.; Whitmer, Jonathan K.; de Pablo, Juan J.
2014-10-01
Nucleosomes provide the basic unit of compaction in eukaryotic genomes, and the mechanisms that dictate their position at specific locations along a DNA sequence are of central importance to genetics. In this Letter, we employ molecular models of DNA and proteins to elucidate various aspects of nucleosome positioning. In particular, we show how DNA's histone affinity is encoded in its sequence-dependent shape, including subtle deviations from the ideal straight B-DNA form and local variations of minor groove width. By relying on high-precision simulations of the free energy of nucleosome complexes, we also demonstrate that, depending on DNA's intrinsic curvature, histone binding can be dominated by bending interactions or electrostatic interactions. More generally, the results presented here explain how sequence, manifested as the shape of the DNA molecule, dominates molecular recognition in the problem of nucleosome positioning.
Detection of Merkel Cell Polyomavirus DNA in Serum Samples of Healthy Blood Donors
Mazzoni, Elisa; Rotondo, John C.; Marracino, Luisa; Selvatici, Rita; Bononi, Ilaria; Torreggiani, Elena; Touzé, Antoine; Martini, Fernanda; Tognon, Mauro G.
2017-01-01
Merkel cell polyomavirus (MCPyV) has been detected in 80% of Merkel cell carcinomas (MCC). In the host, the MCPyV reservoir remains elusive. MCPyV DNA sequences were revealed in blood donor buffy coats. In this study, MCPyV DNA sequences were investigated in the sera (n = 190) of healthy blood donors. Two MCPyV DNA sequences, coding for the viral oncoprotein large T antigen (LT), were investigated using polymerase chain reaction (PCR) methods and DNA sequencing. Circulating MCPyV sequences were detected in sera with a prevalence of 2.6% (5/190), at low-DNA viral load, which is in the range of 1–4 and 1–5 copies/μl by real-time PCR and droplet digital PCR, respectively. DNA sequencing carried out in the five MCPyV-positive samples indicated that the two MCPyV LT sequences which were analyzed belong to the MKL-1 strain. Circulating MCPyV LT sequences are present in blood donor sera. MCPyV-positive samples from blood donors could represent a potential vehicle for MCPyV infection in receivers, whereas an increase in viral load may occur with multiple blood transfusions. In certain patient conditions, such as immune-depression/suppression, additional disease or old age, transfusion of MCPyV-positive samples could be an additional risk factor for MCC onset. PMID:29238698
Previously unknown and highly divergent ssDNA viruses populate the oceans.
Labonté, Jessica M; Suttle, Curtis A
2013-11-01
Single-stranded DNA (ssDNA) viruses are economically important pathogens of plants and animals, and are widespread in oceans; yet, the diversity and evolutionary relationships among marine ssDNA viruses remain largely unknown. Here we present the results from a metagenomic study of composite samples from temperate (Saanich Inlet, 11 samples; Strait of Georgia, 85 samples) and subtropical (46 samples, Gulf of Mexico) seawater. Most sequences (84%) had no evident similarity to sequenced viruses. In total, 608 putative complete genomes of ssDNA viruses were assembled, almost doubling the number of ssDNA viral genomes in databases. These comprised 129 genetically distinct groups, each represented by at least one complete genome that had no recognizable similarity to each other or to other virus sequences. Given that the seven recognized families of ssDNA viruses have considerable sequence homology within them, this suggests that many of these genetic groups may represent new viral families. Moreover, nearly 70% of the sequences were similar to one of these genomes, indicating that most of the sequences could be assigned to a genetically distinct group. Most sequences fell within 11 well-defined gene groups, each sharing a common gene. Some of these encoded putative replication and coat proteins that had similarity to sequences from viruses infecting eukaryotes, suggesting that these were likely from viruses infecting eukaryotic phytoplankton and zooplankton.
CRITICA: coding region identification tool invoking comparative analysis
NASA Technical Reports Server (NTRS)
Badger, J. H.; Olsen, G. J.; Woese, C. R. (Principal Investigator)
1999-01-01
Gene recognition is essential to understanding existing and future DNA sequence data. CRITICA (Coding Region Identification Tool Invoking Comparative Analysis) is a suite of programs for identifying likely protein-coding sequences in DNA by combining comparative analysis of DNA sequences with more common noncomparative methods. In the comparative component of the analysis, regions of DNA are aligned with related sequences from the DNA databases; if the translation of the aligned sequences has greater amino acid identity than expected for the observed percentage nucleotide identity, this is interpreted as evidence for coding. CRITICA also incorporates noncomparative information derived from the relative frequencies of hexanucleotides in coding frames versus other contexts (i.e., dicodon bias). The dicodon usage information is derived by iterative analysis of the data, such that CRITICA is not dependent on the existence or accuracy of coding sequence annotations in the databases. This independence makes the method particularly well suited for the analysis of novel genomes. CRITICA was tested by analyzing the available Salmonella typhimurium DNA sequences. Its predictions were compared with the DNA sequence annotations and with the predictions of GenMark. CRITICA proved to be more accurate than GenMark, and moreover, many of its predictions that would seem to be errors instead reflect problems in the sequence databases. The source code of CRITICA is freely available by anonymous FTP (rdp.life.uiuc.edu in/pub/critica) and on the World Wide Web (http:/(/)rdpwww.life.uiuc.edu).
Osipiuk, J; Joachimiak, A
1997-09-12
We propose that the dnaK operon of Thermus thermophilus HB8 is composed of three functionally linked genes: dnaK, grpE, and dnaJ. The dnaK and dnaJ gene products are most closely related to their cyanobacterial homologs. The DnaK protein sequence places T. thermophilus in the plastid Hsp70 subfamily. In contrast, the grpE translated sequence is most similar to GrpE from Clostridium acetobutylicum, a Gram-positive anaerobic bacterium. A single promoter region, with homology to the Escherichia coli consensus promoter sequences recognized by the sigma70 and sigma32 transcription factors, precedes the postulated operon. This promoter is heat-shock inducible. The dnaK mRNA level increased more than 30 times upon 10 min of heat shock (from 70 degrees C to 85 degrees C). A strong transcription terminating sequence was found between the dnaK and grpE genes. The individual genes were cloned into pET expression vectors and the thermophilic proteins were overproduced at high levels in E. coli and purified to homogeneity. The recombinant T. thermophilus DnaK protein was shown to have a weak ATP-hydrolytic activity, with an optimum at 90 degrees C. The ATPase was stimulated by the presence of GrpE and DnaJ. Another open reading frame, coding for ClpB heat-shock protein, was found downstream of the dnaK operon.
Multiple tag labeling method for DNA sequencing
Mathies, R.A.; Huang, X.C.; Quesada, M.A.
1995-07-25
A DNA sequencing method is described which uses single lane or channel electrophoresis. Sequencing fragments are separated in the lane and detected using a laser-excited, confocal fluorescence scanner. Each set of DNA sequencing fragments is separated in the same lane and then distinguished using a binary coding scheme employing only two different fluorescent labels. Also described is a method of using radioisotope labels. 5 figs.
Farjami, Elaheh; Clima, Lilia; Gothelf, Kurt V; Ferapontova, Elena E
2010-06-01
A DNA molecular beacon approach was used for the analysis of interactions between DNA and Methylene Blue (MB) as a redox indicator of a hybridization event. DNA hairpin structures of different length and guanine (G) content were immobilized onto gold electrodes in their folded states through the alkanethiol linker at the 5'-end. Binding of MB to the folded hairpin DNA was electrochemically studied and compared with binding to the duplex structure formed by hybridization of the hairpin DNA to a complementary DNA strand. Variation of the electrochemical signal from the DNA-MB complex was shown to depend primarily on the DNA length and sequence used: the G-C base pairs were the preferential sites of MB binding in the duplex. For short 20 nts long DNA sequences, the increased electrochemical response from MB bound to the duplex structure was consistent with the increased amount of bound and electrochemically readable MB molecules (i.e. MB molecules that are available for the electron transfer (ET) reaction with the electrode). With longer DNA sequences, the balance between the amounts of the electrochemically readable MB molecules bound to the hairpin DNA and to the hybrid was opposite: a part of the MB molecules bound to the long-sequence DNA duplex seem to be electrochemically mute due to long ET distance. The increasing electrochemical response from MB bound to the short-length DNA hybrid contrasts with the decreasing signal from MB bound to the long-length DNA hybrid and allows an "off"-"on" genosensor development.
Multiplexed Sequence Encoding: A Framework for DNA Communication
Zakeri, Bijan; Carr, Peter A.; Lu, Timothy K.
2016-01-01
Synthetic DNA has great propensity for efficiently and stably storing non-biological information. With DNA writing and reading technologies rapidly advancing, new applications for synthetic DNA are emerging in data storage and communication. Traditionally, DNA communication has focused on the encoding and transfer of complete sets of information. Here, we explore the use of DNA for the communication of short messages that are fragmented across multiple distinct DNA molecules. We identified three pivotal points in a communication—data encoding, data transfer & data extraction—and developed novel tools to enable communication via molecules of DNA. To address data encoding, we designed DNA-based individualized keyboards (iKeys) to convert plaintext into DNA, while reducing the occurrence of DNA homopolymers to improve synthesis and sequencing processes. To address data transfer, we implemented a secret-sharing system—Multiplexed Sequence Encoding (MuSE)—that conceals messages between multiple distinct DNA molecules, requiring a combination key to reveal messages. To address data extraction, we achieved the first instance of chromatogram patterning through multiplexed sequencing, thereby enabling a new method for data extraction. We envision these approaches will enable more widespread communication of information via DNA. PMID:27050646
Simon, J W; Slabas, A R
1998-09-18
The GenBank database was searched using the E. coli malonyl CoA:ACP transacylase (MCAT) sequence, for plant protein/cDNA sequences corresponding to MCAT, a component of plant fatty acid synthetase (FAS), for which the plant cDNA has not been isolated. A 272-bp Zea mays EST sequence (GenBank accession number: AA030706) was identified which has strong homology to the E. coli MCAT. A PCR derived cDNA probe from Zea mays was used to screen a Brassica napus (rape) cDNA library. This resulted in the isolation of a 1200-bp cDNA clone which encodes an open reading frame corresponding to a protein of 351 amino acids. The protein shows 47% homology to the E. coli MCAT amino acid sequence in the coding region for the mature protein. Expression of a plasmid (pMCATrap2) containing the plant cDNA sequence in Fab D89, an E. coli mutant, in MCAT activity restores growth demonstrating functional complementation and direct function of the cloned cDNA. This is the first functional evidence supporting the identification of a plant cDNA for MCAT.