dna sequence assembly: Topics by Science.gov

Sample records for dna sequence assembly

Scar-less multi-part DNA assembly design automation

DOE Office of Scientific and Technical Information (OSTI.GOV)

Hillson, Nathan J.

The present invention provides a method of a method of designing an implementation of a DNA assembly. In an exemplary embodiment, the method includes (1) receiving a list of DNA sequence fragments to be assembled together and an order in which to assemble the DNA sequence fragments, (2) designing DNA oligonucleotides (oligos) for each of the DNA sequence fragments, and (3) creating a plan for adding flanking homology sequences to each of the DNA oligos. In an exemplary embodiment, the method includes (1) receiving a list of DNA sequence fragments to be assembled together and an order in which tomore » assemble the DNA sequence fragments, (2) designing DNA oligonucleotides (oligos) for each of the DNA sequence fragments, and (3) creating a plan for adding optimized overhang sequences to each of the DNA oligos.« less
Unique nucleotide sequence-guided assembly of repetitive DNA parts for synthetic biology applications

DOE Office of Scientific and Technical Information (OSTI.GOV)

Torella, JP; Lienert, F; Boehm, CR

2014-08-07

Recombination-based DNA construction methods, such as Gibson assembly, have made it possible to easily and simultaneously assemble multiple DNA parts, and they hold promise for the development and optimization of metabolic pathways and functional genetic circuits. Over time, however, these pathways and circuits have become more complex, and the increasing need for standardization and insulation of genetic parts has resulted in sequence redundancies-for example, repeated terminator and insulator sequences-that complicate recombination-based assembly. We and others have recently developed DNA assembly methods, which we refer to collectively as unique nucleotide sequence (UNS)-guided assembly, in which individual DNA parts are flanked withmore » UNSs to facilitate the ordered, recombination-based assembly of repetitive sequences. Here we present a detailed protocol for UNS-guided assembly that enables researchers to convert multiple DNA parts into sequenced, correctly assembled constructs, or into high-quality combinatorial libraries in only 2-3 d. If the DNA parts must be generated from scratch, an additional 2-5 d are necessary. This protocol requires no specialized equipment and can easily be implemented by a student with experience in basic cloning techniques.« less
Unique nucleotide sequence (UNS)-guided assembly of repetitive DNA parts for synthetic biology applications

PubMed Central

Torella, Joseph P.; Lienert, Florian; Boehm, Christian R.; Chen, Jan-Hung; Way, Jeffrey C.; Silver, Pamela A.

2016-01-01

Recombination-based DNA construction methods, such as Gibson assembly, have made it possible to easily and simultaneously assemble multiple DNA parts and hold promise for the development and optimization of metabolic pathways and functional genetic circuits. Over time, however, these pathways and circuits have become more complex, and the increasing need for standardization and insulation of genetic parts has resulted in sequence redundancies — for example repeated terminator and insulator sequences — that complicate recombination-based assembly. We and others have recently developed DNA assembly methods that we refer to collectively as unique nucleotide sequence (UNS)-guided assembly, in which individual DNA parts are flanked with UNSs to facilitate the ordered, recombination-based assembly of repetitive sequences. Here we present a detailed protocol for UNS-guided assembly that enables researchers to convert multiple DNA parts into sequenced, correctly-assembled constructs, or into high-quality combinatorial libraries in only 2–3 days. If the DNA parts must be generated from scratch, an additional 2–5 days are necessary. This protocol requires no specialized equipment and can easily be implemented by a student with experience in basic cloning techniques. PMID:25101822
A Case Study into Microbial Genome Assembly Gap Sequences and Finishing Strategies.

PubMed

Utturkar, Sagar M; Klingeman, Dawn M; Hurt, Richard A; Brown, Steven D

2017-01-01

This study characterized regions of DNA which remained unassembled by either PacBio and Illumina sequencing technologies for seven bacterial genomes. Two genomes were manually finished using bioinformatics and PCR/Sanger sequencing approaches and regions not assembled by automated software were analyzed. Gaps present within Illumina assemblies mostly correspond to repetitive DNA regions such as multiple rRNA operon sequences. PacBio gap sequences were evaluated for several properties such as GC content, read coverage, gap length, ability to form strong secondary structures, and corresponding annotations. Our hypothesis that strong secondary DNA structures blocked DNA polymerases and contributed to gap sequences was not accepted. PacBio assemblies had few limitations overall and gaps were explained as cumulative effect of lower than average sequence coverage and repetitive sequences at contig termini. An important aspect of the present study is the compilation of biological features that interfered with assembly and included active transposons, multiple plasmid sequences, phage DNA integration, and large sequence duplication. Our targeted genome finishing approach and systematic evaluation of the unassembled DNA will be useful for others looking to close, finish, and polish microbial genome sequences.
GENESUS: a two-step sequence design program for DNA nanostructure self-assembly.

PubMed

Tsutsumi, Takanobu; Asakawa, Takeshi; Kanegami, Akemi; Okada, Takao; Tahira, Tomoko; Hayashi, Kenshi

2014-01-01

DNA has been recognized as an ideal material for bottom-up construction of nanometer scale structures by self-assembly. The generation of sequences optimized for unique self-assembly (GENESUS) program reported here is a straightforward method for generating sets of strand sequences optimized for self-assembly of arbitrarily designed DNA nanostructures by a generate-candidates-and-choose-the-best strategy. A scalable procedure to prepare single-stranded DNA having arbitrary sequences is also presented. Strands for the assembly of various structures were designed and successfully constructed, validating both the program and the procedure.
DOE Office of Scientific and Technical Information (OSTI.GOV)

Lapidus, Alla L.

From the date its role in heredity was discovered, DNA has been generating interest among scientists from different fields of knowledge: physicists have studied the three dimensional structure of the DNA molecule, biologists tried to decode the secrets of life hidden within these long molecules, and technologists invent and improve methods of DNA analysis. The analysis of the nucleotide sequence of DNA occupies a special place among the methods developed. Thanks to the variety of sequencing technologies available, the process of decoding the sequence of genomic DNA (or whole genome sequencing) has become robust and inexpensive. Meanwhile the assembly ofmore » whole genome sequences remains a challenging task. In addition to the need to assemble millions of DNA fragments of different length (from 35 bp (Solexa) to 800 bp (Sanger)), great interest in analysis of microbial communities (metagenomes) of different complexities raises new problems and pushes some new requirements for sequence assembly tools to the forefront. The genome assembly process can be divided into two steps: draft assembly and assembly improvement (finishing). Despite the fact that automatically performed assembly (or draft assembly) is capable of covering up to 98% of the genome, in most cases, it still contains incorrectly assembled reads. The error rate of the consensus sequence produced at this stage is about 1/2000 bp. A finished genome represents the genome assembly of much higher accuracy (with no gaps or incorrectly assembled areas) and quality ({approx}1 error/10,000 bp), validated through a number of computer and laboratory experiments.« less
A Case Study into Microbial Genome Assembly Gap Sequences and Finishing Strategies

DOE Office of Scientific and Technical Information (OSTI.GOV)

Utturkar, Sagar M.; Klingeman, Dawn M.; Hurt, Jr., Richard A.

This study characterized regions of DNA which remained unassembled by either PacBio and Illumina sequencing technologies for seven bacterial genomes. Two genomes were manually finished using bioinformatics and PCR/Sanger sequencing approaches and regions not assembled by automated software were analyzed. Gaps present within Illumina assemblies mostly correspond to repetitive DNA regions such as multiple rRNA operon sequences. PacBio gap sequences were evaluated for several properties such as GC content, read coverage, gap length, ability to form strong secondary structures, and corresponding annotations. Our hypothesis that strong secondary DNA structures blocked DNA polymerases and contributed to gap sequences was not accepted.more » PacBio assemblies had few limitations overall and gaps were explained as cumulative effect of lower than average sequence coverage and repetitive sequences at contig termini. An important aspect of the present study is the compilation of biological features that interfered with assembly and included active transposons, multiple plasmid sequences, phage DNA integration, and large sequence duplication. Furthermore, our targeted genome finishing approach and systematic evaluation of the unassembled DNA will be useful for others looking to close, finish, and polish microbial genome sequences.« less
A Case Study into Microbial Genome Assembly Gap Sequences and Finishing Strategies

DOE PAGES

Utturkar, Sagar M.; Klingeman, Dawn M.; Hurt, Jr., Richard A.; ...

2017-07-18

This study characterized regions of DNA which remained unassembled by either PacBio and Illumina sequencing technologies for seven bacterial genomes. Two genomes were manually finished using bioinformatics and PCR/Sanger sequencing approaches and regions not assembled by automated software were analyzed. Gaps present within Illumina assemblies mostly correspond to repetitive DNA regions such as multiple rRNA operon sequences. PacBio gap sequences were evaluated for several properties such as GC content, read coverage, gap length, ability to form strong secondary structures, and corresponding annotations. Our hypothesis that strong secondary DNA structures blocked DNA polymerases and contributed to gap sequences was not accepted.more » PacBio assemblies had few limitations overall and gaps were explained as cumulative effect of lower than average sequence coverage and repetitive sequences at contig termini. An important aspect of the present study is the compilation of biological features that interfered with assembly and included active transposons, multiple plasmid sequences, phage DNA integration, and large sequence duplication. Furthermore, our targeted genome finishing approach and systematic evaluation of the unassembled DNA will be useful for others looking to close, finish, and polish microbial genome sequences.« less
A Case Study into Microbial Genome Assembly Gap Sequences and Finishing Strategies

PubMed Central

Utturkar, Sagar M.; Klingeman, Dawn M.; Hurt, Richard A.; Brown, Steven D.

2017-01-01

This study characterized regions of DNA which remained unassembled by either PacBio and Illumina sequencing technologies for seven bacterial genomes. Two genomes were manually finished using bioinformatics and PCR/Sanger sequencing approaches and regions not assembled by automated software were analyzed. Gaps present within Illumina assemblies mostly correspond to repetitive DNA regions such as multiple rRNA operon sequences. PacBio gap sequences were evaluated for several properties such as GC content, read coverage, gap length, ability to form strong secondary structures, and corresponding annotations. Our hypothesis that strong secondary DNA structures blocked DNA polymerases and contributed to gap sequences was not accepted. PacBio assemblies had few limitations overall and gaps were explained as cumulative effect of lower than average sequence coverage and repetitive sequences at contig termini. An important aspect of the present study is the compilation of biological features that interfered with assembly and included active transposons, multiple plasmid sequences, phage DNA integration, and large sequence duplication. Our targeted genome finishing approach and systematic evaluation of the unassembled DNA will be useful for others looking to close, finish, and polish microbial genome sequences. PMID:28769883
Cost-effective sequencing of full-length cDNA clones powered by a de novo-reference hybrid assembly.

PubMed

Kuroshu, Reginaldo M; Watanabe, Junichi; Sugano, Sumio; Morishita, Shinichi; Suzuki, Yutaka; Kasahara, Masahiro

2010-05-07

Sequencing full-length cDNA clones is important to determine gene structures including alternative splice forms, and provides valuable resources for experimental analyses to reveal the biological functions of coded proteins. However, previous approaches for sequencing cDNA clones were expensive or time-consuming, and therefore, a fast and efficient sequencing approach was demanded. We developed a program, MuSICA 2, that assembles millions of short (36-nucleotide) reads collected from a single flow cell lane of Illumina Genome Analyzer to shotgun-sequence approximately 800 human full-length cDNA clones. MuSICA 2 performs a hybrid assembly in which an external de novo assembler is run first and the result is then improved by reference alignment of shotgun reads. We compared the MuSICA 2 assembly with 200 pooled full-length cDNA clones finished independently by the conventional primer-walking using Sanger sequencers. The exon-intron structure of the coding sequence was correct for more than 95% of the clones with coding sequence annotation when we excluded cDNA clones insufficiently represented in the shotgun library due to PCR failure (42 out of 200 clones excluded), and the nucleotide-level accuracy of coding sequences of those correct clones was over 99.99%. We also applied MuSICA 2 to full-length cDNA clones from Toxoplasma gondii, to confirm that its ability was competent even for non-human species. The entire sequencing and shotgun assembly takes less than 1 week and the consumables cost only approximately US$3 per clone, demonstrating a significant advantage over previous approaches.
Cost-Effective Sequencing of Full-Length cDNA Clones Powered by a De Novo-Reference Hybrid Assembly

PubMed Central

Sugano, Sumio; Morishita, Shinichi; Suzuki, Yutaka

2010-01-01

Background Sequencing full-length cDNA clones is important to determine gene structures including alternative splice forms, and provides valuable resources for experimental analyses to reveal the biological functions of coded proteins. However, previous approaches for sequencing cDNA clones were expensive or time-consuming, and therefore, a fast and efficient sequencing approach was demanded. Methodology We developed a program, MuSICA 2, that assembles millions of short (36-nucleotide) reads collected from a single flow cell lane of Illumina Genome Analyzer to shotgun-sequence ∼800 human full-length cDNA clones. MuSICA 2 performs a hybrid assembly in which an external de novo assembler is run first and the result is then improved by reference alignment of shotgun reads. We compared the MuSICA 2 assembly with 200 pooled full-length cDNA clones finished independently by the conventional primer-walking using Sanger sequencers. The exon-intron structure of the coding sequence was correct for more than 95% of the clones with coding sequence annotation when we excluded cDNA clones insufficiently represented in the shotgun library due to PCR failure (42 out of 200 clones excluded), and the nucleotide-level accuracy of coding sequences of those correct clones was over 99.99%. We also applied MuSICA 2 to full-length cDNA clones from Toxoplasma gondii, to confirm that its ability was competent even for non-human species. Conclusions The entire sequencing and shotgun assembly takes less than 1 week and the consumables cost only ∼US$3 per clone, demonstrating a significant advantage over previous approaches. PMID:20479877
j5 DNA assembly design automation.

PubMed

Hillson, Nathan J

2014-01-01

Modern standardized methodologies, described in detail in the previous chapters of this book, have enabled the software-automated design of optimized DNA construction protocols. This chapter describes how to design (combinatorial) scar-less DNA assembly protocols using the web-based software j5. j5 assists biomedical and biotechnological researchers construct DNA by automating the design of optimized protocols for flanking homology sequence as well as type IIS endonuclease-mediated DNA assembly methodologies. Unlike any other software tool available today, j5 designs scar-less combinatorial DNA assembly protocols, performs a cost-benefit analysis to identify which portions of an assembly process would be less expensive to outsource to a DNA synthesis service provider, and designs hierarchical DNA assembly strategies to mitigate anticipated poor assembly junction sequence performance. Software integrated with j5 add significant value to the j5 design process through graphical user-interface enhancement and downstream liquid-handling robotic laboratory automation.
Periodic Assembly of Nanospecies on Repetitive DNA Sequences Generated on Gold Nanoparticles by Rolling Circle Amplification

NASA Astrophysics Data System (ADS)

Zhao, Weian; Brook, Michael A.; Li, Yingfu

Periodical assembly of nanospecies is desirable for the construction of nanodevices. We provide a protocol for the preparation of a gold nanoparticle (AuNP)/DNA scaffold on which nanospecies can be assembled in a periodical manner. AuNP/DNA scaffold is prepared by growing long single-stranded DNA (ssDNA) molecules (typically hundreds of nanometers to a few microns in length) on AuNPs via rolling circle amplification (RCA). Since these long ssDNA molecules contain many repetitive sequence units, complementary DNA-attached nanospecies can be assembled through specific hybridization in a controllable and periodical manner.
CAPRRESI: Chimera Assembly by Plasmid Recovery and Restriction Enzyme Site Insertion.

PubMed

Santillán, Orlando; Ramírez-Romero, Miguel A; Dávila, Guillermo

2017-06-25

Here, we present chimera assembly by plasmid recovery and restriction enzyme site insertion (CAPRRESI). CAPRRESI benefits from many strengths of the original plasmid recovery method and introduces restriction enzyme digestion to ease DNA ligation reactions (required for chimera assembly). For this protocol, users clone wildtype genes into the same plasmid (pUC18 or pUC19). After the in silico selection of amino acid sequence regions where chimeras should be assembled, users obtain all the synonym DNA sequences that encode them. Ad hoc Perl scripts enable users to determine all synonym DNA sequences. After this step, another Perl script searches for restriction enzyme sites on all synonym DNA sequences. This in silico analysis is also performed using the ampicillin resistance gene (ampR) found on pUC18/19 plasmids. Users design oligonucleotides inside synonym regions to disrupt wildtype and ampR genes by PCR. After obtaining and purifying complementary DNA fragments, restriction enzyme digestion is accomplished. Chimera assembly is achieved by ligating appropriate complementary DNA fragments. pUC18/19 vectors are selected for CAPRRESI because they offer technical advantages, such as small size (2,686 base pairs), high copy number, advantageous sequencing reaction features, and commercial availability. The usage of restriction enzymes for chimera assembly eliminates the need for DNA polymerases yielding blunt-ended products. CAPRRESI is a fast and low-cost method for fusing protein-coding genes.
Analysis and Functional Annotation of an Expressed Sequence Tag Collection for Tropical Crop Sugarcane

PubMed Central

Vettore, André L.; da Silva, Felipe R.; Kemper, Edson L.; Souza, Glaucia M.; da Silva, Aline M.; Ferro, Maria Inês T.; Henrique-Silva, Flavio; Giglioti, Éder A.; Lemos, Manoel V.F.; Coutinho, Luiz L.; Nobrega, Marina P.; Carrer, Helaine; França, Suzelei C.; Bacci, Maurício; Goldman, Maria Helena S.; Gomes, Suely L.; Nunes, Luiz R.; Camargo, Luis E.A.; Siqueira, Walter J.; Van Sluys, Marie-Anne; Thiemann, Otavio H.; Kuramae, Eiko E.; Santelli, Roberto V.; Marino, Celso L.; Targon, Maria L.P.N.; Ferro, Jesus A.; Silveira, Henrique C.S.; Marini, Danyelle C.; Lemos, Eliana G.M.; Monteiro-Vitorello, Claudia B.; Tambor, José H.M.; Carraro, Dirce M.; Roberto, Patrícia G.; Martins, Vanderlei G.; Goldman, Gustavo H.; de Oliveira, Regina C.; Truffi, Daniela; Colombo, Carlos A.; Rossi, Magdalena; de Araujo, Paula G.; Sculaccio, Susana A.; Angella, Aline; Lima, Marleide M.A.; de Rosa, Vicente E.; Siviero, Fábio; Coscrato, Virginia E.; Machado, Marcos A.; Grivet, Laurent; Di Mauro, Sonia M.Z.; Nobrega, Francisco G.; Menck, Carlos F.M.; Braga, Marilia D.V.; Telles, Guilherme P.; Cara, Frank A.A.; Pedrosa, Guilherme; Meidanis, João; Arruda, Paulo

2003-01-01

To contribute to our understanding of the genome complexity of sugarcane, we undertook a large-scale expressed sequence tag (EST) program. More than 260,000 cDNA clones were partially sequenced from 26 standard cDNA libraries generated from different sugarcane tissues. After the processing of the sequences, 237,954 high-quality ESTs were identified. These ESTs were assembled into 43,141 putative transcripts. Of the assembled sequences, 35.6% presented no matches with existing sequences in public databases. A global analysis of the whole SUCEST data set indicated that 14,409 assembled sequences (33% of the total) contained at least one cDNA clone with a full-length insert. Annotation of the 43,141 assembled sequences associated almost 50% of the putative identified sugarcane genes with protein metabolism, cellular communication/signal transduction, bioenergetics, and stress responses. Inspection of the translated assembled sequences for conserved protein domains revealed 40,821 amino acid sequences with 1415 Pfam domains. Reassembling the consensus sequences of the 43,141 transcripts revealed a 22% redundancy in the first assembling. This indicated that possibly 33,620 unique genes had been identified and indicated that >90% of the sugarcane expressed genes were tagged. PMID:14613979
Genome Partitioner: A web tool for multi-level partitioning of large-scale DNA constructs for synthetic biology applications.

PubMed

Christen, Matthias; Del Medico, Luca; Christen, Heinz; Christen, Beat

2017-01-01

Recent advances in lower-cost DNA synthesis techniques have enabled new innovations in the field of synthetic biology. Still, efficient design and higher-order assembly of genome-scale DNA constructs remains a labor-intensive process. Given the complexity, computer assisted design tools that fragment large DNA sequences into fabricable DNA blocks are needed to pave the way towards streamlined assembly of biological systems. Here, we present the Genome Partitioner software implemented as a web-based interface that permits multi-level partitioning of genome-scale DNA designs. Without the need for specialized computing skills, biologists can submit their DNA designs to a fully automated pipeline that generates the optimal retrosynthetic route for higher-order DNA assembly. To test the algorithm, we partitioned a 783 kb Caulobacter crescentus genome design. We validated the partitioning strategy by assembling a 20 kb test segment encompassing a difficult to synthesize DNA sequence. Successful assembly from 1 kb subblocks into the 20 kb segment highlights the effectiveness of the Genome Partitioner for reducing synthesis costs and timelines for higher-order DNA assembly. The Genome Partitioner is broadly applicable to translate DNA designs into ready to order sequences that can be assembled with standardized protocols, thus offering new opportunities to harness the diversity of microbial genomes for synthetic biology applications. The Genome Partitioner web tool can be accessed at https://christenlab.ethz.ch/GenomePartitioner.
Development and validation of an rDNA operon based primer walking strategy applicable to de novo bacterial genome finishing

PubMed Central

Eastman, Alexander W.; Yuan, Ze-Chun

2015-01-01

Advances in sequencing technology have drastically increased the depth and feasibility of bacterial genome sequencing. However, little information is available that details the specific techniques and procedures employed during genome sequencing despite the large numbers of published genomes. Shotgun approaches employed by second-generation sequencing platforms has necessitated the development of robust bioinformatics tools for in silico assembly, and complete assembly is limited by the presence of repetitive DNA sequences and multi-copy operons. Typically, re-sequencing with multiple platforms and laborious, targeted Sanger sequencing are employed to finish a draft bacterial genome. Here we describe a novel strategy based on the identification and targeted sequencing of repetitive rDNA operons to expedite bacterial genome assembly and finishing. Our strategy was validated by finishing the genome of Paenibacillus polymyxa strain CR1, a bacterium with potential in sustainable agriculture and bio-based processes. An analysis of the 38 contigs contained in the P. polymyxa strain CR1 draft genome revealed 12 repetitive rDNA operons with varied intragenic and flanking regions of variable length, unanimously located at contig boundaries and within contig gaps. These highly similar but not identical rDNA operons were experimentally verified and sequenced simultaneously with multiple, specially designed primer sets. This approach also identified and corrected significant sequence rearrangement generated during the initial in silico assembly of sequencing reads. Our approach reduces the required effort associated with blind primer walking for contig assembly, increasing both the speed and feasibility of genome finishing. Our study further reinforces the notion that repetitive DNA elements are major limiting factors for genome finishing. Moreover, we provided a step-by-step workflow for genome finishing, which may guide future bacterial genome finishing projects. PMID:25653642
Genome Partitioner: A web tool for multi-level partitioning of large-scale DNA constructs for synthetic biology applications

PubMed Central

Del Medico, Luca; Christen, Heinz; Christen, Beat

2017-01-01

Recent advances in lower-cost DNA synthesis techniques have enabled new innovations in the field of synthetic biology. Still, efficient design and higher-order assembly of genome-scale DNA constructs remains a labor-intensive process. Given the complexity, computer assisted design tools that fragment large DNA sequences into fabricable DNA blocks are needed to pave the way towards streamlined assembly of biological systems. Here, we present the Genome Partitioner software implemented as a web-based interface that permits multi-level partitioning of genome-scale DNA designs. Without the need for specialized computing skills, biologists can submit their DNA designs to a fully automated pipeline that generates the optimal retrosynthetic route for higher-order DNA assembly. To test the algorithm, we partitioned a 783 kb Caulobacter crescentus genome design. We validated the partitioning strategy by assembling a 20 kb test segment encompassing a difficult to synthesize DNA sequence. Successful assembly from 1 kb subblocks into the 20 kb segment highlights the effectiveness of the Genome Partitioner for reducing synthesis costs and timelines for higher-order DNA assembly. The Genome Partitioner is broadly applicable to translate DNA designs into ready to order sequences that can be assembled with standardized protocols, thus offering new opportunities to harness the diversity of microbial genomes for synthetic biology applications. The Genome Partitioner web tool can be accessed at https://christenlab.ethz.ch/GenomePartitioner. PMID:28531174
j5 v2.8.4

DOE Office of Scientific and Technical Information (OSTI.GOV)

Hillson, Nathan

j5 automates and optimizes the design of the molecular biological process of cloning/constructing DNA. j5 enables users to benefit from (combinatorial) multi-part scar-less SLIC, Gibson, CPEC, Golden Gate assembly, or variants thereof, for which automation software does not currently exist, without the intense labor currently associated with the process. j5 inputs a list of the DNA sequences to be assembled, along with a Genbank, FASTA, jbei-seq, or SBOL v1.1 format sequence file for each DNA source. Given the list of DNA sequences to be assembled, j5 first determines the cost-minimizing assembly strategy for each part (direct synthesis, PCR/SOE, or oligo-embedding),more » designs DNA oligos with Primer3, adds flanking homology sequences (SLIC, Gibson, and CPEC; optimized with Primer3 for CPEC) or optimized overhang sequences (Golden Gate) to the oligos and direct synthesis pieces, and utilizes BLAST to check against oligo mis-priming and assembly piece incompatibility events. After identifying DNA oligos that are already contained within a local collection for reuse, the program estimates the total cost of direct synthesis and new oligos to be ordered. In the instance that j5 identifies putative assembly piece incompatibilities (multiple pieces with high flanking sequence homology), the program suggests hierarchical subassemblies where possible. The program outputs a comma-separated value (CSV) file, viewable via Excel or other spreadsheet software, that contains assembly design information (such as the PCR/SOE reactions to perform, their anticipated sizes and sequences, etc.) as well as a properly annotated genbank file containing the sequence resulting from the assembly, and appends the local oligo library with the oligos to be ordered j5 condenses multiple independent assembly projects into 96-well format for high-throughput liquid-handling robotics platforms, and generates configuration files for the PR-PR biology-friendly robot programming language. j5 thus provides a new way to design DNA assembly procedures much more productively and efficiently, not only in terms of time, but also in terms of cost. To a large extent, however, j5 does not allow people to do something that could not be done before by hand given enough time and effort. An exception to this is that, since the very act of using j5 to design the DNA assembly process standardizes the experimental details and workflow, j5 enables a single person to concurrently perform the independent DNA construction tasks of an entire group of researchers. Currently, this is not readily possible, since separate researchers employ disparate design strategies and workflows, and furthermore, their designs and workflows are very infrequently fully captured in an electronic format which is conducive to automation.« less
Synthesis of DNA

DOEpatents

Mariella, Jr., Raymond P.

2008-11-18

A method of synthesizing a desired double-stranded DNA of a predetermined length and of a predetermined sequence. Preselected sequence segments that will complete the desired double-stranded DNA are determined. Preselected segment sequences of DNA that will be used to complete the desired double-stranded DNA are provided. The preselected segment sequences of DNA are assembled to produce the desired double-stranded DNA.

A Programmable DNA Double-Write Material: Synergy of Photolithography and Self-Assembly Nanofabrication.

PubMed

Song, Youngjun; Takahashi, Tsukasa; Kim, Sejung; Heaney, Yvonne C; Warner, John; Chen, Shaochen; Heller, Michael J

2017-01-11

We demonstrate a DNA double-write process that uses UV to pattern a uniquely designed DNA write material, which produces two distinct binding identities for hybridizing two different complementary DNA sequences. The process requires no modification to the DNA by chemical reagents and allows programmed DNA self-assembly and further UV patterning in the UV exposed and nonexposed areas. Multilayered DNA patterning with hybridization of fluorescently labeled complementary DNA sequences, biotin probe/fluorescent streptavidin complexes, and DNA patterns with 500 nm line widths were all demonstrated.
One-pot DNA construction for synthetic biology: the Modular Overlap-Directed Assembly with Linkers (MODAL) strategy

PubMed Central

Casini, Arturo; MacDonald, James T.; Jonghe, Joachim De; Christodoulou, Georgia; Freemont, Paul S.; Baldwin, Geoff S.; Ellis, Tom

2014-01-01

Overlap-directed DNA assembly methods allow multiple DNA parts to be assembled together in one reaction. These methods, which rely on sequence homology between the ends of DNA parts, have become widely adopted in synthetic biology, despite being incompatible with a key principle of engineering: modularity. To answer this, we present MODAL: a Modular Overlap-Directed Assembly with Linkers strategy that brings modularity to overlap-directed methods, allowing assembly of an initial set of DNA parts into a variety of arrangements in one-pot reactions. MODAL is accompanied by a custom software tool that designs overlap linkers to guide assembly, allowing parts to be assembled in any specified order and orientation. The in silico design of synthetic orthogonal overlapping junctions allows for much greater efficiency in DNA assembly for a variety of different methods compared with using non-designed sequence. In tests with three different assembly technologies, the MODAL strategy gives assembly of both yeast and bacterial plasmids, composed of up to five DNA parts in the kilobase range with efficiencies of between 75 and 100%. It also seamlessly allows mutagenesis to be performed on any specified DNA parts during the process, allowing the one-step creation of construct libraries valuable for synthetic biology applications. PMID:24153110
Navigating the tip of the genomic iceberg: Next-generation sequencing for plant systematics.

PubMed

Straub, Shannon C K; Parks, Matthew; Weitemier, Kevin; Fishbein, Mark; Cronn, Richard C; Liston, Aaron

2012-02-01

Just as Sanger sequencing did more than 20 years ago, next-generation sequencing (NGS) is poised to revolutionize plant systematics. By combining multiplexing approaches with NGS throughput, systematists may no longer need to choose between more taxa or more characters. Here we describe a genome skimming (shallow sequencing) approach for plant systematics. Through simulations, we evaluated optimal sequencing depth and performance of single-end and paired-end short read sequences for assembly of nuclear ribosomal DNA (rDNA) and plastomes and addressed the effect of divergence on reference-guided plastome assembly. We also used simulations to identify potential phylogenetic markers from low-copy nuclear loci at different sequencing depths. We demonstrated the utility of genome skimming through phylogenetic analysis of the Sonoran Desert clade (SDC) of Asclepias (Apocynaceae). Paired-end reads performed better than single-end reads. Minimum sequencing depths for high quality rDNA and plastome assemblies were 40× and 30×, respectively. Divergence from the reference significantly affected plastome assembly, but relatively similar references are available for most seed plants. Deeper rDNA sequencing is necessary to characterize intragenomic polymorphism. The low-copy fraction of the nuclear genome was readily surveyed, even at low sequencing depths. Nearly 160000 bp of sequence from three organelles provided evidence of phylogenetic incongruence in the SDC. Adoption of NGS will facilitate progress in plant systematics, as whole plastome and rDNA cistrons, partial mitochondrial genomes, and low-copy nuclear markers can now be efficiently obtained for molecular phylogenetics studies.
Cytogenetic Analysis of Populus trichocarpa - Ribosomal DNA, Telomere Repeat Sequence, and Marker-selected BACs

DOE Office of Scientific and Technical Information (OSTI.GOV)

Tuskan, Gerald A; Gunter, Lee E; DiFazio, Stephen P

The 18S-28S rDNA and 5S rDNA loci in Populus trichocarpa were localized using fluorescent in situ hybridization (FISH). Two 18S-28S rDNA sites and one 5S rDNA site were identified and located at the ends of 3 different chromosomes. FISH signals from the Arabidopsis -type telomere repeat sequence were observed at the distal ends of each chromosome. Six BAC clones selected from 2 linkage groups based on genome sequence assembly (LG-I and LG-VI) were localized on 2 chromosomes, as expected. BACs from LG-I hybridized to the longest chromosome in the complement. All BAC positions were found to be concordant with sequencemore » assembly positions. BAC-FISH will be useful for delineating each of the Populus trichocarpa chromosomes and improving the sequence assembly of this model angiosperm tree species.« less
BAC sequencing using pooled methods.

PubMed

Saski, Christopher A; Feltus, F Alex; Parida, Laxmi; Haiminen, Niina

2015-01-01

Shotgun sequencing and assembly of a large, complex genome can be both expensive and challenging to accurately reconstruct the true genome sequence. Repetitive DNA arrays, paralogous sequences, polyploidy, and heterozygosity are main factors that plague de novo genome sequencing projects that typically result in highly fragmented assemblies and are difficult to extract biological meaning. Targeted, sub-genomic sequencing offers complexity reduction by removing distal segments of the genome and a systematic mechanism for exploring prioritized genomic content through BAC sequencing. If one isolates and sequences the genome fraction that encodes the relevant biological information, then it is possible to reduce overall sequencing costs and efforts that target a genomic segment. This chapter describes the sub-genome assembly protocol for an organism based upon a BAC tiling path derived from a genome-scale physical map or from fine mapping using BACs to target sub-genomic regions. Methods that are described include BAC isolation and mapping, DNA sequencing, and sequence assembly.
Molecular simulations of assembly of functionalized spherical nanoparticles

NASA Astrophysics Data System (ADS)

Seifpour, Arezou

Precise assembly of nanoparticles is crucial for creating spatially engineered materials that can be used for photonics, photovoltaic, and metamaterials applications. One way to control nanoparticle assembly is by functionalizing the nanoparticle with ligands, such as polymers, DNA, and proteins, that can manipulate the interactions between the nanoparticles in the medium the particles are placed in. This thesis research aims to design ligands to provide a new route to the programmable assembly of nanoparticles. We first investigate using Monte Carlo simulation the effect of copolymer ligands on nanoparticle assembly. We first study a single nanoparticle grafted with many copolymer chains to understand how monomer sequence (e.g. alternating ABAB, or diblock AxBx) and chemistry of the copolymers affect the grafted chain conformation at various particle diameters, grafting densities, copolymer chain lengths, and monomer-monomer interactions in an implicit small molecule solvent. We find that the size of the grafted chain varies non-monotonically with increasing blockiness of the monomer sequence for a small particle diameter. From this first study, we selected the two sequences with the most different chain conformations---alternating and diblock---and studied the effect of the sequence and a range of monomer chemistries of the copolymer on the characteristics of assembly of multiple copolymer-functionalized nanoparticles. We find that the alternating sequence produces nanoclusters that are relatively isotropic, whereas diblock sequence tends to form anisotropic structures that are smaller and more compact when the block closer to the surface is attractive and larger loosely held together clusters when the outer block is attractive. Next, we conduct molecular dynamics simulations to study the effect of DNA ligands on nanoparticle assembly. Specifically we investigate the effect of grafted DNA strand composition (e.g. G/C content, placement and sequence) and bidispersity in DNA strand lengths on the thermodynamics and structure of assembly of functionalized nanoparticles. We find that higher G/C content increases cluster dissociation temperature for smaller particles. Placement of G/C block inward along the strand decreases number of neighbors within the assembled cluster. Finally, increased bidispersity in DNA strand lengths leads a distribution of inter-particle distances in the assembled cluster.
Nonhybrid, finished microbial genome assemblies from long-read SMRT sequencing data.

PubMed

Chin, Chen-Shan; Alexander, David H; Marks, Patrick; Klammer, Aaron A; Drake, James; Heiner, Cheryl; Clum, Alicia; Copeland, Alex; Huddleston, John; Eichler, Evan E; Turner, Stephen W; Korlach, Jonas

2013-06-01

We present a hierarchical genome-assembly process (HGAP) for high-quality de novo microbial genome assemblies using only a single, long-insert shotgun DNA library in conjunction with Single Molecule, Real-Time (SMRT) DNA sequencing. Our method uses the longest reads as seeds to recruit all other reads for construction of highly accurate preassembled reads through a directed acyclic graph-based consensus procedure, which we follow with assembly using off-the-shelf long-read assemblers. In contrast to hybrid approaches, HGAP does not require highly accurate raw reads for error correction. We demonstrate efficient genome assembly for several microorganisms using as few as three SMRT Cell zero-mode waveguide arrays of sequencing and for BACs using just one SMRT Cell. Long repeat regions can be successfully resolved with this workflow. We also describe a consensus algorithm that incorporates SMRT sequencing primary quality values to produce de novo genome sequence exceeding 99.999% accuracy.
Sequencing and assembly of the 22-gb loblolly pine genome.

PubMed

Zimin, Aleksey; Stevens, Kristian A; Crepeau, Marc W; Holtz-Morris, Ann; Koriabine, Maxim; Marçais, Guillaume; Puiu, Daniela; Roberts, Michael; Wegrzyn, Jill L; de Jong, Pieter J; Neale, David B; Salzberg, Steven L; Yorke, James A; Langley, Charles H

2014-03-01

Conifers are the predominant gymnosperm. The size and complexity of their genomes has presented formidable technical challenges for whole-genome shotgun sequencing and assembly. We employed novel strategies that allowed us to determine the loblolly pine (Pinus taeda) reference genome sequence, the largest genome assembled to date. Most of the sequence data were derived from whole-genome shotgun sequencing of a single megagametophyte, the haploid tissue of a single pine seed. Although that constrained the quantity of available DNA, the resulting haploid sequence data were well-suited for assembly. The haploid sequence was augmented with multiple linking long-fragment mate pair libraries from the parental diploid DNA. For the longest fragments, we used novel fosmid DiTag libraries. Sequences from the linking libraries that did not match the megagametophyte were identified and removed. Assembly of the sequence data were aided by condensing the enormous number of paired-end reads into a much smaller set of longer "super-reads," rendering subsequent assembly with an overlap-based assembly algorithm computationally feasible. To further improve the contiguity and biological utility of the genome sequence, additional scaffolding methods utilizing independent genome and transcriptome assemblies were implemented. The combination of these strategies resulted in a draft genome sequence of 20.15 billion bases, with an N50 scaffold size of 66.9 kbp.
Low-Cost, High-Throughput Sequencing of DNA Assemblies Using a Highly Multiplexed Nextera Process.

PubMed

Shapland, Elaine B; Holmes, Victor; Reeves, Christopher D; Sorokin, Elena; Durot, Maxime; Platt, Darren; Allen, Christopher; Dean, Jed; Serber, Zach; Newman, Jack; Chandran, Sunil

2015-07-17

In recent years, next-generation sequencing (NGS) technology has greatly reduced the cost of sequencing whole genomes, whereas the cost of sequence verification of plasmids via Sanger sequencing has remained high. Consequently, industrial-scale strain engineers either limit the number of designs or take short cuts in quality control. Here, we show that over 4000 plasmids can be completely sequenced in one Illumina MiSeq run for less than $3 each (15× coverage), which is a 20-fold reduction over using Sanger sequencing (2× coverage). We reduced the volume of the Nextera tagmentation reaction by 100-fold and developed an automated workflow to prepare thousands of samples for sequencing. We also developed software to track the samples and associated sequence data and to rapidly identify correctly assembled constructs having the fewest defects. As DNA synthesis and assembly become a centralized commodity, this NGS quality control (QC) process will be essential to groups operating high-throughput pipelines for DNA construction.
Noncanonical self-assembly of multifunctional DNA nanoflowers for biomedical applications.

PubMed

Zhu, Guizhi; Hu, Rong; Zhao, Zilong; Chen, Zhuo; Zhang, Xiaobing; Tan, Weihong

2013-11-06

DNA nanotechnology has been extensively explored to assemble various functional nanostructures for versatile applications. Mediated by Watson-Crick base-pairing, these DNA nanostructures have been conventionally assembled through hybridization of many short DNA building blocks. Here we report the noncanonical self-assembly of multifunctional DNA nanostructures, termed as nanoflowers (NFs), and the versatile biomedical applications. These NFs were assembled from long DNA building blocks generated via rolling circle replication (RCR) of a designer template. NF assembly was driven by liquid crystallization and dense packaging of building blocks, without relying on Watson-Crick base-pairing between DNA strands, thereby avoiding the otherwise conventional complicated DNA sequence design. NF sizes were readily tunable in a wide range, by simply adjusting such parameters as assembly time and template sequences. NFs were exceptionally resistant to nuclease degradation, denaturation, or dissociation at extremely low concentration, presumably resulting from the dense DNA packaging in NFs. The exceptional biostability is critical for biomedical applications. By rational design, NFs can be readily incorporated with myriad functional moieties. All these properties make NFs promising for versatile applications. As a proof-of-principle demonstration, in this study, NFs were integrated with aptamers, bioimaging agents, and drug loading sites, and the resultant multifunctional NFs were demonstrated for selective cancer cell recognition, bioimaging, and targeted anticancer drug delivery.
Noncanonical self-assembly of multifunctional DNA nanoflowers for biomedical applications

PubMed Central

Zhu, Guizhi; Hu, Rong; Zhao, Zilong; Chen, Zhuo; Zhang, Xiaobing; Tan, Weihong

2013-01-01

DNA nanotechnology has been extensively explored to assemble various functional nanostructures for versatile applications. Mediated by Watson-Crick base-pairing, these DNA nanostructures have been conventionally assembled through hybridization of many short DNA building blocks. Here we report the noncanonical self-assembly of multifunctional DNA nanostructures, termed as nanoflowers (NFs), and the versatile biomedical applications. These NFs were assembled from long DNA building blocks generated via Rolling Circle Replication (RCR) of a designer template. NF assembly was driven by liquid crystallization and dense packaging of building blocks, without relying on Watson-Crick base-pairing between DNA strands, thereby avoiding the otherwise conventional complicated DNA sequence design. NF sizes were readily tunable in a wide range, by simply adjusting such parameters as assembly time and template sequences. NFs were exceptionally resistant to nuclease degradation, denaturation, or dissociation at extremely low concentration, presumably resulting from the dense DNA packaging in NFs. The exceptional biostability is critical for biomedical applications. By rational design, NFs can be readily incorporated with myriad functional moieties. All these properties make NFs promising for versatile applications. As a proof-of-principle demonstration, in this study, NFs were integrated with aptamers, bioimaging agents, and drug loading sites, and the resultant multifunctional NFs were demonstrated for selective cancer cell recognition, bioimaging, and targeted anticancer drug delivery. PMID:24164620
BASIC: A Simple and Accurate Modular DNA Assembly Method.

PubMed

Storch, Marko; Casini, Arturo; Mackrow, Ben; Ellis, Tom; Baldwin, Geoff S

2017-01-01

Biopart Assembly Standard for Idempotent Cloning (BASIC) is a simple, accurate, and robust DNA assembly method. The method is based on linker-mediated DNA assembly and provides highly accurate DNA assembly with 99 % correct assemblies for four parts and 90 % correct assemblies for seven parts [1]. The BASIC standard defines a single entry vector for all parts flanked by the same prefix and suffix sequences and its idempotent nature means that the assembled construct is returned in the same format. Once a part has been adapted into the BASIC format it can be placed at any position within a BASIC assembly without the need for reformatting. This allows laboratories to grow comprehensive and universal part libraries and to share them efficiently. The modularity within the BASIC framework is further extended by the possibility of encoding ribosomal binding sites (RBS) and peptide linker sequences directly on the linkers used for assembly. This makes BASIC a highly versatile library construction method for combinatorial part assembly including the construction of promoter, RBS, gene variant, and protein-tag libraries. In comparison with other DNA assembly standards and methods, BASIC offers a simple robust protocol; it relies on a single entry vector, provides for easy hierarchical assembly, and is highly accurate for up to seven parts per assembly round [2].
Human Contamination in Public Genome Assemblies.

PubMed

Kryukov, Kirill; Imanishi, Tadashi

2016-01-01

Contamination in genome assembly can lead to wrong or confusing results when using such genome as reference in sequence comparison. Although bacterial contamination is well known, the problem of human-originated contamination received little attention. In this study we surveyed 45,735 available genome assemblies for evidence of human contamination. We used lineage specificity to distinguish between contamination and conservation. We found that 154 genome assemblies contain fragments that with high confidence originate as contamination from human DNA. Majority of contaminating human sequences were present in the reference human genome assembly for over a decade. We recommend that existing contaminated genomes should be revised to remove contaminated sequence, and that new assemblies should be thoroughly checked for presence of human DNA before submitting them to public databases.
Directing folding pathways for multi-component DNA origami nanostructures with complex topology

NASA Astrophysics Data System (ADS)

Marras, A. E.; Zhou, L.; Kolliopoulos, V.; Su, H.-J.; Castro, C. E.

2016-05-01

Molecular self-assembly has become a well-established technique to design complex nanostructures and hierarchical mesoscale assemblies. The typical approach is to design binding complementarity into nucleotide or amino acid sequences to achieve the desired final geometry. However, with an increasing interest in dynamic nanodevices, the need to design structures with motion has necessitated the development of multi-component structures. While this has been achieved through hierarchical assembly of similar structural units, here we focus on the assembly of topologically complex structures, specifically with concentric components, where post-folding assembly is not feasible. We exploit the ability to direct folding pathways to program the sequence of assembly and present a novel approach of designing the strand topology of intermediate folding states to program the topology of the final structure, in this case a DNA origami slider structure that functions much like a piston-cylinder assembly in an engine. The ability to program the sequence and control orientation and topology of multi-component DNA origami nanostructures provides a foundation for a new class of structures with internal and external moving parts and complex scaffold topology. Furthermore, this work provides critical insight to guide the design of intermediate states along a DNA origami folding pathway and to further understand the details of DNA origami self-assembly to more broadly control folding states and landscapes.
Scarless assembly of unphosphorylated DNA fragments with a simplified DATEL method.

PubMed

Ding, Wenwen; Weng, Huanjiao; Jin, Peng; Du, Guocheng; Chen, Jian; Kang, Zhen

2017-05-04

Efficient assembly of multiple DNA fragments is a pivotal technology for synthetic biology. A scarless and sequence-independent DNA assembly method (DATEL) using thermal exonucleases has been developed recently. Here, we present a simplified DATEL (sDATEL) for efficient assembly of unphosphorylated DNA fragments with low cost. The sDATEL method is only dependent on Taq DNA polymerase and Taq DNA ligase. After optimizing the committed parameters of the reaction system such as pH and the concentration of Mg 2+ and NAD+, the assembly efficiency was increased by 32-fold. To further improve the assembly capacity, the number of thermal cycles was optimized, resulting in successful assembly 4 unphosphorylated DNA fragments with an accuracy of 75%. sDATEL could be a desirable method for routine manual and automated assembly.
Targeted isolation, sequence assembly and characterization of two white spruce (Picea glauca) BAC clones for terpenoid synthase and cytochrome P450 genes involved in conifer defence reveal insights into a conifer genome

PubMed Central

2009-01-01

Background Conifers are a large group of gymnosperm trees which are separated from the angiosperms by more than 300 million years of independent evolution. Conifer genomes are extremely large and contain considerable amounts of repetitive DNA. Currently, conifer sequence resources exist predominantly as expressed sequence tags (ESTs) and full-length (FL)cDNAs. There is no genome sequence available for a conifer or any other gymnosperm. Conifer defence-related genes often group into large families with closely related members. The goals of this study are to assess the feasibility of targeted isolation and sequence assembly of conifer BAC clones containing specific genes from two large gene families, and to characterize large segments of genomic DNA sequence for the first time from a conifer. Results We used a PCR-based approach to identify BAC clones for two target genes, a terpene synthase (3-carene synthase; 3CAR) and a cytochrome P450 (CYP720B4) from a non-arrayed genomic BAC library of white spruce (Picea glauca). Shotgun genomic fragments isolated from the BAC clones were sequenced to a depth of 15.6- and 16.0-fold coverage, respectively. Assembly and manual curation yielded sequence scaffolds of 172 kbp (3CAR) and 94 kbp (CYP720B4) long. Inspection of the genomic sequences revealed the intron-exon structures, the putative promoter regions and putative cis-regulatory elements of these genes. Sequences related to transposable elements (TEs), high complexity repeats and simple repeats were prevalent and comprised approximately 40% of the sequenced genomic DNA. An in silico simulation of the effect of sequencing depth on the quality of the sequence assembly provides direction for future efforts of conifer genome sequencing. Conclusion We report the first targeted cloning, sequencing, assembly, and annotation of large segments of genomic DNA from a conifer. We demonstrate that genomic BAC clones for individual members of multi-member gene families can be isolated in a gene-specific fashion. The results of the present work provide important new information about the structure and content of conifer genomic DNA that will guide future efforts to sequence and assemble conifer genomes. PMID:19656416
Targeted isolation, sequence assembly and characterization of two white spruce (Picea glauca) BAC clones for terpenoid synthase and cytochrome P450 genes involved in conifer defence reveal insights into a conifer genome.

PubMed

Hamberger, Björn; Hall, Dawn; Yuen, Mack; Oddy, Claire; Hamberger, Britta; Keeling, Christopher I; Ritland, Carol; Ritland, Kermit; Bohlmann, Jörg

2009-08-06

Conifers are a large group of gymnosperm trees which are separated from the angiosperms by more than 300 million years of independent evolution. Conifer genomes are extremely large and contain considerable amounts of repetitive DNA. Currently, conifer sequence resources exist predominantly as expressed sequence tags (ESTs) and full-length (FL)cDNAs. There is no genome sequence available for a conifer or any other gymnosperm. Conifer defence-related genes often group into large families with closely related members. The goals of this study are to assess the feasibility of targeted isolation and sequence assembly of conifer BAC clones containing specific genes from two large gene families, and to characterize large segments of genomic DNA sequence for the first time from a conifer. We used a PCR-based approach to identify BAC clones for two target genes, a terpene synthase (3-carene synthase; 3CAR) and a cytochrome P450 (CYP720B4) from a non-arrayed genomic BAC library of white spruce (Picea glauca). Shotgun genomic fragments isolated from the BAC clones were sequenced to a depth of 15.6- and 16.0-fold coverage, respectively. Assembly and manual curation yielded sequence scaffolds of 172 kbp (3CAR) and 94 kbp (CYP720B4) long. Inspection of the genomic sequences revealed the intron-exon structures, the putative promoter regions and putative cis-regulatory elements of these genes. Sequences related to transposable elements (TEs), high complexity repeats and simple repeats were prevalent and comprised approximately 40% of the sequenced genomic DNA. An in silico simulation of the effect of sequencing depth on the quality of the sequence assembly provides direction for future efforts of conifer genome sequencing. We report the first targeted cloning, sequencing, assembly, and annotation of large segments of genomic DNA from a conifer. We demonstrate that genomic BAC clones for individual members of multi-member gene families can be isolated in a gene-specific fashion. The results of the present work provide important new information about the structure and content of conifer genomic DNA that will guide future efforts to sequence and assemble conifer genomes.
DNA assembly with error correction on a droplet digital microfluidics platform.

PubMed

Khilko, Yuliya; Weyman, Philip D; Glass, John I; Adams, Mark D; McNeil, Melanie A; Griffin, Peter B

2018-06-01

Custom synthesized DNA is in high demand for synthetic biology applications. However, current technologies to produce these sequences using assembly from DNA oligonucleotides are costly and labor-intensive. The automation and reduced sample volumes afforded by microfluidic technologies could significantly decrease materials and labor costs associated with DNA synthesis. The purpose of this study was to develop a gene assembly protocol utilizing a digital microfluidic device. Toward this goal, we adapted bench-scale oligonucleotide assembly methods followed by enzymatic error correction to the Mondrian™ digital microfluidic platform. We optimized Gibson assembly, polymerase chain reaction (PCR), and enzymatic error correction reactions in a single protocol to assemble 12 oligonucleotides into a 339-bp double- stranded DNA sequence encoding part of the human influenza virus hemagglutinin (HA) gene. The reactions were scaled down to 0.6-1.2 μL. Initial microfluidic assembly methods were successful and had an error frequency of approximately 4 errors/kb with errors originating from the original oligonucleotide synthesis. Relative to conventional benchtop procedures, PCR optimization required additional amounts of MgCl 2 , Phusion polymerase, and PEG 8000 to achieve amplification of the assembly and error correction products. After one round of error correction, error frequency was reduced to an average of 1.8 errors kb - 1 . We demonstrated that DNA assembly from oligonucleotides and error correction could be completely automated on a digital microfluidic (DMF) platform. The results demonstrate that enzymatic reactions in droplets show a strong dependence on surface interactions, and successful on-chip implementation required supplementation with surfactants, molecular crowding agents, and an excess of enzyme. Enzymatic error correction of assembled fragments improved sequence fidelity by 2-fold, which was a significant improvement but somewhat lower than expected compared to bench-top assays, suggesting an additional capacity for optimization.
Sequence Polishing Library (SPL) v10.0

DOE Office of Scientific and Technical Information (OSTI.GOV)

Oberortner, Ernst

The Sequence Polishing Library (SPL) is a suite of software tools in order to automate "Design for Synthesis and Assembly" workflows. Specifically: The SPL "Converter" tool converts files among the following sequence data exchange formats: CSV, FASTA, GenBank, and Synthetic Biology Open Language (SBOL); The SPL "Juggler" tool optimizes the codon usages of DNA coding sequences according to an optimization strategy, a user-specific codon usage table and genetic code. In addition, the SPL "Juggler" can translate amino acid sequences into DNA sequences.:The SPL "Polisher" verifies NA sequences against DNA synthesis constraints, such as GC content, repeating k-mers, and restriction sites.more » In case of violations, the "Polisher" reports the violations in a comprehensive manner. The "Polisher" tool can also modify the violating regions according to an optimization strategy, a user-specific codon usage table and genetic code;The SPL "Partitioner" decomposes large DNA sequences into smaller building blocks with partial overlaps that enable an efficient assembly. The "Partitioner" enables the user to configure the characteristics of the overlaps, which are mostly determined by the utilized assembly protocol, such as length, GC content, or melting temperature.« less
Evaluation and validation of de novo and hybrid assembly techniques to derive high quality genome sequences

DOE PAGES

Utturkar, Sagar M.; Klingeman, Dawn Marie; Land, Miriam L.; ...

2014-06-14

Our motivation with this work was to assess the potential of different types of sequence data combined with de novo and hybrid assembly approaches to improve existing draft genome sequences. Our results show Illumina, 454 and PacBio sequencing technologies were used to generate de novo and hybrid genome assemblies for four different bacteria, which were assessed for quality using summary statistics (e.g. number of contigs, N50) and in silico evaluation tools. Differences in predictions of multiple copies of rDNA operons for each respective bacterium were evaluated by PCR and Sanger sequencing, and then the validated results were applied as anmore » additional criterion to rank assemblies. In general, assemblies using longer PacBio reads were better able to resolve repetitive regions. In this study, the combination of Illumina and PacBio sequence data assembled through the ALLPATHS-LG algorithm gave the best summary statistics and most accurate rDNA operon number predictions. This study will aid others looking to improve existing draft genome assemblies. As to availability and implementation–all assembly tools except CLC Genomics Workbench are freely available under GNU General Public License.« less

Creation of a type IIS restriction endonuclease with a long recognition sequence

PubMed Central

Lippow, Shaun M.; Aha, Patti M.; Parker, Matthew H.; Blake, William J.; Baynes, Brian M.; Lipovšek, Daša

2009-01-01

Type IIS restriction endonucleases cleave DNA outside their recognition sequences, and are therefore particularly useful in the assembly of DNA from smaller fragments. A limitation of type IIS restriction endonucleases in assembly of long DNA sequences is the relative abundance of their target sites. To facilitate ligation-based assembly of extremely long pieces of DNA, we have engineered a new type IIS restriction endonuclease that combines the specificity of the homing endonuclease I-SceI with the type IIS cleavage pattern of FokI. We linked a non-cleaving mutant of I-SceI, which conveys to the chimeric enzyme its specificity for an 18-bp DNA sequence, to the catalytic domain of FokI, which cuts DNA at a defined site outside the target site. Whereas previously described chimeric endonucleases do not produce type IIS-like precise DNA overhangs suitable for ligation, our chimeric endonuclease cleaves double-stranded DNA exactly 2 and 6 nt from the target site to generate homogeneous, 5′, four-base overhangs, which can be ligated with 90% fidelity. We anticipate that these enzymes will be particularly useful in manipulation of DNA fragments larger than a thousand bases, which are very likely to contain target sites for all natural type IIS restriction endonucleases. PMID:19304757
Dual signal amplification for highly sensitive electrochemical detection of uropathogens via enzyme-based catalytic target recycling.

PubMed

Su, Jiao; Zhang, Haijie; Jiang, Bingying; Zheng, Huzhi; Chai, Yaqin; Yuan, Ruo; Xiang, Yun

2011-11-15

We report an ultrasensitive electrochemical approach for the detection of uropathogen sequence-specific DNA target. The sensing strategy involves a dual signal amplification process, which combines the signal enhancement by the enzymatic target recycling technique with the sensitivity improvement by the quantum dot (QD) layer-by-layer (LBL) assembled labels. The enzyme-based catalytic target DNA recycling process results in the use of each target DNA sequence for multiple times and leads to direct amplification of the analytical signal. Moreover, the LBL assembled QD labels can further enhance the sensitivity of the sensing system. The coupling of these two effective signal amplification strategies thus leads to low femtomolar (5fM) detection of the target DNA sequences. The proposed strategy also shows excellent discrimination between the target DNA and the single-base mismatch sequences. The advantageous intrinsic sequence-independent property of exonuclease III over other sequence-dependent enzymes makes our new dual signal amplification system a general sensing platform for monitoring ultralow level of various types of target DNA sequences. Copyright © 2011 Elsevier B.V. All rights reserved.
☆DNA assembly technique simplifies the construction of infectious clone of fowl adenovirus.

PubMed

Zou, Xiao-Hui; Bi, Zhi-Xiang; Guo, Xiao-Juan; Zhang, Zun; Zhao, Yang; Wang, Min; Zhu, Ya-Lu; Jie, Hong-Ying; Yu, Yang; Hung, Tao; Lu, Zhuo-Zhuang

2018-07-01

Plasmid bearing adenovirus genome is generally constructed with the method of homologous recombination in E. coli BJ5183 strain. Here, we utilized Gibson gene assembly technique to generate infectious clone of fowl adenovirus 4 (FAdV-4). Primers flanked with partial inverted terminal repeat (ITR) sequence of FAdV-4 were synthesized to amplify a plasmid backbone containing kanamycin-resistant gene and pBR322 origin (KAN-ORI). DNA assembly was carried out by combining the KAN-ORI fragment, virus genomic DNA and DNA assembly master mix. E. coli competent cells were transformed with the assembled product, and plasmids (pKFAV4) were extracted and confirmed to contain viral genome by restriction analysis and sequencing. Virus was successfully rescued from linear pKFAV4-transfected chicken LMH cells. This approach was further verified in cloning of human adenovirus 5 genome. Our results indicated that DNA assembly technique simplified the construction of infectious clone of adenovirus, suggesting its possible application in virus traditional or reverse genetics. Copyright © 2018 Elsevier B.V. All rights reserved.
Barcode extension for analysis and reconstruction of structures

NASA Astrophysics Data System (ADS)

Myhrvold, Cameron; Baym, Michael; Hanikel, Nikita; Ong, Luvena L.; Gootenberg, Jonathan S.; Yin, Peng

2017-03-01

Collections of DNA sequences can be rationally designed to self-assemble into predictable three-dimensional structures. The geometric and functional diversity of DNA nanostructures created to date has been enhanced by improvements in DNA synthesis and computational design. However, existing methods for structure characterization typically image the final product or laboriously determine the presence of individual, labelled strands using gel electrophoresis. Here we introduce a new method of structure characterization that uses barcode extension and next-generation DNA sequencing to quantitatively measure the incorporation of every strand into a DNA nanostructure. By quantifying the relative abundances of distinct DNA species in product and monomer bands, we can study the influence of geometry and sequence on assembly. We have tested our method using 2D and 3D DNA brick and DNA origami structures. Our method is general and should be extensible to a wide variety of DNA nanostructures.
Barcode extension for analysis and reconstruction of structures.

PubMed

Myhrvold, Cameron; Baym, Michael; Hanikel, Nikita; Ong, Luvena L; Gootenberg, Jonathan S; Yin, Peng

2017-03-13

Collections of DNA sequences can be rationally designed to self-assemble into predictable three-dimensional structures. The geometric and functional diversity of DNA nanostructures created to date has been enhanced by improvements in DNA synthesis and computational design. However, existing methods for structure characterization typically image the final product or laboriously determine the presence of individual, labelled strands using gel electrophoresis. Here we introduce a new method of structure characterization that uses barcode extension and next-generation DNA sequencing to quantitatively measure the incorporation of every strand into a DNA nanostructure. By quantifying the relative abundances of distinct DNA species in product and monomer bands, we can study the influence of geometry and sequence on assembly. We have tested our method using 2D and 3D DNA brick and DNA origami structures. Our method is general and should be extensible to a wide variety of DNA nanostructures.
Barcode extension for analysis and reconstruction of structures

PubMed Central

Myhrvold, Cameron; Baym, Michael; Hanikel, Nikita; Ong, Luvena L; Gootenberg, Jonathan S; Yin, Peng

2017-01-01

Collections of DNA sequences can be rationally designed to self-assemble into predictable three-dimensional structures. The geometric and functional diversity of DNA nanostructures created to date has been enhanced by improvements in DNA synthesis and computational design. However, existing methods for structure characterization typically image the final product or laboriously determine the presence of individual, labelled strands using gel electrophoresis. Here we introduce a new method of structure characterization that uses barcode extension and next-generation DNA sequencing to quantitatively measure the incorporation of every strand into a DNA nanostructure. By quantifying the relative abundances of distinct DNA species in product and monomer bands, we can study the influence of geometry and sequence on assembly. We have tested our method using 2D and 3D DNA brick and DNA origami structures. Our method is general and should be extensible to a wide variety of DNA nanostructures. PMID:28287117
Single haplotype assembly of the human genome from a hydatidiform mole.

PubMed

Steinberg, Karyn Meltz; Schneider, Valerie A; Graves-Lindsay, Tina A; Fulton, Robert S; Agarwala, Richa; Huddleston, John; Shiryev, Sergey A; Morgulis, Aleksandr; Surti, Urvashi; Warren, Wesley C; Church, Deanna M; Eichler, Evan E; Wilson, Richard K

2014-12-01

A complete reference assembly is essential for accurately interpreting individual genomes and associating variation with phenotypes. While the current human reference genome sequence is of very high quality, gaps and misassemblies remain due to biological and technical complexities. Large repetitive sequences and complex allelic diversity are the two main drivers of assembly error. Although increasing the length of sequence reads and library fragments can improve assembly, even the longest available reads do not resolve all regions. In order to overcome the issue of allelic diversity, we used genomic DNA from an essentially haploid hydatidiform mole, CHM1. We utilized several resources from this DNA including a set of end-sequenced and indexed BAC clones and 100× Illumina whole-genome shotgun (WGS) sequence coverage. We used the WGS sequence and the GRCh37 reference assembly to create an assembly of the CHM1 genome. We subsequently incorporated 382 finished BAC clone sequences to generate a draft assembly, CHM1_1.1 (NCBI AssemblyDB GCA_000306695.2). Analysis of gene, repetitive element, and segmental duplication content show this assembly to be of excellent quality and contiguity. However, comparison to assembly-independent resources, such as BAC clone end sequences and PacBio long reads, indicate misassembled regions. Most of these regions are enriched for structural variation and segmental duplication, and can be resolved in the future. This publicly available assembly will be integrated into the Genome Reference Consortium curation framework for further improvement, with the ultimate goal being a completely finished gap-free assembly. © 2014 Steinberg et al.; Published by Cold Spring Harbor Laboratory Press.
Single haplotype assembly of the human genome from a hydatidiform mole

PubMed Central

Steinberg, Karyn Meltz; Schneider, Valerie A.; Graves-Lindsay, Tina A.; Fulton, Robert S.; Agarwala, Richa; Huddleston, John; Shiryev, Sergey A.; Morgulis, Aleksandr; Surti, Urvashi; Warren, Wesley C.; Church, Deanna M.; Eichler, Evan E.; Wilson, Richard K.

2014-01-01

A complete reference assembly is essential for accurately interpreting individual genomes and associating variation with phenotypes. While the current human reference genome sequence is of very high quality, gaps and misassemblies remain due to biological and technical complexities. Large repetitive sequences and complex allelic diversity are the two main drivers of assembly error. Although increasing the length of sequence reads and library fragments can improve assembly, even the longest available reads do not resolve all regions. In order to overcome the issue of allelic diversity, we used genomic DNA from an essentially haploid hydatidiform mole, CHM1. We utilized several resources from this DNA including a set of end-sequenced and indexed BAC clones and 100× Illumina whole-genome shotgun (WGS) sequence coverage. We used the WGS sequence and the GRCh37 reference assembly to create an assembly of the CHM1 genome. We subsequently incorporated 382 finished BAC clone sequences to generate a draft assembly, CHM1_1.1 (NCBI AssemblyDB GCA_000306695.2). Analysis of gene, repetitive element, and segmental duplication content show this assembly to be of excellent quality and contiguity. However, comparison to assembly-independent resources, such as BAC clone end sequences and PacBio long reads, indicate misassembled regions. Most of these regions are enriched for structural variation and segmental duplication, and can be resolved in the future. This publicly available assembly will be integrated into the Genome Reference Consortium curation framework for further improvement, with the ultimate goal being a completely finished gap-free assembly. PMID:25373144
A Glance at Microsatellite Motifs from 454 Sequencing Reads of Watermelon Genomic DNA

USDA-ARS?s Scientific Manuscript database

A single 454 (Life Sciences Sequencing Technology) run of Charleston Gray watermelon (Citrullus lanatus var. lanatus) genomic DNA was performed and sequence data were assembled. A large scale identification of simple sequence repeat (SSR) was performed and SSR sequence data were used for the develo...
AFEAP cloning: a precise and efficient method for large DNA sequence assembly.

PubMed

Zeng, Fanli; Zang, Jinping; Zhang, Suhua; Hao, Zhimin; Dong, Jingao; Lin, Yibin

2017-11-14

Recent development of DNA assembly technologies has spurred myriad advances in synthetic biology, but new tools are always required for complicated scenarios. Here, we have developed an alternative DNA assembly method named AFEAP cloning (Assembly of Fragment Ends After PCR), which allows scarless, modular, and reliable construction of biological pathways and circuits from basic genetic parts. The AFEAP method requires two-round of PCRs followed by ligation of the sticky ends of DNA fragments. The first PCR yields linear DNA fragments and is followed by a second asymmetric (one primer) PCR and subsequent annealing that inserts overlapping overhangs at both sides of each DNA fragment. The overlapping overhangs of the neighboring DNA fragments annealed and the nick was sealed by T4 DNA ligase, followed by bacterial transformation to yield the desired plasmids. We characterized the capability and limitations of new developed AFEAP cloning and demonstrated its application to assemble DNA with varying scenarios. Under the optimized conditions, AFEAP cloning allows assembly of an 8 kb plasmid from 1-13 fragments with high accuracy (between 80 and 100%), and 8.0, 11.6, 19.6, 28, and 35.6 kb plasmids from five fragments at 91.67, 91.67, 88.33, 86.33, and 81.67% fidelity, respectively. AFEAP cloning also is capable to construct bacterial artificial chromosome (BAC, 200 kb) with a fidelity of 46.7%. AFEAP cloning provides a powerful, efficient, seamless, and sequence-independent DNA assembly tool for multiple fragments up to 13 and large DNA up to 200 kb that expands synthetic biologist's toolbox.
Reducing assembly complexity of microbial genomes with single-molecule sequencing

USDA-ARS?s Scientific Manuscript database

Genome assembly algorithms cannot fully reconstruct microbial chromosomes from the DNA reads output by first or second-generation sequencing instruments. Therefore, most genomes are left unfinished due to the significant resources required to manually close gaps left in the draft assemblies. Single-...
Nanopore DNA Sequencing and Genome Assembly on the International Space Station.

PubMed

Castro-Wallace, Sarah L; Chiu, Charles Y; John, Kristen K; Stahl, Sarah E; Rubins, Kathleen H; McIntyre, Alexa B R; Dworkin, Jason P; Lupisella, Mark L; Smith, David J; Botkin, Douglas J; Stephenson, Timothy A; Juul, Sissel; Turner, Daniel J; Izquierdo, Fernando; Federman, Scot; Stryke, Doug; Somasekar, Sneha; Alexander, Noah; Yu, Guixia; Mason, Christopher E; Burton, Aaron S

2017-12-21

We evaluated the performance of the MinION DNA sequencer in-flight on the International Space Station (ISS), and benchmarked its performance off-Earth against the MinION, Illumina MiSeq, and PacBio RS II sequencing platforms in terrestrial laboratories. Samples contained equimolar mixtures of genomic DNA from lambda bacteriophage, Escherichia coli (strain K12, MG1655) and Mus musculus (female BALB/c mouse). Nine sequencing runs were performed aboard the ISS over a 6-month period, yielding a total of 276,882 reads with no apparent decrease in performance over time. From sequence data collected aboard the ISS, we constructed directed assemblies of the ~4.6 Mb E. coli genome, ~48.5 kb lambda genome, and a representative M. musculus sequence (the ~16.3 kb mitochondrial genome), at 100%, 100%, and 96.7% consensus pairwise identity, respectively; de novo assembly of the E. coli genome from raw reads yielded a single contig comprising 99.9% of the genome at 98.6% consensus pairwise identity. Simulated real-time analyses of in-flight sequence data using an automated bioinformatic pipeline and laptop-based genomic assembly demonstrated the feasibility of sequencing analysis and microbial identification aboard the ISS. These findings illustrate the potential for sequencing applications including disease diagnosis, environmental monitoring, and elucidating the molecular basis for how organisms respond to spaceflight.
Bacillus nealsonii sp. nov., isolated from a spacecraft-assembly facility, whose spores are gamma-radiation resistant

NASA Technical Reports Server (NTRS)

Venkateswaran, Kasthuri; Kempf, Michael; Chen, Fei; Satomi, Masataka; Nicholson, Wayne; Kern, Roger

2003-01-01

One of the spore-formers isolated from a spacecraft-assembly facility, belonging to the genus Bacillus, is described on the basis of phenotypic characterization, 16S rDNA sequence analysis and DNA-DNA hybridization studies. It is a Gram-positive, facultatively anaerobic, rod-shaped eubacterium that produces endospores. The spores of this novel bacterial species exhibited resistance to UV, gamma-radiation, H2O2 and desiccation. The 18S rDNA sequence analysis revealed a clear affiliation between this strain and members of the low G+C Firmicutes. High 16S rDNA sequence similarity values were found with members of the genus Bacillus and this was supported by fatty acid profiles. The 16S rDNA sequence similarity between strain FO-92T and Bacillus benzoevorans DSM 5391T was very high. However, molecular characterizations employing small-subunit 16S rDNA sequences were at the limits of resolution for the differentiation of species in this genus, but DNA-DNA hybridization data support the proposal of FO-92T as Bacillus nealsonii sp. nov. (type strain is FO-92T =ATCC BAAM-519T =DSM 15077T).
RPA binds histone H3-H4 and functions in DNA replication-coupled nucleosome assembly.

PubMed

Liu, Shaofeng; Xu, Zhiyun; Leng, He; Zheng, Pu; Yang, Jiayi; Chen, Kaifu; Feng, Jianxun; Li, Qing

2017-01-27

DNA replication-coupled nucleosome assembly is essential to maintain genome integrity and retain epigenetic information. Multiple involved histone chaperones have been identified, but how nucleosome assembly is coupled to DNA replication remains elusive. Here we show that replication protein A (RPA), an essential replisome component that binds single-stranded DNA, has a role in replication-coupled nucleosome assembly. RPA directly binds free H3-H4. Assays using a synthetic sequence that mimics freshly unwound single-stranded DNA at replication fork showed that RPA promotes DNA-(H3-H4) complex formation immediately adjacent to double-stranded DNA. Further, an RPA mutant defective in H3-H4 binding exhibited attenuated nucleosome assembly on nascent chromatin. Thus, we propose that RPA functions as a platform for targeting histone deposition to replication fork, through which RPA couples nucleosome assembly with ongoing DNA replication. Copyright © 2017, American Association for the Advancement of Science.
An accurate algorithm for the detection of DNA fragments from dilution pool sequencing experiments.

PubMed

Bansal, Vikas

2018-01-01

The short read lengths of current high-throughput sequencing technologies limit the ability to recover long-range haplotype information. Dilution pool methods for preparing DNA sequencing libraries from high molecular weight DNA fragments enable the recovery of long DNA fragments from short sequence reads. These approaches require computational methods for identifying the DNA fragments using aligned sequence reads and assembling the fragments into long haplotypes. Although a number of computational methods have been developed for haplotype assembly, the problem of identifying DNA fragments from dilution pool sequence data has not received much attention. We formulate the problem of detecting DNA fragments from dilution pool sequencing experiments as a genome segmentation problem and develop an algorithm that uses dynamic programming to optimize a likelihood function derived from a generative model for the sequence reads. This algorithm uses an iterative approach to automatically infer the mean background read depth and the number of fragments in each pool. Using simulated data, we demonstrate that our method, FragmentCut, has 25-30% greater sensitivity compared with an HMM based method for fragment detection and can also detect overlapping fragments. On a whole-genome human fosmid pool dataset, the haplotypes assembled using the fragments identified by FragmentCut had greater N50 length, 16.2% lower switch error rate and 35.8% lower mismatch error rate compared with two existing methods. We further demonstrate the greater accuracy of our method using two additional dilution pool datasets. FragmentCut is available from https://bansal-lab.github.io/software/FragmentCut. vibansal@ucsd.edu. Supplementary data are available at Bioinformatics online. © The Author 2017. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com
New tool to assemble repetitive regions using next-generation sequencing data

NASA Astrophysics Data System (ADS)

Kuśmirek, Wiktor; Nowak, Robert M.; Neumann, Łukasz

2017-08-01

The next generation sequencing techniques produce a large amount of sequencing data. Some part of the genome are composed of repetitive DNA sequences, which are very problematic for the existing genome assemblers. We propose a modification of the algorithm for a DNA assembly, which uses the relative frequency of reads to properly reconstruct repetitive sequences. The new approach was implemented and tested, as a demonstration of the capability of our software we present some results for model organisms. The new implementation, using a three-layer software architecture was selected, where the presentation layer, data processing layer, and data storage layer were kept separate. Source code as well as demo application with web interface and the additional data are available at project web-page: http://dnaasm.sourceforge.net.
The Past, Present, and Future of Human Centromere Genomics

PubMed Central

Aldrup-MacDonald, Megan E.; Sullivan, Beth A.

2014-01-01

The centromere is the chromosomal locus essential for chromosome inheritance and genome stability. Human centromeres are located at repetitive alpha satellite DNA arrays that compose approximately 5% of the genome. Contiguous alpha satellite DNA sequence is absent from the assembled reference genome, limiting current understanding of centromere organization and function. Here, we review the progress in centromere genomics spanning the discovery of the sequence to its molecular characterization and the work done during the Human Genome Project era to elucidate alpha satellite structure and sequence variation. We discuss exciting recent advances in alpha satellite sequence assembly that have provided important insight into the abundance and complex organization of this sequence on human chromosomes. In light of these new findings, we offer perspectives for future studies of human centromere assembly and function. PMID:24683489
Single-cell genomic sequencing using Multiple Displacement Amplification.

PubMed

Lasken, Roger S

2007-10-01

Single microbial cells can now be sequenced using DNA amplified by the Multiple Displacement Amplification (MDA) reaction. The few femtograms of DNA in a bacterium are amplified into micrograms of high molecular weight DNA suitable for DNA library construction and Sanger sequencing. The MDA-generated DNA also performs well when used directly as template for pyrosequencing by the 454 Life Sciences method. While MDA from single cells loses some of the genomic sequence, this approach will greatly accelerate the pace of sequencing from uncultured microbes. The genetically linked sequences from single cells are also a powerful tool to be used in guiding genomic assembly of shotgun sequences of multiple organisms from environmental DNA extracts (metagenomic sequences).
DNA as a powerful tool for morphology control, spatial positioning, and dynamic assembly of nanoparticles.

PubMed

Tan, Li Huey; Xing, Hang; Lu, Yi

2014-06-17

CONSPECTUS: Several properties of nanomaterials, such as morphologies (e.g., shapes and surface structures) and distance dependent properties (e.g., plasmonic and quantum confinement effects), make nanomaterials uniquely qualified as potential choices for future applications from catalysis to biomedicine. To realize the full potential of these nanomaterials, it is important to demonstrate fine control of the morphology of individual nanoparticles, as well as precise spatial control of the position, orientation, and distances between multiple nanoparticles. In addition, dynamic control of nanomaterial assembly in response to multiple stimuli, with minimal or no error, and the reversibility of the assemblies are also required. In this Account, we summarize recent progress of using DNA as a powerful programmable tool to realize the above goals. First, inspired by the discovery of genetic codes in biology, we have discovered DNA sequence combinations to control different morphologies of nanoparticles during their growth process and have shown that these effects are synergistic or competitive, depending on the sequence combination. The DNA, which guides the growth of the nanomaterial, is stable and retains its biorecognition ability. Second, by taking advantage of different reactivities of phosphorothioate and phosphodiester backbone, we have placed phosphorothioate at selective positions on different DNA nanostructures including DNA tetrahedrons. Bifunctional linkers have been used to conjugate phosphorothioate on one end and bind nanoparticles or proteins on the other end. In doing so, precise control of distances between two or more nanoparticles or proteins with nanometer resolution can be achieved. Furthermore, by developing facile methods to functionalize two hemispheres of Janus nanoparticles with two different DNA sequences regioselectively, we have demonstrated directional control of nanomaterial assembly, where DNA strands with specific hybridization serve as orthogonal linkers. Third, by using functional DNA that includes DNAzyme, aptamer, and aptazyme, dynamic control of assemblies of gold nanoparticles, quantum dots, carbon nanotubes, and iron oxide nanoparticles in response to one or more stimuli cooperatively have been achieved, resulting in colorimetric, fluorescent, electrochemical, and magnetic resonance signals for a wide range of targets, such as metal ions, small molecules, proteins, and intact cells. Fourth, by mimicking biology, we have employed DNAzymes as proofreading units to remove errors in nanoparticle assembly and further used DNAzyme cascade reactions to modify or repair DNA sequences involved in the assembly. Finally, by taking advantage of different affinities of biotin and desthiobiotin toward streptavidin, we have demonstrated reversible assembly of proteins on DNA origami.
Design and analysis of linear cascade DNA hybridization chain reactions using DNA hairpins

NASA Astrophysics Data System (ADS)

Bui, Hieu; Garg, Sudhanshu; Miao, Vincent; Song, Tianqi; Mokhtar, Reem; Reif, John

2017-01-01

DNA self-assembly has been employed non-conventionally to construct nanoscale structures and dynamic nanoscale machines. The technique of hybridization chain reactions by triggered self-assembly has been shown to form various interesting nanoscale structures ranging from simple linear DNA oligomers to dendritic DNA structures. Inspired by earlier triggered self-assembly works, we present a system for controlled self-assembly of linear cascade DNA hybridization chain reactions using nine distinct DNA hairpins. NUPACK is employed to assist in designing DNA sequences and Matlab has been used to simulate DNA hairpin interactions. Gel electrophoresis and ensemble fluorescence reaction kinetics data indicate strong evidence of linear cascade DNA hybridization chain reactions. The half-time completion of the proposed linear cascade reactions indicates a linear dependency on the number of hairpins.

Capturing chloroplast variation for molecular ecology studies: a simple next generation sequencing approach applied to a rainforest tree

PubMed Central

2013-01-01

Background With high quantity and quality data production and low cost, next generation sequencing has the potential to provide new opportunities for plant phylogeographic studies on single and multiple species. Here we present an approach for in silicio chloroplast DNA assembly and single nucleotide polymorphism detection from short-read shotgun sequencing. The approach is simple and effective and can be implemented using standard bioinformatic tools. Results The chloroplast genome of Toona ciliata (Meliaceae), 159,514 base pairs long, was assembled from shotgun sequencing on the Illumina platform using de novo assembly of contigs. To evaluate its practicality, value and quality, we compared the short read assembly with an assembly completed using 454 data obtained after chloroplast DNA isolation. Sanger sequence verifications indicated that the Illumina dataset outperformed the longer read 454 data. Pooling of several individuals during preparation of the shotgun library enabled detection of informative chloroplast SNP markers. Following validation, we used the identified SNPs for a preliminary phylogeographic study of T. ciliata in Australia and to confirm low diversity across the distribution. Conclusions Our approach provides a simple method for construction of whole chloroplast genomes from shotgun sequencing of whole genomic DNA using short-read data and no available closely related reference genome (e.g. from the same species or genus). The high coverage of Illumina sequence data also renders this method appropriate for multiplexing and SNP discovery and therefore a useful approach for landscape level studies of evolutionary ecology. PMID:23497206
Genomic treasure troves: complete genome sequencing of herbarium and insect museum specimens.

PubMed

Staats, Martijn; Erkens, Roy H J; van de Vossenberg, Bart; Wieringa, Jan J; Kraaijeveld, Ken; Stielow, Benjamin; Geml, József; Richardson, James E; Bakker, Freek T

2013-01-01

Unlocking the vast genomic diversity stored in natural history collections would create unprecedented opportunities for genome-scale evolutionary, phylogenetic, domestication and population genomic studies. Many researchers have been discouraged from using historical specimens in molecular studies because of both generally limited success of DNA extraction and the challenges associated with PCR-amplifying highly degraded DNA. In today's next-generation sequencing (NGS) world, opportunities and prospects for historical DNA have changed dramatically, as most NGS methods are actually designed for taking short fragmented DNA molecules as templates. Here we show that using a standard multiplex and paired-end Illumina sequencing approach, genome-scale sequence data can be generated reliably from dry-preserved plant, fungal and insect specimens collected up to 115 years ago, and with minimal destructive sampling. Using a reference-based assembly approach, we were able to produce the entire nuclear genome of a 43-year-old Arabidopsis thaliana (Brassicaceae) herbarium specimen with high and uniform sequence coverage. Nuclear genome sequences of three fungal specimens of 22-82 years of age (Agaricus bisporus, Laccaria bicolor, Pleurotus ostreatus) were generated with 81.4-97.9% exome coverage. Complete organellar genome sequences were assembled for all specimens. Using de novo assembly we retrieved between 16.2-71.0% of coding sequence regions, and hence remain somewhat cautious about prospects for de novo genome assembly from historical specimens. Non-target sequence contaminations were observed in 2 of our insect museum specimens. We anticipate that future museum genomics projects will perhaps not generate entire genome sequences in all cases (our specimens contained relatively small and low-complexity genomes), but at least generating vital comparative genomic data for testing (phylo)genetic, demographic and genetic hypotheses, that become increasingly more horizontal. Furthermore, NGS of historical DNA enables recovering crucial genetic information from old type specimens that to date have remained mostly unutilized and, thus, opens up a new frontier for taxonomic research as well.
mPUMA: a computational approach to microbiota analysis by de novo assembly of operational taxonomic units based on protein-coding barcode sequences.

PubMed

Links, Matthew G; Chaban, Bonnie; Hemmingsen, Sean M; Muirhead, Kevin; Hill, Janet E

2013-08-15

Formation of operational taxonomic units (OTU) is a common approach to data aggregation in microbial ecology studies based on amplification and sequencing of individual gene targets. The de novo assembly of OTU sequences has been recently demonstrated as an alternative to widely used clustering methods, providing robust information from experimental data alone, without any reliance on an external reference database. Here we introduce mPUMA (microbial Profiling Using Metagenomic Assembly, http://mpuma.sourceforge.net), a software package for identification and analysis of protein-coding barcode sequence data. It was developed originally for Cpn60 universal target sequences (also known as GroEL or Hsp60). Using an unattended process that is independent of external reference sequences, mPUMA forms OTUs by DNA sequence assembly and is capable of tracking OTU abundance. mPUMA processes microbial profiles both in terms of the direct DNA sequence as well as in the translated amino acid sequence for protein coding barcodes. By forming OTUs and calculating abundance through an assembly approach, mPUMA is capable of generating inputs for several popular microbiota analysis tools. Using SFF data from sequencing of a synthetic community of Cpn60 sequences derived from the human vaginal microbiome, we demonstrate that mPUMA can faithfully reconstruct all expected OTU sequences and produce compositional profiles consistent with actual community structure. mPUMA enables analysis of microbial communities while empowering the discovery of novel organisms through OTU assembly.
De novo assembly of human genomes with massively parallel short read sequencing.

PubMed

Li, Ruiqiang; Zhu, Hongmei; Ruan, Jue; Qian, Wubin; Fang, Xiaodong; Shi, Zhongbin; Li, Yingrui; Li, Shengting; Shan, Gao; Kristiansen, Karsten; Li, Songgang; Yang, Huanming; Wang, Jian; Wang, Jun

2010-02-01

Next-generation massively parallel DNA sequencing technologies provide ultrahigh throughput at a substantially lower unit data cost; however, the data are very short read length sequences, making de novo assembly extremely challenging. Here, we describe a novel method for de novo assembly of large genomes from short read sequences. We successfully assembled both the Asian and African human genome sequences, achieving an N50 contig size of 7.4 and 5.9 kilobases (kb) and scaffold of 446.3 and 61.9 kb, respectively. The development of this de novo short read assembly method creates new opportunities for building reference sequences and carrying out accurate analyses of unexplored genomes in a cost-effective way.
Building block synthesis using the polymerase chain assembly method.

PubMed

Marchand, Julie A; Peccoud, Jean

2012-01-01

De novo gene synthesis allows the creation of custom DNA molecules without the typical constraints of traditional cloning assembly: scars, restriction site incompatibility, and the quest to find all the desired parts to name a few. Moreover, with the help of computer-assisted design, the perfect DNA molecule can be created along with its matching sequence ready to download. The challenge is to build the physical DNA molecules that have been designed with the software. Although there are several DNA assembly methods, this section presents and describes a method using the polymerase chain assembly (PCA).
Theory and modeling of particles with DNA-mediated interactions

NASA Astrophysics Data System (ADS)

Licata, Nicholas A.

2008-05-01

In recent years significant attention has been attracted to proposals which utilize DNA for nanotechnological applications. Potential applications of these ideas range from the programmable self-assembly of colloidal crystals, to biosensors and nanoparticle based drug delivery platforms. In Chapter I we introduce the system, which generically consists of colloidal particles functionalized with specially designed DNA markers. The sequence of bases on the DNA markers determines the particle type. Due to the hybridization between complementary single-stranded DNA, specific, type-dependent interactions can be introduced between particles by choosing the appropriate DNA marker sequences. In Chapter II we develop a statistical mechanical description of the aggregation and melting behavior of particles with DNA-mediated interactions. In Chapter III a model is proposed to describe the dynamical departure and diffusion of particles which form reversible key-lock connections. In Chapter IV we propose a method to self-assemble nanoparticle clusters using DNA scaffolds. A natural extension is discussed in Chapter V, the programmable self-assembly of nanoparticle clusters where the desired cluster geometry is encoded using DNA-mediated interactions. In Chapter VI we consider a nanoparticle based drug delivery platform for targeted, cell specific chemotherapy. In Chapter VII we present prospects for future research: the connection between DNA-mediated colloidal crystallization and jamming, and the inverse problem in self-assembly.
Phylogenic study of Lemnoideae (duckweeds) through complete chloroplast genomes for eight accessions.

PubMed

Ding, Yanqiang; Fang, Yang; Guo, Ling; Li, Zhidan; He, Kaize; Zhao, Yun; Zhao, Hai

2017-01-01

Phylogenetic relationship within different genera of Lemnoideae, a kind of small aquatic monocotyledonous plants, was not well resolved, using either morphological characters or traditional markers. Given that rich genetic information in chloroplast genome makes them particularly useful for phylogenetic studies, we used chloroplast genomes to clarify the phylogeny within Lemnoideae. DNAs were sequenced with next-generation sequencing. The duckweeds chloroplast genomes were indirectly filtered from the total DNA data, or directly obtained from chloroplast DNA data. To test the reliability of assembling the chloroplast genome based on the filtration of the total DNA, two methods were used to assemble the chloroplast genome of Landoltia punctata strain ZH0202. A phylogenetic tree was built on the basis of the whole chloroplast genome sequences using MrBayes v.3.2.6 and PhyML 3.0. Eight complete duckweeds chloroplast genomes were assembled, with lengths ranging from 165,775 bp to 171,152 bp, and each contains 80 protein-coding sequences, four rRNAs, 30 tRNAs and two pseudogenes. The identity of L. punctata strain ZH0202 chloroplast genomes assembled through two methods was 100%, and their sequences and lengths were completely identical. The chloroplast genome comparison demonstrated that the differences in chloroplast genome sizes among the Lemnoideae primarily resulted from variation in non-coding regions, especially from repeat sequence variation. The phylogenetic analysis demonstrated that the different genera of Lemnoideae are derived from each other in the following order: Spirodela , Landoltia , Lemna , Wolffiella , and Wolffia . This study demonstrates potential of whole chloroplast genome DNA as an effective option for phylogenetic studies of Lemnoideae. It also showed the possibility of using chloroplast DNA data to elucidate those phylogenies which were not yet solved well by traditional methods even in plants other than duckweeds.
Phylogenic study of Lemnoideae (duckweeds) through complete chloroplast genomes for eight accessions

PubMed Central

Ding, Yanqiang; Fang, Yang; Guo, Ling; Li, Zhidan; He, Kaize

2017-01-01

Background Phylogenetic relationship within different genera of Lemnoideae, a kind of small aquatic monocotyledonous plants, was not well resolved, using either morphological characters or traditional markers. Given that rich genetic information in chloroplast genome makes them particularly useful for phylogenetic studies, we used chloroplast genomes to clarify the phylogeny within Lemnoideae. Methods DNAs were sequenced with next-generation sequencing. The duckweeds chloroplast genomes were indirectly filtered from the total DNA data, or directly obtained from chloroplast DNA data. To test the reliability of assembling the chloroplast genome based on the filtration of the total DNA, two methods were used to assemble the chloroplast genome of Landoltia punctata strain ZH0202. A phylogenetic tree was built on the basis of the whole chloroplast genome sequences using MrBayes v.3.2.6 and PhyML 3.0. Results Eight complete duckweeds chloroplast genomes were assembled, with lengths ranging from 165,775 bp to 171,152 bp, and each contains 80 protein-coding sequences, four rRNAs, 30 tRNAs and two pseudogenes. The identity of L. punctata strain ZH0202 chloroplast genomes assembled through two methods was 100%, and their sequences and lengths were completely identical. The chloroplast genome comparison demonstrated that the differences in chloroplast genome sizes among the Lemnoideae primarily resulted from variation in non-coding regions, especially from repeat sequence variation. The phylogenetic analysis demonstrated that the different genera of Lemnoideae are derived from each other in the following order: Spirodela, Landoltia, Lemna, Wolffiella, and Wolffia. Discussion This study demonstrates potential of whole chloroplast genome DNA as an effective option for phylogenetic studies of Lemnoideae. It also showed the possibility of using chloroplast DNA data to elucidate those phylogenies which were not yet solved well by traditional methods even in plants other than duckweeds. PMID:29302399
De novo assembly of mitochondrial genomes provides insights into genetic diversity and molecular evolution in wild boars and domestic pigs.

PubMed

Ni, Pan; Bhuiyan, Ali Akbar; Chen, Jian-Hai; Li, Jingjin; Zhang, Cheng; Zhao, Shuhong; Du, Xiaoyong; Li, Hua; Yu, Hui; Liu, Xiangdong; Li, Kui

2018-06-01

Up to date, the scarcity of publicly available complete mitochondrial sequences for European wild pigs hampers deeper understanding about the genetic changes following domestication. Here, we have assembled 26 de novo mtDNA sequences of European wild boars from next generation sequencing (NGS) data and downloaded 174 complete mtDNA sequences to assess the genetic relationship, nucleotide diversity, and selection. The Bayesian consensus tree reveals the clear divergence between the European and Asian clade and a very small portion (10 out of 200 samples) of maternal introgression. The overall nucleotides diversities of the mtDNA sequences have been reduced following domestication. Interestingly, the selection efficiencies in both European and Asian domestic pigs are reduced, probably caused by changes in both selection constraints and maternal population size following domestication. This study suggests that de novo assembled mitogenomes can be a great boon to uncover the genetic turnover following domestication. Further investigation is warranted to include more samples from the ever-increasing amounts of NGS data to help us to better understand the process of domestication.
Metagenomic assembly through the lens of validation: recent advances in assessing and improving the quality of genomes assembled from metagenomes.

PubMed

Olson, Nathan D; Treangen, Todd J; Hill, Christopher M; Cepeda-Espinoza, Victoria; Ghurye, Jay; Koren, Sergey; Pop, Mihai

2017-08-07

Metagenomic samples are snapshots of complex ecosystems at work. They comprise hundreds of known and unknown species, contain multiple strain variants and vary greatly within and across environments. Many microbes found in microbial communities are not easily grown in culture making their DNA sequence our only clue into their evolutionary history and biological function. Metagenomic assembly is a computational process aimed at reconstructing genes and genomes from metagenomic mixtures. Current methods have made significant strides in reconstructing DNA segments comprising operons, tandem gene arrays and syntenic blocks. Shorter, higher-throughput sequencing technologies have become the de facto standard in the field. Sequencers are now able to generate billions of short reads in only a few days. Multiple metagenomic assembly strategies, pipelines and assemblers have appeared in recent years. Owing to the inherent complexity of metagenome assembly, regardless of the assembly algorithm and sequencing method, metagenome assemblies contain errors. Recent developments in assembly validation tools have played a pivotal role in improving metagenomics assemblers. Here, we survey recent progress in the field of metagenomic assembly, provide an overview of key approaches for genomic and metagenomic assembly validation and demonstrate the insights that can be derived from assemblies through the use of assembly validation strategies. We also discuss the potential for impact of long-read technologies in metagenomics. We conclude with a discussion of future challenges and opportunities in the field of metagenomic assembly and validation. © The Author 2017. Published by Oxford University Press.
An efficient approach to BAC based assembly of complex genomes.

PubMed

Visendi, Paul; Berkman, Paul J; Hayashi, Satomi; Golicz, Agnieszka A; Bayer, Philipp E; Ruperao, Pradeep; Hurgobin, Bhavna; Montenegro, Juan; Chan, Chon-Kit Kenneth; Staňková, Helena; Batley, Jacqueline; Šimková, Hana; Doležel, Jaroslav; Edwards, David

2016-01-01

There has been an exponential growth in the number of genome sequencing projects since the introduction of next generation DNA sequencing technologies. Genome projects have increasingly involved assembly of whole genome data which produces inferior assemblies compared to traditional Sanger sequencing of genomic fragments cloned into bacterial artificial chromosomes (BACs). While whole genome shotgun sequencing using next generation sequencing (NGS) is relatively fast and inexpensive, this method is extremely challenging for highly complex genomes, where polyploidy or high repeat content confounds accurate assembly, or where a highly accurate 'gold' reference is required. Several attempts have been made to improve genome sequencing approaches by incorporating NGS methods, to variable success. We present the application of a novel BAC sequencing approach which combines indexed pools of BACs, Illumina paired read sequencing, a sequence assembler specifically designed for complex BAC assembly, and a custom bioinformatics pipeline. We demonstrate this method by sequencing and assembling BAC cloned fragments from bread wheat and sugarcane genomes. We demonstrate that our assembly approach is accurate, robust, cost effective and scalable, with applications for complete genome sequencing in large and complex genomes.
Cloning Should Be Simple: Escherichia coli DH5α-Mediated Assembly of Multiple DNA Fragments with Short End Homologies

PubMed Central

Richardson, Ruth E.; Suzuki, Yo

2015-01-01

Numerous DNA assembly technologies exist for generating plasmids for biological studies. Many procedures require complex in vitro or in vivo assembly reactions followed by plasmid propagation in recombination-impaired Escherichia coli strains such as DH5α, which are optimal for stable amplification of the DNA materials. Here we show that despite its utility as a cloning strain, DH5α retains sufficient recombinase activity to assemble up to six double-stranded DNA fragments ranging in size from 150 bp to at least 7 kb into plasmids in vivo. This process also requires surprisingly small amounts of DNA, potentially obviating the need for upstream assembly processes associated with most common applications of DNA assembly. We demonstrate the application of this process in cloning of various DNA fragments including synthetic genes, preparation of knockout constructs, and incorporation of guide RNA sequences in constructs for clustered regularly interspaced short palindromic repeats (CRISPR) genome editing. This consolidated process for assembly and amplification in a widely available strain of E. coli may enable productivity gain across disciplines involving recombinant DNA work. PMID:26348330
Highly-sensitive microRNA detection based on bio-bar-code assay and catalytic hairpin assembly two-stage amplification.

PubMed

Tang, Songsong; Gu, Yuan; Lu, Huiting; Dong, Haifeng; Zhang, Kai; Dai, Wenhao; Meng, Xiangdan; Yang, Fan; Zhang, Xueji

2018-04-03

Herein, a highly-sensitive microRNA (miRNA) detection strategy was developed by combining bio-bar-code assay (BBA) with catalytic hairpin assembly (CHA). In the proposed system, two nanoprobes of magnetic nanoparticles functionalized with DNA probes (MNPs-DNA) and gold nanoparticles with numerous barcode DNA (AuNPs-DNA) were designed. In the presence of target miRNA, the MNP-DNA and AuNP-DNA hybridized with target miRNA to form a "sandwich" structure. After "sandwich" structures were separated from the solution by the magnetic field and dehybridized by high temperature, the barcode DNA sequences were released by dissolving AuNPs. The released barcode DNA sequences triggered the toehold strand displacement assembly of two hairpin probes, leading to recycle of barcode DNA sequences and producing numerous fluorescent CHA products for miRNA detection. Under the optimal experimental conditions, the proposed two-stage amplification system could sensitively detect target miRNA ranging from 10 pM to 10 aM with a limit of detection (LOD) down to 97.9 zM. It displayed good capability to discriminate single base and three bases mismatch due to the unique sandwich structure. Notably, it presented good feasibility for selective multiplexed detection of various combinations of synthetic miRNA sequences and miRNAs extracted from different cell lysates, which were in agreement with the traditional polymerase chain reaction analysis. The two-stage amplification strategy may be significant implication in the biological detection and clinical diagnosis. Copyright © 2017 Elsevier B.V. All rights reserved.
DNA sequence templates adjacent nucleosome and ORC sites at gene amplification origins in Drosophila

PubMed Central

Liu, Jun; Zimmer, Kurt; Rusch, Douglas B.; Paranjape, Neha; Podicheti, Ram; Tang, Haixu; Calvi, Brian R.

2015-01-01

Eukaryotic origins of DNA replication are bound by the origin recognition complex (ORC), which scaffolds assembly of a pre-replicative complex (pre-RC) that is then activated to initiate replication. Both pre-RC assembly and activation are strongly influenced by developmental changes to the epigenome, but molecular mechanisms remain incompletely defined. We have been examining the activation of origins responsible for developmental gene amplification in Drosophila. At a specific time in oogenesis, somatic follicle cells transition from genomic replication to a locus-specific replication from six amplicon origins. Previous evidence indicated that these amplicon origins are activated by nucleosome acetylation, but how this affects origin chromatin is unknown. Here, we examine nucleosome position in follicle cells using micrococcal nuclease digestion with Ilumina sequencing. The results indicate that ORC binding sites and other essential origin sequences are nucleosome-depleted regions (NDRs). Nucleosome position at the amplicons was highly similar among developmental stages during which ORC is or is not bound, indicating that being an NDR is not sufficient to specify ORC binding. Importantly, the data suggest that nucleosomes and ORC have opposite preferences for DNA sequence and structure. We propose that nucleosome hyperacetylation promotes pre-RC assembly onto adjacent DNA sequences that are disfavored by nucleosomes but favored by ORC. PMID:26227968
BioNano genome mapping of individual chromosomes supports physical mapping and sequence assembly in complex plant genomes.

PubMed

Staňková, Helena; Hastie, Alex R; Chan, Saki; Vrána, Jan; Tulpová, Zuzana; Kubaláková, Marie; Visendi, Paul; Hayashi, Satomi; Luo, Mingcheng; Batley, Jacqueline; Edwards, David; Doležel, Jaroslav; Šimková, Hana

2016-07-01

The assembly of a reference genome sequence of bread wheat is challenging due to its specific features such as the genome size of 17 Gbp, polyploid nature and prevalence of repetitive sequences. BAC-by-BAC sequencing based on chromosomal physical maps, adopted by the International Wheat Genome Sequencing Consortium as the key strategy, reduces problems caused by the genome complexity and polyploidy, but the repeat content still hampers the sequence assembly. Availability of a high-resolution genomic map to guide sequence scaffolding and validate physical map and sequence assemblies would be highly beneficial to obtaining an accurate and complete genome sequence. Here, we chose the short arm of chromosome 7D (7DS) as a model to demonstrate for the first time that it is possible to couple chromosome flow sorting with genome mapping in nanochannel arrays and create a de novo genome map of a wheat chromosome. We constructed a high-resolution chromosome map composed of 371 contigs with an N50 of 1.3 Mb. Long DNA molecules achieved by our approach facilitated chromosome-scale analysis of repetitive sequences and revealed a ~800-kb array of tandem repeats intractable to current DNA sequencing technologies. Anchoring 7DS sequence assemblies obtained by clone-by-clone sequencing to the 7DS genome map provided a valuable tool to improve the BAC-contig physical map and validate sequence assembly on a chromosome-arm scale. Our results indicate that creating genome maps for the whole wheat genome in a chromosome-by-chromosome manner is feasible and that they will be an affordable tool to support the production of improved pseudomolecules. © 2016 The Authors. Plant Biotechnology Journal published by Society for Experimental Biology and The Association of Applied Biologists and John Wiley & Sons Ltd.
DNA nanostructures: Through, rather than across

NASA Astrophysics Data System (ADS)

Bruchez, Marcel P.

2018-02-01

Dye molecules are shown to assemble into J-aggregate arrays by sequence-specific organization in the minor groove of DNA duplex sequences. Energy transfer through these structures displays the hallmarks of coherent coupling over distances that exceed those of conventional dipole-coupling processes.
A Hybrid Parallel Strategy Based on String Graph Theory to Improve De Novo DNA Assembly on the TianHe-2 Supercomputer.

PubMed

Zhang, Feng; Liao, Xiangke; Peng, Shaoliang; Cui, Yingbo; Wang, Bingqiang; Zhu, Xiaoqian; Liu, Jie

2016-06-01

' The de novo assembly of DNA sequences is increasingly important for biological researches in the genomic era. After more than one decade since the Human Genome Project, some challenges still exist and new solutions are being explored to improve de novo assembly of genomes. String graph assembler (SGA), based on the string graph theory, is a new method/tool developed to address the challenges. In this paper, based on an in-depth analysis of SGA we prove that the SGA-based sequence de novo assembly is an NP-complete problem. According to our analysis, SGA outperforms other similar methods/tools in memory consumption, but costs much more time, of which 60-70 % is spent on the index construction. Upon this analysis, we introduce a hybrid parallel optimization algorithm and implement this algorithm in the TianHe-2's parallel framework. Simulations are performed with different datasets. For data of small size the optimized solution is 3.06 times faster than before, and for data of middle size it's 1.60 times. The results demonstrate an evident performance improvement, with the linear scalability for parallel FM-index construction. This results thus contribute significantly to improving the efficiency of de novo assembly of DNA sequences.
Sequence verification of synthetic DNA by assembly of sequencing reads

PubMed Central

Wilson, Mandy L.; Cai, Yizhi; Hanlon, Regina; Taylor, Samantha; Chevreux, Bastien; Setubal, João C.; Tyler, Brett M.; Peccoud, Jean

2013-01-01

Gene synthesis attempts to assemble user-defined DNA sequences with base-level precision. Verifying the sequences of construction intermediates and the final product of a gene synthesis project is a critical part of the workflow, yet one that has received the least attention. Sequence validation is equally important for other kinds of curated clone collections. Ensuring that the physical sequence of a clone matches its published sequence is a common quality control step performed at least once over the course of a research project. GenoREAD is a web-based application that breaks the sequence verification process into two steps: the assembly of sequencing reads and the alignment of the resulting contig with a reference sequence. GenoREAD can determine if a clone matches its reference sequence. Its sophisticated reporting features help identify and troubleshoot problems that arise during the sequence verification process. GenoREAD has been experimentally validated on thousands of gene-sized constructs from an ORFeome project, and on longer sequences including whole plasmids and synthetic chromosomes. Comparing GenoREAD results with those from manual analysis of the sequencing data demonstrates that GenoREAD tends to be conservative in its diagnostic. GenoREAD is available at www.genoread.org. PMID:23042248
A Hybrid Approach for the Automated Finishing of Bacterial Genomes

PubMed Central

Robins, William P.; Chin, Chen-Shan; Webster, Dale; Paxinos, Ellen; Hsu, David; Ashby, Meredith; Wang, Susana; Peluso, Paul; Sebra, Robert; Sorenson, Jon; Bullard, James; Yen, Jackie; Valdovino, Marie; Mollova, Emilia; Luong, Khai; Lin, Steven; LaMay, Brianna; Joshi, Amruta; Rowe, Lori; Frace, Michael; Tarr, Cheryl L.; Turnsek, Maryann; Davis, Brigid M; Kasarskis, Andrew; Mekalanos, John J.; Waldor, Matthew K.; Schadt, Eric E.

2013-01-01

Dramatic improvements in DNA sequencing technology have revolutionized our ability to characterize most genomic diversity. However, accurate resolution of large structural events has remained challenging due to the comparatively shorter read lengths of second-generation technologies. Emerging third-generation sequencing technologies, which yield markedly increased read length on rapid time scales and for low cost, have the potential to address assembly limitations. Here we combine sequencing data from second- and third-generation DNA sequencing technologies to assemble the two-chromosome genome of a recent Haitian cholera outbreak strain into two nearly finished contigs at > 99.9% accuracy. Complex regions with clinically significant structure were completely resolved. In separate control assemblies on experimental and simulated data for the canonical N16961 reference we obtain 14 and 8 scaffolds greater than 1kb, respectively, correcting several errors in the underlying source data. This work provides a blueprint for the next generation of rapid microbial identification and full-genome assembly. PMID:22750883
Single Molecule Visualization of Protein-DNA Complexes: Watching Machines at Work

NASA Astrophysics Data System (ADS)

Kowalczykowski, Stephen

2013-03-01

We can now watch individual proteins acting on single molecules of DNA. Such imaging provides unprecedented interrogation of fundamental biophysical processes. Visualization is achieved through the application of two complementary procedures. In one, single DNA molecules are attached to a polystyrene bead and are then captured by an optical trap. The DNA, a worm-like coil, is extended either by the force of solution flow in a micro-fabricated channel, or by capturing the opposite DNA end in a second optical trap. In the second procedure, DNA is attached by one end to a glass surface. The coiled DNA is elongated either by continuous solution flow or by subsequently tethering the opposite end to the surface. Protein action is visualized by fluorescent reporters: fluorescent dyes that bind double-stranded DNA (dsDNA), fluorescent biosensors for single-stranded DNA (ssDNA), or fluorescently-tagged proteins. Individual molecules are imaged using either epifluorescence microscopy or total internal reflection fluorescence (TIRF) microscopy. Using these approaches, we imaged the search for DNA sequence homology conducted by the RecA-ssDNA filament. The manner by which RecA protein finds a single homologous sequence in the genome had remained undefined for almost 30 years. Single-molecule imaging revealed that the search occurs through a mechanism termed ``intersegmental contact sampling,'' in which the randomly coiled structure of DNA is essential for reiterative sampling of DNA sequence identity: an example of parallel processing. In addition, the assembly of RecA filaments on single molecules of single-stranded DNA was visualized. Filament assembly requires nucleation of a protein dimer on DNA, and subsequent growth occurs via monomer addition. Furthermore, we discovered a class of proteins that catalyzed both nucleation and growth of filaments, revealing how the cell controls assembly of this protein-DNA complex.

Xenopus origin recognition complex (ORC) initiates DNA replication preferentially at sequences targeted by Schizosaccharomyces pombe ORC

PubMed Central

Kong, Daochun; Coleman, Thomas R.; DePamphilis, Melvin L.

2003-01-01

Budding yeast (Saccharomyces cerevisiae) origin recognition complex (ORC) requires ATP to bind specific DNA sequences, whereas fission yeast (Schizosaccharomyces pombe) ORC binds to specific, asymmetric A:T-rich sites within replication origins, independently of ATP, and frog (Xenopus laevis) ORC seems to bind DNA non-specifically. Here we show that despite these differences, ORCs are functionally conserved. Firstly, SpOrc1, SpOrc4 and SpOrc5, like those from other eukaryotes, bound ATP and exhibited ATPase activity, suggesting that ATP is required for pre-replication complex (pre-RC) assembly rather than origin specificity. Secondly, SpOrc4, which is solely responsible for binding SpORC to DNA, inhibited up to 70% of XlORC-dependent DNA replication in Xenopus egg extract by preventing XlORC from binding to chromatin and assembling pre-RCs. Chromatin-bound SpOrc4 was located at AT-rich sequences. XlORC in egg extract bound preferentially to asymmetric A:T-sequences in either bare DNA or in sperm chromatin, and it recruited XlCdc6 and XlMcm proteins to these sequences. These results reveal that XlORC initiates DNA replication preferentially at the same or similar sites to those targeted in S.pombe. PMID:12840006
Integrating DNA strand displacement circuitry to the nonlinear hybridization chain reaction.

PubMed

Zhang, Zhuo; Fan, Tsz Wing; Hsing, I-Ming

2017-02-23

Programmable and modular attributes of DNA molecules allow one to develop versatile sensing platforms that can be operated isothermally and enzyme-free. In this work, we present an approach to integrate upstream DNA strand displacement circuits that can be turned on by a sequence-specific microRNA analyte with a downstream nonlinear hybridization chain reaction for a cascading hyperbranched nucleic acid assembly. This system provides a two-step amplification strategy for highly sensitive detection of the miRNA analyte, conducive for multiplexed detection. Multiple miRNA analytes were tested with our integrated circuitry using the same downstream signal amplification setting, showing the decoupling of nonlinear self-assembly with the analyte sequence. Compared with the reported methods, our signal amplification approach provides an additional control module for higher-order DNA self-assembly and could be developed into a promising platform for the detection of critical nucleic-acid based biomarkers.
Purification of High Molecular Weight Genomic DNA from Powdery Mildew for Long-Read Sequencing.

PubMed

Feehan, Joanna M; Scheibel, Katherine E; Bourras, Salim; Underwood, William; Keller, Beat; Somerville, Shauna C

2017-03-31

The powdery mildew fungi are a group of economically important fungal plant pathogens. Relatively little is known about the molecular biology and genetics of these pathogens, in part due to a lack of well-developed genetic and genomic resources. These organisms have large, repetitive genomes, which have made genome sequencing and assembly prohibitively difficult. Here, we describe methods for the collection, extraction, purification and quality control assessment of high molecular weight genomic DNA from one powdery mildew species, Golovinomyces cichoracearum. The protocol described includes mechanical disruption of spores followed by an optimized phenol/chloroform genomic DNA extraction. A typical yield was 7 µg DNA per 150 mg conidia. The genomic DNA that is isolated using this procedure is suitable for long-read sequencing (i.e., > 48.5 kbp). Quality control measures to ensure the size, yield, and purity of the genomic DNA are also described in this method. Sequencing of the genomic DNA of the quality described here will allow for the assembly and comparison of multiple powdery mildew genomes, which in turn will lead to a better understanding and improved control of this agricultural pathogen.
Surface-assisted DNA self-assembly: An enzyme-free strategy towards formation of branched DNA lattice

DOE Office of Scientific and Technical Information (OSTI.GOV)

Bhanjadeo, Madhabi M.; Academy of Scientific and Innovative Research; Nayak, Ashok K.

DNA based self-assembled nanostructures and DNA origami has proven useful for organizing nanomaterials with firm precision. However, for advanced applications like nanoelectronics and photonics, large-scale organization of self-assembled branched DNA (bDNA) into periodic lattices is desired. In this communication for the first time we report a facile method of self-assembly of Y-shaped bDNA nanostructures on the cationic surface of Aluminum (Al) foil to prepare periodic two dimensional (2D) bDNA lattice. Particularly those Y-shaped bDNA structures having smaller overhangs and unable to self-assemble in solution, they are easily assembled on the surface of Al foil in the absence of ligase. Fieldmore » emission scanning electron microscopy (FESEM) analysis shows homogenous distribution of two-dimensional bDNA lattices across the Al foil. When the assembled bDNA structures were recovered from the Al foil and electrophoresed in nPAGE only higher order polymeric bDNA structures were observed without a trace of monomeric structures which confirms the stability and high yield of the bDNA lattices. Therefore, this enzyme-free economic and efficient strategy for developing bDNA lattices can be utilized in assembling various nanomaterials for functional molecular components towards development of DNA based self-assembled nanodevices. - Highlights: • Al foil surface-assisted self-assembly of monomeric structures into larger branched DNA lattice. • FESEM study confirms the uniform distribution of two-dimensional bDNA lattice structures across the surface of Al foil. • Enzyme-free and economic strategy to prepare higher order structures from simpler DNA nanostructures have been confirmed by recovery assay. • Use of well proven sequences for the preparation of pure Y-shaped monomeric DNA nanostructure with high yield.« less
Facile Site-Directed Mutagenesis of Large Constructs Using Gibson Isothermal DNA Assembly.

PubMed

Yonemoto, Isaac T; Weyman, Philip D

2017-01-01

Site-directed mutagenesis is a commonly used molecular biology technique to manipulate biological sequences, and is especially useful for studying sequence determinants of enzyme function or designing proteins with improved activity. We describe a strategy using Gibson Isothermal DNA Assembly to perform site-directed mutagenesis on large (>~20 kbp) constructs that are outside the effective range of standard techniques such as QuikChange II (Agilent Technologies), but more reliable than traditional cloning using restriction enzymes and ligation.
Direct comparisons of Illumina vs. Roche 454 sequencing technologies on the same microbial community DNA sample.

PubMed

Luo, Chengwei; Tsementzi, Despina; Kyrpides, Nikos; Read, Timothy; Konstantinidis, Konstantinos T

2012-01-01

Next-generation sequencing (NGS) is commonly used in metagenomic studies of complex microbial communities but whether or not different NGS platforms recover the same diversity from a sample and their assembled sequences are of comparable quality remain unclear. We compared the two most frequently used platforms, the Roche 454 FLX Titanium and the Illumina Genome Analyzer (GA) II, on the same DNA sample obtained from a complex freshwater planktonic community. Despite the substantial differences in read length and sequencing protocols, the platforms provided a comparable view of the community sampled. For instance, derived assemblies overlapped in ~90% of their total sequences and in situ abundances of genes and genotypes (estimated based on sequence coverage) correlated highly between the two platforms (R(2)>0.9). Evaluation of base-call error, frameshift frequency, and contig length suggested that Illumina offered equivalent, if not better, assemblies than Roche 454. The results from metagenomic samples were further validated against DNA samples of eighteen isolate genomes, which showed a range of genome sizes and G+C% content. We also provide quantitative estimates of the errors in gene and contig sequences assembled from datasets characterized by different levels of complexity and G+C% content. For instance, we noted that homopolymer-associated, single-base errors affected ~1% of the protein sequences recovered in Illumina contigs of 10× coverage and 50% G+C; this frequency increased to ~3% when non-homopolymer errors were also considered. Collectively, our results should serve as a useful practical guide for choosing proper sampling strategies and data possessing protocols for future metagenomic studies.
De novo Assembly of a 40 Mb Eukaryotic Genome from Short Sequence Reads: Sordaria macrospora, a Model Organism for Fungal Morphogenesis

PubMed Central

Nowrousian, Minou; Stajich, Jason E.; Chu, Meiling; Engh, Ines; Espagne, Eric; Halliday, Karen; Kamerewerd, Jens; Kempken, Frank; Knab, Birgit; Kuo, Hsiao-Che; Osiewacz, Heinz D.; Pöggeler, Stefanie; Read, Nick D.; Seiler, Stephan; Smith, Kristina M.; Zickler, Denise; Kück, Ulrich; Freitag, Michael

2010-01-01

Filamentous fungi are of great importance in ecology, agriculture, medicine, and biotechnology. Thus, it is not surprising that genomes for more than 100 filamentous fungi have been sequenced, most of them by Sanger sequencing. While next-generation sequencing techniques have revolutionized genome resequencing, e.g. for strain comparisons, genetic mapping, or transcriptome and ChIP analyses, de novo assembly of eukaryotic genomes still presents significant hurdles, because of their large size and stretches of repetitive sequences. Filamentous fungi contain few repetitive regions in their 30–90 Mb genomes and thus are suitable candidates to test de novo genome assembly from short sequence reads. Here, we present a high-quality draft sequence of the Sordaria macrospora genome that was obtained by a combination of Illumina/Solexa and Roche/454 sequencing. Paired-end Solexa sequencing of genomic DNA to 85-fold coverage and an additional 10-fold coverage by single-end 454 sequencing resulted in ∼4 Gb of DNA sequence. Reads were assembled to a 40 Mb draft version (N50 of 117 kb) with the Velvet assembler. Comparative analysis with Neurospora genomes increased the N50 to 498 kb. The S. macrospora genome contains even fewer repeat regions than its closest sequenced relative, Neurospora crassa. Comparison with genomes of other fungi showed that S. macrospora, a model organism for morphogenesis and meiosis, harbors duplications of several genes involved in self/nonself-recognition. Furthermore, S. macrospora contains more polyketide biosynthesis genes than N. crassa. Phylogenetic analyses suggest that some of these genes may have been acquired by horizontal gene transfer from a distantly related ascomycete group. Our study shows that, for typical filamentous fungi, de novo assembly of genomes from short sequence reads alone is feasible, that a mixture of Solexa and 454 sequencing substantially improves the assembly, and that the resulting data can be used for comparative studies to address basic questions of fungal biology. PMID:20386741
De novo assembly of a 40 Mb eukaryotic genome from short sequence reads: Sordaria macrospora, a model organism for fungal morphogenesis.

PubMed

Nowrousian, Minou; Stajich, Jason E; Chu, Meiling; Engh, Ines; Espagne, Eric; Halliday, Karen; Kamerewerd, Jens; Kempken, Frank; Knab, Birgit; Kuo, Hsiao-Che; Osiewacz, Heinz D; Pöggeler, Stefanie; Read, Nick D; Seiler, Stephan; Smith, Kristina M; Zickler, Denise; Kück, Ulrich; Freitag, Michael

2010-04-08

Filamentous fungi are of great importance in ecology, agriculture, medicine, and biotechnology. Thus, it is not surprising that genomes for more than 100 filamentous fungi have been sequenced, most of them by Sanger sequencing. While next-generation sequencing techniques have revolutionized genome resequencing, e.g. for strain comparisons, genetic mapping, or transcriptome and ChIP analyses, de novo assembly of eukaryotic genomes still presents significant hurdles, because of their large size and stretches of repetitive sequences. Filamentous fungi contain few repetitive regions in their 30-90 Mb genomes and thus are suitable candidates to test de novo genome assembly from short sequence reads. Here, we present a high-quality draft sequence of the Sordaria macrospora genome that was obtained by a combination of Illumina/Solexa and Roche/454 sequencing. Paired-end Solexa sequencing of genomic DNA to 85-fold coverage and an additional 10-fold coverage by single-end 454 sequencing resulted in approximately 4 Gb of DNA sequence. Reads were assembled to a 40 Mb draft version (N50 of 117 kb) with the Velvet assembler. Comparative analysis with Neurospora genomes increased the N50 to 498 kb. The S. macrospora genome contains even fewer repeat regions than its closest sequenced relative, Neurospora crassa. Comparison with genomes of other fungi showed that S. macrospora, a model organism for morphogenesis and meiosis, harbors duplications of several genes involved in self/nonself-recognition. Furthermore, S. macrospora contains more polyketide biosynthesis genes than N. crassa. Phylogenetic analyses suggest that some of these genes may have been acquired by horizontal gene transfer from a distantly related ascomycete group. Our study shows that, for typical filamentous fungi, de novo assembly of genomes from short sequence reads alone is feasible, that a mixture of Solexa and 454 sequencing substantially improves the assembly, and that the resulting data can be used for comparative studies to address basic questions of fungal biology.
The Centromere: Chromatin Foundation for the Kinetochore Machinery

PubMed Central

Fukagawa, Tatsuo; Earnshaw, William C.

2014-01-01

Since discovery of the centromere-specific histone H3 variant CENP-A, centromeres have come to be defined as chromatin structures that establish the assembly site for the complex kinetochore machinery. In most organisms, centromere activity is defined epigenetically, rather than by specific DNA sequences. In this review, we describe selected classic work and recent progress in studies of centromeric chromatin with a focus on vertebrates. We consider possible roles for repetitive DNA sequences found at most centromeres, chromatin factors and modifications that assemble and activate CENP-A chromatin for kinetochore assembly, plus the use of artificial chromosomes and kinetochores to study centromere function. PMID:25203206
Proton-Fueled, Reversible DNA Hybridization Chain Assembly for pH Sensing and Imaging.

PubMed

Liu, Lan; Liu, Jin-Wen; Huang, Zhi-Mei; Wu, Han; Li, Na; Tang, Li-Juan; Jiang, Jian-Hui

2017-07-05

Design of DNA self-assembly with reversible responsiveness to external stimuli is of great interest for diverse applications. We for the first time develop a pH-responsive, fully reversible hybridization chain reaction (HCR) assembly that allows sensitive sensing and imaging of pH in living cells. Our design relies on the triplex forming sequences that form DNA triplex with toehold regions under acidic conditions and then induce a cascade of strand displacement and DNA assembly. The HCR assembly has shown dynamic responses in physiological pH ranges with excellent reversibility and demonstrated the potential for in vitro detection and live-cell imaging of pH. Moreover, this method affords HCR assemblies with highly localized fluorescence responses, offering advantages of improving sensitivity and better selectivity. The proton-fueled, reversible HCR assembly may provide a useful approach for pH-related cell biology study and disease diagnostics.
Reducing assembly complexity of microbial genomes with single-molecule sequencing.

PubMed

Koren, Sergey; Harhay, Gregory P; Smith, Timothy P L; Bono, James L; Harhay, Dayna M; Mcvey, Scott D; Radune, Diana; Bergman, Nicholas H; Phillippy, Adam M

2013-01-01

The short reads output by first- and second-generation DNA sequencing instruments cannot completely reconstruct microbial chromosomes. Therefore, most genomes have been left unfinished due to the significant resources required to manually close gaps in draft assemblies. Third-generation, single-molecule sequencing addresses this problem by greatly increasing sequencing read length, which simplifies the assembly problem. To measure the benefit of single-molecule sequencing on microbial genome assembly, we sequenced and assembled the genomes of six bacteria and analyzed the repeat complexity of 2,267 complete bacteria and archaea. Our results indicate that the majority of known bacterial and archaeal genomes can be assembled without gaps, at finished-grade quality, using a single PacBio RS sequencing library. These single-library assemblies are also more accurate than typical short-read assemblies and hybrid assemblies of short and long reads. Automated assembly of long, single-molecule sequencing data reduces the cost of microbial finishing to $1,000 for most genomes, and future advances in this technology are expected to drive the cost lower. This is expected to increase the number of completed genomes, improve the quality of microbial genome databases, and enable high-fidelity, population-scale studies of pan-genomes and chromosomal organization.
Cloning should be simple: Escherichia coli DH5α-mediated assembly of multiple DNA fragments with short end homologies

DOE PAGES

Kostylev, Maxim; Otwell, Anne E.; Richardson, Ruth E.; ...

2015-09-08

Numerous DNA assembly technologies exist for generating plasmids for biological studies. Many procedures require complex in vitro or in vivo assembly reactions followed by plasmid propagation in recombination-impaired Escherichia coli strains such as DH5α, which are optimal for stable amplification of the DNA materials. Here we show that despite its utility as a cloning strain, DH5α retains sufficient recombinase activity to assemble up to six doublestranded DNA fragments ranging in size from 150 bp to at least 7 kb into plasmids in vivo. This process also requires surprisingly small amounts of DNA, potentially obviating the need for upstream assembly processesmore » associated with most common applications of DNA assembly. In addition, we demonstrate the application of this process in cloning of various DNA fragments including synthetic genes, preparation of knockout constructs, and incorporation of guide RNA sequences in constructs for clustered regularly interspaced short palindromic repeats (CRISPR) genome editing. This consolidated process for assembly and amplification in a widely available strain of E. coli may enable productivity gain across disciplines involving recombinant DNA work.« less
Cloning should be simple: Escherichia coli DH5α-mediated assembly of multiple DNA fragments with short end homologies

DOE Office of Scientific and Technical Information (OSTI.GOV)

Kostylev, Maxim; Otwell, Anne E.; Richardson, Ruth E.

Numerous DNA assembly technologies exist for generating plasmids for biological studies. Many procedures require complex in vitro or in vivo assembly reactions followed by plasmid propagation in recombination-impaired Escherichia coli strains such as DH5α, which are optimal for stable amplification of the DNA materials. Here we show that despite its utility as a cloning strain, DH5α retains sufficient recombinase activity to assemble up to six doublestranded DNA fragments ranging in size from 150 bp to at least 7 kb into plasmids in vivo. This process also requires surprisingly small amounts of DNA, potentially obviating the need for upstream assembly processesmore » associated with most common applications of DNA assembly. In addition, we demonstrate the application of this process in cloning of various DNA fragments including synthetic genes, preparation of knockout constructs, and incorporation of guide RNA sequences in constructs for clustered regularly interspaced short palindromic repeats (CRISPR) genome editing. This consolidated process for assembly and amplification in a widely available strain of E. coli may enable productivity gain across disciplines involving recombinant DNA work.« less
Long-range barcode labeling-sequencing

DOE Office of Scientific and Technical Information (OSTI.GOV)

Chen, Feng; Zhang, Tao; Singh, Kanwar K.

Methods for sequencing single large DNA molecules by clonal multiple displacement amplification using barcoded primers. Sequences are binned based on barcode sequences and sequenced using a microdroplet-based method for sequencing large polynucleotide templates to enable assembly of haplotype-resolved complex genomes and metagenomes.
Application of long sequence reads to improve genomes for Clostridium thermocellum AD2, Clostridium thermocellum LQRI, and Pelosinus fermentans R7

DOE PAGES

Utturkar, Sagar M.; Bayer, Edward A.; Borovok, Ilya; ...

2016-09-29

Here, we and others have shown the utility of long sequence reads to improve genome assembly quality. In this study, we generated PacBio DNA sequence data to improve the assemblies of draft genomes for Clostridium thermocellum AD2, Clostridium thermocellum LQRI, and Pelosinus fermentans R7.
MetaCAA: A clustering-aided methodology for efficient assembly of metagenomic datasets.

PubMed

Reddy, Rachamalla Maheedhar; Mohammed, Monzoorul Haque; Mande, Sharmila S

2014-01-01

A key challenge in analyzing metagenomics data pertains to assembly of sequenced DNA fragments (i.e. reads) originating from various microbes in a given environmental sample. Several existing methodologies can assemble reads originating from a single genome. However, these methodologies cannot be applied for efficient assembly of metagenomic sequence datasets. In this study, we present MetaCAA - a clustering-aided methodology which helps in improving the quality of metagenomic sequence assembly. MetaCAA initially groups sequences constituting a given metagenome into smaller clusters. Subsequently, sequences in each cluster are independently assembled using CAP3, an existing single genome assembly program. Contigs formed in each of the clusters along with the unassembled reads are then subjected to another round of assembly for generating the final set of contigs. Validation using simulated and real-world metagenomic datasets indicates that MetaCAA aids in improving the overall quality of assembly. A software implementation of MetaCAA is available at https://metagenomics.atc.tcs.com/MetaCAA. Copyright © 2014 Elsevier Inc. All rights reserved.
DNA microdevice for electrochemical detection of Escherichia coli 0157:H7 molecular markers.

PubMed

Berganza, J; Olabarria, G; García, R; Verdoy, D; Rebollo, A; Arana, S

2007-04-15

An electrochemical DNA sensor based on the hybridization recognition of a single-stranded DNA (ssDNA) probe immobilized onto a gold electrode to its complementary ssDNA is presented. The DNA probe is bound on gold surface electrode by using self-assembled monolayer (SAM) technology. An optimized mixed SAM with a blocking molecule preventing the nonspecific adsorption on the electrode surface has been prepared. In this paper, a DNA biosensor is designed by means of the immobilization of a single stranded DNA probe on an electrochemical transducer surface to recognize specifically Escherichia coli (E. coli) 0157:H7 complementary target DNA sequence via cyclic voltammetry experiments. The 21 mer DNA probe including a C6 alkanethiol group at the 5' phosphate end has been synthesized to form the SAM onto the gold surface through the gold sulfur bond. The goal of this paper has been to design, characterise and optimise an electrochemical DNA sensor. In order to investigate the oligonucleotide probe immobilization and the hybridization detection, experiments with different concentration of DNA and mismatch sequences have been performed. This microdevice has demonstrated the suitability of oligonucleotide Self-assembled monolayers (SAMs) on gold as immobilization method. The DNA probes deposited on gold surface have been functional and able to detect changes in bases sequence in a 21-mer oligonucleotide.
Rational Design of High-Number dsDNA Fragments Based on Thermodynamics for the Construction of Full-Length Genes in a Single Reaction.

PubMed

Birla, Bhagyashree S; Chou, Hui-Hsien

2015-01-01

Gene synthesis is frequently used in modern molecular biology research either to create novel genes or to obtain natural genes when the synthesis approach is more flexible and reliable than cloning. DNA chemical synthesis has limits on both its length and yield, thus full-length genes have to be hierarchically constructed from synthesized DNA fragments. Gibson Assembly and its derivatives are the simplest methods to assemble multiple double-stranded DNA fragments. Currently, up to 12 dsDNA fragments can be assembled at once with Gibson Assembly according to its vendor. In practice, the number of dsDNA fragments that can be assembled in a single reaction are much lower. We have developed a rational design method for gene construction that allows high-number dsDNA fragments to be assembled into full-length genes in a single reaction. Using this new design method and a modified version of the Gibson Assembly protocol, we have assembled 3 different genes from up to 45 dsDNA fragments at once. Our design method uses the thermodynamic analysis software Picky that identifies all unique junctions in a gene where consecutive DNA fragments are specifically made to connect to each other. Our novel method is generally applicable to most gene sequences, and can improve both the efficiency and cost of gene assembly.
Studies of G-quadruplexes formed within self-assembled DNA mini-circles.

PubMed

Klejevskaja, Beata; Pyne, Alice L B; Reynolds, Matthew; Shivalingam, Arun; Thorogate, Richard; Hoogenboom, Bart W; Ying, Liming; Vilar, Ramon

2016-10-13

We have developed self-assembled DNA mini-circles that contain a G-quadruplex-forming sequence from the c-Myc oncogene promoter and demonstrate by FRET that the G-quadruplex unfolding kinetics are 10-fold slower than for the simpler 24-mer G-quadruplex that is commonly used for FRET experiments.
Ultraaccurate genome sequencing and haplotyping of single human cells.

PubMed

Chu, Wai Keung; Edge, Peter; Lee, Ho Suk; Bansal, Vikas; Bafna, Vineet; Huang, Xiaohua; Zhang, Kun

2017-11-21

Accurate detection of variants and long-range haplotypes in genomes of single human cells remains very challenging. Common approaches require extensive in vitro amplification of genomes of individual cells using DNA polymerases and high-throughput short-read DNA sequencing. These approaches have two notable drawbacks. First, polymerase replication errors could generate tens of thousands of false-positive calls per genome. Second, relatively short sequence reads contain little to no haplotype information. Here we report a method, which is dubbed SISSOR (single-stranded sequencing using microfluidic reactors), for accurate single-cell genome sequencing and haplotyping. A microfluidic processor is used to separate the Watson and Crick strands of the double-stranded chromosomal DNA in a single cell and to randomly partition megabase-size DNA strands into multiple nanoliter compartments for amplification and construction of barcoded libraries for sequencing. The separation and partitioning of large single-stranded DNA fragments of the homologous chromosome pairs allows for the independent sequencing of each of the complementary and homologous strands. This enables the assembly of long haplotypes and reduction of sequence errors by using the redundant sequence information and haplotype-based error removal. We demonstrated the ability to sequence single-cell genomes with error rates as low as 10 -8 and average 500-kb-long DNA fragments that can be assembled into haplotype contigs with N50 greater than 7 Mb. The performance could be further improved with more uniform amplification and more accurate sequence alignment. The ability to obtain accurate genome sequences and haplotype information from single cells will enable applications of genome sequencing for diverse clinical needs. Copyright © 2017 the Author(s). Published by PNAS.

Analytical Devices Based on Direct Synthesis of DNA on Paper.

PubMed

Glavan, Ana C; Niu, Jia; Chen, Zhen; Güder, Firat; Cheng, Chao-Min; Liu, David; Whitesides, George M

2016-01-05

This paper addresses a growing need in clinical diagnostics for parallel, multiplex analysis of biomarkers from small biological samples. It describes a new procedure for assembling arrays of ssDNA and proteins on paper. This method starts with the synthesis of DNA oligonucleotides covalently linked to paper and proceeds to assemble microzones of DNA-conjugated paper into arrays capable of simultaneously capturing DNA, DNA-conjugated protein antigens, and DNA-conjugated antibodies. The synthesis of ssDNA oligonucleotides on paper is convenient and effective with 32% of the oligonucleotides cleaved and eluted from the paper substrate being full-length by HPLC for a 32-mer. These ssDNA arrays can be used to detect fluorophore-linked DNA oligonucleotides in solution, and as the basis for DNA-directed assembly of arrays of DNA-conjugated capture antibodies on paper, detect protein antigens by sandwich ELISAs. Paper-anchored ssDNA arrays with different sequences can be used to assemble paper-based devices capable of detecting DNA and antibodies in the same device and enable simple microfluidic paper-based devices.
An easy-to-prepare mini-scaffold for DNA origami

NASA Astrophysics Data System (ADS)

Brown, S.; Majikes, J.; Martínez, A.; Girón, T. M.; Fennell, H.; Samano, E. C.; Labean, T. H.

2015-10-01

The DNA origami strategy for assembling designed supramolecular complexes requires ssDNA as a scaffold strand. A system is described that was designed approximately one third the length of the M13 bacteriophage genome for ease of ssDNA production. Folding of the 2404-base ssDNA scaffold into a variety of origami shapes with high assembly yields is demonstrated.The DNA origami strategy for assembling designed supramolecular complexes requires ssDNA as a scaffold strand. A system is described that was designed approximately one third the length of the M13 bacteriophage genome for ease of ssDNA production. Folding of the 2404-base ssDNA scaffold into a variety of origami shapes with high assembly yields is demonstrated. Electronic supplementary information (ESI) available: Flow chart of the production process, base sequences of the scaffold strand, and synthetic staple strands, as well as caDNAnao files for all three mini-M13 origami structures. See DOI: 10.1039/c5nr04921k
SynTrack: DNA Assembly Workflow Management (SynTrack) v2.0.1

DOE Office of Scientific and Technical Information (OSTI.GOV)

MENG, XIANWEI; SIMIRENKO, LISA

2016-12-01

SynTrack is a dynamic, workflow-driven data management system that tracks the DNA build process: Management of the hierarchical relationships of the DNA fragments; Monitoring of process tasks for the assembly of multiple DNA fragments into final constructs; Creations of vendor order forms with selectable building blocks. Organizing plate layouts barcodes for vendor/pcr/fusion/chewback/bioassay/glycerol/master plate maps (default/condensed); Creating or updating Pre-Assembly/Assembly process workflows with selected building blocks; Generating Echo pooling instructions based on plate maps; Tracking of building block orders, received and final assembled for delivering; Bulk updating of colony or PCR amplification information, fusion PCR and chewback results; Updating with QA/QCmore » outcome with .csv & .xlsx template files; Re-work assembly workflow enabled before and after sequencing validation; and Tracking of plate/well data changes and status updates and reporting of master plate status with QC outcomes.« less
Functionalization of quantum rods with oligonucleotides for programmable assembly with DNA origami

NASA Astrophysics Data System (ADS)

Doane, Tennyson L.; Alam, Rabeka; Maye, Mathew M.

2015-02-01

The DNA-mediated self-assembly of CdSe/CdS quantum rods (QRs) onto DNA origami is described. Two QR types with unique optical emission and high polarization were synthesized, and then functionalized with oligonucleotides (ssDNA) using a novel protection-deprotection approach, which harnessed ssDNA's tailorable rigidity and denaturation temperature to increase DNA coverage by reducing non-specific coordination and wrapping. The QR assembly was programmable, and occurred at two different assembly zones that had capture strands in parallel alignment. QRs with different optical properties were assembled, opening up future studies on orientation dependent QR FRET. The QR-origami conjugates could be purified via gel electrophoresis and sucrose gradient ultracentrifugation. Assembly yields, QR stoichiometry and orientation, as well as energy transfer implications were studied in light of QR distances, origami flexibility, and conditions.The DNA-mediated self-assembly of CdSe/CdS quantum rods (QRs) onto DNA origami is described. Two QR types with unique optical emission and high polarization were synthesized, and then functionalized with oligonucleotides (ssDNA) using a novel protection-deprotection approach, which harnessed ssDNA's tailorable rigidity and denaturation temperature to increase DNA coverage by reducing non-specific coordination and wrapping. The QR assembly was programmable, and occurred at two different assembly zones that had capture strands in parallel alignment. QRs with different optical properties were assembled, opening up future studies on orientation dependent QR FRET. The QR-origami conjugates could be purified via gel electrophoresis and sucrose gradient ultracentrifugation. Assembly yields, QR stoichiometry and orientation, as well as energy transfer implications were studied in light of QR distances, origami flexibility, and conditions. Electronic supplementary information (ESI) available: Experimental conditions, DNA origami blueprint and sequences, FRET calculations. Additional Fig. S1-S13. See DOI: 10.1039/c4nr07662a
Recent advances in sequence assembly: principles and applications.

PubMed

Chen, Qingfeng; Lan, Chaowang; Zhao, Liang; Wang, Jianxin; Chen, Baoshan; Chen, Yi-Ping Phoebe

2017-11-01

The application of advanced sequencing technologies and the rapid growth of various sequence data have led to increasing interest in DNA sequence assembly. However, repeats and polymorphism occur frequently in genomes, and each of these has different impacts on assembly. Further, many new applications for sequencing, such as metagenomics regarding multiple species, have emerged in recent years. These not only give rise to higher complexity but also prevent short-read assembly in an efficient way. This article reviews the theoretical foundations that underlie current mapping-based assembly and de novo-based assembly, and highlights the key issues and feasible solutions that need to be considered. It focuses on how individual processes, such as optimal k-mer determination and error correction in assembly, rely on intelligent strategies or high-performance computation. We also survey primary algorithms/software and offer a discussion on the emerging challenges in assembly. © The Author 2017. Published by Oxford University Press. All rights reserved. For permissions, please email: journals.permissions@oup.com.
RNA-programmed genome editing in human cells

PubMed Central

Jinek, Martin; East, Alexandra; Cheng, Aaron; Lin, Steven; Ma, Enbo; Doudna, Jennifer

2013-01-01

Type II CRISPR immune systems in bacteria use a dual RNA-guided DNA endonuclease, Cas9, to cleave foreign DNA at specific sites. We show here that Cas9 assembles with hybrid guide RNAs in human cells and can induce the formation of double-strand DNA breaks (DSBs) at a site complementary to the guide RNA sequence in genomic DNA. This cleavage activity requires both Cas9 and the complementary binding of the guide RNA. Experiments using extracts from transfected cells show that RNA expression and/or assembly into Cas9 is the limiting factor for Cas9-mediated DNA cleavage. In addition, we find that extension of the RNA sequence at the 3′ end enhances DNA targeting activity in vivo. These results show that RNA-programmed genome editing is a facile strategy for introducing site-specific genetic changes in human cells. DOI: http://dx.doi.org/10.7554/eLife.00471.001 PMID:23386978
RapGene: a fast and accurate strategy for synthetic gene assembly in Escherichia coli

PubMed Central

Zampini, Massimiliano; Stevens, Pauline Rees; Pachebat, Justin A.; Kingston-Smith, Alison; Mur, Luis A. J.; Hayes, Finbarr

2015-01-01

The ability to assemble DNA sequences de novo through efficient and powerful DNA fabrication methods is one of the foundational technologies of synthetic biology. Gene synthesis, in particular, has been considered the main driver for the emergence of this new scientific discipline. Here we describe RapGene, a rapid gene assembly technique which was successfully tested for the synthesis and cloning of both prokaryotic and eukaryotic genes through a ligation independent approach. The method developed in this study is a complete bacterial gene synthesis platform for the quick, accurate and cost effective fabrication and cloning of gene-length sequences that employ the widely used host Escherichia coli. PMID:26062748
The value of new genome references.

PubMed

Worley, Kim C; Richards, Stephen; Rogers, Jeffrey

2017-09-15

Genomic information has become a ubiquitous and almost essential aspect of biological research. Over the last 10-15 years, the cost of generating sequence data from DNA or RNA samples has dramatically declined and our ability to interpret those data increased just as remarkably. Although it is still possible for biologists to conduct interesting and valuable research on species for which genomic data are not available, the impact of having access to a high quality whole genome reference assembly for a given species is nothing short of transformational. Research on a species for which we have no DNA or RNA sequence data is restricted in fundamental ways. In contrast, even access to an initial draft quality genome (see below for definitions) opens a wide range of opportunities that are simply not available without that reference genome assembly. Although a complete discussion of the impact of genome sequencing and assembly is beyond the scope of this short paper, the goal of this review is to summarize the most common and highest impact contributions that whole genome sequencing and assembly has had on comparative and evolutionary biology. Copyright © 2016. Published by Elsevier Inc.
An integrated pipeline for next generation sequencing and annotation of the complete mitochondrial genome of the giant intestinal fluke, Fasciolopsis buski (Lankester, 1857) Looss, 1899

PubMed Central

Biswal, Devendra Kumar; Ghatani, Sudeep; Shylla, Jollin A.; Sahu, Ranjana; Mullapudi, Nandita

2013-01-01

Helminths include both parasitic nematodes (roundworms) and platyhelminths (trematode and cestode flatworms) that are abundant, and are of clinical importance. The genetic characterization of parasitic flatworms using advanced molecular tools is central to the diagnosis and control of infections. Although the nuclear genome houses suitable genetic markers (e.g., in ribosomal (r) DNA) for species identification and molecular characterization, the mitochondrial (mt) genome consistently provides a rich source of novel markers for informative systematics and epidemiological studies. In the last decade, there have been some important advances in mtDNA genomics of helminths, especially lung flukes, liver flukes and intestinal flukes. Fasciolopsis buski, often called the giant intestinal fluke, is one of the largest digenean trematodes infecting humans and found primarily in Asia, in particular the Indian subcontinent. Next-generation sequencing (NGS) technologies now provide opportunities for high throughput sequencing, assembly and annotation within a short span of time. Herein, we describe a high-throughput sequencing and bioinformatics pipeline for mt genomics for F. buski that emphasizes the utility of short read NGS platforms such as Ion Torrent and Illumina in successfully sequencing and assembling the mt genome using innovative approaches for PCR primer design as well as assembly. We took advantage of our NGS whole genome sequence data (unpublished so far) for F. buski and its comparison with available data for the Fasciola hepatica mtDNA as the reference genome for design of precise and specific primers for amplification of mt genome sequences from F. buski. A long-range PCR was carried out to create an NGS library enriched in mt DNA sequences. Two different NGS platforms were employed for complete sequencing, assembly and annotation of the F. buski mt genome. The complete mt genome sequences of the intestinal fluke comprise 14,118 bp and is thus the shortest trematode mitochondrial genome sequenced to date. The noncoding control regions are separated into two parts by the tRNA-Gly gene and don’t contain either tandem repeats or secondary structures, which are typical for trematode control regions. The gene content and arrangement are identical to that of F. hepatica. The F. buski mtDNA genome has a close resemblance with F. hepatica and has a similar gene order tallying with that of other trematodes. The mtDNA for the intestinal fluke is reported herein for the first time by our group that would help investigate Fasciolidae taxonomy and systematics with the aid of mtDNA NGS data. More so, it would serve as a resource for comparative mitochondrial genomics and systematic studies of trematode parasites. PMID:24255820
α satellite DNA variation and function of the human centromere

PubMed Central

Sullivan, Lori L.; Chew, Kimberline

2017-01-01

ABSTRACT Genomic variation is a source of functional diversity that is typically studied in genic and non-coding regulatory regions. However, the extent of variation within noncoding portions of the human genome, particularly highly repetitive regions, and the functional consequences are not well understood. Satellite DNA, including α satellite DNA found at human centromeres, comprises up to 10% of the genome, but is difficult to study because its repetitive nature hinders contiguous sequence assemblies. We recently described variation within α satellite DNA that affects centromere function. On human chromosome 17 (HSA17), we showed that size and sequence polymorphisms within primary array D17Z1 are associated with chromosome aneuploidy and defective centromere architecture. However, HSA17 can counteract this instability by assembling the centromere at a second, “backup” array lacking variation. Here, we discuss our findings in a broader context of human centromere assembly, and highlight areas of future study to uncover links between genomic and epigenetic features of human centromeres. PMID:28406740
DNA-programmable nanoparticle crystallization.

PubMed

Park, Sung Yong; Lytton-Jean, Abigail K R; Lee, Byeongdu; Weigand, Steven; Schatz, George C; Mirkin, Chad A

2008-01-31

It was first shown more than ten years ago that DNA oligonucleotides can be attached to gold nanoparticles rationally to direct the formation of larger assemblies. Since then, oligonucleotide-functionalized nanoparticles have been developed into powerful diagnostic tools for nucleic acids and proteins, and into intracellular probes and gene regulators. In contrast, the conceptually simple yet powerful idea that functionalized nanoparticles might serve as basic building blocks that can be rationally assembled through programmable base-pairing interactions into highly ordered macroscopic materials remains poorly developed. So far, the approach has mainly resulted in polymerization, with modest control over the placement of, the periodicity in, and the distance between particles within the assembled material. That is, most of the materials obtained thus far are best classified as amorphous polymers, although a few examples of colloidal crystal formation exist. Here, we demonstrate that DNA can be used to control the crystallization of nanoparticle-oligonucleotide conjugates to the extent that different DNA sequences guide the assembly of the same type of inorganic nanoparticle into different crystalline states. We show that the choice of DNA sequences attached to the nanoparticle building blocks, the DNA linking molecules and the absence or presence of a non-bonding single-base flexor can be adjusted so that gold nanoparticles assemble into micrometre-sized face-centred-cubic or body-centred-cubic crystal structures. Our findings thus clearly demonstrate that synthetically programmable colloidal crystallization is possible, and that a single-component system can be directed to form different structures.
Cooperative heteroassembly of the adenoviral L4-22K and IVa2 proteins onto the viral packaging sequence DNA.

PubMed

Yang, Teng-Chieh; Maluf, Nasib Karl

2012-02-21

Human adenovirus (Ad) is an icosahedral, double-stranded DNA virus. Viral DNA packaging refers to the process whereby the viral genome becomes encapsulated by the viral particle. In Ad, activation of the DNA packaging reaction requires at least three viral components: the IVa2 and L4-22K proteins and a section of DNA within the viral genome, called the packaging sequence. Previous studies have shown that the IVa2 and L4-22K proteins specifically bind to conserved elements within the packaging sequence and that these interactions are absolutely required for the observation of DNA packaging. However, the equilibrium mechanism for assembly of IVa2 and L4-22K onto the packaging sequence has not been determined. Here we characterize the assembly of the IVa2 and L4-22K proteins onto truncated packaging sequence DNA by analytical sedimentation velocity and equilibrium methods. At limiting concentrations of L4-22K, we observe a species with two IVa2 monomers and one L4-22K monomer bound to the DNA. In this species, the L4-22K monomer is promoting positive cooperative interactions between the two bound IVa2 monomers. As L4-22K levels are increased, we observe a species with one IVa2 monomer and three L4-22K monomers bound to the DNA. To explain this result, we propose a model in which L4-22K self-assembly on the DNA competes with IVa2 for positive heterocooperative interactions, destabilizing binding of the second IVa2 monomer. Thus, we propose that L4-22K levels control the extent of cooperativity observed between adjacently bound IVa2 monomers. We have also determined the hydrodynamic properties of all observed stoichiometric species; we observe that species with three L4-22K monomers bound have more extended conformations than species with a single L4-22K bound. We suggest this might reflect a molecular switch that controls insertion of the viral DNA into the capsid.
Comparative scaffolding and gap filling of ancient bacterial genomes applied to two ancient Yersinia pestis genomes

PubMed Central

Doerr, Daniel; Chauve, Cedric

2017-01-01

Yersinia pestis is the causative agent of the bubonic plague, a disease responsible for several dramatic historical pandemics. Progress in ancient DNA (aDNA) sequencing rendered possible the sequencing of whole genomes of important human pathogens, including the ancient Y. pestis strains responsible for outbreaks of the bubonic plague in London in the 14th century and in Marseille in the 18th century, among others. However, aDNA sequencing data are still characterized by short reads and non-uniform coverage, so assembling ancient pathogen genomes remains challenging and often prevents a detailed study of genome rearrangements. It has recently been shown that comparative scaffolding approaches can improve the assembly of ancient Y. pestis genomes at a chromosome level. In the present work, we address the last step of genome assembly, the gap-filling stage. We describe an optimization-based method AGapEs (ancestral gap estimation) to fill in inter-contig gaps using a combination of a template obtained from related extant genomes and aDNA reads. We show how this approach can be used to refine comparative scaffolding by selecting contig adjacencies supported by a mix of unassembled aDNA reads and comparative signal. We applied our method to two Y. pestis data sets from the London and Marseilles outbreaks, for which we obtained highly improved genome assemblies for both genomes, comprised of, respectively, five and six scaffolds with 95 % of the assemblies supported by ancient reads. We analysed the genome evolution between both ancient genomes in terms of genome rearrangements, and observed a high level of synteny conservation between these strains. PMID:29114402
An improved divergent synthesis of comb-type branched oligodeoxyribonucleotides (bDNA) containing multiple secondary sequences.

PubMed

Horn, T; Chang, C A; Urdea, M S

1997-12-01

The divergent synthesis of branched DNA (bDNA) comb structures is described. This new type of bDNA contains one unique oligonucleotide, the primary sequence, covalently attached through a comb-like branch network to many identical copies of a different oligonucleotide, the secondary sequence. The bDNA comb structures were assembled on a solid support and several synthesis parameters were investigated and optimized. The bDNA comb molecules were characterized by polyacrylamide gel electrophoretic methods and by controlled cleavage at periodate-cleavable moieties incorporated during synthesis. The developed chemistry allows synthesis of bDNA comb molecules containing multiple secondary sequences. In the accompanying article we describe the synthesis and characterization of large bDNA combs containing all four deoxynucleotides for use as signal amplifiers in nucleic acid quantification assays.
An improved divergent synthesis of comb-type branched oligodeoxyribonucleotides (bDNA) containing multiple secondary sequences.

PubMed Central

Horn, T; Chang, C A; Urdea, M S

1997-01-01

The divergent synthesis of branched DNA (bDNA) comb structures is described. This new type of bDNA contains one unique oligonucleotide, the primary sequence, covalently attached through a comb-like branch network to many identical copies of a different oligonucleotide, the secondary sequence. The bDNA comb structures were assembled on a solid support and several synthesis parameters were investigated and optimized. The bDNA comb molecules were characterized by polyacrylamide gel electrophoretic methods and by controlled cleavage at periodate-cleavable moieties incorporated during synthesis. The developed chemistry allows synthesis of bDNA comb molecules containing multiple secondary sequences. In the accompanying article we describe the synthesis and characterization of large bDNA combs containing all four deoxynucleotides for use as signal amplifiers in nucleic acid quantification assays. PMID:9365265
Tuning the Cavity Size and Chirality of Self-Assembling 3D DNA Crystals

DOE Office of Scientific and Technical Information (OSTI.GOV)

Simmons, Chad R.; Zhang, Fei; MacCulloch, Tara

The foundational goal of structural DNA nanotechnology—the field that uses oligonucleotides as a molecular building block for the programmable self-assembly of nanostructured systems—was to use DNA to construct three-dimensional (3D) lattices for solving macromolecular structures. The programmable nature of DNA makes it an ideal system for rationally constructing self-assembled crystals and immobilizing guest molecules in a repeating 3D array through their specific stereospatial interactions with the scaffold. In this work, we have extended a previously described motif (4 × 5) by expanding the structure to a system that links four double-helical layers; we use a central weaving oligonucleotide containing amore » sequence of four six-base repeats (4 × 6), forming a matrix of layers that are organized and dictated by a series of Holliday junctions. In addition, we have assembled mirror image crystals (l-DNA) with the identical sequence that are completely resistant to nucleases. Bromine and selenium derivatives were obtained for the l- and d-DNA forms, respectively, allowing phase determination for both forms and solution of the resulting structures to 3.0 and 3.05 Å resolution. Both right- and left-handed forms crystallized in the trigonal space groups with mirror image 3-fold helical screw axes P32 and P31 for each motif, respectively. The structures reveal a highly organized array of discrete and well-defined cavities that are suitable for hosting guest molecules and allow us to dictate a priori the assembly of guest–DNA conjugates with a specified crystalline hand.« less
Rhipicephalus microplus strain Deutsch, 10 BAC clone sequences

USDA-ARS?s Scientific Manuscript database

The cattle tick, Rhipicephalus (Boophilus) microplus, has a genome over 2.4 times the size of the human genome, and with over 70% of repetitive DNA, this genome would prove very costly to sequence at today's prices and difficult to assemble and analyze. We used labeled DNA probes from the coding reg...
Capturing the Biofuel Wellhead and Powerhouse: The Chloroplast and Mitochondrial Genomes of the Leguminous Feedstock Tree Pongamia pinnata

PubMed Central

Kazakoff, Stephen H.; Imelfort, Michael; Edwards, David; Koehorst, Jasper; Biswas, Bandana; Batley, Jacqueline; Scott, Paul T.; Gresshoff, Peter M.

2012-01-01

Pongamia pinnata (syn. Millettia pinnata) is a novel, fast-growing arboreal legume that bears prolific quantities of oil-rich seeds suitable for the production of biodiesel and aviation biofuel. Here, we have used Illumina® ‘Second Generation DNA Sequencing (2GS)’ and a new short-read de novo assembler, SaSSY, to assemble and annotate the Pongamia chloroplast (152,968 bp; cpDNA) and mitochondrial (425,718 bp; mtDNA) genomes. We also show that SaSSY can be used to accurately assemble 2GS data, by re-assembling the Lotus japonicus cpDNA and in the process assemble its mtDNA (380,861 bp). The Pongamia cpDNA contains 77 unique protein-coding genes and is almost 60% gene-dense. It contains a 50 kb inversion common to other legumes, as well as a novel 6.5 kb inversion that is responsible for the non-disruptive, re-orientation of five protein-coding genes. Additionally, two copies of an inverted repeat firmly place the species outside the subclade of the Fabaceae lacking the inverted repeat. The Pongamia and L. japonicus mtDNA contain just 33 and 31 unique protein-coding genes, respectively, and like other angiosperm mtDNA, have expanded intergenic and multiple repeat regions. Through comparative analysis with Vigna radiata we measured the average synonymous and non-synonymous divergence of all three legume mitochondrial (1.59% and 2.40%, respectively) and chloroplast (8.37% and 8.99%, respectively) protein-coding genes. Finally, we explored the relatedness of Pongamia within the Fabaceae and showed the utility of the organellar genome sequences by mapping transcriptomic data to identify up- and down-regulated stress-responsive gene candidates and confirm in silico predicted RNA editing sites. PMID:23272141
Capturing the biofuel wellhead and powerhouse: the chloroplast and mitochondrial genomes of the leguminous feedstock tree Pongamia pinnata.

PubMed

Kazakoff, Stephen H; Imelfort, Michael; Edwards, David; Koehorst, Jasper; Biswas, Bandana; Batley, Jacqueline; Scott, Paul T; Gresshoff, Peter M

2012-01-01

Pongamia pinnata (syn. Millettia pinnata) is a novel, fast-growing arboreal legume that bears prolific quantities of oil-rich seeds suitable for the production of biodiesel and aviation biofuel. Here, we have used Illumina® 'Second Generation DNA Sequencing (2GS)' and a new short-read de novo assembler, SaSSY, to assemble and annotate the Pongamia chloroplast (152,968 bp; cpDNA) and mitochondrial (425,718 bp; mtDNA) genomes. We also show that SaSSY can be used to accurately assemble 2GS data, by re-assembling the Lotus japonicus cpDNA and in the process assemble its mtDNA (380,861 bp). The Pongamia cpDNA contains 77 unique protein-coding genes and is almost 60% gene-dense. It contains a 50 kb inversion common to other legumes, as well as a novel 6.5 kb inversion that is responsible for the non-disruptive, re-orientation of five protein-coding genes. Additionally, two copies of an inverted repeat firmly place the species outside the subclade of the Fabaceae lacking the inverted repeat. The Pongamia and L. japonicus mtDNA contain just 33 and 31 unique protein-coding genes, respectively, and like other angiosperm mtDNA, have expanded intergenic and multiple repeat regions. Through comparative analysis with Vigna radiata we measured the average synonymous and non-synonymous divergence of all three legume mitochondrial (1.59% and 2.40%, respectively) and chloroplast (8.37% and 8.99%, respectively) protein-coding genes. Finally, we explored the relatedness of Pongamia within the Fabaceae and showed the utility of the organellar genome sequences by mapping transcriptomic data to identify up- and down-regulated stress-responsive gene candidates and confirm in silico predicted RNA editing sites.
Genome Wide Characterization of Simple Sequence Repeats in Cucumber

USDA-ARS?s Scientific Manuscript database

The whole genome sequence of the cucumber cultivar Gy14 was recently sequenced at 15× coverage with the Roche 454 Titanium technology. The microsatellite DNA sequences (simple sequence repeats, SSRs) in the assembled scaffolds were computationally explored and characterized. A total of 112,073 SSRs ...

GFinisher: a new strategy to refine and finish bacterial genome assemblies

NASA Astrophysics Data System (ADS)

Guizelini, Dieval; Raittz, Roberto T.; Cruz, Leonardo M.; Souza, Emanuel M.; Steffens, Maria B. R.; Pedrosa, Fabio O.

2016-10-01

Despite the development in DNA sequencing technology, improving the number and the length of reads, the process of reconstruction of complete genome sequences, the so called genome assembly, is still complex. Only 13% of the prokaryotic genome sequencing projects have been completed. Draft genome sequences deposited in public databases are fragmented in contigs and may lack the full gene complement. The aim of the present work is to identify assembly errors and improve the assembly process of bacterial genomes. The biological patterns observed in genomic sequences and the application of a priori information can allow the identification of misassembled regions, and the reorganization and improvement of the overall de novo genome assembly. GFinisher starts generating a Fuzzy GC skew graphs for each contig in an assembly and follows breaking down the contigs in critical points in order to reassemble and close them using jFGap. This has been successfully applied to dataset from 96 genome assemblies, decreasing the number of contigs by up to 86%. GFinisher can easily optimize assemblies of prokaryotic draft genomes and can be used to improve the assembly programs based on nucleotide sequence patterns in the genome. The software and source code are available at http://gfinisher.sourceforge.net/.
GFinisher: a new strategy to refine and finish bacterial genome assemblies.

PubMed

Guizelini, Dieval; Raittz, Roberto T; Cruz, Leonardo M; Souza, Emanuel M; Steffens, Maria B R; Pedrosa, Fabio O

2016-10-10

Despite the development in DNA sequencing technology, improving the number and the length of reads, the process of reconstruction of complete genome sequences, the so called genome assembly, is still complex. Only 13% of the prokaryotic genome sequencing projects have been completed. Draft genome sequences deposited in public databases are fragmented in contigs and may lack the full gene complement. The aim of the present work is to identify assembly errors and improve the assembly process of bacterial genomes. The biological patterns observed in genomic sequences and the application of a priori information can allow the identification of misassembled regions, and the reorganization and improvement of the overall de novo genome assembly. GFinisher starts generating a Fuzzy GC skew graphs for each contig in an assembly and follows breaking down the contigs in critical points in order to reassemble and close them using jFGap. This has been successfully applied to dataset from 96 genome assemblies, decreasing the number of contigs by up to 86%. GFinisher can easily optimize assemblies of prokaryotic draft genomes and can be used to improve the assembly programs based on nucleotide sequence patterns in the genome. The software and source code are available at http://gfinisher.sourceforge.net/.
Whole-genome sequencing in bacteriology: state of the art

PubMed Central

Dark, Michael J

2013-01-01

Over the last ten years, genome sequencing capabilities have expanded exponentially. There have been tremendous advances in sequencing technology, DNA sample preparation, genome assembly, and data analysis. This has led to advances in a number of facets of bacterial genomics, including metagenomics, clinical medicine, bacterial archaeology, and bacterial evolution. This review examines the strengths and weaknesses of techniques in bacterial genome sequencing, upcoming technologies, and assembly techniques, as well as highlighting recent studies that highlight new applications for bacterial genomics. PMID:24143115
DNA-imprinted polymer nanoparticles with monodispersity and prescribed DNA-strand patterns

NASA Astrophysics Data System (ADS)

Trinh, Tuan; Liao, Chenyi; Toader, Violeta; Barłóg, Maciej; Bazzi, Hassan S.; Li, Jianing; Sleiman, Hanadi F.

2018-02-01

As colloidal self-assembly increasingly approaches the complexity of natural systems, an ongoing challenge is to generate non-centrosymmetric structures. For example, patchy, Janus or living crystallization particles have significantly advanced the area of polymer assembly. It has remained difficult, however, to devise polymer particles that associate in a directional manner, with controlled valency and recognition motifs. Here, we present a method to transfer DNA patterns from a DNA cage to a polymeric nanoparticle encapsulated inside the cage in three dimensions. The resulting DNA-imprinted particles (DIPs), which are 'moulded' on the inside of the DNA cage, consist of a monodisperse crosslinked polymer core with a predetermined pattern of different DNA strands covalently 'printed' on their exterior, and further assemble with programmability and directionality. The number, orientation and sequence of DNA strands grafted onto the polymeric core can be controlled during the process, and the strands are addressable independently of each other.
Genome Calligrapher: A Web Tool for Refactoring Bacterial Genome Sequences for de Novo DNA Synthesis.

PubMed

Christen, Matthias; Deutsch, Samuel; Christen, Beat

2015-08-21

Recent advances in synthetic biology have resulted in an increasing demand for the de novo synthesis of large-scale DNA constructs. Any process improvement that enables fast and cost-effective streamlining of digitized genetic information into fabricable DNA sequences holds great promise to study, mine, and engineer genomes. Here, we present Genome Calligrapher, a computer-aided design web tool intended for whole genome refactoring of bacterial chromosomes for de novo DNA synthesis. By applying a neutral recoding algorithm, Genome Calligrapher optimizes GC content and removes obstructive DNA features known to interfere with the synthesis of double-stranded DNA and the higher order assembly into large DNA constructs. Subsequent bioinformatics analysis revealed that synthesis constraints are prevalent among bacterial genomes. However, a low level of codon replacement is sufficient for refactoring bacterial genomes into easy-to-synthesize DNA sequences. To test the algorithm, 168 kb of synthetic DNA comprising approximately 20 percent of the synthetic essential genome of the cell-cycle bacterium Caulobacter crescentus was streamlined and then ordered from a commercial supplier of low-cost de novo DNA synthesis. The successful assembly into eight 20 kb segments indicates that Genome Calligrapher algorithm can be efficiently used to refactor difficult-to-synthesize DNA. Genome Calligrapher is broadly applicable to recode biosynthetic pathways, DNA sequences, and whole bacterial genomes, thus offering new opportunities to use synthetic biology tools to explore the functionality of microbial diversity. The Genome Calligrapher web tool can be accessed at https://christenlab.ethz.ch/GenomeCalligrapher  .
DIVA V2.0

DOE Office of Scientific and Technical Information (OSTI.GOV)

CHEN, JOANNA; SIMIRENKO, LISA; TAPASWI, MANJIRI

The DIVA software interfaces a process in which researchers design their DNA with a web-based graphical user interface, submit their designs to a central queue, and a few weeks later receive their sequence-verified clonal constructs. Each researcher independently designs the DNA to be constructed with a web-based BioCAD tool, and presses a button to submit their designs to a central queue. Researchers have web-based access to their DNA design queues, and can track the progress of their submitted designs as they progress from "evaluation", to "waiting for reagents", to "in progress", to "complete". Researchers access their completed constructs through themore » central DNA repository. Along the way, all DNA construction success/failure rates are captured in a central database. Once a design has been submitted to the queue, a small number of dedicated staff evaluate the design for feasibility and provide feedback to the responsible researcher if the design is either unreasonable (e.g., encompasses a combinatorial library of a billion constructs) or small design changes could significantly facilitate the downstream implementation process. The dedicated staff then use DNA assembly design automation software to optimize the DNA construction process for the design, leveraging existing parts from the DNA repository where possible and ordering synthetic DNA where necessary. SynTrack software manages the physical locations and availability of the various requisite reagents and process inputs (e.g., DNA templates). Once all requisite process inputs are available, the design progresses from "waiting for reagents" to "in progress" in the design queue. Human-readable and machine-parseable DNA construction protocols output by the DNA assembly design automation software are then executed by the dedicated staff exploiting lab automation devices wherever possible. Since the all employed DNA construction methods are sequence-agnostic, standardized (utilize the same enzymatic master mixes and reaction conditions), completely independent DNA construction tasks can be aggregated into the same multi-well plates and pursued in parallel. The resulting sets of cloned constructs can then be screened by high-throughput next-gen sequencing platforms for sequence correctness. A combination of long read-length (e.g., PacBio) and paired-end read platforms (e.g., Illumina) would be exploited depending the particular task at hand (e.g., PacBio might be sufficient to screen a set of pooled constructs with significant gene divergence). Post sequence verification, designs for which at least one correct clone was identified will progress to a "complete" status, while designs for which no correct clones wereidentified will progress to a "failure" status. Depending on the failure mode (e.g., no transformants), and how many prior attempts/variations of assembly protocol have been already made for a given design, subsequent attempts may be made or the design can progress to a "permanent failure" state. All success and failure rate information will be captured during the process, including at which stage a given clonal construction procedure failed (e.g., no PCR product) and what the exact failure was (e.g. assembly piece 2 missing). This success/failure rate data can be leveraged to refine the DNA assembly design process.« less
A complete Neandertal mitochondrial genome sequence determined by high-throughput sequencing

PubMed Central

Green, Richard E.; Malaspinas, Anna-Sapfo; Krause, Johannes; Briggs, Adrian W.; Johnson, Philip L. F.; Uhler, Caroline; Meyer, Matthias; Good, Jeffrey M.; Maricic, Tomislav; Stenzel, Udo; Prüfer, Kay; Siebauer, Michael; Burbano, Hernán A.; Ronan, Michael; Rothberg, Jonathan M.; Egholm, Michael; Rudan, Pavao; Brajković, Dejana; Kućan, Željko; Gušić, Ivan; Wikström, Mårten; Laakkonen, Liisa; Kelso, Janet; Slatkin, Montgomery; Pääbo, Svante

2008-01-01

Summary A complete mitochondrial (mt) genome sequence was reconstructed from a 38,000-year-old Neandertal individual using 8,341 mtDNA sequences identified among 4.8 Gb of DNA generated from ~0.3 grams of bone. Analysis of the assembled sequence unequivocally establishes that the Neandertal mtDNA falls outside the variation of extant human mtDNAs and allows an estimate of the divergence date between the two mtDNA lineages of 660,000±140,000 years. Of the 13 proteins encoded in the mtDNA, subunit 2 of cytochrome c oxidase of the mitochondrial electron transport chain has experienced the largest number of amino acid substitutions in human ancestors since the separation from Neandertals. There is evidence that purifying selection in the Neandertal mtDNA was reduced compared to other primate lineages suggesting that the effective population size of Neandertals was small. PMID:18692465
Comparison of complete mitochondrial DNA sequences between old and new world strains of the cowpea aphid, Aphis craccivora (Hemiptera: Aphididae)

USDA-ARS?s Scientific Manuscript database

Mitochondrial DNA provides useful tools for inferring population genetic structure within a species and phylogenetic relationships between species. The complete mitogenome sequences were assembled from strains of the cowpea aphids, Aphis craccivora, from the old (15,308 bp) and new world (15,305 bp...
The gene space in wheat: the complete γ-gliadin gene family from the wheat cultivar Chinese Spring.

PubMed

Anderson, Olin D; Huo, Naxin; Gu, Yong Q

2013-06-01

The complete set of unique γ-gliadin genes is described for the wheat cultivar Chinese Spring using a combination of expressed sequence tag (EST) and Roche 454 DNA sequences. Assemblies of Chinese Spring ESTs yielded 11 different γ-gliadin gene sequences. Two of the sequences encode identical polypeptides and are assumed to be the result of a recent gene duplication. One gene has a 3' coding mutation that changes the reading frame in the final eight codons. A second assembly of Chinese Spring γ-gliadin sequences was generated using Roche 454 total genomic DNA sequences. The 454 assembly confirmed the same 11 active genes as the EST assembly plus two pseudogenes not represented by ESTs. These 13 γ-gliadin sequences represent the complete unique set of γ-gliadin genes for cv Chinese Spring, although not ruled out are additional genes that are exact duplications of these 13 genes. A comparison with the ESTs of two other hexaploid cultivars (Butte 86 and Recital) finds that the most active genes are present in all three cultivars, with exceptions likely due to too few ESTs for detection in Butte 86 and Recital. A comparison of the numbers of ESTs per gene indicates differential levels of expression within the γ-gliadin gene family. Genome assignments were made for 6 of the 13 Chinese Spring γ-gliadin genes, i.e., one assignment from a match to two γ-gliadin genes found within a tetraploid wheat A genome BAC and four genes that match four distinct γ-gliadin sequences assembled from Roche 454 sequences from Aegilops tauschii, the hexaploid wheat D-genome ancestor.
Retrosynthetic Analysis-Guided Breaking Tile Symmetry for the Assembly of Complex DNA Nanostructures.

PubMed

Wang, Pengfei; Wu, Siyu; Tian, Cheng; Yu, Guimei; Jiang, Wen; Wang, Guansong; Mao, Chengde

2016-10-11

Current tile-based DNA self-assembly produces simple repetitive or highly symmetric structures. In the case of 2D lattices, the unit cell often contains only one basic tile because the tiles often are symmetric (in terms of either the backbone or the sequence). In this work, we have applied retrosynthetic analysis to determine the minimal asymmetric units for complex DNA nanostructures. Such analysis guides us to break the intrinsic structural symmetries of the tiles to achieve high structural complexities. This strategy has led to the construction of several DNA nanostructures that are not accessible from conventional symmetric tile designs. Along with previous studies, herein we have established a set of four fundamental rules regarding tile-based assembly. Such rules could serve as guidelines for the design of DNA nanostructures.
Establishing gene models from the Pinus pinaster genome using gene capture and BAC sequencing.

PubMed

Seoane-Zonjic, Pedro; Cañas, Rafael A; Bautista, Rocío; Gómez-Maldonado, Josefa; Arrillaga, Isabel; Fernández-Pozo, Noé; Claros, M Gonzalo; Cánovas, Francisco M; Ávila, Concepción

2016-02-27

In the era of DNA throughput sequencing, assembling and understanding gymnosperm mega-genomes remains a challenge. Although drafts of three conifer genomes have recently been published, this number is too low to understand the full complexity of conifer genomes. Using techniques focused on specific genes, gene models can be established that can aid in the assembly of gene-rich regions, and this information can be used to compare genomes and understand functional evolution. In this study, gene capture technology combined with BAC isolation and sequencing was used as an experimental approach to establish de novo gene structures without a reference genome. Probes were designed for 866 maritime pine transcripts to sequence genes captured from genomic DNA. The gene models were constructed using GeneAssembler, a new bioinformatic pipeline, which reconstructed over 82% of the gene structures, and a high proportion (85%) of the captured gene models contained sequences from the promoter regulatory region. In a parallel experiment, the P. pinaster BAC library was screened to isolate clones containing genes whose cDNA sequence were already available. BAC clones containing the asparagine synthetase, sucrose synthase and xyloglucan endotransglycosylase gene sequences were isolated and used in this study. The gene models derived from the gene capture approach were compared with the genomic sequences derived from the BAC clones. This combined approach is a particularly efficient way to capture the genomic structures of gene families with a small number of members. The experimental approach used in this study is a valuable combined technique to study genomic gene structures in species for which a reference genome is unavailable. It can be used to establish exon/intron boundaries in unknown gene structures, to reconstruct incomplete genes and to obtain promoter sequences that can be used for transcriptional studies. A bioinformatics algorithm (GeneAssembler) is also provided as a Ruby gem for this class of analyses.
Modelling of DNA-Mediated of Two- and -Three dimensional Protein-Protein and Protein-Nanoparticle Self-Assembly

NASA Astrophysics Data System (ADS)

Millan, Jaime; McMillan, Janet; Brodin, Jeff; Lee, Byeongdu; Mirkin, Chad; Olvera de La Cruz, Monica

Programmable DNA interactions represent a robust scheme to self-assemble a rich variety of tunable superlattices, where intrinsic and in some cases non-desirable nano-scale building blocks interactions are substituted for DNA hybridization events. Recent advances in synthesis has allowed the extension of this successful scheme to proteins, where DNA distribution can be tuned independently of protein shape by selectively addressing surface residues, giving rise to assembly properties in three dimensional protein-nanoparticle superlattices dependent on DNA distribution. In parallel to this advances, we introduced a scalable coarse-grained model that faithfully reproduces the previously observed co-assemblies from nanoparticles and proteins conjugates. Herein, we implement this numerical model to explain the stability of complex protein-nanoparticle binary superlattices and to elucidate experimentally inaccessible features such as protein orientation. Also, we will discuss systematic studies that highlight the role of DNA distribution and sequence on two-dimensional protein-protein and protein-nanoparticle superlattices.
Whole-genome random sequencing and assembly of Haemophilus influenzae Rd

DOE Office of Scientific and Technical Information (OSTI.GOV)

Fleischmann, R.D.; Adams, M.D.; White, O.

1995-07-28

An approach for genome analysis based on sequencing and assembly of unselected pieces of DNA from the whole chromosome has been applied to obtain the complete nucleotide sequence (1,830,137 base pairs) of the genome from the bacterium Haemophilus influenzae Rd. This approach eliminates the need for initial mapping efforts and is therefore applicable to the vast array of microbial species for which genome maps are unavailable. The H. influenzae Rd genome sequence (Genome Sequence DataBase accession number L42023) represents the only complete genome sequence from a free-living organism. 46 refs., 4 figs., 4 tabs.
A Versatile Microfluidic Device for Automating Synthetic Biology.

PubMed

Shih, Steve C C; Goyal, Garima; Kim, Peter W; Koutsoubelis, Nicolas; Keasling, Jay D; Adams, Paul D; Hillson, Nathan J; Singh, Anup K

2015-10-16

New microbes are being engineered that contain the genetic circuitry, metabolic pathways, and other cellular functions required for a wide range of applications such as producing biofuels, biobased chemicals, and pharmaceuticals. Although currently available tools are useful in improving the synthetic biology process, further improvements in physical automation would help to lower the barrier of entry into this field. We present an innovative microfluidic platform for assembling DNA fragments with 10× lower volumes (compared to that of current microfluidic platforms) and with integrated region-specific temperature control and on-chip transformation. Integration of these steps minimizes the loss of reagents and products compared to that with conventional methods, which require multiple pipetting steps. For assembling DNA fragments, we implemented three commonly used DNA assembly protocols on our microfluidic device: Golden Gate assembly, Gibson assembly, and yeast assembly (i.e., TAR cloning, DNA Assembler). We demonstrate the utility of these methods by assembling two combinatorial libraries of 16 plasmids each. Each DNA plasmid is transformed into Escherichia coli or Saccharomyces cerevisiae using on-chip electroporation and further sequenced to verify the assembly. We anticipate that this platform will enable new research that can integrate this automated microfluidic platform to generate large combinatorial libraries of plasmids and will help to expedite the overall synthetic biology process.
Fractal assembly of micrometre-scale DNA origami arrays with arbitrary patterns.

PubMed

Tikhomirov, Grigory; Petersen, Philip; Qian, Lulu

2017-12-06

Self-assembled DNA nanostructures enable nanometre-precise patterning that can be used to create programmable molecular machines and arrays of functional materials. DNA origami is particularly versatile in this context because each DNA strand in the origami nanostructure occupies a unique position and can serve as a uniquely addressable pixel. However, the scale of such structures has been limited to about 0.05 square micrometres, hindering applications that demand a larger layout and integration with more conventional patterning methods. Hierarchical multistage assembly of simple sets of tiles can in principle overcome this limitation, but so far has not been sufficiently robust to enable successful implementation of larger structures using DNA origami tiles. Here we show that by using simple local assembly rules that are modified and applied recursively throughout a hierarchical, multistage assembly process, a small and constant set of unique DNA strands can be used to create DNA origami arrays of increasing size and with arbitrary patterns. We illustrate this method, which we term 'fractal assembly', by producing DNA origami arrays with sizes of up to 0.5 square micrometres and with up to 8,704 pixels, allowing us to render images such as the Mona Lisa and a rooster. We find that self-assembly of the tiles into arrays is unaffected by changes in surface patterns on the tiles, and that the yield of the fractal assembly process corresponds to about 0.95 m - 1 for arrays containing m tiles. When used in conjunction with a software tool that we developed that converts an arbitrary pattern into DNA sequences and experimental protocols, our assembly method is readily accessible and will facilitate the construction of sophisticated materials and devices with sizes similar to that of a bacterium using DNA nanostructures.
Lattice-free prediction of three-dimensional structure of programmed DNA assemblies

PubMed Central

Pan, Keyao; Kim, Do-Nyun; Zhang, Fei; Adendorff, Matthew R.; Yan, Hao; Bathe, Mark

2014-01-01

DNA can be programmed to self-assemble into high molecular weight 3D assemblies with precise nanometer-scale structural features. Although numerous sequence design strategies exist to realize these assemblies in solution, there is currently no computational framework to predict their 3D structures on the basis of programmed underlying multi-way junction topologies constrained by DNA duplexes. Here, we introduce such an approach and apply it to assemblies designed using the canonical immobile four-way junction. The procedure is used to predict the 3D structure of high molecular weight planar and spherical ring-like origami objects, a tile-based sheet-like ribbon, and a 3D crystalline tensegrity motif, in quantitative agreement with experiments. Our framework provides a new approach to predict programmed nucleic acid 3D structure on the basis of prescribed secondary structure motifs, with possible application to the design of such assemblies for use in biomolecular and materials science. PMID:25470497
Fusion of GFP to the M.EcoKI DNA methyltransferase produces a new probe of Type I DNA restriction and modification enzymes

DOE Office of Scientific and Technical Information (OSTI.GOV)

Chen, Kai; Roberts, Gareth A.; Stephanou, Augoustinos S.

2010-07-23

Research highlights: {yields} Successful fusion of GFP to M.EcoKI DNA methyltransferase. {yields} GFP located at C-terminal of sequence specificity subunit does not later enzyme activity. {yields} FRET confirms structural model of M.EcoKI bound to DNA. -- Abstract: We describe the fusion of enhanced green fluorescent protein to the C-terminus of the HsdS DNA sequence-specificity subunit of the Type I DNA modification methyltransferase M.EcoKI. The fusion expresses well in vivo and assembles with the two HsdM modification subunits. The fusion protein functions as a sequence-specific DNA methyltransferase protecting DNA against digestion by the EcoKI restriction endonuclease. The purified enzyme shows Foerstermore » resonance energy transfer to fluorescently-labelled DNA duplexes containing the target sequence and to fluorescently-labelled ocr protein, a DNA mimic that binds to the M.EcoKI enzyme. Distances determined from the energy transfer experiments corroborate the structural model of M.EcoKI.« less
One-Dimensional Multichromophor Arrays Based on DNA: From Self-Assembly to Light-Harvesting.

PubMed

Ensslen, Philipp; Wagenknecht, Hans-Achim

2015-10-20

Light-harvesting complexes collect light energy and deliver it by a cascade of energy and electron transfer processes to the reaction center where charge separation leads to storage as chemical energy. The design of artificial light-harvesting assemblies faces enormous challenges because several antenna chromophores need to be kept in close proximity but self-quenching needs to be avoided. Double stranded DNA as a supramolecular scaffold plays a promising role due to its characteristic structural properties. Automated DNA synthesis allows incorporation of artificial chromophore-modified building blocks, and sequence design allows precise control of the distances and orientations between the chromophores. The helical twist between the chromophores, which is induced by the DNA framework, controls energy and electron transfer and thereby reduces the self-quenching that is typically observed in chromophore aggregates. This Account summarizes covalently multichromophore-modified DNA and describes how such multichromophore arrays were achieved by Watson-Crick-specific and DNA-templated self-assembly. The covalent DNA systems were prepared by incorporation of chromophores as DNA base substitutions (either as C-nucleosides or with acyclic linkers as substitutes for the 2'-deoxyribofuranoside) and as DNA base modifications. Studies with DNA base substitutions revealed that distances but more importantly relative orientations of the chromophores govern the energy transfer efficiencies and thereby the light-harvesting properties. With DNA base substitutions, duplex stabilization was faced and could be overcome, for instance, by zipper-like placement of the chromophores in both strands. For both principal structural approaches, DNA-based light-harvesting antenna could be realized. The major disadvantages, however, for covalent multichromophore DNA conjugates are the poor yields of synthesis and the solubility issues for oligonucleotides with more than 5-10 chromophore modifications in a row. A logical alternative approach is to leave out the phosphodiester bridges between the chromophores and let chromophore-nucleoside conjugates self-assemble specifically along single stranded DNA as template. The self-organization of chromophores along the DNA template based on canonical base pairing would be advantageous because sequence selective base pairing could provide a structural basis for programmed complexity within the chromophore assembly. The self-assembly is governed by two interactions. The chromophore-nucleoside conjugates as guest molecules are recognized via hydrogen bonds to the corresponding counter bases in the single stranded DNA template. Moreover, the π-π interactions between the stacked chromophores stabilize these self-assembled constructs with increasing length. Longer DNA templates are more attractive for self-assembled antenna. The helicity in the stack of porphyrins as guest molecules assembled on the DNA template can be switched by environmental changes, such as pH variations. DNA-templated stacks of ethynyl pyrene and nile red exhibit left-handed chirality, which stands in contrast to similar covalent multichromophore-DNA conjugates with enforced right-handed helicity. With ethynyl nile red, it is possible to occupy every available binding site on the templates. Mixed assemblies of ethynyl pyrene and nile red show energy transfer and thereby provide a proof-of-principle that simple light-harvesting antennae can be obtained in a noncovalent and self-assembled fashion. With respect to the next important step, chemical storage of the absorbed light energy, future research has to focus on the coupling of sophisticated DNA-based light-harvesting antenna to reaction centers.
The Comprehensive Phytopathogen Genomics Resource: a web-based resource for data-mining plant pathogen genomes.

PubMed

Hamilton, John P; Neeno-Eckwall, Eric C; Adhikari, Bishwo N; Perna, Nicole T; Tisserat, Ned; Leach, Jan E; Lévesque, C André; Buell, C Robin

2011-01-01

The Comprehensive Phytopathogen Genomics Resource (CPGR) provides a web-based portal for plant pathologists and diagnosticians to view the genome and trancriptome sequence status of 806 bacterial, fungal, oomycete, nematode, viral and viroid plant pathogens. Tools are available to search and analyze annotated genome sequences of 74 bacterial, fungal and oomycete pathogens. Oomycete and fungal genomes are obtained directly from GenBank, whereas bacterial genome sequences are downloaded from the A Systematic Annotation Package (ASAP) database that provides curation of genomes using comparative approaches. Curated lists of bacterial genes relevant to pathogenicity and avirulence are also provided. The Plant Pathogen Transcript Assemblies Database provides annotated assemblies of the transcribed regions of 82 eukaryotic genomes from publicly available single pass Expressed Sequence Tags. Data-mining tools are provided along with tools to create candidate diagnostic markers, an emerging use for genomic sequence data in plant pathology. The Plant Pathogen Ribosomal DNA (rDNA) database is a resource for pathogens that lack genome or transcriptome data sets and contains 131 755 rDNA sequences from GenBank for 17 613 species identified as plant pathogens and related genera. Database URL: http://cpgr.plantbiology.msu.edu.
Structurally Ordered Nanowire Formation from Co-Assembly of DNA Origami and Collagen-Mimetic Peptides

DOE Office of Scientific and Technical Information (OSTI.GOV)

Jiang, Tao; Meyer, Travis A.; Modlin, Charles

In this paper, we describe the co-assembly of two different building units: collagen-mimetic peptides and DNA origami. Two peptides CP ++ and sCP ++ are designed with a sequence comprising a central block (Pro-Hyp-Gly) and two positively charged domains (Pro-Arg-Gly) at both N- and C-termini. Co-assembly of peptides and DNA origami two-layer (TL) nanosheets affords the formation of one-dimensional nanowires with repeating periodicity of similar to 10 nm. Structural analyses suggest a face-to-face stacking of DNA nanosheets with peptides aligned perpendicularly to the sheet surfaces. We demonstrate the potential of selective peptide-DNA association between face-to-face and edge-to-edge packing by tailoringmore » the size of DNA nanostructures. Finally, this study presents an attractive strategy to create hybrid biomolecular assemblies from peptide and DNA-based building blocks that takes advantage of the intrinsic chemical and physical properties of the respective components to encode structural and, potentially, functional complexity within readily accessible biomimetic materials.« less

Structurally Ordered Nanowire Formation from Co-Assembly of DNA Origami and Collagen-Mimetic Peptides

DOE PAGES

Jiang, Tao; Meyer, Travis A.; Modlin, Charles; ...

2017-09-26

In this paper, we describe the co-assembly of two different building units: collagen-mimetic peptides and DNA origami. Two peptides CP ++ and sCP ++ are designed with a sequence comprising a central block (Pro-Hyp-Gly) and two positively charged domains (Pro-Arg-Gly) at both N- and C-termini. Co-assembly of peptides and DNA origami two-layer (TL) nanosheets affords the formation of one-dimensional nanowires with repeating periodicity of similar to 10 nm. Structural analyses suggest a face-to-face stacking of DNA nanosheets with peptides aligned perpendicularly to the sheet surfaces. We demonstrate the potential of selective peptide-DNA association between face-to-face and edge-to-edge packing by tailoringmore » the size of DNA nanostructures. Finally, this study presents an attractive strategy to create hybrid biomolecular assemblies from peptide and DNA-based building blocks that takes advantage of the intrinsic chemical and physical properties of the respective components to encode structural and, potentially, functional complexity within readily accessible biomimetic materials.« less
Dramatic Increase in the Signal and Sensitivity of Detection via Self-Assembly of Branched DNA

PubMed Central

Kim, Kyung-Tae; Chae, Chi-Bom

2011-01-01

In molecular testing using PCR, the target DNA is amplified via PCR and the sequence of interest is investigated via hybridization with short oligonucleotide capture probes that are either in a solution or immobilized on solid supports such as beads or glass slides. In this report, we report the discovery of assembly of DNA complex(es) between a capture probe and multiple strands of the PCR product. The DNA complex most likely has branched structure. The assembly of branched DNA was facilitated by the product of asymmetric PCR. The amount of branched DNA assembled was increased five fold when the asymmetric PCR product was denatured and hybridized with a capture probe all in the same PCR reaction mixture. The major branched DNA species appeared to contain three reverse strands (the strand complementary to the capture probe) and two forward strands. The DNA was sensitive to S1 nuclease suggesting that it had single-stranded gaps. Branched DNA also appeared to be assembled with the capture probes immobilized on the surface of solid support when the product of asymmetric PCR was hybridized. Assembly of the branched DNA was also increased when hybridization was performed in complete PCR reaction mixture suggesting the requirement of DNA synthesis. Integration of asymmetric PCR, heat denaturation and hybridization in the same PCR reaction mixture with the capture probes immobilized on the surface of solid support achieved dramatic increase in the signal and sensitivity of detection of DNA. Such a system should be advantageously applied for development of automated process for detection of DNA. PMID:21870112
Toehold strand displacement-driven assembly of G-quadruplex DNA for enzyme-free and non-label sensitive fluorescent detection of thrombin.

PubMed

Xu, Yunying; Zhou, Wenjiao; Zhou, Ming; Xiang, Yun; Yuan, Ruo; Chai, Yaqin

2015-02-15

Based on a new signal amplification strategy by the toehold strand displacement-driven cyclic assembly of G-quadruplex DNA, the development of an enzyme-free and non-label aptamer sensing approach for sensitive fluorescent detection of thrombin is described. The target thrombin associates with the corresponding aptamer of the partial dsDNA probes and liberates single stranded initiation sequences, which trigger the toehold strand displacement assembly of two G-quadruplex containing hairpin DNAs. This toehold strand displacement reaction leads to the cyclic reuse of the initiation sequences and the production of DNA assemblies with numerous G-quadruplex structures. The fluorescent dye, N-Methyl mesoporphyrin IX, binds to these G-quadruplex structures and generates significantly amplified fluorescent signals to achieve highly sensitive detection of thrombin down to 5 pM. Besides, this method shows high selectivity towards the target thrombin against other control proteins. The developed thrombin sensing method herein avoids the modification of the probes and the involvement of any enzyme or nanomaterial labels for signal amplification. With the successful demonstration for thrombin detection, our approach can be easily adopted to monitor other target molecules in a simple, low-cost, sensitive and selective way by choosing appropriate aptamer/ligand pairs. Copyright © 2014 Elsevier B.V. All rights reserved.
Fractal assembly of micrometre-scale DNA origami arrays with arbitrary patterns

NASA Astrophysics Data System (ADS)

Tikhomirov, Grigory; Petersen, Philip; Qian, Lulu

2017-12-01

Self-assembled DNA nanostructures enable nanometre-precise patterning that can be used to create programmable molecular machines and arrays of functional materials. DNA origami is particularly versatile in this context because each DNA strand in the origami nanostructure occupies a unique position and can serve as a uniquely addressable pixel. However, the scale of such structures has been limited to about 0.05 square micrometres, hindering applications that demand a larger layout and integration with more conventional patterning methods. Hierarchical multistage assembly of simple sets of tiles can in principle overcome this limitation, but so far has not been sufficiently robust to enable successful implementation of larger structures using DNA origami tiles. Here we show that by using simple local assembly rules that are modified and applied recursively throughout a hierarchical, multistage assembly process, a small and constant set of unique DNA strands can be used to create DNA origami arrays of increasing size and with arbitrary patterns. We illustrate this method, which we term ‘fractal assembly’, by producing DNA origami arrays with sizes of up to 0.5 square micrometres and with up to 8,704 pixels, allowing us to render images such as the Mona Lisa and a rooster. We find that self-assembly of the tiles into arrays is unaffected by changes in surface patterns on the tiles, and that the yield of the fractal assembly process corresponds to about 0.95m - 1 for arrays containing m tiles. When used in conjunction with a software tool that we developed that converts an arbitrary pattern into DNA sequences and experimental protocols, our assembly method is readily accessible and will facilitate the construction of sophisticated materials and devices with sizes similar to that of a bacterium using DNA nanostructures.
preAssemble: a tool for automatic sequencer trace data processing.

PubMed

Adzhubei, Alexei A; Laerdahl, Jon K; Vlasova, Anna V

2006-01-17

Trace or chromatogram files (raw data) are produced by automatic nucleic acid sequencing equipment or sequencers. Each file contains information which can be interpreted by specialised software to reveal the sequence (base calling). This is done by the sequencer proprietary software or publicly available programs. Depending on the size of a sequencing project the number of trace files can vary from just a few to thousands of files. Sequencing quality assessment on various criteria is important at the stage preceding clustering and contig assembly. Two major publicly available packages--Phred and Staden are used by preAssemble to perform sequence quality processing. The preAssemble pre-assembly sequence processing pipeline has been developed for small to large scale automatic processing of DNA sequencer chromatogram (trace) data. The Staden Package Pregap4 module and base-calling program Phred are utilized in the pipeline, which produces detailed and self-explanatory output that can be displayed with a web browser. preAssemble can be used successfully with very little previous experience, however options for parameter tuning are provided for advanced users. preAssemble runs under UNIX and LINUX operating systems. It is available for downloading and will run as stand-alone software. It can also be accessed on the Norwegian Salmon Genome Project web site where preAssemble jobs can be run on the project server. preAssemble is a tool allowing to perform quality assessment of sequences generated by automatic sequencing equipment. preAssemble is flexible since both interactive jobs on the preAssemble server and the stand alone downloadable version are available. Virtually no previous experience is necessary to run a default preAssemble job, on the other hand options for parameter tuning are provided. Consequently preAssemble can be used as efficiently for just several trace files as for large scale sequence processing.
Virtual Genome Walking across the 32 Gb Ambystoma mexicanum genome; assembling gene models and intronic sequence.

PubMed

Evans, Teri; Johnson, Andrew D; Loose, Matthew

2018-01-12

Large repeat rich genomes present challenges for assembly using short read technologies. The 32 Gb axolotl genome is estimated to contain ~19 Gb of repetitive DNA making an assembly from short reads alone effectively impossible. Indeed, this model species has been sequenced to 20× coverage but the reads could not be conventionally assembled. Using an alternative strategy, we have assembled subsets of these reads into scaffolds describing over 19,000 gene models. We call this method Virtual Genome Walking as it locally assembles whole genome reads based on a reference transcriptome, identifying exons and iteratively extending them into surrounding genomic sequence. These assemblies are then linked and refined to generate gene models including upstream and downstream genomic, and intronic, sequence. Our assemblies are validated by comparison with previously published axolotl bacterial artificial chromosome (BAC) sequences. Our analyses of axolotl intron length, intron-exon structure, repeat content and synteny provide novel insights into the genic structure of this model species. This resource will enable new experimental approaches in axolotl, such as ChIP-Seq and CRISPR and aid in future whole genome sequencing efforts. The assembled sequences and annotations presented here are freely available for download from https://tinyurl.com/y8gydc6n . The software pipeline is available from https://github.com/LooseLab/iterassemble .
Katome: de novo DNA assembler implemented in rust

NASA Astrophysics Data System (ADS)

Neumann, Łukasz; Nowak, Robert M.; Kuśmirek, Wiktor

2017-08-01

Katome is a new de novo sequence assembler written in the Rust programming language, designed with respect to future parallelization of the algorithms, run time and memory usage optimization. The application uses new algorithms for the correct assembly of repetitive sequences. Performance and quality tests were performed on various data, comparing the new application to `dnaasm', `ABySS' and `Velvet' genome assemblers. Quality tests indicate that the new assembler creates more contigs than well-established solutions, but the contigs have better quality with regard to mismatches per 100kbp and indels per 100kbp. Additionally, benchmarks indicate that the Rust-based implementation outperforms `dnaasm', `ABySS' and `Velvet' assemblers, written in C++, in terms of assembly time. Lower memory usage in comparison to `dnaasm' is observed.
Centromeres and kinetochores of Brassicaceae.

PubMed

Lermontova, Inna; Sandmann, Michael; Demidov, Dmitri

2014-06-01

The centromere-the primary constriction of monocentric chromosomes-is essential for correct segregation of chromosomes during mitosis and meiosis. Centromeric DNA varies between different organisms in sequence composition and extension. The main components of centromeric and pericentromeric DNA of Brassicaceae species are centromeric satellite repeats. Centromeric DNA initiates assembly of the kinetochore, the large protein complex where the spindle fibers attach during nuclear division to pull sister chromatids apart. Kinetochore assembly is initiated by incorporation of the centromeric histone H3 cenH3 into centromeric nucleosomes. The spindle assembly checkpoint acts during mitosis and meiosis at centromeres and maintains genome stability by preventing chromosome segregation before all kinetochores are correctly attached to microtubules. The function of the spindle assembly checkpoint in plants is still poorly understood. Here, we review recent advances of studies on structure and functional importance of centromeric DNA of Brassicaceae, assembly and function of cenH3 in Arabidopsis thaliana and characterization of core SAC proteins of A. thaliana in comparison with non-plant homologues.
Sequencing and comparative genomic analysis of 1227 Felis catus cDNA sequences enriched for developmental, clinical and nutritional phenotypes

PubMed Central

2012-01-01

Background The feline genome is valuable to the veterinary and model organism genomics communities because the cat is an obligate carnivore and a model for endangered felids. The initial public release of the Felis catus genome assembly provided a framework for investigating the genomic basis of feline biology. However, the entire set of protein coding genes has not been elucidated. Results We identified and characterized 1227 protein coding feline sequences, of which 913 map to public sequences and 314 are novel. These sequences have been deposited into NCBI's genbank database and complement public genomic resources by providing additional protein coding sequences that fill in some of the gaps in the feline genome assembly. Through functional and comparative genomic analyses, we gained an understanding of the role of these sequences in feline development, nutrition and health. Specifically, we identified 104 orthologs of human genes associated with Mendelian disorders. We detected negative selection within sequences with gene ontology annotations associated with intracellular trafficking, cytoskeleton and muscle functions. We detected relatively less negative selection on protein sequences encoding extracellular networks, apoptotic pathways and mitochondrial gene ontology annotations. Additionally, we characterized feline cDNA sequences that have mouse orthologs associated with clinical, nutritional and developmental phenotypes. Together, this analysis provides an overview of the value of our cDNA sequences and enhances our understanding of how the feline genome is similar to, and different from other mammalian genomes. Conclusions The cDNA sequences reported here expand existing feline genomic resources by providing high-quality sequences annotated with comparative genomic information providing functional, clinical, nutritional and orthologous gene information. PMID:22257742
Switching bonds in a DNA gel: an all-DNA vitrimer.

PubMed

Romano, Flavio; Sciortino, Francesco

2015-02-20

We design an all-DNA system that behaves like vitrimers, innovative plastics with self-healing and stress-releasing properties. The DNA sequences are engineered to self-assemble first into tetra- and bifunctional units which, upon further cooling, bind to each other forming a fully bonded network gel. An innovative design of the binding regions of the DNA sequences, exploiting a double toehold-mediated strand displacement, generates a network gel which is able to reshuffle its bonds, retaining at all times full bonding. As in vitrimers, the rate of bond switching can be controlled via a thermally activated catalyst, which in the present design is very short DNA strands.
Paranemic Crossover DNA: There and Back Again.

PubMed

Wang, Xing; Chandrasekaran, Arun Richard; Shen, Zhiyong; Ohayon, Yoel P; Wang, Tong; Kizer, Megan E; Sha, Ruojie; Mao, Chengde; Yan, Hao; Zhang, Xiaoping; Liao, Shiping; Ding, Baoquan; Chakraborty, Banani; Jonoska, Natasha; Niu, Dong; Gu, Hongzhou; Chao, Jie; Gao, Xiang; Li, Yuhang; Ciengshin, Tanashaya; Seeman, Nadrian C

2018-06-18

Over the past 35 years, DNA has been used to produce various nanometer-scale constructs, nanomechanical devices, and walkers. Construction of complex DNA nanostructures relies on the creation of rigid DNA motifs. Paranemic crossover (PX) DNA is one such motif that has played many roles in DNA nanotechnology. Specifically, PX cohesion has been used to connect topologically closed molecules, to assemble a three-dimensional object, and to create two-dimensional DNA crystals. Additionally, a sequence-dependent nanodevice based on conformational change between PX and its topoisomer, JX 2 , has been used in robust nanoscale assembly lines, as a key component in a DNA transducer, and to dictate polymer assembly. Furthermore, the PX motif has recently found a new role directly in basic biology, by possibly serving as the molecular structure for double-stranded DNA homology recognition, a prominent feature of molecular biology and essential for many crucial biological processes. This review discusses the many attributes and usages of PX-DNA-its design, characteristics, applications, and potential biological relevance-and aims to accelerate the understanding of PX-DNA motif in its many roles and manifestations.
Near complete genome sequence of Clostridium paradoxum strain JW-YL-7

DOE Office of Scientific and Technical Information (OSTI.GOV)

Lancaster, Andrew; Utturkar, Sagar M.; Poole, Farris

2016-05-05

Clostridium paradoxum strain JW-YL-7 is a moderately thermophilic anaerobic alkaliphile isolated from the municipal sewage treatment plant in Athens, GA. We report the near-complete genome sequence of C. paradoxum strain JW-YL-7 obtained by using PacBio DNA sequencing and Pilon for sequence assembly refinement with Illumina data.
Company profile: Complete Genomics Inc.

PubMed

Reid, Clifford

2011-02-01

Complete Genomics Inc. is a life sciences company that focuses on complete human genome sequencing. It is taking a completely different approach to DNA sequencing than other companies in the industry. Rather than building a general-purpose platform for sequencing all organisms and all applications, it has focused on a single application - complete human genome sequencing. The company's Complete Genomics Analysis Platform (CGA™ Platform) comprises an integrated package of biochemistry, instrumentation and software that sequences human genomes at the highest quality, lowest cost and largest scale available. Complete Genomics offers a turnkey service that enables customers to outsource their human genome sequencing to the company's genome sequencing center in Mountain View, CA, USA. Customers send in their DNA samples, the company does all the library preparation, DNA sequencing, assembly and variant analysis, and customers receive research-ready data that they can use for biological discovery.
Chemical synthesis and characterization of branched oligodeoxyribonucleotides (bDNA) for use as signal amplifiers in nucleic acid quantification assays.

PubMed

Horn, T; Chang, C A; Urdea, M S

1997-12-01

The divergent synthesis of bDNA structures is described. This new type of branched DNA contains one unique oligonucleotide, the primary sequence, covalently attached through a comb-like branching network to many identical copies of a different oligonucleotide, the secondary sequence. The bDNA comb molecules were assembled on a solid support using parameters optimized for bDNA synthesis. The chemistry was used to synthesize bDNA comb molecules containing 15 secondary sequences. The bDNA comb molecules were elaborated by enzymatic ligation into branched amplification multimers, large bDNA molecules (a total of 1068 nt) containing an average of 36 repeated DNA oligomer sequences, each capable of hybridizing specifically to an alkaline phosphatase-labeled oligonucleotide. The bDNA comb molecules were characterized by electrophoretic methods and by controlled cleavage at periodate-cleavable moieties incorporated during synthesis. The branched amplification multimers have been used as signal amplifiers in nucleic acid quantification assays for detection of viral infection. It is possible to detect as few as 50 molecules with bDNA technology.
Chemical synthesis and characterization of branched oligodeoxyribonucleotides (bDNA) for use as signal amplifiers in nucleic acid quantification assays.

PubMed Central

Horn, T; Chang, C A; Urdea, M S

1997-01-01

The divergent synthesis of bDNA structures is described. This new type of branched DNA contains one unique oligonucleotide, the primary sequence, covalently attached through a comb-like branching network to many identical copies of a different oligonucleotide, the secondary sequence. The bDNA comb molecules were assembled on a solid support using parameters optimized for bDNA synthesis. The chemistry was used to synthesize bDNA comb molecules containing 15 secondary sequences. The bDNA comb molecules were elaborated by enzymatic ligation into branched amplification multimers, large bDNA molecules (a total of 1068 nt) containing an average of 36 repeated DNA oligomer sequences, each capable of hybridizing specifically to an alkaline phosphatase-labeled oligonucleotide. The bDNA comb molecules were characterized by electrophoretic methods and by controlled cleavage at periodate-cleavable moieties incorporated during synthesis. The branched amplification multimers have been used as signal amplifiers in nucleic acid quantification assays for detection of viral infection. It is possible to detect as few as 50 molecules with bDNA technology. PMID:9365266
Recent advances in DNA nanotechnology.

PubMed

Chidchob, Pongphak; Sleiman, Hanadi F

2018-05-08

DNA is a powerful guiding molecule to achieve the precise construction of arbitrary structures and high-resolution organization of functional materials. The combination of sequence programmability, rigidity and highly specific molecular recognition in this molecule has resulted in a wide range of exquisitely designed DNA frameworks. To date, the impressive potential of DNA nanomaterials has been demonstrated from fundamental research to technological advancements in materials science and biomedicine. This review presents a summary of some of the most recent developments in structural DNA nanotechnology regarding new assembly approaches and efforts in translating DNA nanomaterials into practical use. Recent work on incorporating blunt-end stacking and hydrophobic interactions as orthogonal instruction rules in DNA assembly, and several emerging applications of DNA nanomaterials will also be highlighted. Copyright © 2018. Published by Elsevier Ltd.
Peripheral infrastructure vectors and an extended set of plant parts for the Modular Cloning system

PubMed Central

Kretschmer, Carola; Gruetzner, Ramona; Löfke, Christian; Dagdas, Yasin; Bürstenbinder, Katharina; Marillonnet, Sylvestre

2018-01-01

Standardized DNA assembly strategies facilitate the generation of multigene constructs from collections of building blocks in plant synthetic biology. A common syntax for hierarchical DNA assembly following the Golden Gate principle employing Type IIs restriction endonucleases was recently developed, and underlies the Modular Cloning and GoldenBraid systems. In these systems, transcriptional units and/or multigene constructs are assembled from libraries of standardized building blocks, also referred to as phytobricks, in several hierarchical levels and by iterative Golden Gate reactions. Here, a toolkit containing further modules for the novel DNA assembly standards was developed. Intended for use with Modular Cloning, most modules are also compatible with GoldenBraid. Firstly, a collection of approximately 80 additional phytobricks is provided, comprising e.g. modules for inducible expression systems, promoters or epitope tags. Furthermore, DNA modules were developed for connecting Modular Cloning and Gateway cloning, either for toggling between systems or for standardized Gateway destination vector assembly. Finally, first instances of a “peripheral infrastructure” around Modular Cloning are presented: While available toolkits are designed for the assembly of plant transformation constructs, vectors were created to also use coding sequence-containing phytobricks directly in yeast two hybrid interaction or bacterial infection assays. The presented material will further enhance versatility of hierarchical DNA assembly strategies. PMID:29847550
A DNA 'barcode blitz': rapid digitization and sequencing of a natural history collection.

PubMed

Hebert, Paul D N; Dewaard, Jeremy R; Zakharov, Evgeny V; Prosser, Sean W J; Sones, Jayme E; McKeown, Jaclyn T A; Mantle, Beth; La Salle, John

2013-01-01

DNA barcoding protocols require the linkage of each sequence record to a voucher specimen that has, whenever possible, been authoritatively identified. Natural history collections would seem an ideal resource for barcode library construction, but they have never seen large-scale analysis because of concerns linked to DNA degradation. The present study examines the strength of this barrier, carrying out a comprehensive analysis of moth and butterfly (Lepidoptera) species in the Australian National Insect Collection. Protocols were developed that enabled tissue samples, specimen data, and images to be assembled rapidly. Using these methods, a five-person team processed 41,650 specimens representing 12,699 species in 14 weeks. Subsequent molecular analysis took about six months, reflecting the need for multiple rounds of PCR as sequence recovery was impacted by age, body size, and collection protocols. Despite these variables and the fact that specimens averaged 30.4 years old, barcode records were obtained from 86% of the species. In fact, one or more barcode compliant sequences (>487 bp) were recovered from virtually all species represented by five or more individuals, even when the youngest was 50 years old. By assembling specimen images, distributional data, and DNA barcode sequences on a web-accessible informatics platform, this study has greatly advanced accessibility to information on thousands of species. Moreover, much of the specimen data became publically accessible within days of its acquisition, while most sequence results saw release within three months. As such, this study reveals the speed with which DNA barcode workflows can mobilize biodiversity data, often providing the first web-accessible information for a species. These results further suggest that existing collections can enable the rapid development of a comprehensive DNA barcode library for the most diverse compartment of terrestrial biodiversity - insects.
A Simple and Efficient Method for Assembling TALE Protein Based on Plasmid Library

PubMed Central

Xu, Huarong; Xin, Ying; Zhang, Tingting; Ma, Lixia; Wang, Xin; Chen, Zhilong; Zhang, Zhiying

2013-01-01

DNA binding domain of the transcription activator-like effectors (TALEs) from Xanthomonas sp. consists of tandem repeats that can be rearranged according to a simple cipher to target new DNA sequences with high DNA-binding specificity. This technology has been successfully applied in varieties of species for genome engineering. However, assembling long TALE tandem repeats remains a big challenge precluding wide use of this technology. Although several new methodologies for efficiently assembling TALE repeats have been recently reported, all of them require either sophisticated facilities or skilled technicians to carry them out. Here, we described a simple and efficient method for generating customized TALE nucleases (TALENs) and TALE transcription factors (TALE-TFs) based on TALE repeat tetramer library. A tetramer library consisting of 256 tetramers covers all possible combinations of 4 base pairs. A set of unique primers was designed for amplification of these tetramers. PCR products were assembled by one step of digestion/ligation reaction. 12 TALE constructs including 4 TALEN pairs targeted to mouse Gt(ROSA)26Sor gene and mouse Mstn gene sequences as well as 4 TALE-TF constructs targeted to mouse Oct4, c-Myc, Klf4 and Sox2 gene promoter sequences were generated by using our method. The construction routines took 3 days and parallel constructions were available. The rate of positive clones during colony PCR verification was 64% on average. Sequencing results suggested that all TALE constructs were performed with high successful rate. This is a rapid and cost-efficient method using the most common enzymes and facilities with a high success rate. PMID:23840477
A simple and efficient method for assembling TALE protein based on plasmid library.

PubMed

Zhang, Zhiqiang; Li, Duo; Xu, Huarong; Xin, Ying; Zhang, Tingting; Ma, Lixia; Wang, Xin; Chen, Zhilong; Zhang, Zhiying

2013-01-01

DNA binding domain of the transcription activator-like effectors (TALEs) from Xanthomonas sp. consists of tandem repeats that can be rearranged according to a simple cipher to target new DNA sequences with high DNA-binding specificity. This technology has been successfully applied in varieties of species for genome engineering. However, assembling long TALE tandem repeats remains a big challenge precluding wide use of this technology. Although several new methodologies for efficiently assembling TALE repeats have been recently reported, all of them require either sophisticated facilities or skilled technicians to carry them out. Here, we described a simple and efficient method for generating customized TALE nucleases (TALENs) and TALE transcription factors (TALE-TFs) based on TALE repeat tetramer library. A tetramer library consisting of 256 tetramers covers all possible combinations of 4 base pairs. A set of unique primers was designed for amplification of these tetramers. PCR products were assembled by one step of digestion/ligation reaction. 12 TALE constructs including 4 TALEN pairs targeted to mouse Gt(ROSA)26Sor gene and mouse Mstn gene sequences as well as 4 TALE-TF constructs targeted to mouse Oct4, c-Myc, Klf4 and Sox2 gene promoter sequences were generated by using our method. The construction routines took 3 days and parallel constructions were available. The rate of positive clones during colony PCR verification was 64% on average. Sequencing results suggested that all TALE constructs were performed with high successful rate. This is a rapid and cost-efficient method using the most common enzymes and facilities with a high success rate.

Biocompatible artificial DNA linker that is read through by DNA polymerases and is functional in Escherichia coli

PubMed Central

El-Sagheer, Afaf H.; Sanzone, A. Pia; Gao, Rachel; Tavassoli, Ali; Brown, Tom

2011-01-01

A triazole mimic of a DNA phosphodiester linkage has been produced by templated chemical ligation of oligonucleotides functionalized with 5′-azide and 3′-alkyne. The individual azide and alkyne oligonucleotides were synthesized by standard phosphoramidite methods and assembled using a straightforward ligation procedure. This highly efficient chemical equivalent of enzymatic DNA ligation has been used to assemble a 300-mer from three 100-mer oligonucleotides, demonstrating the total chemical synthesis of very long oligonucleotides. The base sequences of the DNA strands containing this artificial linkage were copied during PCR with high fidelity and a gene containing the triazole linker was functional in Escherichia coli. PMID:21709264
Genovo: De Novo Assembly for Metagenomes

NASA Astrophysics Data System (ADS)

Laserson, Jonathan; Jojic, Vladimir; Koller, Daphne

Next-generation sequencing technologies produce a large number of noisy reads from the DNA in a sample. Metagenomics and population sequencing aim to recover the genomic sequences of the species in the sample, which could be of high diversity. Methods geared towards single sequence reconstruction are not sensitive enough when applied in this setting. We introduce a generative probabilistic model of read generation from environmental samples and present Genovo, a novel de novo sequence assembler that discovers likely sequence reconstructions under the model. A Chinese restaurant process prior accounts for the unknown number of genomes in the sample. Inference is made by applying a series of hill-climbing steps iteratively until convergence. We compare the performance of Genovo to three other short read assembly programs across one synthetic dataset and eight metagenomic datasets created using the 454 platform, the largest of which has 311k reads. Genovo's reconstructions cover more bases and recover more genes than the other methods, and yield a higher assembly score.
GRAbB: Selective Assembly of Genomic Regions, a New Niche for Genomic Research

PubMed Central

Zhang, Hao; van Diepeningen, Anne D.; van der Lee, Theo A. J.; Waalwijk, Cees; de Hoog, G. Sybren

2016-01-01

GRAbB (Genomic Region Assembly by Baiting) is a new program that is dedicated to assemble specific genomic regions from NGS data. This approach is especially useful when dealing with multi copy regions, such as mitochondrial genome and the rDNA repeat region, parts of the genome that are often neglected or poorly assembled, although they contain interesting information from phylogenetic or epidemiologic perspectives, but also single copy regions can be assembled. The program is capable of targeting multiple regions within a single run. Furthermore, GRAbB can be used to extract specific loci from NGS data, based on homology, like sequences that are used for barcoding. To make the assembly specific, a known part of the region, such as the sequence of a PCR amplicon or a homologous sequence from a related species must be specified. By assembling only the region of interest, the assembly process is computationally much less demanding and may lead to assemblies of better quality. In this study the different applications and functionalities of the program are demonstrated such as: exhaustive assembly (rDNA region and mitochondrial genome), extracting homologous regions or genes (IGS, RPB1, RPB2 and TEF1a), as well as extracting multiple regions within a single run. The program is also compared with MITObim, which is meant for the exhaustive assembly of a single target based on a similar query sequence. GRAbB is shown to be more efficient than MITObim in terms of speed, memory and disk usage. The other functionalities (handling multiple targets simultaneously and extracting homologous regions) of the new program are not matched by other programs. The program is available with explanatory documentation at https://github.com/b-brankovics/grabb. GRAbB has been tested on Ubuntu (12.04 and 14.04), Fedora (23), CentOS (7.1.1503) and Mac OS X (10.7). Furthermore, GRAbB is available as a docker repository: brankovics/grabb (https://hub.docker.com/r/brankovics/grabb/). PMID:27308864
GRAbB: Selective Assembly of Genomic Regions, a New Niche for Genomic Research.

PubMed

Brankovics, Balázs; Zhang, Hao; van Diepeningen, Anne D; van der Lee, Theo A J; Waalwijk, Cees; de Hoog, G Sybren

2016-06-01

GRAbB (Genomic Region Assembly by Baiting) is a new program that is dedicated to assemble specific genomic regions from NGS data. This approach is especially useful when dealing with multi copy regions, such as mitochondrial genome and the rDNA repeat region, parts of the genome that are often neglected or poorly assembled, although they contain interesting information from phylogenetic or epidemiologic perspectives, but also single copy regions can be assembled. The program is capable of targeting multiple regions within a single run. Furthermore, GRAbB can be used to extract specific loci from NGS data, based on homology, like sequences that are used for barcoding. To make the assembly specific, a known part of the region, such as the sequence of a PCR amplicon or a homologous sequence from a related species must be specified. By assembling only the region of interest, the assembly process is computationally much less demanding and may lead to assemblies of better quality. In this study the different applications and functionalities of the program are demonstrated such as: exhaustive assembly (rDNA region and mitochondrial genome), extracting homologous regions or genes (IGS, RPB1, RPB2 and TEF1a), as well as extracting multiple regions within a single run. The program is also compared with MITObim, which is meant for the exhaustive assembly of a single target based on a similar query sequence. GRAbB is shown to be more efficient than MITObim in terms of speed, memory and disk usage. The other functionalities (handling multiple targets simultaneously and extracting homologous regions) of the new program are not matched by other programs. The program is available with explanatory documentation at https://github.com/b-brankovics/grabb. GRAbB has been tested on Ubuntu (12.04 and 14.04), Fedora (23), CentOS (7.1.1503) and Mac OS X (10.7). Furthermore, GRAbB is available as a docker repository: brankovics/grabb (https://hub.docker.com/r/brankovics/grabb/).
Rapid construction of insulated genetic circuits via synthetic sequence-guided isothermal assembly

DOE Office of Scientific and Technical Information (OSTI.GOV)

Torella, JP; Boehm, CR; Lienert, F

2013-12-28

In vitro recombination methods have enabled one-step construction of large DNA sequences from multiple parts. Although synthetic biological circuits can in principle be assembled in the same fashion, they typically contain repeated sequence elements such as standard promoters and terminators that interfere with homologous recombination. Here we use a computational approach to design synthetic, biologically inactive unique nucleotide sequences (UNSes) that facilitate accurate ordered assembly. Importantly, our designed UNSes make it possible to assemble parts with repeated terminator and insulator sequences, and thereby create insulated functional genetic circuits in bacteria and mammalian cells. Using UNS-guided assembly to construct repeating promoter-gene-terminatormore » parts, we systematically varied gene expression to optimize production of a deoxychromoviridans biosynthetic pathway in Escherichia coli. We then used this system to construct complex eukaryotic AND-logic gates for genomic integration into embryonic stem cells. Construction was performed by using a standardized series of UNS-bearing BioBrick-compatible vectors, which enable modular assembly and facilitate reuse of individual parts. UNS-guided isothermal assembly is broadly applicable to the construction and optimization of genetic circuits and particularly those requiring tight insulation, such as complex biosynthetic pathways, sensors, counters and logic gates.« less
Molecular Microbial Analyses of the Mars Exploration Rovers Assembly Facility

NASA Technical Reports Server (NTRS)

Venkateswaran, Kasthuri; LaDuc, Myron T.; Newcombe, David; Kempf, Michael J.; Koke, John. A.; Smoot, James C.; Smoot, Laura M.; Stahl, David A.

2004-01-01

During space exploration, the control of terrestrial microbes associated with robotic space vehicles intended to land on extraterrestrial solar system bodies is necessary to prevent forward contamination and maintain scientific integrity during the search for life. Microorganisms associated with the spacecraft assembly environment can be a source of contamination for the spacecraft. In this study, we have monitored the microbial burden of air samples of the Mars Exploration Rovers' assembly facility at the Kennedy Space Center utilizing complementary diagnostic tools. To estimate the microbial burden and identify potential contaminants in the assembly facility, several microbiological techniques were used including culturing, cloning and sequencing of 16S rRNA genes, DNA microarray analysis, and ATP assays to assess viable microorganisms. Culturing severely underestimated types and amounts of contamination since many of the microbes implicated by molecular analyses were not cultivable. In addition to the cultivation of Agrobacterium, Burkholderia and Bacillus species, the cloning approach retrieved 16s rDNA sequences of oligotrophs, symbionts, and y-proteobacteria members. DNA microarray analysis based on rational probe design and dissociation curves complemented existing molecular techniques and produced a highly parallel, high resolution analysis of contaminating microbial populations. For instance, strong hybridization signals to probes targeting the Bacillus species indicated that members of this species were present in the assembly area samples; however, differences in dissociation curves between perfect-match and air sample sequences showed that these samples harbored nucleotide polymorphisms. Vegetative cells of several isolates were resistant when subjected to treatments of UVC (254 nm) and vapor H202 (4 mg/L). This study further validates the significance of non-cultivable microbes in association with spacecraft assembly facilities, as our analyses have identified several non-cultivable microbes likely to contaminate the surfaces of spacecraft hardware.
Comparing de novo genome assembly: the long and short of it.

PubMed

Narzisi, Giuseppe; Mishra, Bud

2011-04-29

Recent advances in DNA sequencing technology and their focal role in Genome Wide Association Studies (GWAS) have rekindled a growing interest in the whole-genome sequence assembly (WGSA) problem, thereby, inundating the field with a plethora of new formalizations, algorithms, heuristics and implementations. And yet, scant attention has been paid to comparative assessments of these assemblers' quality and accuracy. No commonly accepted and standardized method for comparison exists yet. Even worse, widely used metrics to compare the assembled sequences emphasize only size, poorly capturing the contig quality and accuracy. This paper addresses these concerns: it highlights common anomalies in assembly accuracy through a rigorous study of several assemblers, compared under both standard metrics (N50, coverage, contig sizes, etc.) as well as a more comprehensive metric (Feature-Response Curves, FRC) that is introduced here; FRC transparently captures the trade-offs between contigs' quality against their sizes. For this purpose, most of the publicly available major sequence assemblers--both for low-coverage long (Sanger) and high-coverage short (Illumina) reads technologies--are compared. These assemblers are applied to microbial (Escherichia coli, Brucella, Wolbachia, Staphylococcus, Helicobacter) and partial human genome sequences (Chr. Y), using sequence reads of various read-lengths, coverages, accuracies, and with and without mate-pairs. It is hoped that, based on these evaluations, computational biologists will identify innovative sequence assembly paradigms, bioinformaticists will determine promising approaches for developing "next-generation" assemblers, and biotechnologists will formulate more meaningful design desiderata for sequencing technology platforms. A new software tool for computing the FRC metric has been developed and is available through the AMOS open-source consortium.
A systematic comparison of error correction enzymes by next-generation sequencing

DOE Office of Scientific and Technical Information (OSTI.GOV)

Lubock, Nathan B.; Zhang, Di; Sidore, Angus M.

Gene synthesis, the process of assembling genelength fragments from shorter groups of oligonucleotides (oligos), is becoming an increasingly important tool in molecular and synthetic biology. The length, quality and cost of gene synthesis are limited by errors produced during oligo synthesis and subsequent assembly. Enzymatic error correction methods are cost-effective means to ameliorate errors in gene synthesis. Previous analyses of these methods relied on cloning and Sanger sequencing to evaluate their efficiencies, limiting quantitative assessment. Here, we develop a method to quantify errors in synthetic DNA by next-generation sequencing. We analyzed errors in model gene assemblies and systematically compared sixmore » different error correction enzymes across 11 conditions. We find that ErrASE and T7 Endonuclease I are the most effective at decreasing average error rates (up to 5.8-fold relative to the input), whereas MutS is the best for increasing the number of perfect assemblies (up to 25.2-fold). We are able to quantify differential specificities such as ErrASE preferentially corrects C/G transversions whereas T7 Endonuclease I preferentially corrects A/T transversions. More generally, this experimental and computational pipeline is a fast, scalable and extensible way to analyze errors in gene assemblies, to profile error correction methods, and to benchmark DNA synthesis methods.« less
A systematic comparison of error correction enzymes by next-generation sequencing

DOE PAGES

Lubock, Nathan B.; Zhang, Di; Sidore, Angus M.; ...

2017-08-01

Gene synthesis, the process of assembling genelength fragments from shorter groups of oligonucleotides (oligos), is becoming an increasingly important tool in molecular and synthetic biology. The length, quality and cost of gene synthesis are limited by errors produced during oligo synthesis and subsequent assembly. Enzymatic error correction methods are cost-effective means to ameliorate errors in gene synthesis. Previous analyses of these methods relied on cloning and Sanger sequencing to evaluate their efficiencies, limiting quantitative assessment. Here, we develop a method to quantify errors in synthetic DNA by next-generation sequencing. We analyzed errors in model gene assemblies and systematically compared sixmore » different error correction enzymes across 11 conditions. We find that ErrASE and T7 Endonuclease I are the most effective at decreasing average error rates (up to 5.8-fold relative to the input), whereas MutS is the best for increasing the number of perfect assemblies (up to 25.2-fold). We are able to quantify differential specificities such as ErrASE preferentially corrects C/G transversions whereas T7 Endonuclease I preferentially corrects A/T transversions. More generally, this experimental and computational pipeline is a fast, scalable and extensible way to analyze errors in gene assemblies, to profile error correction methods, and to benchmark DNA synthesis methods.« less
Efficient self-assembly of DNA-functionalized fluorophores and gold nanoparticles with DNA functionalized silicon surfaces: the effect of oligomer spacers

PubMed Central

Milton, James A.; Patole, Samson; Yin, Huabing; Xiao, Qiang; Brown, Tom; Melvin, Tracy

2013-01-01

Although strategies for the immobilization of DNA oligonucleotides onto surfaces for bioanalytical and top-down bio-inspired nanobiofabrication approaches are well developed, the effect of introducing spacer molecules between the surface and the DNA oligonucleotide for the hybridization of nanoparticle–DNA conjugates has not been previously assessed in a quantitative manner. The hybridization efficiency of DNA oligonucleotides end-labelled with gold nanoparticles (1.4 or 10 nm diameter) with DNA sequences conjugated to silicon surfaces via hexaethylene glycol phosphate diester oligomer spacers (0, 1, 2, 6 oligomers) was found to be independent of spacer length. To quantify both the density of DNA strands attached to the surfaces and hybridization with the surface-attached DNA, new methodologies have been developed. Firstly, a simple approach based on fluorescence has been developed for determination of the immobilization density of DNA oligonucleotides. Secondly, an approach using mass spectrometry has been created to establish (i) the mean number of DNA oligonucleotides attached to the gold nanoparticles and (ii) the hybridization density of nanoparticle–oligonucleotide conjugates with the silicon surface–attached complementary sequence. These methods and results will be useful for application with nanosensors, the self-assembly of nanoelectronic devices and the attachment of nanoparticles to biomolecules for single-molecule biophysical studies. PMID:23361467
Chemiluminescent and chemiluminescence resonance energy transfer (CRET) detection of DNA, metal ions, and aptamer-substrate complexes using hemin/G-quadruplexes and CdSe/ZnS quantum dots.

PubMed

Freeman, Ronit; Liu, Xiaoqing; Willner, Itamar

2011-08-03

Nucleic acid subunits consisting of fragments of the horseradish peroxidase (HRP)-mimicking DNAzyme and aptamer domains against ATP or sequences recognizing Hg(2+) ions self-assemble, in the presence of ATP or Hg(2+), into the active hemin-G-quadruplex DNAzyme structure. The DNAzyme-generated chemiluminescence provides the optical readout for the sensing events. In addition, the DNAzyme-stimulated chemiluminescence resonance energy transfer (CRET) to CdSe/ZnS quantum dots (QDs) is implemented to develop aptamer or DNA sensing platforms. The self-assembly of the ATP-aptamer subunits/hemin-G-quadruplex DNAzyme, where one of the aptamer subunits is functionalized with CdSe/ZnS QDs, leads to the CRET signal. Also, the functionalization of QDs with a hairpin nucleic acid that includes the G-quadruplex sequence in a ''caged'' configuration is used to analyze DNA. The opening of the hairpin structure by the target DNA assembles the hemin-G-quadruplex DNAzyme that stimulates the CRET signal. By the application of three different sized QDs functionalized with different hairpins, the multiplexed analysis of three different DNA targets is demonstrated by the generation of three different CRET luminescence signals.
OligArch: A software tool to allow artificially expanded genetic information systems (AEGIS) to guide the autonomous self-assembly of long DNA constructs from multiple DNA single strands.

PubMed

Bradley, Kevin M; Benner, Steven A

2014-01-01

Synthetic biologists wishing to self-assemble large DNA (L-DNA) constructs from small DNA fragments made by automated synthesis need fragments that hybridize predictably. Such predictability is difficult to obtain with nucleotides built from just the four standard nucleotides. Natural DNA's peculiar combination of strong and weak G:C and A:T pairs, the context-dependence of the strengths of those pairs, unimolecular strand folding that competes with desired interstrand hybridization, and non-Watson-Crick interactions available to standard DNA, all contribute to this unpredictability. In principle, adding extra nucleotides to the genetic alphabet can improve the predictability and reliability of autonomous DNA self-assembly, simply by increasing the information density of oligonucleotide sequences. These extra nucleotides are now available as parts of artificially expanded genetic information systems (AEGIS), and tools are now available to generate entirely standard DNA from AEGIS DNA during PCR amplification. Here, we describe the OligArch (for "oligonucleotide architecting") software, an application that permits synthetic biologists to engineer optimally self-assembling DNA constructs from both six- and eight-letter AEGIS alphabets. This software has been used to design oligonucleotides that self-assemble to form complete genes from 20 or more single-stranded synthetic oligonucleotides. OligArch is therefore a key element of a scalable and integrated infrastructure for the rapid and designed engineering of biology.
Exploring the Limits of DNA Size: Naphtho-homologated DNA Bases and Pairs

PubMed Central

Lee, Alex H. F.; Kool, Eric T.

2008-01-01

A new design for DNA bases and base pairs is described in which the pyrimidine bases are widened by naphtho-homologation. Two naphtho-homologated deoxyribosides, dyyT (1) and dyyC (2) were synthesized and could be incorporated into oligonucleotides as suitably protected phosphoramidite derivatives. The deoxyribosides were found to be fluorescent, with emission maxima at 446 and 433 nm, respectively. Studies with single substitutions of 1 and 2 in the natural DNA context revealed exceptionally strong base stacking propensity for both. Sequences containing multiple substitutions of 1 and 2 paired opposite adenine and guanine were subsequently mixed and studied by several analytical methods. Data from UV mixing experiments, FRET measurements, fluorescence quenching experiments, and hybridizations on beads suggest that complementary “doublewide DNA” (yyDNA) strands may self-assemble into helical complexes with 1:1 stoichiometry. Data from thermal denaturation plots and CD spectra were less conclusive. Control experiments in one sequence context gave evidence that yyDNA helices, if formed, are preferentially antiparallel and are sequence selective. Hypothesized base pairing schemes are analogous to Watson-Crick pairing, but with glycosidic C1′-C1′ distances widened by over 45%, to ca. 15.2 Å. The possible self-assembly of the double-wide DNA helix establishes a new limit for the size of information-encoding, DNA-like molecules, and the fluorescence of yyDNA bases suggests uses as reporters in monomeric and oligomeric forms. PMID:16834396
Sequence Identification, Recombinant Production, and Analysis of the Self-Assembly of Egg Stalk Silk Proteins from Lacewing Chrysoperla carnea.

PubMed

Neuenfeldt, Martin; Scheibel, Thomas

2017-06-13

Egg stalk silks of the common green lacewing Chrysoperla carnea likely comprise at least three different silk proteins. Based on the natural spinning process, it was hypothesized that these proteins self-assemble without shear stress, as adult lacewings do not use a spinneret. To examine this, the first sequence identification and determination of the gene expression profile of several silk proteins and various transcript variants thereof was conducted, and then the three major proteins were recombinantly produced in Escherichia coli encoded by their native complementary DNA (cDNA) sequences. Circular dichroism measurements indicated that the silk proteins in aqueous solutions had a mainly intrinsically disordered structure. The largest silk protein, which we named ChryC1, exhibited a lower critical solution temperature (LCST) behavior and self-assembled into fibers or film morphologies, depending on the conditions used. The second silk protein, ChryC2, self-assembled into nanofibrils and subsequently formed hydrogels. Circular dichroism and Fourier transform infrared spectroscopy confirmed conformational changes of both proteins into beta sheet rich structures upon assembly. ChryC3 did not self-assemble into any morphology under the tested conditions. Thereby, through this work, it could be shown that recombinant lacewing silk proteins can be produced and further used for studying the fiber formation of lacewing egg stalks.
Rapid and highly efficient construction of TALE-based transcriptional regulators and nucleases for genome modification.

PubMed

Li, Lixin; Piatek, Marek J; Atef, Ahmed; Piatek, Agnieszka; Wibowo, Anjar; Fang, Xiaoyun; Sabir, J S M; Zhu, Jian-Kang; Mahfouz, Magdy M

2012-03-01

Transcription activator-like effectors (TALEs) can be used as DNA-targeting modules by engineering their repeat domains to dictate user-selected sequence specificity. TALEs have been shown to function as site-specific transcriptional activators in a variety of cell types and organisms. TALE nucleases (TALENs), generated by fusing the FokI cleavage domain to TALE, have been used to create genomic double-strand breaks. The identity of the TALE repeat variable di-residues, their number, and their order dictate the DNA sequence specificity. Because TALE repeats are nearly identical, their assembly by cloning or even by synthesis is challenging and time consuming. Here, we report the development and use of a rapid and straightforward approach for the construction of designer TALE (dTALE) activators and nucleases with user-selected DNA target specificity. Using our plasmid set of 100 repeat modules, researchers can assemble repeat domains for any 14-nucleotide target sequence in one sequential restriction-ligation cloning step and in only 24 h. We generated several custom dTALEs and dTALENs with new target sequence specificities and validated their function by transient expression in tobacco leaves and in vitro DNA cleavage assays, respectively. Moreover, we developed a web tool, called idTALE, to facilitate the design of dTALENs and the identification of their genomic targets and potential off-targets in the genomes of several model species. Our dTALE repeat assembly approach along with the web tool idTALE will expedite genome-engineering applications in a variety of cell types and organisms including plants.
Self-Assembly of 3D DNA Crystals Containing a Torsionally Stressed Component

DOE PAGES

Hernandez, Carina; Birktoft, Jens J.; Ohayon, Yoel P.; ...

2017-10-05

There is an increasing appreciation for structural diversity of DNA that is of interest to both DNA nanotechnology and basic biology. Here, we have explored how DNA responds to torsional stress by building on a previously reported two-turn DNA tensegrity triangle and demonstrating that we could introduce an extra nucleotide pair (np) into the original sequence without affecting assembly and crystallization. The extra np imposes a significant torsional stress, which is accommodated by global changes throughout the B-DNA duplex and the DNA lattice. Furthermore, the work reveals a near-atomic structure of naked DNA under a torsional stress of approximately 14%,more » and thus provides an example of DNA distortions that occur without a requirement for either an external energy source or the free energy available from protein or drug binding.« less
Self-Assembly of 3D DNA Crystals Containing a Torsionally Stressed Component.

PubMed

Hernandez, Carina; Birktoft, Jens J; Ohayon, Yoel P; Chandrasekaran, Arun Richard; Abdallah, Hatem; Sha, Ruojie; Stojanoff, Vivian; Mao, Chengde; Seeman, Nadrian C

2017-11-16

There is an increasing appreciation for structural diversity of DNA that is of interest to both DNA nanotechnology and basic biology. Here, we have explored how DNA responds to torsional stress by building on a previously reported two-turn DNA tensegrity triangle and demonstrating that we could introduce an extra nucleotide pair (np) into the original sequence without affecting assembly and crystallization. The extra np imposes a significant torsional stress, which is accommodated by global changes throughout the B-DNA duplex and the DNA lattice. The work reveals a near-atomic structure of naked DNA under a torsional stress of approximately 14%, and thus provides an example of DNA distortions that occur without a requirement for either an external energy source or the free energy available from protein or drug binding. Copyright © 2017 Elsevier Ltd. All rights reserved.
Self-Assembly of 3D DNA Crystals Containing a Torsionally Stressed Component

DOE Office of Scientific and Technical Information (OSTI.GOV)

Hernandez, Carina; Birktoft, Jens J.; Ohayon, Yoel P.

There is an increasing appreciation for structural diversity of DNA that is of interest to both DNA nanotechnology and basic biology. Here, we have explored how DNA responds to torsional stress by building on a previously reported two-turn DNA tensegrity triangle and demonstrating that we could introduce an extra nucleotide pair (np) into the original sequence without affecting assembly and crystallization. The extra np imposes a significant torsional stress, which is accommodated by global changes throughout the B-DNA duplex and the DNA lattice. Furthermore, the work reveals a near-atomic structure of naked DNA under a torsional stress of approximately 14%,more » and thus provides an example of DNA distortions that occur without a requirement for either an external energy source or the free energy available from protein or drug binding.« less
[The principle and application of the single-molecule real-time sequencing technology].

PubMed

Yanhu, Liu; Lu, Wang; Li, Yu

2015-03-01

Last decade witnessed the explosive development of the third-generation sequencing strategy, including single-molecule real-time sequencing (SMRT), true single-molecule sequencing (tSMSTM) and the single-molecule nanopore DNA sequencing. In this review, we summarize the principle, performance and application of the SMRT sequencing technology. Compared with the traditional Sanger method and the next-generation sequencing (NGS) technologies, the SMRT approach has several advantages, including long read length, high speed, PCR-free and the capability of direct detection of epigenetic modiﬁcations. However, the disadvantage of its low accuracy, most of which resulted from insertions and deletions, is also notable. So, the raw sequence data need to be corrected before assembly. Up to now, the SMRT is a good fit for applications in the de novo genomic sequencing and the high-quality assemblies of small genomes. In the future, it is expected to play an important role in epigenetics, transcriptomic sequencing, and assemblies of large genomes.
Sequential self-assembly of DNA functionalized droplets

DOE Office of Scientific and Technical Information (OSTI.GOV)

Zhang, Yin; McMullen, Angus; Pontani, Lea-Laetitia

Complex structures and devices, both natural and manmade, are often constructed sequentially. From crystallization to embryogenesis, a nucleus or seed is formed and built upon. Sequential assembly allows for initiation, signaling, and logical programming, which are necessary for making enclosed, hierarchical structures. Though biology relies on such schemes, they have not been available in materials science. We demonstrate programmed sequential self-assembly of DNA functionalized emulsions. The droplets are initially inert because the grafted DNA strands are pre-hybridized in pairs. Active strands on initiator droplets then displace one of the paired strands and thus release its complement, which in turn activatesmore » the next droplet in the sequence, akin to living polymerization. This strategy provides time and logic control during the self-assembly process, and offers a new perspective on the synthesis of materials.« less

Sequential self-assembly of DNA functionalized droplets

DOE PAGES

Zhang, Yin; McMullen, Angus; Pontani, Lea-Laetitia; ...

2017-06-16

Complex structures and devices, both natural and manmade, are often constructed sequentially. From crystallization to embryogenesis, a nucleus or seed is formed and built upon. Sequential assembly allows for initiation, signaling, and logical programming, which are necessary for making enclosed, hierarchical structures. Though biology relies on such schemes, they have not been available in materials science. We demonstrate programmed sequential self-assembly of DNA functionalized emulsions. The droplets are initially inert because the grafted DNA strands are pre-hybridized in pairs. Active strands on initiator droplets then displace one of the paired strands and thus release its complement, which in turn activatesmore » the next droplet in the sequence, akin to living polymerization. This strategy provides time and logic control during the self-assembly process, and offers a new perspective on the synthesis of materials.« less
Improved Analysis of Nanopore Sequence Data and Scanning Nanopore Techniques

NASA Astrophysics Data System (ADS)

Szalay, Tamas

The field of nanopore research has been driven by the need to inexpensively and rapidly sequence DNA. In order to help realize this goal, this thesis describes the PoreSeq algorithm that identifies and corrects errors in real-world nanopore sequencing data and improves the accuracy of de novo genome assembly with increasing coverage depth. The approach relies on modeling the possible sources of uncertainty that occur as DNA advances through the nanopore and then using this model to find the sequence that best explains multiple reads of the same region of DNA. PoreSeq increases nanopore sequencing read accuracy of M13 bacteriophage DNA from 85% to 99% at 100X coverage. We also use the algorithm to assemble E. coli with 30X coverage and the lambda genome at a range of coverages from 3X to 50X. Additionally, we classify sequence variants at an order of magnitude lower coverage than is possible with existing methods. This thesis also reports preliminary progress towards controlling the motion of DNA using two nanopores instead of one. The speed at which the DNA travels through the nanopore needs to be carefully controlled to facilitate the detection of individual bases. A second nanopore in close proximity to the first could be used to slow or stop the motion of the DNA in order to enable a more accurate readout. The fabrication process for a new pyramidal nanopore geometry was developed in order to facilitate the positioning of the nanopores. This thesis demonstrates that two of them can be placed close enough to interact with a single molecule of DNA, which is a prerequisite for being able to use the driving force of the pores to exert fine control over the motion of the DNA. Another strategy for reading the DNA is to trap it completely with one pore and to move the second nanopore instead. To that end, this thesis also shows that a single strand of immobilized DNA can be captured in a scanning nanopore and examined for a full hour, with data from many scans at many different voltages obtained in order to detect a bound protein placed partway along the molecule.
Theory and modeling of particles with DNA-mediated interactions

NASA Astrophysics Data System (ADS)

Licata, Nicholas A.

In recent years significant attention has been attracted to proposals which utilize DNA for nanotechnological applications. Potential applications of these ideas range from the programmable self-assembly of colloidal crystals, to biosensors and nanoparticle based drug delivery platforms. In Chapter I we introduce the system, which generically consists of colloidal particles functionalized with specially designed DNA markers. The sequence of bases on the DNA markers determines the particle type. Due to the hybridization between complementary single-stranded DNA, specific, type-dependent interactions can be introduced between particles by choosing the appropriate DNA marker sequences. In Chapter II we develop a statistical mechanical description of the aggregation and melting behavior of particles with DNA-mediated interactions. A quantitative comparison between the theory and experiments is made by calculating the experimentally observed melting profile. In Chapter III a model is proposed to describe the dynamical departure and diffusion of particles which form reversible key-lock connections. The model predicts a crossover from localized to diffusive behavior. The random walk statistics for the particles' in plane diffusion is discussed. The lateral motion is analogous to dispersive transport in disordered semiconductors, ranging from standard diffusion with a renormalized diffusion coefficient to anomalous, subdiffusive behavior. In Chapter IV we propose a method to self-assemble nanoparticle clusters using DNA scaffolds. An optimal concentration ratio is determined for the experimental implementation of our self-assembly proposal. A natural extension is discussed in Chapter V, the programmable self-assembly of nanoparticle clusters where the desired cluster geometry is encoded using DNA-mediated interactions. We determine the probability that the system self-assembles the desired cluster geometry, and discuss the connections to jamming in granular and colloidal systems. In Chapter VI we consider a nanoparticle based drug delivery platform for targeted, cell specific chemotherapy. A key-lock model is proposed to describe the results of in-vitro experiments, and the situation in-vivo is discussed. The cooperative binding, and hence the specificity to cancerous cells, is kinetically limited. The implications for optimizing the design of nanoparticle based drug delivery platforms is discussed. In Chapter VII we present prospects for future research: the connection between DNA-mediated colloidal crystallization and jamming, and the inverse problem in self-assembly.
Minimalist Approach to Complexity: Templating the Assembly of DNA Tile Structures with Sequentially Grown Input Strands.

PubMed

Lau, Kai Lin; Sleiman, Hanadi F

2016-07-26

Given its highly predictable self-assembly properties, DNA has proven to be an excellent template toward the design of functional materials. Prominent examples include the remarkable complexity provided by DNA origami and single-stranded tile (SST) assemblies, which require hundreds of unique component strands. However, in many cases, the majority of the DNA assembly is purely structural, and only a small "working area" needs to be aperiodic. On the other hand, extended lattices formed by DNA tile motifs require only a few strands; but they suffer from lack of size control and limited periodic patterning. To overcome these limitations, we adopt a templation strategy, where an input strand of DNA dictates the size and patterning of resultant DNA tile structures. To prepare these templating input strands, a sequential growth technique developed in our lab is used, whereby extended DNA strands of defined sequence and length may be generated simply by controlling their order of addition. With these, we demonstrate the periodic patterning of size-controlled double-crossover (DX) and triple-crossover (TX) tile structures, as well as intentionally designed aperiodicity of a DX tile structure. As such, we are able to prepare size-controlled DNA structures featuring aperiodicity only where necessary with exceptional economy and efficiency.
de novo assembly and population genomic survey of natural yeast isolates with the Oxford Nanopore MinION sequencer.

PubMed

Istace, Benjamin; Friedrich, Anne; d'Agata, Léo; Faye, Sébastien; Payen, Emilie; Beluche, Odette; Caradec, Claudia; Davidas, Sabrina; Cruaud, Corinne; Liti, Gianni; Lemainque, Arnaud; Engelen, Stefan; Wincker, Patrick; Schacherer, Joseph; Aury, Jean-Marc

2017-02-01

Oxford Nanopore Technologies Ltd (Oxford, UK) have recently commercialized MinION, a small single-molecule nanopore sequencer, that offers the possibility of sequencing long DNA fragments from small genomes in a matter of seconds. The Oxford Nanopore technology is truly disruptive; it has the potential to revolutionize genomic applications due to its portability, low cost, and ease of use compared with existing long reads sequencing technologies. The MinION sequencer enables the rapid sequencing of small eukaryotic genomes, such as the yeast genome. Combined with existing assembler algorithms, near complete genome assemblies can be generated and comprehensive population genomic analyses can be performed. Here, we resequenced the genome of the Saccharomyces cerevisiae S288C strain to evaluate the performance of nanopore-only assemblers. Then we de novo sequenced and assembled the genomes of 21 isolates representative of the S. cerevisiae genetic diversity using the MinION platform. The contiguity of our assemblies was 14 times higher than the Illumina-only assemblies and we obtained one or two long contigs for 65 % of the chromosomes. This high contiguity allowed us to accurately detect large structural variations across the 21 studied genomes. Because of the high completeness of the nanopore assemblies, we were able to produce a complete cartography of transposable elements insertions and inspect structural variants that are generally missed using a short-read sequencing strategy. Our analyses show that the Oxford Nanopore technology is already usable for de novo sequencing and assembly; however, non-random errors in homopolymers require polishing the consensus using an alternate sequencing technology. © The Author 2017. Published by Oxford University Press.
de novo assembly and population genomic survey of natural yeast isolates with the Oxford Nanopore MinION sequencer

PubMed Central

Istace, Benjamin; Friedrich, Anne; d'Agata, Léo; Faye, Sébastien; Payen, Emilie; Beluche, Odette; Caradec, Claudia; Davidas, Sabrina; Cruaud, Corinne; Liti, Gianni; Lemainque, Arnaud; Engelen, Stefan; Wincker, Patrick; Schacherer, Joseph

2017-01-01

Abstract Background: Oxford Nanopore Technologies Ltd (Oxford, UK) have recently commercialized MinION, a small single-molecule nanopore sequencer, that offers the possibility of sequencing long DNA fragments from small genomes in a matter of seconds. The Oxford Nanopore technology is truly disruptive; it has the potential to revolutionize genomic applications due to its portability, low cost, and ease of use compared with existing long reads sequencing technologies. The MinION sequencer enables the rapid sequencing of small eukaryotic genomes, such as the yeast genome. Combined with existing assembler algorithms, near complete genome assemblies can be generated and comprehensive population genomic analyses can be performed. Results: Here, we resequenced the genome of the Saccharomyces cerevisiae S288C strain to evaluate the performance of nanopore-only assemblers. Then we de novo sequenced and assembled the genomes of 21 isolates representative of the S. cerevisiae genetic diversity using the MinION platform. The contiguity of our assemblies was 14 times higher than the Illumina-only assemblies and we obtained one or two long contigs for 65 % of the chromosomes. This high contiguity allowed us to accurately detect large structural variations across the 21 studied genomes. Conclusion: Because of the high completeness of the nanopore assemblies, we were able to produce a complete cartography of transposable elements insertions and inspect structural variants that are generally missed using a short-read sequencing strategy. Our analyses show that the Oxford Nanopore technology is already usable for de novo sequencing and assembly; however, non-random errors in homopolymers require polishing the consensus using an alternate sequencing technology. PMID:28369459
"The devil's in the detail": Release of an expanded, enhanced and dynamically revised forensic STR Sequence Guide.

PubMed

Phillips, C; Gettings, K Butler; King, J L; Ballard, D; Bodner, M; Borsuk, L; Parson, W

2018-05-01

The STR sequence template file published in 2016 as part of the considerations from the DNA Commission of the International Society for Forensic Genetics on minimal STR sequence nomenclature requirements, has been comprehensively revised and audited using the latest GRCh38 genome assembly. The list of forensic STRs characterized was expanded by including supplementary autosomal, X- and Y-chromosome microsatellites in less common use for routine DNA profiling, but some likely to be adopted in future massively parallel sequencing (MPS) STR panels. We outline several aspects of sequence alignment and annotation that required care and attention to detail when comparing sequences to GRCh37 and GRCh38 assemblies, as well as the necessary matching of MPS-based allele descriptions to previously established repeat region structures described in initial sequencing studies of the less well known forensic STRs. The revised sequence guide is now available in a dynamically updated FTP format from the STRidER website with a date-stamped change log to allow users to explore their own MPS data with the most up-to-date forensic STR sequence information compiled in a simple guide. Copyright © 2018 Elsevier B.V. All rights reserved.
MIDAS: A Modular DNA Assembly System for Synthetic Biology.

PubMed

van Dolleweerd, Craig J; Kessans, Sarah A; Van de Bittner, Kyle C; Bustamante, Leyla Y; Bundela, Rudranuj; Scott, Barry; Nicholson, Matthew J; Parker, Emily J

2018-04-20

A modular and hierarchical DNA assembly platform for synthetic biology based on Golden Gate (Type IIS restriction enzyme) cloning is described. This enabling technology, termed MIDAS (for Modular Idempotent DNA Assembly System), can be used to precisely assemble multiple DNA fragments in a single reaction using a standardized assembly design. It can be used to build genes from libraries of sequence-verified, reusable parts and to assemble multiple genes in a single vector, with full user control over gene order and orientation, as well as control of the direction of growth (polarity) of the multigene assembly, a feature that allows genes to be nested between other genes or genetic elements. We describe the detailed design and use of MIDAS, exemplified by the reconstruction, in the filamentous fungus Penicillium paxilli, of the metabolic pathway for production of paspaline and paxilline, key intermediates in the biosynthesis of a range of indole diterpenes-a class of secondary metabolites produced by several species of filamentous fungi. MIDAS was used to efficiently assemble a 25.2 kb plasmid from 21 different modules (seven genes, each composed of three basic parts). By using a parts library-based system for construction of complex assemblies, and a unique set of vectors, MIDAS can provide a flexible route to assembling tailored combinations of genes and other genetic elements, thereby supporting synthetic biology applications in a wide range of expression hosts.
Self-assembled bionanostructures: proteins following the lead of DNA nanostructures

PubMed Central

2014-01-01

Natural polymers are able to self-assemble into versatile nanostructures based on the information encoded into their primary structure. The structural richness of biopolymer-based nanostructures depends on the information content of building blocks and the available biological machinery to assemble and decode polymers with a defined sequence. Natural polypeptides comprise 20 amino acids with very different properties in comparison to only 4 structurally similar nucleotides, building elements of nucleic acids. Nevertheless the ease of synthesizing polynucleotides with selected sequence and the ability to encode the nanostructural assembly based on the two specific nucleotide pairs underlay the development of techniques to self-assemble almost any selected three-dimensional nanostructure from polynucleotides. Despite more complex design rules, peptides were successfully used to assemble symmetric nanostructures, such as fibrils and spheres. While earlier designed protein-based nanostructures used linked natural oligomerizing domains, recent design of new oligomerizing interaction surfaces and introduction of the platform for topologically designed protein fold may enable polypeptide-based design to follow the track of DNA nanostructures. The advantages of protein-based nanostructures, such as the functional versatility and cost effective and sustainable production methods provide strong incentive for further development in this direction. PMID:24491139
DNA-programmable multiplexing for scalable, renewable redox protein bio-nanoelectronics.

PubMed

Withey, Gary D; Kim, Jin Ho; Xu, Jimmy

2008-11-01

A universal, site-addressable DNA linking strategy is deployed for the programmable assembly of multifunctional, long-lasting redox protein nanoelectronic devices. This addressable linker, the first incorporated into a redox enzyme-nanoelectronic system, promotes versatility and renewability by allowing the reconfiguration and replacement of enzymes at will. The linker is transferable to all redox proteins due to the simple conjugation chemistry involved. The efficacy of this linking strategy is assessed using two model enzymes, glucose oxidase (GOx) and alcohol dehydrogenase (ADH), self-assembled onto separate nanoelectrode regions comprised of a highly ordered carbon nanotube (CNT) array. The sequence-specificity of DNA hybridization provides the means of encoding spatial address to the self-assembling process that conjugates enzymes tagged with single-stranded DNA (ssDNA) to the tips of designated CNTs functionalized with the complementary strands. In this study, we demonstrate the feasibility of multiplexed, scalable, reconfigurable and renewable transduction of redox protein signals by virtue of DNA addressing.
Rapid self-assembly of DNA on a microfluidic chip

PubMed Central

Zheng, Yao; Footz, Tim; Manage, Dammika P; Backhouse, Christopher James

2005-01-01

Background DNA self-assembly methods have played a major role in enabling methods for acquiring genetic information without having to resort to sequencing, a relatively slow and costly procedure. However, even self-assembly processes tend to be very slow when they rely upon diffusion on a large scale. Miniaturisation and integration therefore hold the promise of greatly increasing this speed of operation. Results We have developed a rapid method for implementing the self-assembly of DNA within a microfluidic system by electrically extracting the DNA from an environment containing an uncharged denaturant. By controlling the parameters of the electrophoretic extraction and subsequent analysis of the DNA we are able to control when the hybridisation occurs as well as the degree of hybridisation. By avoiding off-chip processing or long thermal treatments we are able to perform this hybridisation rapidly and can perform hybridisation, sizing, heteroduplex analysis and single-stranded conformation analysis within a matter of minutes. The rapidity of this analysis allows the sampling of transient effects that may improve the sensitivity of mutation detection. Conclusions We believe that this method will aid the integration of self-assembly methods upon microfluidic chips. The speed of this analysis also appears to provide information upon the dynamics of the self-assembly process. PMID:15717935
Blueprints for green biotech: development and application of standards for plant synthetic biology.

PubMed

Patron, Nicola J

2016-06-15

Synthetic biology aims to apply engineering principles to the design and modification of biological systems and to the construction of biological parts and devices. The ability to programme cells by providing new instructions written in DNA is a foundational technology of the field. Large-scale de novo DNA synthesis has accelerated synthetic biology by offering custom-made molecules at ever decreasing costs. However, for large fragments and for experiments in which libraries of DNA sequences are assembled in different combinations, assembly in the laboratory is still desirable. Biological assembly standards allow DNA parts, even those from multiple laboratories and experiments, to be assembled together using the same reagents and protocols. The adoption of such standards for plant synthetic biology has been cohesive for the plant science community, facilitating the application of genome editing technologies to plant systems and streamlining progress in large-scale, multi-laboratory bioengineering projects. © 2016 The Author(s). published by Portland Press Limited on behalf of the Biochemical Society.
Mechanical Response of DNA–Nanoparticle Crystals to Controlled Deformation

DOE PAGES

Lequieu, Joshua; Córdoba, Andrés; Hinckley, Daniel; ...

2016-08-17

The self-assembly of DNA-conjugated nanoparticles represents a promising avenue toward the design of engineered hierarchical materials. By using DNA to encode nanoscale interactions, macroscale crystals can be formed with mechanical properties that can, at least in principle, be tuned. Here we present in silico evidence that the mechanical response of these assemblies can indeed be controlled, and that subtle modifications of the linking DNA sequences can change the Young’s modulus from 97 kPa to 2.1 MPa. We rely on a detailed molecular model to quantify the energetics of DNA–nanoparticle assembly and demonstrate that the mechanical response is governed by entropic,more » rather than enthalpic, contributions and that the response of the entire network can be estimated from the elastic properties of an individual nanoparticle. The results here provide a first step toward the mechanical characterization of DNA–nanoparticle assemblies, and suggest the possibility of mechanical metamaterials constructed using DNA.« less
Mechanical Response of DNA–Nanoparticle Crystals to Controlled Deformation

DOE Office of Scientific and Technical Information (OSTI.GOV)

Lequieu, Joshua; Córdoba, Andrés; Hinckley, Daniel

The self-assembly of DNA-conjugated nanoparticles represents a promising avenue toward the design of engineered hierarchical materials. By using DNA to encode nanoscale interactions, macroscale crystals can be formed with mechanical properties that can, at least in principle, be tuned. Here we present in silico evidence that the mechanical response of these assemblies can indeed be controlled, and that subtle modifications of the linking DNA sequences can change the Young’s modulus from 97 kPa to 2.1 MPa. We rely on a detailed molecular model to quantify the energetics of DNA–nanoparticle assembly and demonstrate that the mechanical response is governed by entropic,more » rather than enthalpic, contributions and that the response of the entire network can be estimated from the elastic properties of an individual nanoparticle. The results here provide a first step toward the mechanical characterization of DNA–nanoparticle assemblies, and suggest the possibility of mechanical metamaterials constructed using DNA.« less
Facilitated sequence counting and assembly by template mutagenesis

PubMed Central

Levy, Dan; Wigler, Michael

2014-01-01

Presently, inferring the long-range structure of the DNA templates is limited by short read lengths. Accurate template counts suffer from distortions occurring during PCR amplification. We explore the utility of introducing random mutations in identical or nearly identical templates to create distinguishable patterns that are inherited during subsequent copying. We simulate the applications of this process under assumptions of error-free sequencing and perfect mapping, using cytosine deamination as a model for mutation. The simulations demonstrate that within readily achievable conditions of nucleotide conversion and sequence coverage, we can accurately count the number of otherwise identical molecules as well as connect variants separated by long spans of identical sequence. We discuss many potential applications, such as transcript profiling, isoform assembly, haplotype phasing, and de novo genome assembly. PMID:25313059
Gigadalton-scale shape-programmable DNA assemblies

NASA Astrophysics Data System (ADS)

Wagenbauer, Klaus F.; Sigl, Christian; Dietz, Hendrik

2017-12-01

Natural biomolecular assemblies such as molecular motors, enzymes, viruses and subcellular structures often form by self-limiting hierarchical oligomerization of multiple subunits. Large structures can also assemble efficiently from a few components by combining hierarchical assembly and symmetry, a strategy exemplified by viral capsids. De novo protein design and RNA and DNA nanotechnology aim to mimic these capabilities, but the bottom-up construction of artificial structures with the dimensions and complexity of viruses and other subcellular components remains challenging. Here we show that natural assembly principles can be combined with the methods of DNA origami to produce gigadalton-scale structures with controlled sizes. DNA sequence information is used to encode the shapes of individual DNA origami building blocks, and the geometry and details of the interactions between these building blocks then control their copy numbers, positions and orientations within higher-order assemblies. We illustrate this strategy by creating planar rings of up to 350 nanometres in diameter and with atomic masses of up to 330 megadaltons, micrometre-long, thick tubes commensurate in size to some bacilli, and three-dimensional polyhedral assemblies with sizes of up to 1.2 gigadaltons and 450 nanometres in diameter. We achieve efficient assembly, with yields of up to 90 per cent, by using building blocks with validated structure and sufficient rigidity, and an accurate design with interaction motifs that ensure that hierarchical assembly is self-limiting and able to proceed in equilibrium to allow for error correction. We expect that our method, which enables the self-assembly of structures with sizes approaching that of viruses and cellular organelles, can readily be used to create a range of other complex structures with well defined sizes, by exploiting the modularity and high degree of addressability of the DNA origami building blocks used.
Gigadalton-scale shape-programmable DNA assemblies.

PubMed

Wagenbauer, Klaus F; Sigl, Christian; Dietz, Hendrik

2017-12-06

Natural biomolecular assemblies such as molecular motors, enzymes, viruses and subcellular structures often form by self-limiting hierarchical oligomerization of multiple subunits. Large structures can also assemble efficiently from a few components by combining hierarchical assembly and symmetry, a strategy exemplified by viral capsids. De novo protein design and RNA and DNA nanotechnology aim to mimic these capabilities, but the bottom-up construction of artificial structures with the dimensions and complexity of viruses and other subcellular components remains challenging. Here we show that natural assembly principles can be combined with the methods of DNA origami to produce gigadalton-scale structures with controlled sizes. DNA sequence information is used to encode the shapes of individual DNA origami building blocks, and the geometry and details of the interactions between these building blocks then control their copy numbers, positions and orientations within higher-order assemblies. We illustrate this strategy by creating planar rings of up to 350 nanometres in diameter and with atomic masses of up to 330 megadaltons, micrometre-long, thick tubes commensurate in size to some bacilli, and three-dimensional polyhedral assemblies with sizes of up to 1.2 gigadaltons and 450 nanometres in diameter. We achieve efficient assembly, with yields of up to 90 per cent, by using building blocks with validated structure and sufficient rigidity, and an accurate design with interaction motifs that ensure that hierarchical assembly is self-limiting and able to proceed in equilibrium to allow for error correction. We expect that our method, which enables the self-assembly of structures with sizes approaching that of viruses and cellular organelles, can readily be used to create a range of other complex structures with well defined sizes, by exploiting the modularity and high degree of addressability of the DNA origami building blocks used.
What's in your next-generation sequence data? An exploration of unmapped DNA and RNA sequence reads from the bovine reference individual

USDA-ARS?s Scientific Manuscript database

BACKGROUND: Next-generation sequencing projects commonly commence by aligning reads to a reference genome assembly. While improvements in alignment algorithms and computational hardware have greatly enhanced the efficiency and accuracy of alignments, a significant percentage of reads often remain u...
Advantages of genome sequencing by long-read sequencer using SMRT technology in medical area.

PubMed

Nakano, Kazuma; Shiroma, Akino; Shimoji, Makiko; Tamotsu, Hinako; Ashimine, Noriko; Ohki, Shun; Shinzato, Misuzu; Minami, Maiko; Nakanishi, Tetsuhiro; Teruya, Kuniko; Satou, Kazuhito; Hirano, Takashi

2017-07-01

PacBio RS II is the first commercialized third-generation DNA sequencer able to sequence a single molecule DNA in real-time without amplification. PacBio RS II's sequencing technology is novel and unique, enabling the direct observation of DNA synthesis by DNA polymerase. PacBio RS II confers four major advantages compared to other sequencing technologies: long read lengths, high consensus accuracy, a low degree of bias, and simultaneous capability of epigenetic characterization. These advantages surmount the obstacle of sequencing genomic regions such as high/low G+C, tandem repeat, and interspersed repeat regions. Moreover, PacBio RS II is ideal for whole genome sequencing, targeted sequencing, complex population analysis, RNA sequencing, and epigenetics characterization. With PacBio RS II, we have sequenced and analyzed the genomes of many species, from viruses to humans. Herein, we summarize and review some of our key genome sequencing projects, including full-length viral sequencing, complete bacterial genome and almost-complete plant genome assemblies, and long amplicon sequencing of a disease-associated gene region. We believe that PacBio RS II is not only an effective tool for use in the basic biological sciences but also in the medical/clinical setting.
Enhanced sequencing coverage with digital droplet multiple displacement amplification

PubMed Central

Sidore, Angus M.; Lan, Freeman; Lim, Shaun W.; Abate, Adam R.

2016-01-01

Sequencing small quantities of DNA is important for applications ranging from the assembly of uncultivable microbial genomes to the identification of cancer-associated mutations. To obtain sufficient quantities of DNA for sequencing, the small amount of starting material must be amplified significantly. However, existing methods often yield errors or non-uniform coverage, reducing sequencing data quality. Here, we describe digital droplet multiple displacement amplification, a method that enables massive amplification of low-input material while maintaining sequence accuracy and uniformity. The low-input material is compartmentalized as single molecules in millions of picoliter droplets. Because the molecules are isolated in compartments, they amplify to saturation without competing for resources; this yields uniform representation of all sequences in the final product and, in turn, enhances the quality of the sequence data. We demonstrate the ability to uniformly amplify the genomes of single Escherichia coli cells, comprising just 4.7 fg of starting DNA, and obtain sequencing coverage distributions that rival that of unamplified material. Digital droplet multiple displacement amplification provides a simple and effective method for amplifying minute amounts of DNA for accurate and uniform sequencing. PMID:26704978

DNA tetrominoes: the construction of DNA nanostructures using self-organised heterogeneous deoxyribonucleic acids shapes.

PubMed

Ong, Hui San; Rahim, Mohd Syafiq; Firdaus-Raih, Mohd; Ramlan, Effirul Ikhwan

2015-01-01

The unique programmability of nucleic acids offers alternative in constructing excitable and functional nanostructures. This work introduces an autonomous protocol to construct DNA Tetris shapes (L-Shape, B-Shape, T-Shape and I-Shape) using modular DNA blocks. The protocol exploits the rich number of sequence combinations available from the nucleic acid alphabets, thus allowing for diversity to be applied in designing various DNA nanostructures. Instead of a deterministic set of sequences corresponding to a particular design, the protocol promotes a large pool of DNA shapes that can assemble to conform to any desired structures. By utilising evolutionary programming in the design stage, DNA blocks are subjected to processes such as sequence insertion, deletion and base shifting in order to enrich the diversity of the resulting shapes based on a set of cascading filters. The optimisation algorithm allows mutation to be exerted indefinitely on the candidate sequences until these sequences complied with all the four fitness criteria. Generated candidates from the protocol are in agreement with the filter cascades and thermodynamic simulation. Further validation using gel electrophoresis indicated the formation of the designed shapes. Thus, supporting the plausibility of constructing DNA nanostructures in a more hierarchical, modular, and interchangeable manner.
BioBrick assembly standards and techniques and associated software tools.

PubMed

Røkke, Gunvor; Korvald, Eirin; Pahr, Jarle; Oyås, Ove; Lale, Rahmi

2014-01-01

The BioBrick idea was developed to introduce the engineering principles of abstraction and standardization into synthetic biology. BioBricks are DNA sequences that serve a defined biological function and can be readily assembled with any other BioBrick parts to create new BioBricks with novel properties. In order to achieve this, several assembly standards can be used. Which assembly standards a BioBrick is compatible with, depends on the prefix and suffix sequences surrounding the part. In this chapter, five of the most common assembly standards will be described, as well as some of the most used assembly techniques, cloning procedures, and a presentation of the available software tools that can be used for deciding on the best method for assembling of different BioBricks, and searching for BioBrick parts in the Registry of Standard Biological Parts database.
Nanomaterials Based on DNA

PubMed Central

Seeman, Nadrian C.

2012-01-01

The combination of synthetic stable branched DNA and sticky ended cohesion has led to the development of structural DNA nanotechnology over the past 30 years. The basis of this enterprise is that it is possible to construct novel DNA-based materials by combining these features in a self-assembly protocol. Thus, simple branched molecules lead directly to the construction of polyhedra whose edges consist of double helical DNA, and whose vertices correspond to the branch points. Stiffer branched motifs can be used to produce self-assembled two-dimensional and three-dimensional periodic lattices of DNA (crystals). DNA has also been used to make a variety of nanomechanical devices, including molecules that change their shapes, and molecules that can walk along a DNA sidewalk. Devices have been incorporated into two-dimensional DNA arrangements; sequence-dependent devices are driven by increases in nucleotide pairing at each step in their machine cycles. PMID:20222824
Detection of Low-Copy-Number Genomic DNA Sequences in Individual Bacterial Cells by Using Peptide Nucleic Acid-Assisted Rolling-Circle Amplification and Fluorescence In Situ Hybridization▿ †

PubMed Central

Smolina, Irina; Lee, Charles; Frank-Kamenetskii, Maxim

2007-01-01

An approach is proposed for in situ detection of short signature DNA sequences present in single copies per bacterial genome. The site is locally opened by peptide nucleic acids, and a circular oligonucleotide is assembled. The amplicon generated by rolling circle amplification is detected by hybridization with fluorescently labeled decorator probes. PMID:17293504
An end-to-end workflow for engineering of biological networks from high-level specifications.

PubMed

Beal, Jacob; Weiss, Ron; Densmore, Douglas; Adler, Aaron; Appleton, Evan; Babb, Jonathan; Bhatia, Swapnil; Davidsohn, Noah; Haddock, Traci; Loyall, Joseph; Schantz, Richard; Vasilev, Viktor; Yaman, Fusun

2012-08-17

We present a workflow for the design and production of biological networks from high-level program specifications. The workflow is based on a sequence of intermediate models that incrementally translate high-level specifications into DNA samples that implement them. We identify algorithms for translating between adjacent models and implement them as a set of software tools, organized into a four-stage toolchain: Specification, Compilation, Part Assignment, and Assembly. The specification stage begins with a Boolean logic computation specified in the Proto programming language. The compilation stage uses a library of network motifs and cellular platforms, also specified in Proto, to transform the program into an optimized Abstract Genetic Regulatory Network (AGRN) that implements the programmed behavior. The part assignment stage assigns DNA parts to the AGRN, drawing the parts from a database for the target cellular platform, to create a DNA sequence implementing the AGRN. Finally, the assembly stage computes an optimized assembly plan to create the DNA sequence from available part samples, yielding a protocol for producing a sample of engineered plasmids with robotics assistance. Our workflow is the first to automate the production of biological networks from a high-level program specification. Furthermore, the workflow's modular design allows the same program to be realized on different cellular platforms simply by swapping workflow configurations. We validated our workflow by specifying a small-molecule sensor-reporter program and verifying the resulting plasmids in both HEK 293 mammalian cells and in E. coli bacterial cells.
Single-molecule sequencing of the desiccation-tolerant grass Oropetium thomaeum.

PubMed

VanBuren, Robert; Bryant, Doug; Edger, Patrick P; Tang, Haibao; Burgess, Diane; Challabathula, Dinakar; Spittle, Kristi; Hall, Richard; Gu, Jenny; Lyons, Eric; Freeling, Michael; Bartels, Dorothea; Ten Hallers, Boudewijn; Hastie, Alex; Michael, Todd P; Mockler, Todd C

2015-11-26

Plant genomes, and eukaryotic genomes in general, are typically repetitive, polyploid and heterozygous, which complicates genome assembly. The short read lengths of early Sanger and current next-generation sequencing platforms hinder assembly through complex repeat regions, and many draft and reference genomes are fragmented, lacking skewed GC and repetitive intergenic sequences, which are gaining importance due to projects like the Encyclopedia of DNA Elements (ENCODE). Here we report the whole-genome sequencing and assembly of the desiccation-tolerant grass Oropetium thomaeum. Using only single-molecule real-time sequencing, which generates long (>16 kilobases) reads with random errors, we assembled 99% (244 megabases) of the Oropetium genome into 625 contigs with an N50 length of 2.4 megabases. Oropetium is an example of a 'near-complete' draft genome which includes gapless coverage over gene space as well as intergenic sequences such as centromeres, telomeres, transposable elements and rRNA clusters that are typically unassembled in draft genomes. Oropetium has 28,466 protein-coding genes and 43% repeat sequences, yet with 30% more compact euchromatic regions it is the smallest known grass genome. The Oropetium genome demonstrates the utility of single-molecule real-time sequencing for assembling high-quality plant and other eukaryotic genomes, and serves as a valuable resource for the plant comparative genomics community.
DNA mimic proteins: functions, structures, and bioinformatic analysis.

PubMed

Wang, Hao-Ching; Ho, Chun-Han; Hsu, Kai-Cheng; Yang, Jinn-Moon; Wang, Andrew H-J

2014-05-13

DNA mimic proteins have DNA-like negative surface charge distributions, and they function by occupying the DNA binding sites of DNA binding proteins to prevent these sites from being accessed by DNA. DNA mimic proteins control the activities of a variety of DNA binding proteins and are involved in a wide range of cellular mechanisms such as chromatin assembly, DNA repair, transcription regulation, and gene recombination. However, the sequences and structures of DNA mimic proteins are diverse, making them difficult to predict by bioinformatic search. To date, only a few DNA mimic proteins have been reported. These DNA mimics were not found by searching for functional motifs in their sequences but were revealed only by structural analysis of their charge distribution. This review highlights the biological roles and structures of 16 reported DNA mimic proteins. We also discuss approaches that might be used to discover new DNA mimic proteins.
Dual-colored graphene quantum dots-labeled nanoprobes/graphene oxide: functional carbon materials for respective and simultaneous detection of DNA and thrombin

NASA Astrophysics Data System (ADS)

Qian, Zhao Sheng; Shan, Xiao Yue; Chai, Lu Jing; Chen, Jian Rong; Feng, Hui

2014-10-01

Convenient and simultaneous detection of multiple biomarkers such as DNA and proteins with biocompatible materials and good analytical performance still remains a challenge. Herein, we report the respective and simultaneous detection of DNA and bovine α-thrombin (thrombin) entirely based on biocompatible carbon materials through a specially designed fluorescence on-off-on process. Colorful fluorescence, high emission efficiency, good photostability and excellent compatibility enables graphene quantum dots (GQDs) as the best choice for fluorophores in bioprobes, and thus two-colored GQDs as labeling fluorophores were chemically bonded with specific oligonucleotide sequence and aptamer to prepare two probes targeting the DNA and thrombin, respectively. Each probe can be assembled on the graphene oxide (GO) platform spontaneously by π-π stacking and electrostatic attraction; as a result, fast electron transfer in the assembly efficiently quenches the fluorescence of probe. The presence of DNA or thrombin can trigger the self-recognition between capturing a nucleotide sequence and its target DNA or between thrombin and its aptamer due to their specific hybridization and duplex DNA structures or the formation of apatamer-substrate complex, which is taken advantage of in order to achieve a separate quantitative analysis of DNA and thrombin. A dual-functional biosensor for simultaneous detection of DNA and thrombin was also constructed by self-assembly of two probes with distinct colors and GO platform, and was further evaluated with the presence of various concentrations of DNA and thrombin. Both biosensors serving as a general detection model for multiple species exhibit outstanding analytical performance, and are expected to be applied in vivo because of the excellent biocompatibility of their used materials.
Potential benefits from using a new reference map in genomic prediction

USDA-ARS?s Scientific Manuscript database

Many genomic studies in cattle have used the 2009 reference assembly from the University of Maryland (UMD3.1). A new USDA Agricultural Research Service-University of California, Davis (ARS-UCD) assembly based on longer DNA reads from the same cow (Dominette) should improve sequence alignment, imputa...
An innovative platform for quick and flexible joining of assorted DNA fragments

DOE PAGES

De Paoli, Henrique Cestari; Tuskan, Gerald A.; Yang, Xiaohan

2016-01-13

Successful synthetic biology efforts rely on conceptual and experimental designs in combination with testing of multi-gene constructs. Despite recent progresses, several limitations still hinder the ability to flexibly assemble and collectively share different types of DNA segments. We describe an advanced system for joining DNA fragments from a universal library that automatically maintains open reading frames (ORFs) and does not require linkers, adaptors, sequence homology, amplification or mutation (domestication) of fragments in order to work properly. Moreover, we find that this system, which is enhanced by a unique buffer formulation, provides unforeseen capabilities for testing, and sharing, complex multi-gene circuitrymore » assembled from different DNA fragments.« less
The Genome Sequence of a Widespread Apex Predator, the Golden Eagle (Aquila chrysaetos)

PubMed Central

Doyle, Jacqueline M.; Katzner, Todd E.; Bloom, Peter H.; Ji, Yanzhu; Wijayawardena, Bhagya K.; DeWoody, J. Andrew

2014-01-01

Biologists routinely use molecular markers to identify conservation units, to quantify genetic connectivity, to estimate population sizes, and to identify targets of selection. Many imperiled eagle populations require such efforts and would benefit from enhanced genomic resources. We sequenced, assembled, and annotated the first eagle genome using DNA from a male golden eagle (Aquila chrysaetos) captured in western North America. We constructed genomic libraries that were sequenced using Illumina technology and assembled the high-quality data to a depth of ∼40x coverage. The genome assembly includes 2,552 scaffolds >10 Kb and 415 scaffolds >1.2 Mb. We annotated 16,571 genes that are involved in myriad biological processes, including such disparate traits as beak formation and color vision. We also identified repetitive regions spanning 92 Mb (∼6% of the assembly), including LINES, SINES, LTR-RTs and DNA transposons. The mitochondrial genome encompasses 17,332 bp and is ∼91% identical to the Mountain Hawk-Eagle (Nisaetus nipalensis). Finally, the data reveal that several anonymous microsatellites commonly used for population studies are embedded within protein-coding genes and thus may not have evolved in a neutral fashion. Because the genome sequence includes ∼800,000 novel polymorphisms, markers can now be chosen based on their proximity to functional genes involved in migration, carnivory, and other biological processes. PMID:24759626
Sequencing and comparing whole mitochondrial genomes ofanimals

DOE Office of Scientific and Technical Information (OSTI.GOV)

Boore, Jeffrey L.; Macey, J. Robert; Medina, Monica

2005-04-22

Comparing complete animal mitochondrial genome sequences is becoming increasingly common for phylogenetic reconstruction and as a model for genome evolution. Not only are they much more informative than shorter sequences of individual genes for inferring evolutionary relatedness, but these data also provide sets of genome-level characters, such as the relative arrangements of genes, that can be especially powerful. We describe here the protocols commonly used for physically isolating mtDNA, for amplifying these by PCR or RCA, for cloning,sequencing, assembly, validation, and gene annotation, and for comparing both sequences and gene arrangements. On several topics, we offer general observations based onmore » our experiences to date with determining and comparing complete mtDNA sequences.« less
Discovery, genotyping and characterization of structural variation and novel sequence at single nucleotide resolution from de novo genome assemblies on a population scale.

PubMed

Liu, Siyang; Huang, Shujia; Rao, Junhua; Ye, Weijian; Krogh, Anders; Wang, Jun

2015-01-01

Comprehensive recognition of genomic variation in one individual is important for understanding disease and developing personalized medication and treatment. Many tools based on DNA re-sequencing exist for identification of single nucleotide polymorphisms, small insertions and deletions (indels) as well as large deletions. However, these approaches consistently display a substantial bias against the recovery of complex structural variants and novel sequence in individual genomes and do not provide interpretation information such as the annotation of ancestral state and formation mechanism. We present a novel approach implemented in a single software package, AsmVar, to discover, genotype and characterize different forms of structural variation and novel sequence from population-scale de novo genome assemblies up to nucleotide resolution. Application of AsmVar to several human de novo genome assemblies captures a wide spectrum of structural variants and novel sequences present in the human population in high sensitivity and specificity. Our method provides a direct solution for investigating structural variants and novel sequences from de novo genome assemblies, facilitating the construction of population-scale pan-genomes. Our study also highlights the usefulness of the de novo assembly strategy for definition of genome structure.
A computer aided thermodynamic approach for predicting the formation of Z-DNA in naturally occurring sequences

NASA Technical Reports Server (NTRS)

Ho, P. S.; Ellison, M. J.; Quigley, G. J.; Rich, A.

1986-01-01

The ease with which a particular DNA segment adopts the left-handed Z-conformation depends largely on the sequence and on the degree of negative supercoiling to which it is subjected. We describe a computer program (Z-hunt) that is designed to search long sequences of naturally occurring DNA and retrieve those nucleotide combinations of up to 24 bp in length which show a strong propensity for Z-DNA formation. Incorporated into Z-hunt is a statistical mechanical model based on empirically determined energetic parameters for the B to Z transition accumulated to date. The Z-forming potential of a sequence is assessed by ranking its behavior as a function of negative superhelicity relative to the behavior of similar sized randomly generated nucleotide sequences assembled from over 80,000 combinations. The program makes it possible to compare directly the Z-forming potential of sequences with different base compositions and different sequence lengths. Using Z-hunt, we have analyzed the DNA sequences of the bacteriophage phi X174, plasmid pBR322, the animal virus SV40 and the replicative form of the eukaryotic adenovirus-2. The results are compared with those previously obtained by others from experiments designed to locate Z-DNA forming regions in these sequences using probes which show specificity for the left-handed DNA conformation.
A novel nonenzymatic cascade amplification for ultrasensitive photoelectrochemical DNA sensing based on target driven to initiate cyclic assembly of hairpins.

PubMed

Wen, Guangming; Dong, Wenxia; Liu, Bin; Li, Zhongping; Fan, Lifang

2018-05-29

A novel cascade photoelectrochemical (PEC) signal amplification biosensing tactics was developed for DNA detection based on a target-driven DNA association to induce cyclic hairpin assembly. In the circulatory system there are two ssDNA (A and B) and two hairpins (C and D). The hybridization of these ssDNA led to the formation of an A-target-B structure. The close proximity of their toehold and branch-migration regions was able to induce the cyclic hairpin assembly. Afterwards, the assembly result further causes the separation of a double-stranded probe DNA (Q:F) to switch the PEC signal via toehold-mediated strand replacement. As such, the signal stranded DNA-CdS QDs (F) as the signal tag was released in the presence of the target DNA. The signal DNA-CdS QDs was then coated to F-doped tin oxide (FTO) electrode leading to the "signal-on" PEC signal. The designed biosensing strategy showed a low detection limit of 21.3 pM for target DNA and a broad linear range from 50 pM to 100 nM. This signal amplification PEC sensing method exhibited a potential application to detect protein molecules, RNA or metal ions via changing the sequence of A and B recognition. Copyright © 2018 Elsevier B.V. All rights reserved.
Comparison of next generation sequencing technologies for transcriptome characterization

PubMed Central

2009-01-01

Background We have developed a simulation approach to help determine the optimal mixture of sequencing methods for most complete and cost effective transcriptome sequencing. We compared simulation results for traditional capillary sequencing with "Next Generation" (NG) ultra high-throughput technologies. The simulation model was parameterized using mappings of 130,000 cDNA sequence reads to the Arabidopsis genome (NCBI Accession SRA008180.19). We also generated 454-GS20 sequences and de novo assemblies for the basal eudicot California poppy (Eschscholzia californica) and the magnoliid avocado (Persea americana) using a variety of methods for cDNA synthesis. Results The Arabidopsis reads tagged more than 15,000 genes, including new splice variants and extended UTR regions. Of the total 134,791 reads (13.8 MB), 119,518 (88.7%) mapped exactly to known exons, while 1,117 (0.8%) mapped to introns, 11,524 (8.6%) spanned annotated intron/exon boundaries, and 3,066 (2.3%) extended beyond the end of annotated UTRs. Sequence-based inference of relative gene expression levels correlated significantly with microarray data. As expected, NG sequencing of normalized libraries tagged more genes than non-normalized libraries, although non-normalized libraries yielded more full-length cDNA sequences. The Arabidopsis data were used to simulate additional rounds of NG and traditional EST sequencing, and various combinations of each. Our simulations suggest a combination of FLX and Solexa sequencing for optimal transcriptome coverage at modest cost. We have also developed ESTcalc http://fgp.huck.psu.edu/NG_Sims/ngsim.pl, an online webtool, which allows users to explore the results of this study by specifying individualized costs and sequencing characteristics. Conclusion NG sequencing technologies are a highly flexible set of platforms that can be scaled to suit different project goals. In terms of sequence coverage alone, the NG sequencing is a dramatic advance over capillary-based sequencing, but NG sequencing also presents significant challenges in assembly and sequence accuracy due to short read lengths, method-specific sequencing errors, and the absence of physical clones. These problems may be overcome by hybrid sequencing strategies using a mixture of sequencing methodologies, by new assemblers, and by sequencing more deeply. Sequencing and microarray outcomes from multiple experiments suggest that our simulator will be useful for guiding NG transcriptome sequencing projects in a wide range of organisms. PMID:19646272
Programmed self-assembly of DNA/RNA for biomedical applications

NASA Astrophysics Data System (ADS)

Wang, Pengfei

Three self-assembly strategies were utilized for assembly of novel functional DNA/RNA nanostructures. RNA-DNA hybrid origami method was developed to fabricate nano-objects (ribbon, rectangle, and triangle) with precisely controlled geometry. Unlike conventional DNA origami which use long DNA single strand as scaffold, a long RNA single strand was used instead, which was folded by short DNA single strands (staples) into prescribed objects through sequence specific hybridization between RNA and DNA. Single stranded tiles (SST) and RNA-DNA hybrid origami were utilized to fabricate a variety of barcode-like nanostructures with unique patterns by expanding a plain rectangle via introducing spacers (10-bp dsDNA segment) between parallel duplexes. Finally, complex 2D array and 3D polyhedrons with multiple patterns within one structure were assembled from simple DNA motifs. Two demonstrations of biomedical applications of DNA nanotechnology were presented. Firstly, lambda-DNA was used as template to direct the fabrication of multi-component magnetic nanoparticle chains. Nuclear magnetic relaxation (NMR) characterization showed superb magnetic relaxativity of the nanoparticle chains which have large potential to be utilized as MRI contrast agents. Secondly, DNA nanotechnology was introduced into the conformational study of a routinely used catalytic DNAzyme, the RNA-cleaving 10-23 DNAzyme. The relative angle between two flanking duplexes of the catalytic core was determined (94.8°), which shall be able to provide a clue to further understanding of the cleaving mechanism of this DNAzyme from a conformational perspective.
BioPartsBuilder: a synthetic biology tool for combinatorial assembly of biological parts.

PubMed

Yang, Kun; Stracquadanio, Giovanni; Luo, Jingchuan; Boeke, Jef D; Bader, Joel S

2016-03-15

Combinatorial assembly of DNA elements is an efficient method for building large-scale synthetic pathways from standardized, reusable components. These methods are particularly useful because they enable assembly of multiple DNA fragments in one reaction, at the cost of requiring that each fragment satisfies design constraints. We developed BioPartsBuilder as a biologist-friendly web tool to design biological parts that are compatible with DNA combinatorial assembly methods, such as Golden Gate and related methods. It retrieves biological sequences, enforces compliance with assembly design standards and provides a fabrication plan for each fragment. BioPartsBuilder is accessible at http://public.biopartsbuilder.org and an Amazon Web Services image is available from the AWS Market Place (AMI ID: ami-508acf38). Source code is released under the MIT license, and available for download at https://github.com/baderzone/biopartsbuilder joel.bader@jhu.edu Supplementary data are available at Bioinformatics online. © The Author 2015. Published by Oxford University Press.
Illumina Synthetic Long Read Sequencing Allows Recovery of Missing Sequences even in the “Finished” C. elegans Genome

PubMed Central

Li, Runsheng; Hsieh, Chia-Ling; Young, Amanda; Zhang, Zhihong; Ren, Xiaoliang; Zhao, Zhongying

2015-01-01

Most next-generation sequencing platforms permit acquisition of high-throughput DNA sequences, but the relatively short read length limits their use in genome assembly or finishing. Illumina has recently released a technology called Synthetic Long-Read Sequencing that can produce reads of unusual length, i.e., predominately around 10 Kb. However, a systematic assessment of their use in genome finishing and assembly is still lacking. We evaluate the promise and deficiency of the long reads in these aspects using isogenic C. elegans genome with no gap. First, the reads are highly accurate and capable of recovering most types of repetitive sequences. However, the presence of tandem repetitive sequences prevents pre-assembly of long reads in the relevant genomic region. Second, the reads are able to reliably detect missing but not extra sequences in the C. elegans genome. Third, the reads of smaller size are more capable of recovering repetitive sequences than those of bigger size. Fourth, at least 40 Kbp missing genomic sequences are recovered in the C. elegans genome using the long reads. Finally, an N50 contig size of at least 86 Kbp can be achieved with 24×reads but with substantial mis-assembly errors, highlighting a need for novel assembly algorithm for the long reads. PMID:26039588
A memory-efficient data structure representing exact-match overlap graphs with application for next-generation DNA assembly.

PubMed

Dinh, Hieu; Rajasekaran, Sanguthevar

2011-07-15

Exact-match overlap graphs have been broadly used in the context of DNA assembly and the shortest super string problem where the number of strings n ranges from thousands to billions. The length ℓ of the strings is from 25 to 1000, depending on the DNA sequencing technologies. However, many DNA assemblers using overlap graphs suffer from the need for too much time and space in constructing the graphs. It is nearly impossible for these DNA assemblers to handle the huge amount of data produced by the next-generation sequencing technologies where the number n of strings could be several billions. If the overlap graph is explicitly stored, it would require Ω(n(2)) memory, which could be prohibitive in practice when n is greater than a hundred million. In this article, we propose a novel data structure using which the overlap graph can be compactly stored. This data structure requires only linear time to construct and and linear memory to store. For a given set of input strings (also called reads), we can informally define an exact-match overlap graph as follows. Each read is represented as a node in the graph and there is an edge between two nodes if the corresponding reads overlap sufficiently. A formal description follows. The maximal exact-match overlap of two strings x and y, denoted by ov(max)(x, y), is the longest string which is a suffix of x and a prefix of y. The exact-match overlap graph of n given strings of length ℓ is an edge-weighted graph in which each vertex is associated with a string and there is an edge (x, y) of weight ω=ℓ-|ov(max)(x, y)| if and only if ω ≤ λ, where |ov(max)(x, y)| is the length of ov(max)(x, y) and λ is a given threshold. In this article, we show that the exact-match overlap graphs can be represented by a compact data structure that can be stored using at most (2λ-1)(2⌈logn⌉+⌈logλ⌉)n bits with a guarantee that the basic operation of accessing an edge takes O(log λ) time. We also propose two algorithms for constructing the data structure for the exact-match overlap graph. The first algorithm runs in O(λℓnlogn) worse-case time and requires O(λ) extra memory. The second one runs in O(λℓn) time and requires O(n) extra memory. Our experimental results on a huge amount of simulated data from sequence assembly show that the data structure can be constructed efficiently in time and memory. Our DNA sequence assembler that incorporates the data structure is freely available on the web at http://www.engr.uconn.edu/~htd06001/assembler/leap.zip

Pydna: a simulation and documentation tool for DNA assembly strategies using python.

PubMed

Pereira, Filipa; Azevedo, Flávio; Carvalho, Ângela; Ribeiro, Gabriela F; Budde, Mark W; Johansson, Björn

2015-05-02

Recent advances in synthetic biology have provided tools to efficiently construct complex DNA molecules which are an important part of many molecular biology and biotechnology projects. The planning of such constructs has traditionally been done manually using a DNA sequence editor which becomes error-prone as scale and complexity of the construction increase. A human-readable formal description of cloning and assembly strategies, which also allows for automatic computer simulation and verification, would therefore be a valuable tool. We have developed pydna, an extensible, free and open source Python library for simulating basic molecular biology DNA unit operations such as restriction digestion, ligation, PCR, primer design, Gibson assembly and homologous recombination. A cloning strategy expressed as a pydna script provides a description that is complete, unambiguous and stable. Execution of the script automatically yields the sequence of the final molecule(s) and that of any intermediate constructs. Pydna has been designed to be understandable for biologists with limited programming skills by providing interfaces that are semantically similar to the description of molecular biology unit operations found in literature. Pydna simplifies both the planning and sharing of cloning strategies and is especially useful for complex or combinatorial DNA molecule construction. An important difference compared to existing tools with similar goals is the use of Python instead of a specifically constructed language, providing a simulation environment that is more flexible and extensible by the user.
Lanthanum induced B-to-Z transition in self-assembled Y-shaped branched DNA structure

PubMed Central

Nayak, Ashok K.; Mishra, Aseem; Jena, Bhabani S.; Mishra, Barada K.; Subudhi, Umakanta

2016-01-01

Controlled conversion of right-handed B-DNA to left-handed Z-DNA is one of the greatest conformational transitions in biology. Recently, the B-Z transition has been explored from nanotechnological points of view and used as the driving machinery of many nanomechanical devices. Using a combination of CD spectroscopy, fluorescence spectroscopy, and PAGE, we demonstrate that low concentration of lanthanum chloride can mediate B-to-Z transition in self-assembled Y-shaped branched DNA (bDNA) structure. The transition is sensitive to the sequence and structure of the bDNA. Thermal melting and competitive dye binding experiments suggest that La3+ ions are loaded to the major and minor grooves of DNA and stabilize the Z-conformation. Our studies also show that EDTA and EtBr play an active role in reversing the transition from Z-to-B DNA. PMID:27241949
Lanthanum induced B-to-Z transition in self-assembled Y-shaped branched DNA structure

NASA Astrophysics Data System (ADS)

Nayak, Ashok K.; Mishra, Aseem; Jena, Bhabani S.; Mishra, Barada K.; Subudhi, Umakanta

2016-05-01

Controlled conversion of right-handed B-DNA to left-handed Z-DNA is one of the greatest conformational transitions in biology. Recently, the B-Z transition has been explored from nanotechnological points of view and used as the driving machinery of many nanomechanical devices. Using a combination of CD spectroscopy, fluorescence spectroscopy, and PAGE, we demonstrate that low concentration of lanthanum chloride can mediate B-to-Z transition in self-assembled Y-shaped branched DNA (bDNA) structure. The transition is sensitive to the sequence and structure of the bDNA. Thermal melting and competitive dye binding experiments suggest that La3+ ions are loaded to the major and minor grooves of DNA and stabilize the Z-conformation. Our studies also show that EDTA and EtBr play an active role in reversing the transition from Z-to-B DNA.
DNA Trojan Horses: Self-Assembled Floxuridine-Containing DNA Polyhedra for Cancer Therapy.

PubMed

Mou, Quanbing; Ma, Yuan; Pan, Gaifang; Xue, Bai; Yan, Deyue; Zhang, Chuan; Zhu, Xinyuan

2017-10-02

Based on their structural similarity to natural nucleobases, nucleoside analogue therapeutics were integrated into DNA strands through conventional solid-phase synthesis. By elaborately designing their sequences, floxuridine-integrated DNA strands were synthesized and self-assembled into well-defined DNA polyhedra with definite drug-loading ratios as well as tunable size and morphology. As a novel drug delivery system, these drug-containing DNA polyhedra could ideally mimic the Trojan Horse to deliver chemotherapeutics into tumor cells and fight against cancer. Both in vitro and in vivo results demonstrate that the DNA Trojan horse with buckyball architecture exhibits superior anticancer capability over the free drug and other formulations. With precise control over the drug-loading ratio and structure of the nanocarriers, the DNA Trojan horse may play an important role in anticancer treatment and exhibit great potential in translational nanomedicine. © 2017 Wiley-VCH Verlag GmbH & Co. KGaA, Weinheim.
DNA-guided nanoparticle assemblies

DOEpatents

Gang, Oleg; Nykypanchuk, Dmytro; Maye, Mathew; van der Lelie, Daniel

2013-07-16

In some embodiments, DNA-capped nanoparticles are used to define a degree of crystalline order in assemblies thereof. In some embodiments, thermodynamically reversible and stable body-centered cubic (bcc) structures, with particles occupying <.about.10% of the unit cell, are formed. Designs and pathways amenable to the crystallization of particle assemblies are identified. In some embodiments, a plasmonic crystal is provided. In some aspects, a method for controlling the properties of particle assemblages is provided. In some embodiments a catalyst is formed from nanoparticles linked by nucleic acid sequences and forming an open crystal structure with catalytically active agents attached to the crystal on its surface or in interstices.
Optical mapping and its potential for large-scale sequencing projects.

PubMed

Aston, C; Mishra, B; Schwartz, D C

1999-07-01

Physical mapping has been rediscovered as an important component of large-scale sequencing projects. Restriction maps provide landmark sequences at defined intervals, and high-resolution restriction maps can be assembled from ensembles of single molecules by optical means. Such optical maps can be constructed from both large-insert clones and genomic DNA, and are used as a scaffold for accurately aligning sequence contigs generated by shotgun sequencing.
Full-Length Venom Protein cDNA Sequences from Venom-Derived mRNA: Exploring Compositional Variation and Adaptive Multigene Evolution

PubMed Central

Modahl, Cassandra M.; Mackessy, Stephen P.

2016-01-01

Envenomation of humans by snakes is a complex and continuously evolving medical emergency, and treatment is made that much more difficult by the diverse biochemical composition of many venoms. Venomous snakes and their venoms also provide models for the study of molecular evolutionary processes leading to adaptation and genotype-phenotype relationships. To compare venom complexity and protein sequences, venom gland transcriptomes are assembled, which usually requires the sacrifice of snakes for tissue. However, toxin transcripts are also present in venoms, offering the possibility of obtaining cDNA sequences directly from venom. This study provides evidence that unknown full-length venom protein transcripts can be obtained from the venoms of multiple species from all major venomous snake families. These unknown venom protein cDNAs are obtained by the use of primers designed from conserved signal peptide sequences within each venom protein superfamily. This technique was used to assemble a partial venom gland transcriptome for the Middle American Rattlesnake (Crotalus simus tzabcan) by amplifying sequences for phospholipases A2, serine proteases, C-lectins, and metalloproteinases from within venom. Phospholipase A2 sequences were also recovered from the venoms of several rattlesnakes and an elapid snake (Pseudechis porphyriacus), and three-finger toxin sequences were recovered from multiple rear-fanged snake species, demonstrating that the three major clades of advanced snakes (Elapidae, Viperidae, Colubridae) have stable mRNA present in their venoms. These cDNA sequences from venom were then used to explore potential activities derived from protein sequence similarities and evolutionary histories within these large multigene superfamilies. Venom-derived sequences can also be used to aid in characterizing venoms that lack proteomic profiles and identify sequence characteristics indicating specific envenomation profiles. This approach, requiring only venom, provides access to cDNA sequences in the absence of living specimens, even from commercial venom sources, to evaluate important regional differences in venom composition and to study snake venom protein evolution. PMID:27280639
Serendipitous discovery of Wolbachia genomes in multiple Drosophila species.

PubMed

Salzberg, Steven L; Dunning Hotopp, Julie C; Delcher, Arthur L; Pop, Mihai; Smith, Douglas R; Eisen, Michael B; Nelson, William C

2005-01-01

The Trace Archive is a repository for the raw, unanalyzed data generated by large-scale genome sequencing projects. The existence of this data offers scientists the possibility of discovering additional genomic sequences beyond those originally sequenced. In particular, if the source DNA for a sequencing project came from a species that was colonized by another organism, then the project may yield substantial amounts of genomic DNA, including near-complete genomes, from the symbiotic or parasitic organism. By searching the publicly available repository of DNA sequencing trace data, we discovered three new species of the bacterial endosymbiont Wolbachia pipientis in three different species of fruit fly: Drosophila ananassae, D. simulans, and D. mojavensis. We extracted all sequences with partial matches to a previously sequenced Wolbachia strain and assembled those sequences using customized software. For one of the three new species, the data recovered were sufficient to produce an assembly that covers more than 95% of the genome; for a second species the data produce the equivalent of a 'light shotgun' sampling of the genome, covering an estimated 75-80% of the genome; and for the third species the data cover approximately 6-7% of the genome. The results of this study reveal an unexpected benefit of depositing raw data in a central genome sequence repository: new species can be discovered within this data. The differences between these three new Wolbachia genomes and the previously sequenced strain revealed numerous rearrangements and insertions within each lineage and hundreds of novel genes. The three new genomes, with annotation, have been deposited in GenBank.
Specific and reversible DNA-directed self-assembly of oil-in-water emulsion droplets

PubMed Central

Hadorn, Maik; Boenzli, Eva; Sørensen, Kristian T.; Fellermann, Harold; Eggenberger Hotz, Peter; Hanczyc, Martin M.

2012-01-01

Higher-order structures that originate from the specific and reversible DNA-directed self-assembly of microscopic building blocks hold great promise for future technologies. Here, we functionalized biotinylated soft colloid oil-in-water emulsion droplets with biotinylated single-stranded DNA oligonucleotides using streptavidin as an intermediary linker. We show the components of this modular linking system to be stable and to induce sequence-specific aggregation of binary mixtures of emulsion droplets. Three length scales were thereby involved: nanoscale DNA base pairing linking microscopic building blocks resulted in macroscopic aggregates visible to the naked eye. The aggregation process was reversible by changing the temperature and electrolyte concentration and by the addition of competing oligonucleotides. The system was reset and reused by subsequent refunctionalization of the emulsion droplets. DNA-directed self-assembly of oil-in-water emulsion droplets, therefore, offers a solid basis for programmable and recyclable soft materials that undergo structural rearrangements on demand and that range in application from information technology to medicine. PMID:23175791
Previously unknown and highly divergent ssDNA viruses populate the oceans.

PubMed

Labonté, Jessica M; Suttle, Curtis A

2013-11-01

Single-stranded DNA (ssDNA) viruses are economically important pathogens of plants and animals, and are widespread in oceans; yet, the diversity and evolutionary relationships among marine ssDNA viruses remain largely unknown. Here we present the results from a metagenomic study of composite samples from temperate (Saanich Inlet, 11 samples; Strait of Georgia, 85 samples) and subtropical (46 samples, Gulf of Mexico) seawater. Most sequences (84%) had no evident similarity to sequenced viruses. In total, 608 putative complete genomes of ssDNA viruses were assembled, almost doubling the number of ssDNA viral genomes in databases. These comprised 129 genetically distinct groups, each represented by at least one complete genome that had no recognizable similarity to each other or to other virus sequences. Given that the seven recognized families of ssDNA viruses have considerable sequence homology within them, this suggests that many of these genetic groups may represent new viral families. Moreover, nearly 70% of the sequences were similar to one of these genomes, indicating that most of the sequences could be assigned to a genetically distinct group. Most sequences fell within 11 well-defined gene groups, each sharing a common gene. Some of these encoded putative replication and coat proteins that had similarity to sequences from viruses infecting eukaryotes, suggesting that these were likely from viruses infecting eukaryotic phytoplankton and zooplankton.
UV-Visible Spectroscopy-Based Quantification of Unlabeled DNA Bound to Gold Nanoparticles.

PubMed

Baldock, Brandi L; Hutchison, James E

2016-12-20

DNA-functionalized gold nanoparticles have been increasingly applied as sensitive and selective analytical probes and biosensors. The DNA ligands bound to a nanoparticle dictate its reactivity, making it essential to know the type and number of DNA strands bound to the nanoparticle surface. Existing methods used to determine the number of DNA strands per gold nanoparticle (AuNP) require that the sequences be fluorophore-labeled, which may affect the DNA surface coverage and reactivity of the nanoparticle and/or require specialized equipment and other fluorophore-containing reagents. We report a UV-visible-based method to conveniently and inexpensively determine the number of DNA strands attached to AuNPs of different core sizes. When this method is used in tandem with a fluorescence dye assay, it is possible to determine the ratio of two unlabeled sequences of different lengths bound to AuNPs. Two sizes of citrate-stabilized AuNPs (5 and 12 nm) were functionalized with mixtures of short (5 base) and long (32 base) disulfide-terminated DNA sequences, and the ratios of sequences bound to the AuNPs were determined using the new method. The long DNA sequence was present as a lower proportion of the ligand shell than in the ligand exchange mixture, suggesting it had a lower propensity to bind the AuNPs than the short DNA sequence. The ratio of DNA sequences bound to the AuNPs was not the same for the large and small AuNPs, which suggests that the radius of curvature had a significant influence on the assembly of DNA strands onto the AuNPs.
DNA Nanostructures as Smart Drug-Delivery Vehicles and Molecular Devices.

PubMed

Linko, Veikko; Ora, Ari; Kostiainen, Mauri A

2015-10-01

DNA molecules can be assembled into custom predesigned shapes via hybridization of sequence-complementary domains. The folded structures have high spatial addressability and a tremendous potential to serve as platforms and active components in a plethora of bionanotechnological applications. DNA is a truly programmable material, and its nanoscale engineering thus opens up numerous attractive possibilities to develop novel methods for therapeutics. The tailored molecular devices could be used in targeting cells and triggering the cellular actions in the biological environment. In this review we focus on the DNA-based assemblies - primarily DNA origami nanostructures - that could perform complex tasks in cells and serve as smart drug-delivery vehicles in, for example, cancer therapy, prodrug medication, and enzyme replacement therapy. Copyright © 2015 Elsevier Ltd. All rights reserved.
Understanding the Elementary Steps in DNA Tile-Based Self-Assembly.

PubMed

Jiang, Shuoxing; Hong, Fan; Hu, Huiyu; Yan, Hao; Liu, Yan

2017-09-26

Although many models have been developed to guide the design and implementation of DNA tile-based self-assembly systems with increasing complexity, the fundamental assumptions of the models have not been thoroughly tested. To expand the quantitative understanding of DNA tile-based self-assembly and to test the fundamental assumptions of self-assembly models, we investigated DNA tile attachment to preformed "multi-tile" arrays in real time and obtained the thermodynamic and kinetic parameters of single tile attachment in various sticky end association scenarios. With more sticky ends, tile attachment becomes more thermostable with an approximately linear decrease in the free energy change (more negative). The total binding free energy of sticky ends is partially compromised by a sequence-independent energy penalty when tile attachment forms a constrained configuration: "loop". The minimal loop is a 2 × 2 tetramer (Loop4). The energy penalty of loops of 4, 6, and 8 tiles was analyzed with the independent loop model assuming no interloop tension, which is generalizable to arbitrary tile configurations. More sticky ends also contribute to a faster on-rate under isothermal conditions when nucleation is the rate-limiting step. Incorrect sticky end contributes to neither the thermostability nor the kinetics. The thermodynamic and kinetic parameters of DNA tile attachment elucidated here will contribute to the future improvement and optimization of tile assembly modeling, precise control of experimental conditions, and structural design for error-free self-assembly.
Impact of Lateral Transfers on the Genomes of Lepidoptera

PubMed Central

Drezen, Jean-Michel; Josse, Thibaut; Bézier, Annie; Gauthier, Jérémy; Huguet, Elisabeth

2017-01-01

Transfer of DNA sequences between species regardless of their evolutionary distance is very common in bacteria, but evidence that horizontal gene transfer (HGT) also occurs in multicellular organisms has been accumulating in the past few years. The actual extent of this phenomenon is underestimated due to frequent sequence filtering of “alien” DNA before genome assembly. However, recent studies based on genome sequencing have revealed, and experimentally verified, the presence of foreign DNA sequences in the genetic material of several species of Lepidoptera. Large DNA viruses, such as baculoviruses and the symbiotic viruses of parasitic wasps (bracoviruses), have the potential to mediate these transfers in Lepidoptera. In particular, using ultra-deep sequencing, newly integrated transposons have been identified within baculovirus genomes. Bacterial genes have also been acquired by genomes of Lepidoptera, as in other insects and nematodes. In addition, insertions of bracovirus sequences were present in the genomes of certain moth and butterfly lineages, that were likely corresponding to rearrangements of ancient integrations. The viral genes present in these sequences, sometimes of hymenopteran origin, have been co-opted by lepidopteran species to confer some protection against pathogens. PMID:29120392
Seamless Insert-Plasmid Assembly at High Efficiency and Low Cost

PubMed Central

Benoit, Roger M.; Ostermeier, Christian; Geiser, Martin; Li, Julia Su Zhou; Widmer, Hans; Auer, Manfred

2016-01-01

Seamless cloning methods, such as co-transformation cloning, sequence- and ligation-independent cloning (SLIC) or the Gibson assembly, are essential tools for the precise construction of plasmids. The efficiency of co-transformation cloning is however low and the Gibson assembly reagents are expensive. With the aim to improve the robustness of seamless cloning experiments while keeping costs low, we examined the importance of complementary single-stranded DNA ends for co-transformation cloning and the influence of single-stranded gaps in circular plasmids on SLIC cloning efficiency. Most importantly, our data show that single-stranded gaps in double-stranded plasmids, which occur in typical SLIC protocols, can drastically decrease the efficiency at which the DNA transforms competent E. coli bacteria. Accordingly, filling-in of single-stranded gaps using DNA polymerase resulted in increased transformation efficiency. Ligation of the remaining nicks did not lead to a further increase in transformation efficiency. These findings demonstrate that highly efficient insert-plasmid assembly can be achieved by using only T5 exonuclease and Phusion DNA polymerase, without Taq DNA ligase from the original Gibson protocol, which significantly reduces the cost of the reactions. We successfully used this modified Gibson assembly protocol with two short insert-plasmid overlap regions, each counting only 15 nucleotides. PMID:27073895
Integrating De Novo Transcriptome Assembly and Cloning to Obtain Chicken Ovocleidin-17 Full-Length cDNA

PubMed Central

Ning, ZhongHua; Hincke, Maxwell T.; Yang, Ning; Hou, ZhuoCheng

2014-01-01

Efficiently obtaining full-length cDNA for a target gene is the key step for functional studies and probing genetic variations. However, almost all sequenced domestic animal genomes are not ‘finished’. Many functionally important genes are located in these gapped regions. It can be difficult to obtain full-length cDNA for which only partial amino acid/EST sequences exist. In this study we report a general pipeline to obtain full-length cDNA, and illustrate this approach for one important gene (Ovocleidin-17, OC-17) that is associated with chicken eggshell biomineralization. Chicken OC-17 is one of the best candidates to control and regulate the deposition of calcium carbonate in the calcified eggshell layer. OC-17 protein has been purified, sequenced, and has had its three-dimensional structure solved. However, researchers still cannot conduct OC-17 mRNA related studies because the mRNA sequence is unknown and the gene is absent from the current chicken genome. We used RNA-Seq to obtain the entire transcriptome of the adult hen uterus, and then conducted de novo transcriptome assembling with bioinformatics analysis to obtain candidate OC-17 transcripts. Based on this sequence, we used RACE and PCR cloning methods to successfully obtain the full-length OC-17 cDNA. Temporal and spatial OC-17 mRNA expression analyses were also performed to demonstrate that OC-17 is predominantly expressed in the adult hen uterus during the laying cycle and barely at immature developmental stages. Differential uterine expression of OC-17 was observed in hens laying eggs with weak versus strong eggshell, confirming its important role in the regulation of eggshell mineralization and providing a new tool for genetic selection for eggshell quality parameters. This study is the first one to report the full-length OC-17 cDNA sequence, and builds a foundation for OC-17 mRNA related studies. We provide a general method for biologists experiencing difficulty in obtaining candidate gene full-length cDNA sequences. PMID:24676480
Integrating de novo transcriptome assembly and cloning to obtain chicken Ovocleidin-17 full-length cDNA.

PubMed

Zhang, Quan; Liu, Long; Zhu, Feng; Ning, ZhongHua; Hincke, Maxwell T; Yang, Ning; Hou, ZhuoCheng

2014-01-01

Efficiently obtaining full-length cDNA for a target gene is the key step for functional studies and probing genetic variations. However, almost all sequenced domestic animal genomes are not 'finished'. Many functionally important genes are located in these gapped regions. It can be difficult to obtain full-length cDNA for which only partial amino acid/EST sequences exist. In this study we report a general pipeline to obtain full-length cDNA, and illustrate this approach for one important gene (Ovocleidin-17, OC-17) that is associated with chicken eggshell biomineralization. Chicken OC-17 is one of the best candidates to control and regulate the deposition of calcium carbonate in the calcified eggshell layer. OC-17 protein has been purified, sequenced, and has had its three-dimensional structure solved. However, researchers still cannot conduct OC-17 mRNA related studies because the mRNA sequence is unknown and the gene is absent from the current chicken genome. We used RNA-Seq to obtain the entire transcriptome of the adult hen uterus, and then conducted de novo transcriptome assembling with bioinformatics analysis to obtain candidate OC-17 transcripts. Based on this sequence, we used RACE and PCR cloning methods to successfully obtain the full-length OC-17 cDNA. Temporal and spatial OC-17 mRNA expression analyses were also performed to demonstrate that OC-17 is predominantly expressed in the adult hen uterus during the laying cycle and barely at immature developmental stages. Differential uterine expression of OC-17 was observed in hens laying eggs with weak versus strong eggshell, confirming its important role in the regulation of eggshell mineralization and providing a new tool for genetic selection for eggshell quality parameters. This study is the first one to report the full-length OC-17 cDNA sequence, and builds a foundation for OC-17 mRNA related studies. We provide a general method for biologists experiencing difficulty in obtaining candidate gene full-length cDNA sequences.
Sequence analysis of cultivated strawberry (Fragaria × ananassa Duch.) using microdissected single somatic chromosomes.

PubMed

Yanagi, Tomohiro; Shirasawa, Kenta; Terachi, Mayuko; Isobe, Sachiko

2017-01-01

Cultivated strawberry ( Fragaria × ananassa Duch.) has homoeologous chromosomes because of allo-octoploidy. For example, two homoeologous chromosomes that belong to different sub-genome of allopolyploids have similar base sequences. Thus, when conducting de novo assembly of DNA sequences, it is difficult to determine whether these sequences are derived from the same chromosome. To avoid the difficulties associated with homoeologous chromosomes and demonstrate the possibility of sequencing allopolyploids using single chromosomes, we conducted sequence analysis using microdissected single somatic chromosomes of cultivated strawberry. Three hundred and ten somatic chromosomes of the Japanese octoploid strawberry 'Reiko' were individually selected under a light microscope using a microdissection system. DNA from 288 of the dissected chromosomes was successfully amplified using a DNA amplification kit. Using next-generation sequencing, we decoded the base sequences of the amplified DNA segments, and on the basis of mapping, we identified DNA sequences from 144 samples that were best matched to the reference genomes of the octoploid strawberry, F. × ananassa , and the diploid strawberry, F. vesca . The 144 samples were classified into seven pseudo-molecules of F. vesca . The coverage rates of the DNA sequences from the single chromosome onto all pseudo-molecular sequences varied from 3 to 29.9%. We demonstrated an efficient method for sequence analysis of allopolyploid plants using microdissected single chromosomes. On the basis of our results, we believe that whole-genome analysis of allopolyploid plants can be enhanced using methodology that employs microdissected single chromosomes.
DNA/RNA transverse current sequencing: intrinsic structural noise from neighboring bases

PubMed Central

Alvarez, Jose R.; Skachkov, Dmitry; Massey, Steven E.; Kalitsov, Alan; Velev, Julian P.

2015-01-01

Nanopore DNA sequencing via transverse current has emerged as a promising candidate for third-generation sequencing technology. It produces long read lengths which could alleviate problems with assembly errors inherent in current technologies. However, the high error rates of nanopore sequencing have to be addressed. A very important source of the error is the intrinsic noise in the current arising from carrier dispersion along the chain of the molecule, i.e., from the influence of neighboring bases. In this work we perform calculations of the transverse current within an effective multi-orbital tight-binding model derived from first-principles calculations of the DNA/RNA molecules, to study the effect of this structural noise on the error rates in DNA/RNA sequencing via transverse current in nanopores. We demonstrate that a statistical technique, utilizing not only the currents through the nucleotides but also the correlations in the currents, can in principle reduce the error rate below any desired precision. PMID:26150827
Gene-enriched draft genome of the cattle tick Rhipicephalus microplus: Assembly by the hybrid Pacific Biosciences/Illumina approach enabled analysis of the highly repetitive genome

USDA-ARS?s Scientific Manuscript database

The genome of the cattle tick R. microplus, an ectoparasite with global distribution, is estimated to be 7.1 Gbp and consists of ~70% repetitive DNA. We report the first assembly of a tick genome that utilized a hybrid sequencing and assembly approach to capture the repetitive fractions of the genom...

Partial bisulfite conversion for unique template sequencing

PubMed Central

Kumar, Vijay; Rosenbaum, Julie; Wang, Zihua; Forcier, Talitha; Ronemus, Michael; Wigler, Michael

2018-01-01

Abstract We introduce a new protocol, mutational sequencing or muSeq, which uses sodium bisulfite to randomly deaminate unmethylated cytosines at a fixed and tunable rate. The muSeq protocol marks each initial template molecule with a unique mutation signature that is present in every copy of the template, and in every fragmented copy of a copy. In the sequenced read data, this signature is observed as a unique pattern of C-to-T or G-to-A nucleotide conversions. Clustering reads with the same conversion pattern enables accurate count and long-range assembly of initial template molecules from short-read sequence data. We explore count and low-error sequencing by profiling 135 000 restriction fragments in a PstI representation, demonstrating that muSeq improves copy number inference and significantly reduces sporadic sequencer error. We explore long-range assembly in the context of cDNA, generating contiguous transcript clusters greater than 3,000 bp in length. The muSeq assemblies reveal transcriptional diversity not observable from short-read data alone. PMID:29161423
Single-molecule sequencing of the desiccation-tolerant grass Oropetium thomaeum

DOE Office of Scientific and Technical Information (OSTI.GOV)

VanBuren, Robert; Bryant, Doug; Edger, Patrick P.

Plant genomes, and eukaryotic genomes in general, are typically repetitive, polyploid and heterozygous, which complicates genome assembly1. The short read lengths of early Sanger and current next-generation sequencing platforms hinder assembly through complex repeat regions, and many draft and reference genomes are fragmented, lacking skewed GC and repetitive intergenic sequences, which are gaining importance due to projects like the Encyclopedia of DNA Elements (ENCODE). Here we report the whole-genome sequencing and assembly of the desiccation-tolerant grass Oropetium thomaeum. Using only single-molecule real-time sequencing, which generates long (>16 kilobases) reads with random errors, we assembled 99% (244 megabases) of the Oropetiummore » genome into 625 contigs with an N50 length of 2.4 megabases. Oropetium is an example of a ‘near-complete’ draft genome which includes gapless coverage over gene space as well as intergenic sequences such as centromeres, telomeres, transposable elements and rRNA clusters that are typically unassembled in draft genomes. Oropetium has 28,466 protein-coding genes and 43% repeat sequences, yet with 30% more compact euchromatic regions it is the smallest known grass genome. As a result, the Oropetium genome demonstrates the utility of single-molecule real-time sequencing for assembling high-quality plant and other eukaryotic genomes, and serves as a valuable resource for the plant comparative genomics community.« less
Single-molecule sequencing of the desiccation-tolerant grass Oropetium thomaeum

DOE PAGES

VanBuren, Robert; Bryant, Doug; Edger, Patrick P.; ...

2015-11-11

Plant genomes, and eukaryotic genomes in general, are typically repetitive, polyploid and heterozygous, which complicates genome assembly1. The short read lengths of early Sanger and current next-generation sequencing platforms hinder assembly through complex repeat regions, and many draft and reference genomes are fragmented, lacking skewed GC and repetitive intergenic sequences, which are gaining importance due to projects like the Encyclopedia of DNA Elements (ENCODE). Here we report the whole-genome sequencing and assembly of the desiccation-tolerant grass Oropetium thomaeum. Using only single-molecule real-time sequencing, which generates long (>16 kilobases) reads with random errors, we assembled 99% (244 megabases) of the Oropetiummore » genome into 625 contigs with an N50 length of 2.4 megabases. Oropetium is an example of a ‘near-complete’ draft genome which includes gapless coverage over gene space as well as intergenic sequences such as centromeres, telomeres, transposable elements and rRNA clusters that are typically unassembled in draft genomes. Oropetium has 28,466 protein-coding genes and 43% repeat sequences, yet with 30% more compact euchromatic regions it is the smallest known grass genome. As a result, the Oropetium genome demonstrates the utility of single-molecule real-time sequencing for assembling high-quality plant and other eukaryotic genomes, and serves as a valuable resource for the plant comparative genomics community.« less
Genome Improvement at JGI-HAGSC

DOE Office of Scientific and Technical Information (OSTI.GOV)

Grimwood, Jane; Schmutz, Jeremy J.; Myers, Richard M.

Since the completion of the sequencing of the human genome, the Joint Genome Institute (JGI) has rapidly expanded its scientific goals in several DOE mission-relevant areas. At the JGI-HAGSC, we have kept pace with this rapid expansion of projects with our focus on assessing, assembling, improving and finishing eukaryotic whole genome shotgun (WGS) projects for which the shotgun sequence is generated at the Production Genomic Facility (JGI-PGF). We follow this by combining the draft WGS with genomic resources generated at JGI-HAGSC or in collaborator laboratories (including BAC end sequences, genetic maps and FLcDNA sequences) to produce an improved draft sequence.more » For eukaryotic genomes important to the DOE mission, we then add further information from directed experiments to produce reference genomic sequences that are publicly available for any scientific researcher. Also, we have continued our program for producing BAC-based finished sequence, both for adding information to JGI genome projects and for small BAC-based sequencing projects proposed through any of the JGI sequencing programs. We have now built our computational expertise in WGS assembly and analysis and have moved eukaryotic genome assembly from the JGI-PGF to JGI-HAGSC. We have concentrated our assembly development work on large plant genomes and complex fungal and algal genomes.« less
Self-assembly of multiferroic core-shell particulate nanocomposites through DNA-DNA hybridization and magnetic field directed assembly of superstructures

NASA Astrophysics Data System (ADS)

Sreenivasulu, Gollapudi; Lochbiler, Thomas A.; Panda, Manashi; Srinivasan, Gopalan; Chavez, Ferman A.

2016-04-01

Multiferroic composites of ferromagnetic and ferroelectric phases are of importance for studies on mechanical strain mediated coupling between the magnetic and electric subsystems. This work is on DNA-assisted self-assembly of superstructures of such composites with nanometer periodicity. The synthesis involved oligomeric DNA-functionalized ferroelectric and ferromagnetic nanoparticles, 600 nm BaTiO3 (BTO) and 200 nm NiFe2O4 (NFO), respectively. Mixing BTO and NFO particles, possessing complementary DNA sequences, resulted in the formation of ordered core-shell heteronanocomposites held together by DNA hybridization. The composites were imaged by scanning electron microscopy and scanning microwave microscopy. The presence of heteroassemblies along with core-shell architecture is clearly observed. The reversible nature of the DNA hybridization allows for restructuring the composites into mm-long linear chains and 2D-arrays in the presence of a static magnetic field and ring-like structures in a rotating-magnetic field. Strong magneto-electric (ME) coupling in as-assembled composites is evident from static magnetic field H induced polarization and low-frequency magnetoelectric voltage coefficient measurements. Upon annealing the nanocomposites at high temperatures, evidence for the formation of bulk composites with excellent cross-coupling between the electric and magnetic subsystems is obtained by H-induced polarization and low-frequency ME voltage coefficient. The ME coupling strength in the self-assembled composites is measured to be much stronger than in bulk composites with randomly distributed NFO and BTO prepared by direct mixing and sintering.
Directed nucleation assembly of DNA tile complexes for barcode-patterned lattices

NASA Astrophysics Data System (ADS)

Yan, Hao; Labean, Thomas H.; Feng, Liping; Reif, John H.

2003-07-01

The programmed self-assembly of patterned aperiodic molecular structures is a major challenge in nanotechnology and has numerous potential applications for nanofabrication of complex structures and useful devices. Here we report the construction of an aperiodic patterned DNA lattice (barcode lattice) by a self-assembly process of directed nucleation of DNA tiles around a scaffold DNA strand. The input DNA scaffold strand, constructed by ligation of shorter synthetic oligonucleotides, provides layers of the DNA lattice with barcode patterning information represented by the presence or absence of DNA hairpin loops protruding out of the lattice plane. Self-assembly of multiple DNA tiles around the scaffold strand was shown to result in a patterned lattice containing barcode information of 01101. We have also demonstrated the reprogramming of the system to another patterning. An inverted barcode pattern of 10010 was achieved by modifying the scaffold strands and one of the strands composing each tile. A ribbon lattice, consisting of repetitions of the barcode pattern with expected periodicity, was also constructed by the addition of sticky ends. The patterning of both classes of lattices was clearly observable via atomic force microscopy. These results represent a step toward implementation of a visual readout system capable of converting information encoded on a 1D DNA strand into a 2D form readable by advanced microscopic techniques. A functioning visual output method would not only increase the readout speed of DNA-based computers, but may also find use in other sequence identification techniques such as mutation or allele mapping.
Directed nucleation assembly of DNA tile complexes for barcode-patterned lattices.

PubMed

Yan, Hao; LaBean, Thomas H; Feng, Liping; Reif, John H

2003-07-08

The programmed self-assembly of patterned aperiodic molecular structures is a major challenge in nanotechnology and has numerous potential applications for nanofabrication of complex structures and useful devices. Here we report the construction of an aperiodic patterned DNA lattice (barcode lattice) by a self-assembly process of directed nucleation of DNA tiles around a scaffold DNA strand. The input DNA scaffold strand, constructed by ligation of shorter synthetic oligonucleotides, provides layers of the DNA lattice with barcode patterning information represented by the presence or absence of DNA hairpin loops protruding out of the lattice plane. Self-assembly of multiple DNA tiles around the scaffold strand was shown to result in a patterned lattice containing barcode information of 01101. We have also demonstrated the reprogramming of the system to another patterning. An inverted barcode pattern of 10010 was achieved by modifying the scaffold strands and one of the strands composing each tile. A ribbon lattice, consisting of repetitions of the barcode pattern with expected periodicity, was also constructed by the addition of sticky ends. The patterning of both classes of lattices was clearly observable via atomic force microscopy. These results represent a step toward implementation of a visual readout system capable of converting information encoded on a 1D DNA strand into a 2D form readable by advanced microscopic techniques. A functioning visual output method would not only increase the readout speed of DNA-based computers, but may also find use in other sequence identification techniques such as mutation or allele mapping.
A Nonconventional Approach to Patterned Nanoarrays of DNA Strands for Template-Assisted Assembly of Polyfluorene Nanowires.

PubMed

Bae, Dong Geun; Jeong, Ji-Eun; Kang, Seok Hee; Byun, Myunghwan; Han, Dong-Wook; Lin, Zhiqun; Woo, Han Young; Hong, Suck Won

2016-08-01

DNA molecules have been widely recognized as promising building blocks for constructing functional nanostructures with two main features, that is, self-assembly and rich chemical functionality. The intrinsic feature size of DNA makes it attractive for creating versatile nanostructures. Moreover, the ease of access to tune the surface of DNA by chemical functionalization offers numerous opportunities for many applications. Herein, a simple yet robust strategy is developed to yield the self-assembly of DNA by exploiting controlled evaporative assembly of DNA solution in a unique confined geometry. Intriguingly, depending on the concentration of DNA solution, highly aligned nanostructured fibrillar-like arrays and well-positioned concentric ring-like superstructures composed of DNAs are formed. Subsequently, the ring-like negatively charged DNA superstructures are employed as template to produce conductive organic nanowires on a silicon substrate by complexing with a positively charged conjugated polyelectrolyte poly[9,9-bis(6'-N,N,N-trimethylammoniumhexyl)fluorene dibromide] (PF2) through the strong electrostatic interaction. Finally, a monolithic integration of aligned arrays of DNA-templated PF2 nanowires to yield two DNA/PF2-based devices is demonstrated. It is envisioned that this strategy can be readily extended to pattern other biomolecules and may render a broad range of potential applications from the nucleotide sequence and hybridization as recognition events to transducing elements in chemical sensors. © 2016 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
The long reads ahead: de novo genome assembly using the MinION

PubMed Central

de Lannoy, Carlos; de Ridder, Dick; Risse, Judith

2017-01-01

Nanopore technology provides a novel approach to DNA sequencing that yields long, label-free reads of constant quality. The first commercial implementation of this approach, the MinION, has shown promise in various sequencing applications. This review gives an up-to-date overview of the MinION's utility as a de novo sequencing device. It is argued that the MinION may allow for portable and affordable de novo sequencing of even complex genomes in the near future, despite the currently error-prone nature of its reads. Through continuous updates to the MinION hardware and the development of new assembly pipelines, both sequencing accuracy and assembly quality have already risen rapidly. However, this fast pace of development has also lead to a lack of overview of the expanding landscape of analysis tools, as performance evaluations are outdated quickly. As the MinION is approaching a state of maturity, its user community would benefit from a thorough comparative benchmarking effort of de novo assembly pipelines in the near future. An earlier version of this article can be found on bioRxiv. PMID:29375809
DNA-Assembled Advanced Plasmonic Architectures.

PubMed

Liu, Na; Liedl, Tim

2018-03-28

The interaction between light and matter can be controlled efficiently by structuring materials at a length scale shorter than the wavelength of interest. With the goal to build optical devices that operate at the nanoscale, plasmonics has established itself as a discipline, where near-field effects of electromagnetic waves created in the vicinity of metallic surfaces can give rise to a variety of novel phenomena and fascinating applications. As research on plasmonics has emerged from the optics and solid-state communities, most laboratories employ top-down lithography to implement their nanophotonic designs. In this review, we discuss the recent, successful efforts of employing self-assembled DNA nanostructures as scaffolds for creating advanced plasmonic architectures. DNA self-assembly exploits the base-pairing specificity of nucleic acid sequences and allows for the nanometer-precise organization of organic molecules but also for the arrangement of inorganic particles in space. Bottom-up self-assembly thus bypasses many of the limitations of conventional fabrication methods. As a consequence, powerful tools such as DNA origami have pushed the boundaries of nanophotonics and new ways of thinking about plasmonic designs are on the rise.
Controlled Assembly of Ag Nanoparticles and Carbon Nanotube Hybrid Structures for Biosensing

DTIC Science & Technology

2010-01-01

to∼190 kΩ. The same device was again washed with DI water and treated with the thiolated ssDNA in high salt buffer. After a 2 h treatment, the device...after the cleaning only thiolated DNA should be present on the device, whereas the nonspecifically bound DNA as well as the buffer salts should be...ssDNA molecules for 2 h. Specific immobiliza- tion of thiolated ssDNA (sequence: 50thiol_TCATAC AGCTAGATA ACC AAAGA) was carried out in high salt
DNA–DNA kissing complexes as a new tool for the assembly of DNA nanostructures

PubMed Central

Barth, Anna; Kobbe, Daniela; Focke, Manfred

2016-01-01

Kissing-loop annealing of nucleic acids occurs in nature in several viruses and in prokaryotic replication, among other circumstances. Nucleobases of two nucleic acid strands (loops) interact with each other, although the two strands cannot wrap around each other completely because of the adjacent double-stranded regions (stems). In this study, we exploited DNA kissing-loop interaction for nanotechnological application. We functionalized the vertices of DNA tetrahedrons with DNA stem-loop sequences. The complementary loop sequence design allowed the hybridization of different tetrahedrons via kissing-loop interaction, which might be further exploited for nanotechnology applications like cargo transport and logical elements. Importantly, we were able to manipulate the stability of those kissing-loop complexes based on the choice and concentration of cations, the temperature and the number of complementary loops per tetrahedron either at the same or at different vertices. Moreover, variations in loop sequences allowed the characterization of necessary sequences within the loop as well as additional stability control of the kissing complexes. Therefore, the properties of the presented nanostructures make them an important tool for DNA nanotechnology. PMID:26773051
DNA Extraction Protocols for Whole-Genome Sequencing in Marine Organisms.

PubMed

Panova, Marina; Aronsson, Henrik; Cameron, R Andrew; Dahl, Peter; Godhe, Anna; Lind, Ulrika; Ortega-Martinez, Olga; Pereyra, Ricardo; Tesson, Sylvie V M; Wrange, Anna-Lisa; Blomberg, Anders; Johannesson, Kerstin

2016-01-01

The marine environment harbors a large proportion of the total biodiversity on this planet, including the majority of the earths' different phyla and classes. Studying the genomes of marine organisms can bring interesting insights into genome evolution. Today, almost all marine organismal groups are understudied with respect to their genomes. One potential reason is that extraction of high-quality DNA in sufficient amounts is challenging for many marine species. This is due to high polysaccharide content, polyphenols and other secondary metabolites that will inhibit downstream DNA library preparations. Consequently, protocols developed for vertebrates and plants do not always perform well for invertebrates and algae. In addition, many marine species have large population sizes and, as a consequence, highly variable genomes. Thus, to facilitate the sequence read assembly process during genome sequencing, it is desirable to obtain enough DNA from a single individual, which is a challenge in many species of invertebrates and algae. Here, we present DNA extraction protocols for seven marine species (four invertebrates, two algae, and a marine yeast), optimized to provide sufficient DNA quality and yield for de novo genome sequencing projects.
A simple method for semi-random DNA amplicon fragmentation using the methylation-dependent restriction enzyme MspJI.

PubMed

Shinozuka, Hiroshi; Cogan, Noel O I; Shinozuka, Maiko; Marshall, Alexis; Kay, Pippa; Lin, Yi-Han; Spangenberg, German C; Forster, John W

2015-04-11

Fragmentation at random nucleotide locations is an essential process for preparation of DNA libraries to be used on massively parallel short-read DNA sequencing platforms. Although instruments for physical shearing, such as the Covaris S2 focused-ultrasonicator system, and products for enzymatic shearing, such as the Nextera technology and NEBNext dsDNA Fragmentase kit, are commercially available, a simple and inexpensive method is desirable for high-throughput sequencing library preparation. MspJI is a recently characterised restriction enzyme which recognises the sequence motif CNNR (where R = G or A) when the first base is modified to 5-methylcytosine or 5-hydroxymethylcytosine. A semi-random enzymatic DNA amplicon fragmentation method was developed based on the unique cleavage properties of MspJI. In this method, random incorporation of 5-methyl-2'-deoxycytidine-5'-triphosphate is achieved through DNA amplification with DNA polymerase, followed by DNA digestion with MspJI. Due to the recognition sequence of the enzyme, DNA amplicons are fragmented in a relatively sequence-independent manner. The size range of the resulting fragments was capable of control through optimisation of 5-methyl-2'-deoxycytidine-5'-triphosphate concentration in the reaction mixture. A library suitable for sequencing using the Illumina MiSeq platform was prepared and processed using the proposed method. Alignment of generated short reads to a reference sequence demonstrated a relatively high level of random fragmentation. The proposed method may be performed with standard laboratory equipment. Although the uniformity of coverage was slightly inferior to the Covaris physical shearing procedure, due to efficiencies of cost and labour, the method may be more suitable than existing approaches for implementation in large-scale sequencing activities, such as bacterial artificial chromosome (BAC)-based genome sequence assembly, pan-genomic studies and locus-targeted genotyping-by-sequencing.
Enzymatic Synthesis of Self-assembled Dicer Substrate RNA Nanostructures for Programmable Gene Silencing.

PubMed

Jang, Bora; Kim, Boyoung; Kim, Hyunsook; Kwon, Hyokyoung; Kim, Minjeong; Seo, Yunmi; Colas, Marion; Jeong, Hansaem; Jeong, Eun Hye; Lee, Kyuri; Lee, Hyukjin

2018-06-08

Enzymatic synthesis of RNA nanostructures is achieved by isothermal rolling circle transcription (RCT). Each arm of RNA nanostructures provides a functional role of Dicer substrate RNA inducing sequence specific RNA interference (RNAi). Three different RNAi sequences (GFP, RFP, and BFP) are incorporated within the three-arm junction RNA nanostructures (Y-RNA). The template and helper DNA strands are designed for the large-scale in vitro synthesis of RNA strands to prepare self-assembled Y-RNA. Interestingly, Dicer processing of Y-RNA is highly influenced by its physical structure and different gene silencing activity is achieved depending on its arm length and overhang. In addition, enzymatic synthesis allows the preparation of various Y-RNA structures using a single DNA template offering on demand regulation of multiple target genes.
Improving de novo sequence assembly using machine learning and comparative genomics for overlap correction.

PubMed

Palmer, Lance E; Dejori, Mathaeus; Bolanos, Randall; Fasulo, Daniel

2010-01-15

With the rapid expansion of DNA sequencing databases, it is now feasible to identify relevant information from prior sequencing projects and completed genomes and apply it to de novo sequencing of new organisms. As an example, this paper demonstrates how such extra information can be used to improve de novo assemblies by augmenting the overlapping step. Finding all pairs of overlapping reads is a key task in many genome assemblers, and to this end, highly efficient algorithms have been developed to find alignments in large collections of sequences. It is well known that due to repeated sequences, many aligned pairs of reads nevertheless do not overlap. But no overlapping algorithm to date takes a rigorous approach to separating aligned but non-overlapping read pairs from true overlaps. We present an approach that extends the Minimus assembler by a data driven step to classify overlaps as true or false prior to contig construction. We trained several different classification models within the Weka framework using various statistics derived from overlaps of reads available from prior sequencing projects. These statistics included percent mismatch and k-mer frequencies within the overlaps as well as a comparative genomics score derived from mapping reads to multiple reference genomes. We show that in real whole-genome sequencing data from the E. coli and S. aureus genomes, by providing a curated set of overlaps to the contigging phase of the assembler, we nearly doubled the median contig length (N50) without sacrificing coverage of the genome or increasing the number of mis-assemblies. Machine learning methods that use comparative and non-comparative features to classify overlaps as true or false can be used to improve the quality of a sequence assembly.
MEGGASENSE - The Metagenome/Genome Annotated Sequence Natural Language Search Engine: A Platform for  the Construction of Sequence Data Warehouses.

PubMed

Gacesa, Ranko; Zucko, Jurica; Petursdottir, Solveig K; Gudmundsdottir, Elisabet Eik; Fridjonsson, Olafur H; Diminic, Janko; Long, Paul F; Cullum, John; Hranueli, Daslav; Hreggvidsson, Gudmundur O; Starcevic, Antonio

2017-06-01

The MEGGASENSE platform constructs relational databases of DNA or protein sequences. The default functional analysis uses 14 106 hidden Markov model (HMM) profiles based on sequences in the KEGG database. The Solr search engine allows sophisticated queries and a BLAST search function is also incorporated. These standard capabilities were used to generate the SCATT database from the predicted proteome of Streptomyces cattleya . The implementation of a specialised metagenome database (AMYLOMICS) for bioprospecting of carbohydrate-modifying enzymes is described. In addition to standard assembly of reads, a novel 'functional' assembly was developed, in which screening of reads with the HMM profiles occurs before the assembly. The AMYLOMICS database incorporates additional HMM profiles for carbohydrate-modifying enzymes and it is illustrated how the combination of HMM and BLAST analyses helps identify interesting genes. A variety of different proteome and metagenome databases have been generated by MEGGASENSE.
Preparation and biomedical applications of programmable and multifunctional DNA nanoflowers

PubMed Central

Lv, Yifan; Hu, Rong; Zhu, Guizhi; Zhang, Xiaobing; Mei, Lei; Liu, Qiaoling; Qiu, Liping; Wu, Cuichen; Tan, Weihong

2016-01-01

We describe a comprehensive protocol for the preparation of multifunctional DNA nanostructures termed nanoflowers (NFs), which are self-assembled from long DNA building blocks generated via rolling-circle replication (RCR) of a designed template. NF assembly is driven by liquid crystallization and dense packaging of building blocks, which eliminates the need for conventional Watson-Crick base pairing. As a result of dense DNA packaging, NFs are resistant to nuclease degradation, denaturation or dissociation at extremely low concentrations. By manually changing the template sequence, many different functional moieties including aptamers, bioimaging agents and drug-loading sites could be easily integrated into NF particles, making NFs ideal candidates for a variety of applications in biomedicine. In this protocol, the preparation of multifunctional DNA NFs with highly tunable sizes is described for applications in cell targeting, intracellular imaging and drug delivery. Preparation and characterization of functional DNA NFs takes ~5 d; the following biomedical applications take ~10 d. PMID:26357007
From cheek swabs to consensus sequences: an A to Z protocol for high-throughput DNA sequencing of complete human mitochondrial genomes

PubMed Central

2014-01-01

Background Next-generation DNA sequencing (NGS) technologies have made huge impacts in many fields of biological research, but especially in evolutionary biology. One area where NGS has shown potential is for high-throughput sequencing of complete mtDNA genomes (of humans and other animals). Despite the increasing use of NGS technologies and a better appreciation of their importance in answering biological questions, there remain significant obstacles to the successful implementation of NGS-based projects, especially for new users. Results Here we present an ‘A to Z’ protocol for obtaining complete human mitochondrial (mtDNA) genomes – from DNA extraction to consensus sequence. Although designed for use on humans, this protocol could also be used to sequence small, organellar genomes from other species, and also nuclear loci. This protocol includes DNA extraction, PCR amplification, fragmentation of PCR products, barcoding of fragments, sequencing using the 454 GS FLX platform, and a complete bioinformatics pipeline (primer removal, reference-based mapping, output of coverage plots and SNP calling). Conclusions All steps in this protocol are designed to be straightforward to implement, especially for researchers who are undertaking next-generation sequencing for the first time. The molecular steps are scalable to large numbers (hundreds) of individuals and all steps post-DNA extraction can be carried out in 96-well plate format. Also, the protocol has been assembled so that individual ‘modules’ can be swapped out to suit available resources. PMID:24460871
In Silico Identification of Protein Disulfide Isomerase Gene Families in the De Novo Assembled Transcriptomes of Four Different Species of the Genus Conus.

PubMed

Figueroa-Montiel, Andrea; Ramos, Marco A; Mares, Rosa E; Dueñas, Salvador; Pimienta, Genaro; Ortiz, Ernesto; Possani, Lourival D; Licea-Navarro, Alexei F

2016-01-01

Small peptides isolated from the venom of the marine snails belonging to the genus Conus have been largely studied because of their therapeutic value. These peptides can be classified in two groups. The largest one is composed by peptides rich in disulfide bonds, and referred to as conotoxins. Despite the importance of conotoxins given their pharmacology value, little is known about the protein disulfide isomerase (PDI) enzymes that are required to catalyze their correct folding. To discover the PDIs that may participate in the folding and structural maturation of conotoxins, the transcriptomes of the venom duct of four different species of Conus from the peninsula of Baja California (Mexico) were assembled. Complementary DNA (cDNA) libraries were constructed for each species and sequenced using a Genome Analyzer Illumina platform. The raw RNA-seq data was converted into transcript sequences using Trinity, a de novo assembler that allows the grouping of reads into contigs without a reference genome. An N50 value of 605 was established as a reference for future assemblies of Conus transcriptomes using this software. Transdecoder was used to extract likely coding sequences from Trinity transcripts, and PDI-specific sequence motif "APWCGHCK" was used to capture potential PDIs. An in silico analysis was performed to characterize the group of PDI protein sequences encoded by the duct-transcriptome of each species. The computational approach entailed a structural homology characterization, based on the presence of functional Thioredoxin-like domains. Four different PDI families were characterized, which are constituted by a total of 41 different gene sequences. The sequences had an average of 65% identity with other PDIs. Using MODELLER 9.14, the homology-based three-dimensional structure prediction of a subset of the sequences reported, showed the expected thioredoxin fold which was confirmed by a "simulated annealing" method.

Base-resolution detection of N 4-methylcytosine in genomic DNA using 4mC-Tet-assisted-bisulfite-sequencing

DOE PAGES

Yu, Miao; Ji, Lexiang; Neumann, Drexel A.; ...

2015-07-15

Restriction-modification (R-M) systems pose a major barrier to DNA transformation and genetic engineering of bacterial species. Systematic identification of DNA methylation in R-M systems, including N 6-methyladenine (6mA), 5-methylcytosine (5mC) and N 4-methylcytosine (4mC), will enable strategies to make these species genetically tractable. Although single-molecule, real time (SMRT) sequencing technology is capable of detecting 4mC directly for any bacterial species regardless of whether an assembled genome exists or not, it is not as scalable to profiling hundreds to thousands of samples compared with the commonly used next-generation sequencing technologies. Here, we present 4mC-Tet-assisted bisulfite-sequencing (4mC-TAB-seq), a next-generation sequencing method thatmore » rapidly and cost efficiently reveals the genome-wide locations of 4mC for bacterial species with an available assembled reference genome. In 4mC-TAB-seq, both cytosines and 5mCs are read out as thymines, whereas only 4mCs are read out as cytosines, revealing their specific positions throughout the genome. We applied 4mC-TAB-seq to study the methylation of a member of the hyperthermophilc genus, Caldicellulosiruptor, in which 4mC-related restriction is a major barrier to DNA transformation from other species. Lastly, in combination with MethylC-seq, both 4mC- and 5mC-containing motifs are identified which can assist in rapid and efficient genetic engineering of these bacteria in the future.« less
Hawkeye and AMOS: visualizing and assessing the quality of genome assemblies

PubMed Central

Schatz, Michael C.; Phillippy, Adam M.; Sommer, Daniel D.; Delcher, Arthur L.; Puiu, Daniela; Narzisi, Giuseppe; Salzberg, Steven L.; Pop, Mihai

2013-01-01

Since its launch in 2004, the open-source AMOS project has released several innovative DNA sequence analysis applications including: Hawkeye, a visual analytics tool for inspecting the structure of genome assemblies; the Assembly Forensics and FRCurve pipelines for systematically evaluating the quality of a genome assembly; and AMOScmp, the first comparative genome assembler. These applications have been used to assemble and analyze dozens of genomes ranging in complexity from simple microbial species through mammalian genomes. Recent efforts have been focused on enhancing support for new data characteristics brought on by second- and now third-generation sequencing. This review describes the major components of AMOS in light of these challenges, with an emphasis on methods for assessing assembly quality and the visual analytics capabilities of Hawkeye. These interactive graphical aspects are essential for navigating and understanding the complexities of a genome assembly, from the overall genome structure down to individual bases. Hawkeye and AMOS are available open source at http://amos.sourceforge.net. PMID:22199379
Single-molecule FRET studies of the cooperative and non-cooperative binding kinetics of the bacteriophage T4 single-stranded DNA binding protein (gp32) to ssDNA lattices at replication fork junctions

PubMed Central

Lee, Wonbae; Gillies, John P.; Jose, Davis; Israels, Brett A.; von Hippel, Peter H.; Marcus, Andrew H.

2016-01-01

Gene 32 protein (gp32) is the single-stranded (ss) DNA binding protein of the bacteriophage T4. It binds transiently and cooperatively to ssDNA sequences exposed during the DNA replication process and regulates the interactions of the other sub-assemblies of the replication complex during the replication cycle. We here use single-molecule FRET techniques to build on previous thermodynamic studies of gp32 binding to initiate studies of the dynamics of the isolated and cooperative binding of gp32 molecules within the replication complex. DNA primer/template (p/t) constructs are used as models to determine the effects of ssDNA lattice length, gp32 concentration, salt concentration, binding cooperativity and binding polarity at p/t junctions. Hidden Markov models (HMMs) and transition density plots (TDPs) are used to characterize the dynamics of the multi-step assembly pathway of gp32 at p/t junctions of differing polarity, and show that isolated gp32 molecules bind to their ssDNA targets weakly and dissociate quickly, while cooperatively bound dimeric or trimeric clusters of gp32 bind much more tightly, can ‘slide’ on ssDNA sequences, and exhibit binding dynamics that depend on p/t junction polarities. The potential relationships of these binding dynamics to interactions with other components of the T4 DNA replication complex are discussed. PMID:27694621
Are commercial providers a viable option for clinical bacterial sequencing?

PubMed

Raven, Kathy; Blane, Beth; Churcher, Carol; Parkhill, Julian; Peacock, Sharon J

2018-04-05

Bacterial whole-genome sequencing in the clinical setting has the potential to bring major improvements to infection control and clinical practice. Sequencing instruments are not currently available in the majority of routine microbiology laboratories worldwide, but an alternative is to use external sequencing providers. To foster discussion around this we investigated whether send-out services were a viable option. Four providers offering MiSeq sequencing were selected based on cost and evaluated based on the service provided and sequence data quality. DNA was prepared from five methicillin-resistant Staphylococcus aureus (MRSA) isolates, four of which were investigated during a previously published outbreak in the UK together with a reference MRSA isolate (ST22 HO 5096 0412). Cost of sequencing per isolate ranged from £155 to £342 and turnaround times from DNA postage to arrival of sequence data ranged from 12 to 63 days. Comparison of commercially generated genomes against the original sequence data demonstrated very high concordance, with no more than one single nucleotide polymorphism (SNP) difference on core genome mapping between the original sequences and the new sequence for all four providers. Multilocus sequence type could not be assigned based on assembly for the two cheapest sequence providers due to fragmented assemblies probably caused by a lower output of sequence data per isolate. Our results indicate that external providers returned highly accurate genome data, but that improvements are required in turnaround time to make this a viable option for use in clinical practice.
Combining Chemoselective Ligation with Polyhistidine-Driven Self-Assembly for the Modular Display of Biomolecules on Quantum Dots

PubMed Central

Prasuhn, Duane E.; Blanco-Canosa, Juan B.; Vora, Gary J.; Delehanty, James B.; Susumu, Kimihiro; Mei, Bing C.; Dawson, Philip E.; Medintz, Igor L.

2015-01-01

One of the principle hurdles to wider incorporation of semiconductor quantum dots (QDs) in biology is the lack of facile linkage chemistries to create different types of functional QD-bioconjugates. A two-step modular strategy for the presentation of biomolecules on CdSe/ZnS core/shell QDs is described here which utilizes a chemoselective, aniline-catalyzed hydrazone coupling chemistry to append hexahistidine sequences onto peptides and DNA. This specifically provides them the ability to ratiometrically self-assemble to hydrophilic QDs. The versatility of this labeling approach was highlighted by ligating proteolytic substrate peptides, an oligoarginine cell-penetrating peptide, or a DNA-probe to cognate hexahistidine peptidyl sequences. The modularity allowed subsequently self-assembled QD constructs to engage in different types of targeted bioassays. The self-assembly and photophysical properties of individual QD conjugates were first confirmed by gel electrophoresis and Förster resonance energy transfer analysis. QD-dye-labeled peptide conjugates were then used as biosensors to quantitatively monitor the proteolytic activity of caspase-3 or elastase enzymes from different species. These sensors allowed the determination of the corresponding kinetic parameters, including the Michaelis constant (KM) and the maximum proteolytic activity (Vmax). QDs decorated with cell-penetrating peptides were shown to be successfully internalized by HEK 293T/17 cells, while nanocrystals displaying peptide-DNA conjugates were utilized as fluorescent probes in hybridization microarray assays. This modular approach for displaying peptides or DNA on QDs may be extended to other more complex biomolecules such as proteins or utilized with different types of nanoparticle materials. PMID:20099912
Theory and simulation of DNA-coated colloids: a guide for rational design.

PubMed

Angioletti-Uberti, Stefano; Mognetti, Bortolo M; Frenkel, Daan

2016-03-07

By exploiting the exquisite selectivity of DNA hybridization, DNA-coated colloids (DNACCs) can be made to self-assemble in a wide variety of structures. The beauty of this system stems largely from its exceptional versatility and from the fact that a proper choice of the grafted DNA sequences yields fine control over the colloidal interactions. Theory and simulations have an important role to play in the optimal design of self assembling DNACCs. At present, the powerful model-based design tools are not widely used, because the theoretical literature is fragmented and the connection between different theories is often not evident. In this Perspective, we aim to discuss the similarities and differences between the different models that have been described in the literature, their underlying assumptions, their strengths and their weaknesses. Using the tools described in the present Review, it should be possible to move towards a more rational design of novel self-assembling structures of DNACCs and, more generally, of systems where ligand-receptor are used to control interactions.
Rhipicephalus microplus strain Deutsch, whole genome shotgun sequencing project Version 2

USDA-ARS?s Scientific Manuscript database

The cattle tick, Rhipicephalus (Boophilus) microplus, has a genome over 2.4 times the size of the human genome, and with over 70% of repetitive DNA, this genome would prove very costly to sequence at today's prices and difficult to assemble and analyze. Cot filtration/selection techniques were used ...
Sequence and analysis of chromosome 2 of the plant Arabidopsis thaliana.

PubMed

Lin, X; Kaul, S; Rounsley, S; Shea, T P; Benito, M I; Town, C D; Fujii, C Y; Mason, T; Bowman, C L; Barnstead, M; Feldblyum, T V; Buell, C R; Ketchum, K A; Lee, J; Ronning, C M; Koo, H L; Moffat, K S; Cronin, L A; Shen, M; Pai, G; Van Aken, S; Umayam, L; Tallon, L J; Gill, J E; Adams, M D; Carrera, A J; Creasy, T H; Goodman, H M; Somerville, C R; Copenhaver, G P; Preuss, D; Nierman, W C; White, O; Eisen, J A; Salzberg, S L; Fraser, C M; Venter, J C

1999-12-16

Arabidopsis thaliana (Arabidopsis) is unique among plant model organisms in having a small genome (130-140 Mb), excellent physical and genetic maps, and little repetitive DNA. Here we report the sequence of chromosome 2 from the Columbia ecotype in two gap-free assemblies (contigs) of 3.6 and 16 megabases (Mb). The latter represents the longest published stretch of uninterrupted DNA sequence assembled from any organism to date. Chromosome 2 represents 15% of the genome and encodes 4,037 genes, 49% of which have no predicted function. Roughly 250 tandem gene duplications were found in addition to large-scale duplications of about 0.5 and 4.5 Mb between chromosomes 2 and 1 and between chromosomes 2 and 4, respectively. Sequencing of nearly 2 Mb within the genetically defined centromere revealed a low density of recognizable genes, and a high density and diverse range of vestigial and presumably inactive mobile elements. More unexpected is what appears to be a recent insertion of a continuous stretch of 75% of the mitochondrial genome into chromosome 2.
Uncovering the self-assembly of DNA nanostructures by thermodynamics and kinetics.

PubMed

Wei, Xixi; Nangreave, Jeanette; Liu, Yan

2014-06-17

CONSPECTUS: DNA nanotechnology is one of the most flourishing interdisciplinary research fields. DNA nanostructures can be designed to self-assemble into a variety of periodic or aperiodic patterns of different shapes and length scales. They can be used as scaffolds for organizing other nanoparticles, proteins, and chemical groups, leveraging their functions for creating complex bioinspired materials that may serve as smart drug delivery systems, in vitro or in vivo biomolecular computing platforms, and diagnostic devices. Achieving optimal structural features, efficient assembly protocols, and precise functional group positioning and modification requires a thorough understanding of the thermodynamics and kinetics of the DNA nanostructure self-assembly process. The most common real-time measurement strategies include monitoring changes in UV absorbance based on the hyperchromic effect of DNA, and the emission signal changes of DNA intercalating dyes or covalently conjugated fluorescent dyes/pairs that accompany temperature dependent structural changes. Thermodynamic studies of a variety of DNA nanostructures have been performed, from simple double stranded DNA formation to more complex origami assembly. The key parameters that have been evaluated in terms of stability and cooperativity include the overall dimensions, the folding path of the scaffold, crossover and nick point arrangement, length and sequence of single strands, and salt and ion concentrations. DNA tile-tile interactions through sticky end hybridization have also been analyzed, and the steric inhibition and rigidity of tiles turn out to be important factors. Many kinetic studies have also been reported, and most are based on double stranded DNA formation. A two-state assumption and the hypothesis of several intermediate states have been applied to determine the rate constant and activation energy of the DNA hybridization process. A few simulated models were proposed to represent the structural, mechanical, and kinetic properties of DNA hybridization. The kinetics of strand displacement reactions has also been studied as a special case of DNA hybridization. The thermodynamic and kinetic characteristics of DNA nanostructures have been exploited to develop rapid and isothermal annealing protocols. It is conceivable that a more thorough understanding of the DNA assembly process could be used to guide the structural design process and optimize the conditions for assembly, manipulation, and functionalization, thus benefiting both upstream design and downstream applications.
De novo sequencing and characterization of floral transcriptome in two species of buckwheat (Fagopyrum)

PubMed Central

2011-01-01

Background Transcriptome sequencing data has become an integral component of modern genetics, genomics and evolutionary biology. However, despite advances in the technologies of DNA sequencing, such data are lacking for many groups of living organisms, in particular, many plant taxa. We present here the results of transcriptome sequencing for two closely related plant species. These species, Fagopyrum esculentum and F. tataricum, belong to the order Caryophyllales - a large group of flowering plants with uncertain evolutionary relationships. F. esculentum (common buckwheat) is also an important food crop. Despite these practical and evolutionary considerations Fagopyrum species have not been the subject of large-scale sequencing projects. Results Normalized cDNA corresponding to genes expressed in flowers and inflorescences of F. esculentum and F. tataricum was sequenced using the 454 pyrosequencing technology. This resulted in 267 (for F. esculentum) and 229 (F. tataricum) thousands of reads with average length of 341-349 nucleotides. De novo assembly of the reads produced about 25 thousands of contigs for each species, with 7.5-8.2× coverage. Comparative analysis of two transcriptomes demonstrated their overall similarity but also revealed genes that are presumably differentially expressed. Among them are retrotransposon genes and genes involved in sugar biosynthesis and metabolism. Thirteen single-copy genes were used for phylogenetic analysis; the resulting trees are largely consistent with those inferred from multigenic plastid datasets. The sister relationships of the Caryophyllales and asterids now gained high support from nuclear gene sequences. Conclusions 454 transcriptome sequencing and de novo assembly was performed for two congeneric flowering plant species, F. esculentum and F. tataricum. As a result, a large set of cDNA sequences that represent orthologs of known plant genes as well as potential new genes was generated. PMID:21232141
Centromere location in Arabidopsis is unaltered by extreme divergence in CENH3 protein sequence

PubMed Central

2017-01-01

During cell division, spindle fibers attach to chromosomes at centromeres. The DNA sequence at regional centromeres is fast evolving with no conserved genetic signature for centromere identity. Instead CENH3, a centromere-specific histone H3 variant, is the epigenetic signature that specifies centromere location across both plant and animal kingdoms. Paradoxically, CENH3 is also adaptively evolving. An ongoing question is whether CENH3 evolution is driven by a functional relationship with the underlying DNA sequence. Here, we demonstrate that despite extensive protein sequence divergence, CENH3 histones from distant species assemble centromeres on the same underlying DNA sequence. We first characterized the organization and diversity of centromere repeats in wild-type Arabidopsis thaliana. We show that A. thaliana CENH3-containing nucleosomes exhibit a strong preference for a unique subset of centromeric repeats. These sequences are largely missing from the genome assemblies and represent the youngest and most homogeneous class of repeats. Next, we tested the evolutionary specificity of this interaction in a background in which the native A. thaliana CENH3 is replaced with CENH3s from distant species. Strikingly, we find that CENH3 from Lepidium oleraceum and Zea mays, although specifying epigenetically weaker centromeres that result in genome elimination upon outcrossing, show a binding pattern on A. thaliana centromere repeats that is indistinguishable from the native CENH3. Our results demonstrate positional stability of a highly diverged CENH3 on independently evolved repeats, suggesting that the sequence specificity of centromeres is determined by a mechanism independent of CENH3. PMID:28223399
Standards for plant synthetic biology: a common syntax for exchange of DNA parts.

PubMed

Patron, Nicola J; Orzaez, Diego; Marillonnet, Sylvestre; Warzecha, Heribert; Matthewman, Colette; Youles, Mark; Raitskin, Oleg; Leveau, Aymeric; Farré, Gemma; Rogers, Christian; Smith, Alison; Hibberd, Julian; Webb, Alex A R; Locke, James; Schornack, Sebastian; Ajioka, Jim; Baulcombe, David C; Zipfel, Cyril; Kamoun, Sophien; Jones, Jonathan D G; Kuhn, Hannah; Robatzek, Silke; Van Esse, H Peter; Sanders, Dale; Oldroyd, Giles; Martin, Cathie; Field, Rob; O'Connor, Sarah; Fox, Samantha; Wulff, Brande; Miller, Ben; Breakspear, Andy; Radhakrishnan, Guru; Delaux, Pierre-Marc; Loqué, Dominique; Granell, Antonio; Tissier, Alain; Shih, Patrick; Brutnell, Thomas P; Quick, W Paul; Rischer, Heiko; Fraser, Paul D; Aharoni, Asaph; Raines, Christine; South, Paul F; Ané, Jean-Michel; Hamberger, Björn R; Langdale, Jane; Stougaard, Jens; Bouwmeester, Harro; Udvardi, Michael; Murray, James A H; Ntoukakis, Vardis; Schäfer, Patrick; Denby, Katherine; Edwards, Keith J; Osbourn, Anne; Haseloff, Jim

2015-10-01

Inventors in the field of mechanical and electronic engineering can access multitudes of components and, thanks to standardization, parts from different manufacturers can be used in combination with each other. The introduction of BioBrick standards for the assembly of characterized DNA sequences was a landmark in microbial engineering, shaping the field of synthetic biology. Here, we describe a standard for Type IIS restriction endonuclease-mediated assembly, defining a common syntax of 12 fusion sites to enable the facile assembly of eukaryotic transcriptional units. This standard has been developed and agreed by representatives and leaders of the international plant science and synthetic biology communities, including inventors, developers and adopters of Type IIS cloning methods. Our vision is of an extensive catalogue of standardized, characterized DNA parts that will accelerate plant bioengineering. © 2015 The Authors. New Phytologist © 2015 New Phytologist Trust.
Disentangling the many layers of eukaryotic transcriptional regulation.

PubMed

Lelli, Katherine M; Slattery, Matthew; Mann, Richard S

2012-01-01

Regulation of gene expression in eukaryotes is an extremely complex process. In this review, we break down several critical steps, emphasizing new data and techniques that have expanded current gene regulatory models. We begin at the level of DNA sequence where cis-regulatory modules (CRMs) provide important regulatory information in the form of transcription factor (TF) binding sites. In this respect, CRMs function as instructional platforms for the assembly of gene regulatory complexes. We discuss multiple mechanisms controlling complex assembly, including cooperative DNA binding, combinatorial codes, and CRM architecture. The second section of this review places CRM assembly in the context of nucleosomes and condensed chromatin. We discuss how DNA accessibility and histone modifications contribute to TF function. Lastly, new advances in chromosomal mapping techniques have provided increased understanding of intra- and interchromosomal interactions. We discuss how these topological maps influence gene regulatory models.
Analysis of Litopenaeus vannamei Transcriptome Using the Next-Generation DNA Sequencing Technique

PubMed Central

Li, Chaozheng; Weng, Shaoping; Chen, Yonggui; Yu, Xiaoqiang; Lü, Ling; Zhang, Haiqing; He, Jianguo; Xu, Xiaopeng

2012-01-01

Background Pacific white shrimp (Litopenaeus vannamei), the major species of farmed shrimps in the world, has been attracting extensive studies, which require more and more genome background knowledge. The now available transcriptome data of L. vannamei are insufficient for research requirements, and have not been adequately assembled and annotated. Methodology/Principal Findings This is the first study that used a next-generation high-throughput DNA sequencing technique, the Solexa/Illumina GA II method, to analyze the transcriptome from whole bodies of L. vannamei larvae. More than 2.4 Gb of raw data were generated, and 109,169 unigenes with a mean length of 396 bp were assembled using the SOAP denovo software. 73,505 unigenes (>200 bp) with good quality sequences were selected and subjected to annotation analysis, among which 37.80% can be matched in NCBI Nr database, 37.3% matched in Swissprot, and 44.1% matched in TrEMBL. Using BLAST and BLAST2Go softwares, 11,153 unigenes were classified into 25 Clusters of Orthologous Groups of proteins (COG) categories, 8171 unigenes were assigned into 51 Gene ontology (GO) functional groups, and 18,154 unigenes were divided into 220 Kyoto Encyclopedia of Genes and Genomes (KEGG) pathways. To primarily verify part of the results of assembly and annotations, 12 assembled unigenes that are homologous to many embryo development-related genes were chosen and subjected to RT-PCR for electrophoresis and Sanger sequencing analyses, and to real-time PCR for expression profile analyses during embryo development. Conclusions/Significance The L. vannamei transcriptome analyzed using the next-generation sequencing technique enriches the information of L. vannamei genes, which will facilitate our understanding of the genome background of crustaceans, and promote the studies on L. vannamei. PMID:23071809
Directed nucleation assembly of DNA tile complexes for barcode-patterned lattices

PubMed Central

Yan, Hao; LaBean, Thomas H.; Feng, Liping; Reif, John H.

2003-01-01

The programmed self-assembly of patterned aperiodic molecular structures is a major challenge in nanotechnology and has numerous potential applications for nanofabrication of complex structures and useful devices. Here we report the construction of an aperiodic patterned DNA lattice (barcode lattice) by a self-assembly process of directed nucleation of DNA tiles around a scaffold DNA strand. The input DNA scaffold strand, constructed by ligation of shorter synthetic oligonucleotides, provides layers of the DNA lattice with barcode patterning information represented by the presence or absence of DNA hairpin loops protruding out of the lattice plane. Self-assembly of multiple DNA tiles around the scaffold strand was shown to result in a patterned lattice containing barcode information of 01101. We have also demonstrated the reprogramming of the system to another patterning. An inverted barcode pattern of 10010 was achieved by modifying the scaffold strands and one of the strands composing each tile. A ribbon lattice, consisting of repetitions of the barcode pattern with expected periodicity, was also constructed by the addition of sticky ends. The patterning of both classes of lattices was clearly observable via atomic force microscopy. These results represent a step toward implementation of a visual readout system capable of converting information encoded on a 1D DNA strand into a 2D form readable by advanced microscopic techniques. A functioning visual output method would not only increase the readout speed of DNA-based computers, but may also find use in other sequence identification techniques such as mutation or allele mapping. PMID:12821776
Self-assembly of multiferroic core-shell particulate nanocomposites through DNA-DNA hybridization and magnetic field directed assembly of superstructures

DOE Office of Scientific and Technical Information (OSTI.GOV)

Sreenivasulu, Gollapudi; Srinivasan, Gopalan, E-mail: srinivas@oakland.edu, E-mail: chavez@oakland.edu; Lochbiler, Thomas A.

Multiferroic composites of ferromagnetic and ferroelectric phases are of importance for studies on mechanical strain mediated coupling between the magnetic and electric subsystems. This work is on DNA-assisted self-assembly of superstructures of such composites with nanometer periodicity. The synthesis involved oligomeric DNA-functionalized ferroelectric and ferromagnetic nanoparticles, 600 nm BaTiO{sub 3} (BTO) and 200 nm NiFe{sub 2}O{sub 4} (NFO), respectively. Mixing BTO and NFO particles, possessing complementary DNA sequences, resulted in the formation of ordered core-shell heteronanocomposites held together by DNA hybridization. The composites were imaged by scanning electron microscopy and scanning microwave microscopy. The presence of heteroassemblies along with core-shellmore » architecture is clearly observed. The reversible nature of the DNA hybridization allows for restructuring the composites into mm-long linear chains and 2D-arrays in the presence of a static magnetic field and ring-like structures in a rotating-magnetic field. Strong magneto-electric (ME) coupling in as-assembled composites is evident from static magnetic field H induced polarization and low-frequency magnetoelectric voltage coefficient measurements. Upon annealing the nanocomposites at high temperatures, evidence for the formation of bulk composites with excellent cross-coupling between the electric and magnetic subsystems is obtained by H-induced polarization and low-frequency ME voltage coefficient. The ME coupling strength in the self-assembled composites is measured to be much stronger than in bulk composites with randomly distributed NFO and BTO prepared by direct mixing and sintering.« less
Selection and Screening of DNA Aptamers for Inorganic Nanomaterials.

PubMed

Zhou, Yibo; Huang, Zhicheng; Yang, Ronghua; Liu, Juewen

2018-02-21

Searching for DNA sequences that can strongly and selectively bind to inorganic surfaces is a long-standing topic in bionanotechnology, analytical chemistry and biointerface research. This can be achieved either by aptamer selection starting with a very large library of ≈10 14 random DNA sequences, or by careful screening of a much smaller library (usually from a few to a few hundred) with rationally designed sequences. Unlike typical molecular targets, inorganic surfaces often have quite strong DNA adsorption affinities due to polyvalent binding and even chemical interactions. This leads to a very high background binding making aptamer selection difficult. Screening, on the other hand, can be designed to compare relative binding affinities of different DNA sequences and could be more appropriate for inorganic surfaces. The resulting sequences have been used for DNA-directed assembly, sorting of carbon nanotubes, and DNA-controlled growth of inorganic nanomaterials. It was recently discovered that poly-cytosine (C) DNA can strongly bind to a diverse range of nanomaterials including nanocarbons (graphene oxide and carbon nanotubes), various metal oxides and transition-metal dichalcogenides. In this Concept article, we articulate the need for screening and potential artifacts associated with traditional aptamer selection methods for inorganic surfaces. Representative examples of application are discussed, and a few future research opportunities are proposed towards the end of this article. © 2018 Wiley-VCH Verlag GmbH & Co. KGaA, Weinheim.
Mind the gap; seven reasons to close fragmented genome assemblies.

PubMed

Thomma, Bart P H J; Seidl, Michael F; Shi-Kunne, Xiaoqian; Cook, David E; Bolton, Melvin D; van Kan, Jan A L; Faino, Luigi

2016-05-01

Like other domains of life, research into the biology of filamentous microbes has greatly benefited from the advent of whole-genome sequencing. Next-generation sequencing (NGS) technologies have revolutionized sequencing, making genomic sciences accessible to many academic laboratories including those that study non-model organisms. Thus, hundreds of fungal genomes have been sequenced and are publically available today, although these initiatives have typically yielded considerably fragmented genome assemblies that often lack large contiguous genomic regions. Many important genomic features are contained in intergenic DNA that is often missing in current genome assemblies, and recent studies underscore the significance of non-coding regions and repetitive elements for the life style, adaptability and evolution of many organisms. The study of particular types of genetic elements, such as telomeres, centromeres, repetitive elements, effectors, and clusters of co-regulated genes, but also of phenomena such as structural rearrangements, genome compartmentalization and epigenetics, greatly benefits from having a contiguous and high-quality, preferably even complete and gapless, genome assembly. Here we discuss a number of important reasons to produce gapless, finished, genome assemblies to help answer important biological questions. Copyright © 2015 Elsevier Inc. All rights reserved.
Self-Assembled DNA Tetrahedral Scaffolds for the Construction of Electrochemiluminescence Biosensor with Programmable DNA Cyclic Amplification.

PubMed

Feng, Qiu-Mei; Guo, Yue-Hua; Xu, Jing-Juan; Chen, Hong-Yuan

2017-05-24

A novel DNA tetrahedron-structured electrochemiluminescence (ECL) platform for bioanalysis with programmable DNA cyclic amplification was developed. In this work, glucose oxidase (GOD) was labeled to a DNA sequence (S) as functional conjugation (GOD-S), which could hybridize with other DNA sequences (L and P) to form GOD-S:L:P probe. In the presence of target DNA and a help DNA (A), the programmable DNA cyclic amplification was activated and released GOD-S via toehold-mediated strand displacement. Then, the obtained GOD-S was further immobilized on the DNA tetrahedral scaffolds with a pendant capture DNA and Ru(bpy) 3 2+ -conjugated silica nanoparticles (RuSi NPs) decorated on the electrode surface. Thus, the amount of GOD-S assembled on the electrode surface depended on the concentration of target DNA and GOD could catalyze glucose to generate H 2 O 2 in situ. The ECL signal of Ru(bpy) 3 2+ -TPrA system was quenched by the presence of H 2 O 2 . By integrating the programmable DNA cyclic amplification and in situ generating H 2 O 2 as Ru(bpy) 3 2+ ECL quencher, a sensitive DNA tetrahedron-structured ECL sensing platform was proposed for DNA detection. Under optimized conditions, this biosensor showed a wide linear range from 100 aM to 10 pM with a detection limit of 40 aM, indicating a promising application in DNA analysis. Furthermore, by labeling GOD to different recognition elements, the proposed strategy could be used for the detection of various targets. Thus, this programmable cascade amplification strategy not only retains the high selectivity and good capturing efficiency of tetrahedral-decorated electrode surface but also provides potential applications in the construction of ECL biosensor.
Base-Calling Algorithm with Vocabulary (BCV) Method for Analyzing Population Sequencing Chromatograms

PubMed Central

Fantin, Yuri S.; Neverov, Alexey D.; Favorov, Alexander V.; Alvarez-Figueroa, Maria V.; Braslavskaya, Svetlana I.; Gordukova, Maria A.; Karandashova, Inga V.; Kuleshov, Konstantin V.; Myznikova, Anna I.; Polishchuk, Maya S.; Reshetov, Denis A.; Voiciehovskaya, Yana A.; Mironov, Andrei A.; Chulanov, Vladimir P.

2013-01-01

Sanger sequencing is a common method of reading DNA sequences. It is less expensive than high-throughput methods, and it is appropriate for numerous applications including molecular diagnostics. However, sequencing mixtures of similar DNA of pathogens with this method is challenging. This is important because most clinical samples contain such mixtures, rather than pure single strains. The traditional solution is to sequence selected clones of PCR products, a complicated, time-consuming, and expensive procedure. Here, we propose the base-calling with vocabulary (BCV) method that computationally deciphers Sanger chromatograms obtained from mixed DNA samples. The inputs to the BCV algorithm are a chromatogram and a dictionary of sequences that are similar to those we expect to obtain. We apply the base-calling function on a test dataset of chromatograms without ambiguous positions, as well as one with 3–14% sequence degeneracy. Furthermore, we use BCV to assemble a consensus sequence for an HIV genome fragment in a sample containing a mixture of viral DNA variants and to determine the positions of the indels. Finally, we detect drug-resistant Mycobacterium tuberculosis strains carrying frameshift mutations mixed with wild-type bacteria in the pncA gene, and roughly characterize bacterial communities in clinical samples by direct 16S rRNA sequencing. PMID:23382983

Using Partial Genomic Fosmid Libraries for Sequencing CompleteOrganellar Genomes

DOE Office of Scientific and Technical Information (OSTI.GOV)

McNeal, Joel R.; Leebens-Mack, James H.; Arumuganathan, K.

2005-08-26

Organellar genome sequences provide numerous phylogenetic markers and yield insight into organellar function and molecular evolution. These genomes are much smaller in size than their nuclear counterparts; thus, their complete sequencing is much less expensive than total nuclear genome sequencing, making broader phylogenetic sampling feasible. However, for some organisms it is challenging to isolate plastid DNA for sequencing using standard methods. To overcome these difficulties, we constructed partial genomic libraries from total DNA preparations of two heterotrophic and two autotrophic angiosperm species using fosmid vectors. We then used macroarray screening to isolate clones containing large fragments of plastid DNA. Amore » minimum tiling path of clones comprising the entire genome sequence of each plastid was selected, and these clones were shotgun-sequenced and assembled into complete genomes. Although this method worked well for both heterotrophic and autotrophic plants, nuclear genome size had a dramatic effect on the proportion of screened clones containing plastid DNA and, consequently, the overall number of clones that must be screened to ensure full plastid genome coverage. This technique makes it possible to determine complete plastid genome sequences for organisms that defy other available organellar genome sequencing methods, especially those for which limited amounts of tissue are available.« less
The Autographa californica Multiple Nucleopolyhedrovirus ac83 Gene Contains a cis-Acting Element That Is Essential for Nucleocapsid Assembly.

PubMed

Huang, Zhihong; Pan, Mengjia; Zhu, Silei; Zhang, Hao; Wu, Wenbi; Yuan, Meijin; Yang, Kai

2017-03-01

Baculoviridae is a family of insect-specific viruses that have a circular double-stranded DNA genome packaged within a rod-shaped capsid. The mechanism of baculovirus nucleocapsid assembly remains unclear. Previous studies have shown that deletion of the ac83 gene of Autographa californica multiple nucleopolyhedrovirus (AcMNPV) blocks viral nucleocapsid assembly. Interestingly, the ac83 -encoded protein Ac83 is not a component of the nucleocapsid, implying a particular role for ac83 in nucleocapsid assembly that may be independent of its protein product. To examine this possibility, Ac83 synthesis was disrupted by insertion of a chloramphenicol resistance gene into its coding sequence or by deleting its promoter and translation start codon. Both mutants produced progeny viruses normally, indicating that the Ac83 protein is not required for nucleocapsid assembly. Subsequently, complementation assays showed that the production of progeny viruses required the presence of ac83 in the AcMNPV genome instead of its presence in trans Therefore, we reasoned that ac83 is involved in nucleocapsid assembly via an internal cis -acting element, which we named the nucleocapsid assembly-essential element (NAE). The NAE was identified to lie within nucleotides 1651 to 1850 of ac83 and had 8 conserved A/T-rich regions. Sequences homologous to the NAE were found only in alphabaculoviruses and have a conserved positional relationship with another essential cis -acting element that was recently identified. The identification of the NAE may help to connect the data of viral cis -acting elements and related proteins in the baculovirus nucleocapsid assembly, which is important for elucidating DNA-protein interaction events during this process. IMPORTANCE Virus nucleocapsid assembly usually requires specific cis -acting elements in the viral genome for various processes, such as the selection of the viral genome from the cellular nucleic acids, the cleavage of concatemeric viral genome replication intermediates, and the encapsidation of the viral genome into procapsids. In linear DNA viruses, such elements generally locate at the ends of the viral genome; however, most of these elements remain unidentified in circular DNA viruses (including baculovirus) due to their circular genomic conformation. Here, we identified a nucleocapsid assembly-essential element in the AcMNPV (the archetype of baculovirus) genome. This finding provides an important reference for studies of nucleocapsid assembly-related elements in baculoviruses and other circular DNA viruses. Moreover, as most of the previous studies of baculovirus nucleocapsid assembly have been focused on viral proteins, our study provides a novel entry point to investigate this mechanism via cis -acting elements in the viral genome. Copyright © 2017 American Society for Microbiology.
Electrochemical detection of sequence-specific DNA based on formation of G-quadruplex-hemin through continuous hybridization chain reaction.

PubMed

Sun, Xiaofan; Chen, Haohan; Wang, Shuling; Zhang, Yiping; Tian, Yaping; Zhou, Nandi

2018-08-27

A high-sensitive detection of sequence-specific DNA was established based on the formation of G-quadruplex-hemin complex through continuous hybridization chain reaction (HCR). Taking HIV DNA sequence as an example, a capture probe complementary to part of HIV DNA was firstly self-assembled onto the surface of Au electrode. Then a specially designed assistant probe with both terminals complementary to the target DNA and a G-quadruplex-forming sequence in the center was introduced into the detection solution. In the presence of both the target DNA and the assistant probe, the target DNA can be captured on the electrode surface and then a continuous HCR can be conducted due to the mutual recognition of the target DNA and the assistant probe, leading to the formation of a large number of G-quadruplex on the electrode surface. With the help of hemin, a pronounced electrochemical signal can be observed in differential pulse voltammetry (DPV), due to the formation of G-quadruplex-hemin complex. The peak current is linearly related with the logarithm of the concentration of the target DNA in the range from 10 fM to 10 pM. The electrochemical sensor has high selectivity to clearly discriminate single-base mismatched and three-base mismatched sequences from the original HIV DNA sequence. Moreover, the established DNA sensor was challenged by detection of HIV DNA in human serum samples, which showed the low detection limit of 6.3 fM. Thus it has great application prospect in the field of clinical diagnosis and environmental monitoring. Copyright © 2018 Elsevier B.V. All rights reserved.
Analysis of sequence variability in the macronuclear DNA of Paramecium tetraurelia: A somatic view of the germline

PubMed Central

Duret, Laurent; Cohen, Jean; Jubin, Claire; Dessen, Philippe; Goût, Jean-François; Mousset, Sylvain; Aury, Jean-Marc; Jaillon, Olivier; Noël, Benjamin; Arnaiz, Olivier; Bétermier, Mireille; Wincker, Patrick; Meyer, Eric; Sperling, Linda

2008-01-01

Ciliates are the only unicellular eukaryotes known to separate germinal and somatic functions. Diploid but silent micronuclei transmit the genetic information to the next sexual generation. Polyploid macronuclei express the genetic information from a streamlined version of the genome but are replaced at each sexual generation. The macronuclear genome of Paramecium tetraurelia was recently sequenced by a shotgun approach, providing access to the gene repertoire. The 72-Mb assembly represents a consensus sequence for the somatic DNA, which is produced after sexual events by reproducible rearrangements of the zygotic genome involving elimination of repeated sequences, precise excision of unique-copy internal eliminated sequences (IES), and amplification of the cellular genes to high copy number. We report use of the shotgun sequencing data (>106 reads representing 13× coverage of a completely homozygous clone) to evaluate variability in the somatic DNA produced by these developmental genome rearrangements. Although DNA amplification appears uniform, both of the DNA elimination processes produce sequence heterogeneity. The variability that arises from IES excision allowed identification of hundreds of putative new IESs, compared to 42 that were previously known, and revealed cases of erroneous excision of segments of coding sequences. We demonstrate that IESs in coding regions are under selective pressure to introduce premature termination of translation in case of excision failure. PMID:18256234
Partial bisulfite conversion for unique template sequencing.

PubMed

Kumar, Vijay; Rosenbaum, Julie; Wang, Zihua; Forcier, Talitha; Ronemus, Michael; Wigler, Michael; Levy, Dan

2018-01-25

We introduce a new protocol, mutational sequencing or muSeq, which uses sodium bisulfite to randomly deaminate unmethylated cytosines at a fixed and tunable rate. The muSeq protocol marks each initial template molecule with a unique mutation signature that is present in every copy of the template, and in every fragmented copy of a copy. In the sequenced read data, this signature is observed as a unique pattern of C-to-T or G-to-A nucleotide conversions. Clustering reads with the same conversion pattern enables accurate count and long-range assembly of initial template molecules from short-read sequence data. We explore count and low-error sequencing by profiling 135 000 restriction fragments in a PstI representation, demonstrating that muSeq improves copy number inference and significantly reduces sporadic sequencer error. We explore long-range assembly in the context of cDNA, generating contiguous transcript clusters greater than 3,000 bp in length. The muSeq assemblies reveal transcriptional diversity not observable from short-read data alone. © The Author(s) 2017. Published by Oxford University Press on behalf of Nucleic Acids Research.
Next-generation sequencing of mixed genomic DNA allows efficient assembly of rearranged mitochondrial genomes in Amolops chunganensis and Quasipaa boulengeri

PubMed Central

Yuan, Siqi; Zheng, Yuchi; Zeng, Xiaomao

2016-01-01

Recent improvements in next-generation sequencing (NGS) technologies can facilitate the obtainment of mitochondrial genomes. However, it is not clear whether NGS could be effectively used to reconstruct the mitogenome with high gene rearrangement. These high rearrangements would cause amplification failure, and/or assembly and alignment errors. Here, we choose two frogs with rearranged gene order, Amolops chunganensis and Quasipaa boulengeri, to test whether gene rearrangements affect the mitogenome assembly and alignment by using NGS. The mitogenomes with gene rearrangements are sequenced through Illumina MiSeq genomic sequencing and assembled effectively by Trinity v2.1.0 and SOAPdenovo2. Gene order and contents in the mitogenome of A. chunganensis and Q. boulengeri are typical neobatrachian pattern except for rearrangements at the position of “WANCY” tRNA genes cluster. Further, the mitogenome of Q. boulengeri is characterized with a tandem duplication of trnM. Moreover, we utilize 13 protein-coding genes of A. chunganensis, Q. boulengeri and other neobatrachians to reconstruct the phylogenetic tree for evaluating mitochondrial sequence authenticity of A. chunganensis and Q. boulengeri. In this work, we provide nearly complete mitochondrial genomes of A. chunganensis and Q. boulengeri. PMID:27994980
Population structure of pigs determined by single nucleotide polymorphisms observed in assembled expressed sequence tags.

PubMed

Matsumoto, Toshimi; Okumura, Naohiko; Uenishi, Hirohide; Hayashi, Takeshi; Hamasima, Noriyuki; Awata, Takashi

2012-01-01

We have collected more than 190000 porcine expressed sequence tags (ESTs) from full-length complementary DNA (cDNA) libraries and identified more than 2800 single nucleotide polymorphisms (SNPs). In this study, we tentatively chose 222 SNPs observed in assembled ESTs to study pigs of different breeds; 104 were selected by comparing the cDNA sequences of a Meishan pig and samples of three-way cross pigs (Landrace, Large White, and Duroc: LWD), and 118 were selected from LWD samples. To evaluate the genetic variation between the chosen SNPs from pig breeds, we determined the genotypes for 192 pig samples (11 pig groups) from our DNA reference panel with matrix-assisted laser desorption ionization time-of-flight mass spectrometry. Of the 222 reference SNPs, 186 were successfully genotyped. A neighbor-joining tree showed that the pig groups were classified into two large clusters, namely, Euro-American and East Asian pig populations. F-statistics and the analysis of molecular variance of Euro-American pig groups revealed that approximately 25% of the genetic variations occurred because of intergroup differences. As the F(IS) values were less than the F(ST) values(,) the clustering, based on the Bayesian inference, implied that there was strong genetic differentiation among pig groups and less divergence within the groups in our samples. © 2011 The Authors. Animal Science Journal © 2011 Japanese Society of Animal Science.
A simple, rapid, high-fidelity and cost-effective PCR-based two-step DNA synthesis method for long gene sequences.

PubMed

Xiong, Ai-Sheng; Yao, Quan-Hong; Peng, Ri-He; Li, Xian; Fan, Hui-Qin; Cheng, Zong-Ming; Li, Yi

2004-07-07

Chemical synthesis of DNA sequences provides a powerful tool for modifying genes and for studying gene function, structure and expression. Here, we report a simple, high-fidelity and cost-effective PCR-based two-step DNA synthesis (PTDS) method for synthesis of long segments of DNA. The method involves two steps. (i) Synthesis of individual fragments of the DNA of interest: ten to twelve 60mer oligonucleotides with 20 bp overlap are mixed and a PCR reaction is carried out with high-fidelity DNA polymerase Pfu to produce DNA fragments that are approximately 500 bp in length. (ii) Synthesis of the entire sequence of the DNA of interest: five to ten PCR products from the first step are combined and used as the template for a second PCR reaction using high-fidelity DNA polymerase pyrobest, with the two outermost oligonucleotides as primers. Compared with the previously published methods, the PTDS method is rapid (5-7 days) and suitable for synthesizing long segments of DNA (5-6 kb) with high G + C contents, repetitive sequences or complex secondary structures. Thus, the PTDS method provides an alternative tool for synthesizing and assembling long genes with complex structures. Using the newly developed PTDS method, we have successfully obtained several genes of interest with sizes ranging from 1.0 to 5.4 kb.
Extraction of High Molecular Weight DNA from Fungal Rust Spores for Long Read Sequencing.

PubMed

Schwessinger, Benjamin; Rathjen, John P

2017-01-01

Wheat rust fungi are complex organisms with a complete life cycle that involves two different host plants and five different spore types. During the asexual infection cycle on wheat, rusts produce massive amounts of dikaryotic urediniospores. These spores are dikaryotic (two nuclei) with each nucleus containing one haploid genome. This dikaryotic state is likely to contribute to their evolutionary success, making them some of the major wheat pathogens globally. Despite this, most published wheat rust genomes are highly fragmented and contain very little haplotype-specific sequence information. Current long-read sequencing technologies hold great promise to provide more contiguous and haplotype-phased genome assemblies. Long reads are able to span repetitive regions and phase structural differences between the haplomes. This increased genome resolution enables the identification of complex loci and the study of genome evolution beyond simple nucleotide polymorphisms. Long-read technologies require pure high molecular weight DNA as an input for sequencing. Here, we describe a DNA extraction protocol for rust spores that yields pure double-stranded DNA molecules with molecular weight of >50 kilo-base pairs (kbp). The isolated DNA is of sufficient purity for PacBio long-read sequencing, but may require additional purification for other sequencing technologies such as Nanopore and 10× Genomics.
The genome sequence of a widespread apex Predator, the golden eagle (Aquila chrysaetos)

Treesearch

Jacqueline M. Doyle; Todd E. Katzner; Peter H. Bloom; Yanzhu Ji; Bhagya K. Wijayawardena; J. Andrew DeWoody; Ludovic Orlando

2014-01-01

Biologists routinely use molecular markers to identify conservation units, to quantify genetic connectivity, to estimate population sizes, and to identify targets of selection. Many imperiled eagle populations require such efforts and would benefit from enhanced genomic resources. We sequenced, assembled, and annotated the first eagle genome using DNA from a male...
Biotechnological mass production of DNA origami

NASA Astrophysics Data System (ADS)

Praetorius, Florian; Kick, Benjamin; Behler, Karl L.; Honemann, Maximilian N.; Weuster-Botz, Dirk; Dietz, Hendrik

2017-12-01

DNA nanotechnology, in particular DNA origami, enables the bottom-up self-assembly of micrometre-scale, three-dimensional structures with nanometre-precise features. These structures are customizable in that they can be site-specifically functionalized or constructed to exhibit machine-like or logic-gating behaviour. Their use has been limited to applications that require only small amounts of material (of the order of micrograms), owing to the limitations of current production methods. But many proposed applications, for example as therapeutic agents or in complex materials, could be realized if more material could be used. In DNA origami, a nanostructure is assembled from a very long single-stranded scaffold molecule held in place by many short single-stranded staple oligonucleotides. Only the bacteriophage-derived scaffold molecules are amenable to scalable and efficient mass production; the shorter staple strands are obtained through costly solid-phase synthesis or enzymatic processes. Here we show that single strands of DNA of virtually arbitrary length and with virtually arbitrary sequences can be produced in a scalable and cost-efficient manner by using bacteriophages to generate single-stranded precursor DNA that contains target strand sequences interleaved with self-excising ‘cassettes’, with each cassette comprising two Zn2+-dependent DNA-cleaving DNA enzymes. We produce all of the necessary single strands of DNA for several DNA origami using shaker-flask cultures, and demonstrate end-to-end production of macroscopic amounts of a DNA origami nanorod in a litre-scale stirred-tank bioreactor. Our method is compatible with existing DNA origami design frameworks and retains the modularity and addressability of DNA origami objects that are necessary for implementing custom modifications using functional groups. With all of the production and purification steps amenable to scaling, we expect that our method will expand the scope of DNA nanotechnology in many areas of science and technology.
Biotechnological mass production of DNA origami.

PubMed

Praetorius, Florian; Kick, Benjamin; Behler, Karl L; Honemann, Maximilian N; Weuster-Botz, Dirk; Dietz, Hendrik

2017-12-06

DNA nanotechnology, in particular DNA origami, enables the bottom-up self-assembly of micrometre-scale, three-dimensional structures with nanometre-precise features. These structures are customizable in that they can be site-specifically functionalized or constructed to exhibit machine-like or logic-gating behaviour. Their use has been limited to applications that require only small amounts of material (of the order of micrograms), owing to the limitations of current production methods. But many proposed applications, for example as therapeutic agents or in complex materials, could be realized if more material could be used. In DNA origami, a nanostructure is assembled from a very long single-stranded scaffold molecule held in place by many short single-stranded staple oligonucleotides. Only the bacteriophage-derived scaffold molecules are amenable to scalable and efficient mass production; the shorter staple strands are obtained through costly solid-phase synthesis or enzymatic processes. Here we show that single strands of DNA of virtually arbitrary length and with virtually arbitrary sequences can be produced in a scalable and cost-efficient manner by using bacteriophages to generate single-stranded precursor DNA that contains target strand sequences interleaved with self-excising 'cassettes', with each cassette comprising two Zn 2+ -dependent DNA-cleaving DNA enzymes. We produce all of the necessary single strands of DNA for several DNA origami using shaker-flask cultures, and demonstrate end-to-end production of macroscopic amounts of a DNA origami nanorod in a litre-scale stirred-tank bioreactor. Our method is compatible with existing DNA origami design frameworks and retains the modularity and addressability of DNA origami objects that are necessary for implementing custom modifications using functional groups. With all of the production and purification steps amenable to scaling, we expect that our method will expand the scope of DNA nanotechnology in many areas of science and technology.
Self-assembly of proglycinin and hybrid proglycinin synthesized in vitro from cDNA

PubMed Central

Dickinson, Craig D.; Floener, Liliane A.; Lilley, Glenn G.; Nielsen, Niels C.

1987-01-01

An in vitro system was developed that results in the self-assembly of subunit precursors into complexes that resemble those found naturally in the endoplasmic reticulum. Subunits of glycinin, the predominant seed protein of soybeans, were synthesized from modified cDNAs using a combination of the SP6 transcription and the rabbit reticulocyte translation systems. Subunits produced from plasmid constructions that encoded either Gy4 or Gy5 gene products, but modified such that their signal sequences were absent, self-assembled into trimers equivalent in size to those precursors found in the endoplasmic reticulum. In contrast, proteins synthesized in vitro from Gy4 constructs failed to self-assemble when the signal sequence was left intact (e.g., preproglycinin) or when the coding sequence was modified to remove 27 amino acids from an internal hydrophobic region, which is highly conserved among the glycinin subunits. Various hybrid subunits were also produced by trading portions of Gy4 and Gy5 cDNAs and all self-assembled in our system. The in vitro assembly system provides an opportunity to study the self-assembly of precursors and to probe for regions important for assembly. It will also be helpful in attempts to engineer beneficial nutritional changes into this important food protein. Images PMID:16593868
A pipeline for the de novo assembly of the Themira biloba (Sepsidae: Diptera) transcriptome using a multiple k-mer length approach.

PubMed

Melicher, Dacotah; Torson, Alex S; Dworkin, Ian; Bowsher, Julia H

2014-03-12

The Sepsidae family of flies is a model for investigating how sexual selection shapes courtship and sexual dimorphism in a comparative framework. However, like many non-model systems, there are few molecular resources available. Large-scale sequencing and assembly have not been performed in any sepsid, and the lack of a closely related genome makes investigation of gene expression challenging. Our goal was to develop an automated pipeline for de novo transcriptome assembly, and to use that pipeline to assemble and analyze the transcriptome of the sepsid Themira biloba. Our bioinformatics pipeline uses cloud computing services to assemble and analyze the transcriptome with off-site data management, processing, and backup. It uses a multiple k-mer length approach combined with a second meta-assembly to extend transcripts and recover more bases of transcript sequences than standard single k-mer assembly. We used 454 sequencing to generate 1.48 million reads from cDNA generated from embryo, larva, and pupae of T. biloba and assembled a transcriptome consisting of 24,495 contigs. Annotation identified 16,705 transcripts, including those involved in embryogenesis and limb patterning. We assembled transcriptomes from an additional three non-model organisms to demonstrate that our pipeline assembled a higher-quality transcriptome than single k-mer approaches across multiple species. The pipeline we have developed for assembly and analysis increases contig length, recovers unique transcripts, and assembles more base pairs than other methods through the use of a meta-assembly. The T. biloba transcriptome is a critical resource for performing large-scale RNA-Seq investigations of gene expression patterns, and is the first transcriptome sequenced in this Dipteran family.
Primary analysis of repeat elements of the Asian seabass (Lates calcarifer) transcriptome and genome

PubMed Central

Kuznetsova, Inna S.; Thevasagayam, Natascha M.; Sridatta, Prakki S. R.; Komissarov, Aleksey S.; Saju, Jolly M.; Ngoh, Si Y.; Jiang, Junhui; Shen, Xueyan; Orbán, László

2014-01-01

As part of our Asian seabass genome project, we are generating an inventory of repeat elements in the genome and transcriptome. The karyotype showed a diploid number of 2n = 24 chromosomes with a variable number of B-chromosomes. The transcriptome and genome of Asian seabass were searched for repetitive elements with experimental and bioinformatics tools. Six different types of repeats constituting 8–14% of the genome were characterized. Repetitive elements were clustered in the pericentromeric heterochromatin of all chromosomes, but some of them were preferentially accumulated in pretelomeric and pericentromeric regions of several chromosomes pairs and have chromosomes specific arrangement. From the dispersed class of fish-specific non-LTR retrotransposon elements Rex1 and MAUI-like repeats were analyzed. They were wide-spread both in the genome and transcriptome, accumulated on the pericentromeric and peritelomeric areas of all chromosomes. Every analyzed repeat was represented in the Asian seabass transcriptome, some showed differential expression between the gonads. The other group of repeats analyzed belongs to the rRNA multigene family. FISH signal for 5S rDNA was located on a single pair of chromosomes, whereas that for 18S rDNA was found on two pairs. A BAC-derived contig containing rDNA was sequenced and assembled into a scaffold containing incomplete fragments of 18S rDNA. Their assembly and chromosomal position revealed that this part of Asian seabass genome is extremely rich in repeats containing evolutionarily conserved and novel sequences. In summary, transcriptome assemblies and cDNA data are suitable for the identification of repetitive DNA from unknown genomes and for comparative investigation of conserved elements between teleosts and other vertebrates. PMID:25120555
Ultrafast DNA sequencing on a microchip by a hybrid separation mechanism that gives 600 bases in 6.5 minutes.

PubMed

Fredlake, Christopher P; Hert, Daniel G; Kan, Cheuk-Wai; Chiesl, Thomas N; Root, Brian E; Forster, Ryan E; Barron, Annelise E

2008-01-15

To realize the immense potential of large-scale genomic sequencing after the completion of the second human genome (Venter's), the costs for the complete sequencing of additional genomes must be dramatically reduced. Among the technologies being developed to reduce sequencing costs, microchip electrophoresis is the only new technology ready to produce the long reads most suitable for the de novo sequencing and assembly of large and complex genomes. Compared with the current paradigm of capillary electrophoresis, microchip systems promise to reduce sequencing costs dramatically by increasing throughput, reducing reagent consumption, and integrating the many steps of the sequencing pipeline onto a single platform. Although capillary-based systems require approximately 70 min to deliver approximately 650 bases of contiguous sequence, we report sequencing up to 600 bases in just 6.5 min by microchip electrophoresis with a unique polymer matrix/adsorbed polymer wall coating combination. This represents a two-thirds reduction in sequencing time over any previously published chip sequencing result, with comparable read length and sequence quality. We hypothesize that these ultrafast long reads on chips can be achieved because the combined polymer system engenders a recently discovered "hybrid" mechanism of DNA electromigration, in which DNA molecules alternate rapidly between repeating through the intact polymer network and disrupting network entanglements to drag polymers through the solution, similar to dsDNA dynamics we observe in single-molecule DNA imaging studies. Most importantly, these results reveal the surprisingly powerful ability of microchip electrophoresis to provide ultrafast Sanger sequencing, which will translate to increased system throughput and reduced costs.
Ultrafast DNA sequencing on a microchip by a hybrid separation mechanism that gives 600 bases in 6.5 minutes

PubMed Central

Fredlake, Christopher P.; Hert, Daniel G.; Kan, Cheuk-Wai; Chiesl, Thomas N.; Root, Brian E.; Forster, Ryan E.; Barron, Annelise E.

2008-01-01

To realize the immense potential of large-scale genomic sequencing after the completion of the second human genome (Venter's), the costs for the complete sequencing of additional genomes must be dramatically reduced. Among the technologies being developed to reduce sequencing costs, microchip electrophoresis is the only new technology ready to produce the long reads most suitable for the de novo sequencing and assembly of large and complex genomes. Compared with the current paradigm of capillary electrophoresis, microchip systems promise to reduce sequencing costs dramatically by increasing throughput, reducing reagent consumption, and integrating the many steps of the sequencing pipeline onto a single platform. Although capillary-based systems require ≈70 min to deliver ≈650 bases of contiguous sequence, we report sequencing up to 600 bases in just 6.5 min by microchip electrophoresis with a unique polymer matrix/adsorbed polymer wall coating combination. This represents a two-thirds reduction in sequencing time over any previously published chip sequencing result, with comparable read length and sequence quality. We hypothesize that these ultrafast long reads on chips can be achieved because the combined polymer system engenders a recently discovered “hybrid” mechanism of DNA electromigration, in which DNA molecules alternate rapidly between reptating through the intact polymer network and disrupting network entanglements to drag polymers through the solution, similar to dsDNA dynamics we observe in single-molecule DNA imaging studies. Most importantly, these results reveal the surprisingly powerful ability of microchip electrophoresis to provide ultrafast Sanger sequencing, which will translate to increased system throughput and reduced costs. PMID:18184818
Comparing memory-efficient genome assemblers on stand-alone and cloud infrastructures.

PubMed

Kleftogiannis, Dimitrios; Kalnis, Panos; Bajic, Vladimir B

2013-01-01

A fundamental problem in bioinformatics is genome assembly. Next-generation sequencing (NGS) technologies produce large volumes of fragmented genome reads, which require large amounts of memory to assemble the complete genome efficiently. With recent improvements in DNA sequencing technologies, it is expected that the memory footprint required for the assembly process will increase dramatically and will emerge as a limiting factor in processing widely available NGS-generated reads. In this report, we compare current memory-efficient techniques for genome assembly with respect to quality, memory consumption and execution time. Our experiments prove that it is possible to generate draft assemblies of reasonable quality on conventional multi-purpose computers with very limited available memory by choosing suitable assembly methods. Our study reveals the minimum memory requirements for different assembly programs even when data volume exceeds memory capacity by orders of magnitude. By combining existing methodologies, we propose two general assembly strategies that can improve short-read assembly approaches and result in reduction of the memory footprint. Finally, we discuss the possibility of utilizing cloud infrastructures for genome assembly and we comment on some findings regarding suitable computational resources for assembly.
Re-entrant DNA gels

PubMed Central

Bomboi, Francesca; Romano, Flavio; Leo, Manuela; Fernandez-Castanon, Javier; Cerbino, Roberto; Bellini, Tommaso; Bordi, Federico; Filetici, Patrizia; Sciortino, Francesco

2016-01-01

DNA is acquiring a primary role in material development, self-assembling by design into complex supramolecular aggregates, the building block of a new-materials world. Using DNA nanoconstructs to translate sophisticated theoretical intuitions into experimental realizations by closely matching idealized models of colloidal particles is a much less explored avenue. Here we experimentally show that an appropriate selection of competing interactions enciphered in multiple DNA sequences results into the successful design of a one-pot DNA hydrogel that melts both on heating and on cooling. The relaxation time, measured by light scattering, slows down dramatically in a limited window of temperatures. The phase diagram displays a peculiar re-entrant shape, the hallmark of the competition between different bonding patterns. Our study shows that it is possible to rationally design biocompatible bulk materials with unconventional phase diagrams and tuneable properties by encoding into DNA sequences both the particle shape and the physics of the collective response. PMID:27767029
Variation of 45S rDNA intergenic spacers in Arabidopsis thaliana.

PubMed

Havlová, Kateřina; Dvořáčková, Martina; Peiro, Ramon; Abia, David; Mozgová, Iva; Vansáčová, Lenka; Gutierrez, Crisanto; Fajkus, Jiří

2016-11-01

Approximately seven hundred 45S rRNA genes (rDNA) in the Arabidopsis thaliana genome are organised in two 4 Mbp-long arrays of tandem repeats arranged in head-to-tail fashion separated by an intergenic spacer (IGS). These arrays make up 5 % of the A. thaliana genome. IGS are rapidly evolving sequences and frequent rearrangements inside the rDNA loci have generated considerable interspecific and even intra-individual variability which allows to distinguish among otherwise highly conserved rRNA genes. The IGS has not been comprehensively described despite its potential importance in regulation of rDNA transcription and replication. Here we describe the detailed sequence variation in the complete IGS of A. thaliana WT plants and provide the reference/consensus IGS sequence, as well as genomic DNA analysis. We further investigate mutants dysfunctional in chromatin assembly factor-1 (CAF-1) (fas1 and fas2 mutants), which are known to have a reduced number of rDNA copies, and plant lines with restored CAF-1 function (segregated from a fas1xfas2 genetic background) showing major rDNA rearrangements. The systematic rDNA loss in CAF-1 mutants leads to the decreased variability of the IGS and to the occurrence of distinct IGS variants. We present for the first time a comprehensive and representative set of complete IGS sequences, obtained by conventional cloning and by Pacific Biosciences sequencing. Our data expands the knowledge of the A. thaliana IGS sequence arrangement and variability, which has not been available in full and in detail until now. This is also the first study combining IGS sequencing data with RFLP analysis of genomic DNA.

Automatic Tool for Local Assembly Structures

DOE Office of Scientific and Technical Information (OSTI.GOV)

Whole community shotgun sequencing of total DNA (i.e. metagenomics) and total RNA (i.e. metatranscriptomics) has provided a wealth of information in the microbial community structure, predicted functions, metabolic networks, and is even able to reconstruct complete genomes directly. Here we present ATLAS (Automatic Tool for Local Assembly Structures) a comprehensive pipeline for assembly, annotation, genomic binning of metagenomic and metatranscriptomic data with an integrated framework for Multi-Omics. This will provide an open source tool for the Multi-Omic community at large.
Binding branched and linear DNA structures: From isolated clusters to fully bonded gels

NASA Astrophysics Data System (ADS)

Fernandez-Castanon, J.; Bomboi, F.; Sciortino, F.

2018-01-01

The proper design of DNA sequences allows for the formation of well-defined supramolecular units with controlled interactions via a consecution of self-assembling processes. Here, we benefit from the controlled DNA self-assembly to experimentally realize particles with well-defined valence, namely, tetravalent nanostars (A) and bivalent chains (B). We specifically focus on the case in which A particles can only bind to B particles, via appropriately designed sticky-end sequences. Hence AA and BB bonds are not allowed. Such a binary mixture system reproduces with DNA-based particles the physics of poly-functional condensation, with an exquisite control over the bonding process, tuned by the ratio, r, between B and A units and by the temperature, T. We report dynamic light scattering experiments in a window of Ts ranging from 10 °C to 55 °C and an interval of r around the percolation transition to quantify the decay of the density correlation for the different cases. At low T, when all possible bonds are formed, the system behaves as a fully bonded network, as a percolating gel, and as a cluster fluid depending on the selected r.
Combinatorial Pooling Enables Selective Sequencing of the Barley Gene Space

PubMed Central

Lonardi, Stefano; Duma, Denisa; Alpert, Matthew; Cordero, Francesca; Beccuti, Marco; Bhat, Prasanna R.; Wu, Yonghui; Ciardo, Gianfranco; Alsaihati, Burair; Ma, Yaqin; Wanamaker, Steve; Resnik, Josh; Bozdag, Serdar; Luo, Ming-Cheng; Close, Timothy J.

2013-01-01

For the vast majority of species – including many economically or ecologically important organisms, progress in biological research is hampered due to the lack of a reference genome sequence. Despite recent advances in sequencing technologies, several factors still limit the availability of such a critical resource. At the same time, many research groups and international consortia have already produced BAC libraries and physical maps and now are in a position to proceed with the development of whole-genome sequences organized around a physical map anchored to a genetic map. We propose a BAC-by-BAC sequencing protocol that combines combinatorial pooling design and second-generation sequencing technology to efficiently approach denovo selective genome sequencing. We show that combinatorial pooling is a cost-effective and practical alternative to exhaustive DNA barcoding when preparing sequencing libraries for hundreds or thousands of DNA samples, such as in this case gene-bearing minimum-tiling-path BAC clones. The novelty of the protocol hinges on the computational ability to efficiently compare hundred millions of short reads and assign them to the correct BAC clones (deconvolution) so that the assembly can be carried out clone-by-clone. Experimental results on simulated data for the rice genome show that the deconvolution is very accurate, and the resulting BAC assemblies have high quality. Results on real data for a gene-rich subset of the barley genome confirm that the deconvolution is accurate and the BAC assemblies have good quality. While our method cannot provide the level of completeness that one would achieve with a comprehensive whole-genome sequencing project, we show that it is quite successful in reconstructing the gene sequences within BACs. In the case of plants such as barley, this level of sequence knowledge is sufficient to support critical end-point objectives such as map-based cloning and marker-assisted breeding. PMID:23592960
Combinatorial pooling enables selective sequencing of the barley gene space.

PubMed

Lonardi, Stefano; Duma, Denisa; Alpert, Matthew; Cordero, Francesca; Beccuti, Marco; Bhat, Prasanna R; Wu, Yonghui; Ciardo, Gianfranco; Alsaihati, Burair; Ma, Yaqin; Wanamaker, Steve; Resnik, Josh; Bozdag, Serdar; Luo, Ming-Cheng; Close, Timothy J

2013-04-01

For the vast majority of species - including many economically or ecologically important organisms, progress in biological research is hampered due to the lack of a reference genome sequence. Despite recent advances in sequencing technologies, several factors still limit the availability of such a critical resource. At the same time, many research groups and international consortia have already produced BAC libraries and physical maps and now are in a position to proceed with the development of whole-genome sequences organized around a physical map anchored to a genetic map. We propose a BAC-by-BAC sequencing protocol that combines combinatorial pooling design and second-generation sequencing technology to efficiently approach denovo selective genome sequencing. We show that combinatorial pooling is a cost-effective and practical alternative to exhaustive DNA barcoding when preparing sequencing libraries for hundreds or thousands of DNA samples, such as in this case gene-bearing minimum-tiling-path BAC clones. The novelty of the protocol hinges on the computational ability to efficiently compare hundred millions of short reads and assign them to the correct BAC clones (deconvolution) so that the assembly can be carried out clone-by-clone. Experimental results on simulated data for the rice genome show that the deconvolution is very accurate, and the resulting BAC assemblies have high quality. Results on real data for a gene-rich subset of the barley genome confirm that the deconvolution is accurate and the BAC assemblies have good quality. While our method cannot provide the level of completeness that one would achieve with a comprehensive whole-genome sequencing project, we show that it is quite successful in reconstructing the gene sequences within BACs. In the case of plants such as barley, this level of sequence knowledge is sufficient to support critical end-point objectives such as map-based cloning and marker-assisted breeding.
Refined annotation and assembly of the Tetrahymena thermophila genome sequence through EST analysis, comparative genomic hybridization, and targeted gap closure

PubMed Central

Coyne, Robert S; Thiagarajan, Mathangi; Jones, Kristie M; Wortman, Jennifer R; Tallon, Luke J; Haas, Brian J; Cassidy-Hanley, Donna M; Wiley, Emily A; Smith, Joshua J; Collins, Kathleen; Lee, Suzanne R; Couvillion, Mary T; Liu, Yifan; Garg, Jyoti; Pearlman, Ronald E; Hamilton, Eileen P; Orias, Eduardo; Eisen, Jonathan A; Methé, Barbara A

2008-01-01

Background Tetrahymena thermophila, a widely studied model for cellular and molecular biology, is a binucleated single-celled organism with a germline micronucleus (MIC) and somatic macronucleus (MAC). The recent draft MAC genome assembly revealed low sequence repetitiveness, a result of the epigenetic removal of invasive DNA elements found only in the MIC genome. Such low repetitiveness makes complete closure of the MAC genome a feasible goal, which to achieve would require standard closure methods as well as removal of minor MIC contamination of the MAC genome assembly. Highly accurate preliminary annotation of Tetrahymena's coding potential was hindered by the lack of both comparative genomic sequence information from close relatives and significant amounts of cDNA evidence, thus limiting the value of the genomic information and also leaving unanswered certain questions, such as the frequency of alternative splicing. Results We addressed the problem of MIC contamination using comparative genomic hybridization with purified MIC and MAC DNA probes against a whole genome oligonucleotide microarray, allowing the identification of 763 genome scaffolds likely to contain MIC-limited DNA sequences. We also employed standard genome closure methods to essentially finish over 60% of the MAC genome. For the improvement of annotation, we have sequenced and analyzed over 60,000 verified EST reads from a variety of cellular growth and development conditions. Using this EST evidence, a combination of automated and manual reannotation efforts led to updates that affect 16% of the current protein-coding gene models. By comparing EST abundance, many genes showing apparent differential expression between these conditions were identified. Rare instances of alternative splicing and uses of the non-standard amino acid selenocysteine were also identified. Conclusion We report here significant progress in genome closure and reannotation of Tetrahymena thermophila. Our experience to date suggests that complete closure of the MAC genome is attainable. Using the new EST evidence, automated and manual curation has resulted in substantial improvements to the over 24,000 gene models, which will be valuable to researchers studying this model organism as well as for comparative genomics purposes. PMID:19036158
DNA biosensor for detection of Salmonella typhi from blood sample of typhoid fever patient using gold electrode modified by self-assembled monolayers of thiols

NASA Astrophysics Data System (ADS)

Suryapratiwi, Windha Novita; Paat, Vlagia Indira; Gaffar, Shabarni; Hartati, Yeni Wahyuni

2017-05-01

Electrochemical biosensors are currently being developed in order to handle various clinical problems in diagnosing infectious diseases caused by pathogenic bacteria, or viruses. On this research, voltammetric DNA biosensor using gold electrode modified by thiols with self-assembled monolayers had been developed to detect a certain sequence of Salmonella typhi DNA from blood sample of typhoid fever patient. Thiol groups of cysteamines (Cys) and aldehyde groups from glutaraldehydes (Glu) were used as a link to increase the performance of gold electrode in detecting guanine oxidation signal of hybridized S. typhi DNA and ssDNA probe. Standard calibration method was used to determine analytical parameters from the measurements. The result shown that, the detection of S. typhi DNA from blood sample of typhoid fever patient can be carried out by voltammetry using gold electrode modified by self-assembled monolayers of thiols. A characteristic oxidation potential of guanine using Au/Cys/Gluwas obtained at +0.17 until +0.20 V. Limit of detection and limit of quantification from this measurements were 1.91μg mL-1 and 6.35 μg mL-1. The concentration of complement DNA from sample was 6.96 μg mL-1.
Blood from a turnip: tissue origin of low-coverage shotgun sequencing libraries affects recovery of mitogenome sequences

USGS Publications Warehouse

Barker, F. Keith; Oyler-McCance, Sara; Tomback, Diana F.

2015-01-01

Next generation sequencing methods allow rapid, economical accumulation of data that have many applications, even at relatively low levels of genome coverage. However, the utility of shotgun sequencing data sets for specific goals may vary depending on the biological nature of the samples sequenced. We show that the ability to assemble mitogenomes from three avian samples of two different tissue types varies widely. In particular, data with coverage typical of microsatellite development efforts (∼1×) from DNA extracted from avian blood failed to cover even 50% of the mitogenome, relative to at least 500-fold coverage from muscle-derived data. Researchers should consider possible applications of their data and select the tissue source for their work accordingly. Practitioners analyzing low-coverage shotgun sequencing data (including for microsatellite locus development) should consider the potential benefits of mitogenome assembly, including internal barcode verification of species identity, mitochondrial primer development, and phylogenetics.
Genome sequencing in microfabricated high-density picolitre reactors.

PubMed

Margulies, Marcel; Egholm, Michael; Altman, William E; Attiya, Said; Bader, Joel S; Bemben, Lisa A; Berka, Jan; Braverman, Michael S; Chen, Yi-Ju; Chen, Zhoutao; Dewell, Scott B; Du, Lei; Fierro, Joseph M; Gomes, Xavier V; Godwin, Brian C; He, Wen; Helgesen, Scott; Ho, Chun Heen; Ho, Chun He; Irzyk, Gerard P; Jando, Szilveszter C; Alenquer, Maria L I; Jarvie, Thomas P; Jirage, Kshama B; Kim, Jong-Bum; Knight, James R; Lanza, Janna R; Leamon, John H; Lefkowitz, Steven M; Lei, Ming; Li, Jing; Lohman, Kenton L; Lu, Hong; Makhijani, Vinod B; McDade, Keith E; McKenna, Michael P; Myers, Eugene W; Nickerson, Elizabeth; Nobile, John R; Plant, Ramona; Puc, Bernard P; Ronan, Michael T; Roth, George T; Sarkis, Gary J; Simons, Jan Fredrik; Simpson, John W; Srinivasan, Maithreyan; Tartaro, Karrie R; Tomasz, Alexander; Vogt, Kari A; Volkmer, Greg A; Wang, Shally H; Wang, Yong; Weiner, Michael P; Yu, Pengguang; Begley, Richard F; Rothberg, Jonathan M

2005-09-15

The proliferation of large-scale DNA-sequencing projects in recent years has driven a search for alternative methods to reduce time and cost. Here we describe a scalable, highly parallel sequencing system with raw throughput significantly greater than that of state-of-the-art capillary electrophoresis instruments. The apparatus uses a novel fibre-optic slide of individual wells and is able to sequence 25 million bases, at 99% or better accuracy, in one four-hour run. To achieve an approximately 100-fold increase in throughput over current Sanger sequencing technology, we have developed an emulsion method for DNA amplification and an instrument for sequencing by synthesis using a pyrosequencing protocol optimized for solid support and picolitre-scale volumes. Here we show the utility, throughput, accuracy and robustness of this system by shotgun sequencing and de novo assembly of the Mycoplasma genitalium genome with 96% coverage at 99.96% accuracy in one run of the machine.
Materials and methods for stabilizing nanoparticles in salt solutions

DOEpatents

Robinson, David Bruce; Zuckermann, Ronald; Buffleben, George M.

2013-06-11

Sequence-specific polymers are proving to be a powerful approach to assembly and manipulation of matter on the nanometer scale. Ligands that are peptoids, or sequence-specific N-functional glycine oligomers, allow precise and flexible control over the arrangement of binding groups, steric spacers, charge, and other functionality. We have synthesized short peptoids that can prevent the aggregation of gold nanoparticles in high-salt environments including divalent salt, and allow co-adsorption of a single DNA molecule. This degree of precision and versatility is likely to prove essential in bottom-up assembly of nanostructures and in biomedical applications of nanomaterials.
Dynamic DNA-controlled "stop-and-go" assembly of well-defined protein domains on RNA-scaffolded TMV-like nanotubes.

PubMed

Schneider, Angela; Eber, Fabian J; Wenz, Nana L; Altintoprak, Klara; Jeske, Holger; Eiben, Sabine; Wege, Christina

2016-12-01

A DNA-based approach allows external control over the self-assembly process of tobacco mosaic virus (TMV)-like ribonucleoprotein nanotubes: their growth from viral coat protein (CP) subunits on five distinct RNA scaffolds containing the TMV origin of assembly (OAs) could be temporarily blocked by a stopper DNA oligomer hybridized downstream (3') of the OAs. At two upstream (5') sites tested, simple hybridization was not sufficient for stable stalling, which correlates with previous findings on a non-symmetric assembly of TMV. The growth of DNA-arrested particles could be restarted efficiently by displacement of the stopper via its toehold by using a release DNA oligomer, even after storage for twelve days. This novel strategy for growing proteinaceous tubes under tight kinetic and spatial control combines RNA guidance and its site-specific but reversible interruption by DNA blocking elements. As three of the RNA scaffolds contained long heterologous non-TMV sequence portions that included the stopping sites, this method is applicable to all RNAs amenable to TMV CP encapsidation, albeit with variable efficiency most likely depending on the scaffolds' secondary structures. The use of two distinct, selectively addressable CP variants during the serial assembly stages finally enabled an externally configured fabrication of nanotubes with highly defined subdomains. The "stop-and-go" strategy thus might pave the way towards production routines of TMV-like particles with variable aspect ratios from a single RNA scaffold, and of nanotubes with two or even more adjacent protein domains of tightly pre-defined lengths.
High-quality de novo assembly of the apple genome and methylome dynamics of early fruit development.

PubMed

Daccord, Nicolas; Celton, Jean-Marc; Linsmith, Gareth; Becker, Claude; Choisne, Nathalie; Schijlen, Elio; van de Geest, Henri; Bianco, Luca; Micheletti, Diego; Velasco, Riccardo; Di Pierro, Erica Adele; Gouzy, Jérôme; Rees, D Jasper G; Guérif, Philippe; Muranty, Hélène; Durel, Charles-Eric; Laurens, François; Lespinasse, Yves; Gaillard, Sylvain; Aubourg, Sébastien; Quesneville, Hadi; Weigel, Detlef; van de Weg, Eric; Troggio, Michela; Bucher, Etienne

2017-07-01

Using the latest sequencing and optical mapping technologies, we have produced a high-quality de novo assembly of the apple (Malus domestica Borkh.) genome. Repeat sequences, which represented over half of the assembly, provided an unprecedented opportunity to investigate the uncharacterized regions of a tree genome; we identified a new hyper-repetitive retrotransposon sequence that was over-represented in heterochromatic regions and estimated that a major burst of different transposable elements (TEs) occurred 21 million years ago. Notably, the timing of this TE burst coincided with the uplift of the Tian Shan mountains, which is thought to be the center of the location where the apple originated, suggesting that TEs and associated processes may have contributed to the diversification of the apple ancestor and possibly to its divergence from pear. Finally, genome-wide DNA methylation data suggest that epigenetic marks may contribute to agronomically relevant aspects, such as apple fruit development.
The genome sequence of sweet cherry (Prunus avium) for use in genomics-assisted breeding.

PubMed

Shirasawa, Kenta; Isuzugawa, Kanji; Ikenaga, Mitsunobu; Saito, Yutaro; Yamamoto, Toshiya; Hirakawa, Hideki; Isobe, Sachiko

2017-10-01

We determined the genome sequence of sweet cherry (Prunus avium) using next-generation sequencing technology. The total length of the assembled sequences was 272.4 Mb, consisting of 10,148 scaffold sequences with an N50 length of 219.6 kb. The sequences covered 77.8% of the 352.9 Mb sweet cherry genome, as estimated by k-mer analysis, and included >96.0% of the core eukaryotic genes. We predicted 43,349 complete and partial protein-encoding genes. A high-density consensus map with 2,382 loci was constructed using double-digest restriction site-associated DNA sequencing. Comparing the genetic maps of sweet cherry and peach revealed high synteny between the two genomes; thus the scaffolds were integrated into pseudomolecules using map- and synteny-based strategies. Whole-genome resequencing of six modern cultivars found 1,016,866 SNPs and 162,402 insertions/deletions, out of which 0.7% were deleterious. The sequence variants, as well as simple sequence repeats, can be used as DNA markers. The genomic information helps us to identify agronomically important genes and will accelerate genetic studies and breeding programs for sweet cherries. Further information on the genomic sequences and DNA markers is available in DBcherry (http://cherry.kazusa.or.jp (8 May 2017, date last accessed)). © The Author 2017. Published by Oxford University Press on behalf of Kazusa DNA Research Institute.
IHF-independent assembly of the Tn10 strand transfer transpososome: implications for inhibition of disintegration.

PubMed

Stewart, Barry J; Wardle, Simon J; Haniford, David B

2002-08-15

The frequency of DNA transposition in transposition systems that employ a strand transfer step may be significantly affected by the occurrence of a disintegration reaction, a reaction that reverses the strand transfer event. We have asked whether disintegration occurs in the Tn10 transposition system. We show that disintegration substrates (substrates constituting one half of the strand transfer product) are assembled into a transpososome that mimics the strand transfer intermediate. This strand transfer transpososome (STT) does appear to support an intermolecular disintegration reaction, but only at a very low level. Strikingly, assembly of the STT is not dependent on IHF, a host protein that is required for de novo assembly of all previously characterized Tn10 transpososomes. We suggest that disintegration substrates are able to form both transposon end and target type contacts with transposase because of their enhanced conformational flexibility. This probably allows the conformation of DNA within the complex that prevents the destructive disintegration reaction, and is responsible for relaxing the DNA sequence requirements for STT formation relative to other Tn10 transpososomes.
IHF-independent assembly of the Tn10 strand transfer transpososome: implications for inhibition of disintegration

PubMed Central

Stewart, Barry J.; Wardle, Simon J.; Haniford, David B.

2002-01-01

The frequency of DNA transposition in transposition systems that employ a strand transfer step may be significantly affected by the occurrence of a disintegration reaction, a reaction that reverses the strand transfer event. We have asked whether disintegration occurs in the Tn10 transposition system. We show that disintegration substrates (substrates constituting one half of the strand transfer product) are assembled into a transpososome that mimics the strand transfer intermediate. This strand transfer transpososome (STT) does appear to support an intermolecular disintegration reaction, but only at a very low level. Strikingly, assembly of the STT is not dependent on IHF, a host protein that is required for de novo assembly of all previously characterized Tn10 transpososomes. We suggest that disintegration substrates are able to form both transposon end and target type contacts with transposase because of their enhanced conformational flexibility. This probably allows the conformation of DNA within the complex that prevents the destructive disintegration reaction, and is responsible for relaxing the DNA sequence requirements for STT formation relative to other Tn10 transpososomes. PMID:12169640
Genome-wide comparison of medieval and modern Mycobacterium leprae.

PubMed

Schuenemann, Verena J; Singh, Pushpendra; Mendum, Thomas A; Krause-Kyora, Ben; Jäger, Günter; Bos, Kirsten I; Herbig, Alexander; Economou, Christos; Benjak, Andrej; Busso, Philippe; Nebel, Almut; Boldsen, Jesper L; Kjellström, Anna; Wu, Huihai; Stewart, Graham R; Taylor, G Michael; Bauer, Peter; Lee, Oona Y-C; Wu, Houdini H T; Minnikin, David E; Besra, Gurdyal S; Tucker, Katie; Roffey, Simon; Sow, Samba O; Cole, Stewart T; Nieselt, Kay; Krause, Johannes

2013-07-12

Leprosy was endemic in Europe until the Middle Ages. Using DNA array capture, we have obtained genome sequences of Mycobacterium leprae from skeletons of five medieval leprosy cases from the United Kingdom, Sweden, and Denmark. In one case, the DNA was so well preserved that full de novo assembly of the ancient bacterial genome could be achieved through shotgun sequencing alone. The ancient M. leprae sequences were compared with those of 11 modern strains, representing diverse genotypes and geographic origins. The comparisons revealed remarkable genomic conservation during the past 1000 years, a European origin for leprosy in the Americas, and the presence of an M. leprae genotype in medieval Europe now commonly associated with the Middle East. The exceptional preservation of M. leprae biomarkers, both DNA and mycolic acids, in ancient skeletons has major implications for palaeomicrobiology and human pathogen evolution.
Protocols for self-assembly and imaging of DNA nanostructures.

PubMed

Sobey, Thomas L; Simmel, Friedrich C

2011-01-01

Programed molecular structures allow us to research and make use of physical, chemical, and biological effects at the nanoscale. They are an example of the "bottom-up" approach to nanotechnology, with structures forming through self-assembly. DNA is a particularly useful molecule for this purpose, and some of its advantages include parallel (as opposed to serial) assembly, naturally occurring "tools," such as enzymes and proteins for making modifications and attachments, and structural dependence on base sequence. This allows us to develop one, two, and three dimensional structures that are interesting for their fundamental physical and chemical behavior, and for potential applications such as biosensors, medical diagnostics, molecular electronics, and efficient light-harvesting systems. We describe five techniques that allow one to assemble and image such structures: concentration measurement by ultraviolet absorption, titration gel electrophoresis, thermal annealing, fluorescence microscopy, and atomic force microscopy in fluids.
Centromere location in Arabidopsis is unaltered by extreme divergence in CENH3 protein sequence.

PubMed

Maheshwari, Shamoni; Ishii, Takayoshi; Brown, C Titus; Houben, Andreas; Comai, Luca

2017-03-01

During cell division, spindle fibers attach to chromosomes at centromeres. The DNA sequence at regional centromeres is fast evolving with no conserved genetic signature for centromere identity. Instead CENH3, a centromere-specific histone H3 variant, is the epigenetic signature that specifies centromere location across both plant and animal kingdoms. Paradoxically, CENH3 is also adaptively evolving. An ongoing question is whether CENH3 evolution is driven by a functional relationship with the underlying DNA sequence. Here, we demonstrate that despite extensive protein sequence divergence, CENH3 histones from distant species assemble centromeres on the same underlying DNA sequence. We first characterized the organization and diversity of centromere repeats in wild-type Arabidopsis thaliana We show that A. thaliana CENH3-containing nucleosomes exhibit a strong preference for a unique subset of centromeric repeats. These sequences are largely missing from the genome assemblies and represent the youngest and most homogeneous class of repeats. Next, we tested the evolutionary specificity of this interaction in a background in which the native A. thaliana CENH3 is replaced with CENH3s from distant species. Strikingly, we find that CENH3 from Lepidium oleraceum and Zea mays , although specifying epigenetically weaker centromeres that result in genome elimination upon outcrossing, show a binding pattern on A. thaliana centromere repeats that is indistinguishable from the native CENH3. Our results demonstrate positional stability of a highly diverged CENH3 on independently evolved repeats, suggesting that the sequence specificity of centromeres is determined by a mechanism independent of CENH3. © 2017 Maheshwari et al.; Published by Cold Spring Harbor Laboratory Press.
Making sense of deep sequencing

PubMed Central

Goldman, D.; Domschke, K.

2016-01-01

This review, the first of an occasional series, tries to make sense of the concepts and uses of deep sequencing of polynucleic acids (DNA and RNA). Deep sequencing, synonymous with next-generation sequencing, high-throughput sequencing and massively parallel sequencing, includes whole genome sequencing but is more often and diversely applied to specific parts of the genome captured in different ways, for example the highly expressed portion of the genome known as the exome and portions of the genome that are epigenetically marked either by DNA methylation, the binding of proteins including histones, or that are in different configurations and thus more or less accessible to enzymes that cleave DNA. Deep sequencing of RNA (RNASeq) reverse-transcribed to complementary DNA is invaluable for measuring RNA expression and detecting changes in RNA structure. Important concepts in deep sequencing include the length and depth of sequence reads, mapping and assembly of reads, sequencing error, haplotypes, and the propensity of deep sequencing, as with other types of ‘big data’, to generate large numbers of errors, requiring monitoring for methodologic biases and strategies for replication and validation. Deep sequencing yields a unique genetic fingerprint that can be used to identify a person, and a trove of predictors of genetic medical diseases. Deep sequencing to identify epigenetic events including changes in DNA methylation and RNA expression can reveal the history and impact of environmental exposures. Because of the power of sequencing to identify and deliver biomedically significant information about a person and their blood relatives, it creates ethical dilemmas and practical challenges in research and clinical care, for example the decision and procedures to report incidental findings that will increasingly and frequently be discovered. PMID:24925306
Cloning and analysis of DnaJ family members in the silkworm, Bombyx mori.

PubMed

Li, Yinü; Bu, Cuiyu; Li, Tiantian; Wang, Shibao; Jiang, Feng; Yi, Yongzhu; Yang, Huipeng; Zhang, Zhifang

2016-01-15

Heat shock proteins (Hsps) are involved in a variety of critical biological functions, including protein folding, degradation, and translocation and macromolecule assembly, act as molecular chaperones during periods of stress by binding to other proteins. Using expressed sequence tag (EST) and silkworm (Bombyx mori) transcriptome databases, we identified 27 cDNA sequences encoding the conserved J domain, which is found in DnaJ-type Hsps. Of the 27 J domain-containing sequences, 25 were complete cDNA sequences. We divided them into three types according to the number and presence of conserved domains. By analyzing the gene structures, intron numbers, and conserved domains and constructing a phylogenetic tree, we found that the DnaJ family had undergone convergent evolution, obtaining new domains to expand the diversity of its family members. The acquisition of the new DnaJ domains most likely occurred prior to the evolutionary divergence of prokaryotes and eukaryotes. The expression of DnaJ genes in the silkworm was generally higher in the fat body. The tissue distribution of DnaJ1 proteins was detected by western blotting, demonstrating that in the fifth-instar larvae, the DnaJ1 proteins were expressed at their highest levels in hemocytes, followed by the fat body and head. We also found that the DnaJ1 transcripts were likely differentially translated in different tissues. Using immunofluorescence cytochemistry, we revealed that in the blood cells, DnaJ1 was mainly localized in the cytoplasm. Copyright © 2015 Elsevier B.V. All rights reserved.
bcgTree: automatized phylogenetic tree building from bacterial core genomes.

PubMed

Ankenbrand, Markus J; Keller, Alexander

2016-10-01

The need for multi-gene analyses in scientific fields such as phylogenetics and DNA barcoding has increased in recent years. In particular, these approaches are increasingly important for differentiating bacterial species, where reliance on the standard 16S rDNA marker can result in poor resolution. Additionally, the assembly of bacterial genomes has become a standard task due to advances in next-generation sequencing technologies. We created a bioinformatic pipeline, bcgTree, which uses assembled bacterial genomes either from databases or own sequencing results from the user to reconstruct their phylogenetic history. The pipeline automatically extracts 107 essential single-copy core genes, found in a majority of bacteria, using hidden Markov models and performs a partitioned maximum-likelihood analysis. Here, we describe the workflow of bcgTree and, as a proof-of-concept, its usefulness in resolving the phylogeny of 293 publically available bacterial strains of the genus Lactobacillus. We also evaluate its performance in both low- and high-level taxonomy test sets. The tool is freely available at github ( https://github.com/iimog/bcgTree ) and our institutional homepage ( http://www.dna-analytics.biozentrum.uni-wuerzburg.de ).

Characterization and chromosomal mapping of the human TFG gene involved in thyroid carcinoma

DOE Office of Scientific and Technical Information (OSTI.GOV)

Mencinger, M.; Panagopoulos, I.; Andreasson, P.

1997-05-01

Homology searches in the Expressed Sequence Tag Database were performed using SPYGQ-rich regions as query sequences to find genes encoding protein regions similar to the N-terminal parts of the sarcoma-associated EWS and FUS proteins. Clone 22911 (T74973), encoding a SPYGQ-rich region in its 5{prime} end, and several other clones that overlapped 22911 were selected. The combined data made it possible to assemble a full-length cDNA sequence. This cDNA sequence is 1677 bp, containing an initiation codon ATG, an open reading frame of 400 amino acids, a poly(A) signal, and a poly(A) tail. We found 100% identity between the 5{prime} partmore » of the consensus sequence and the 598-bp-long sequence named TFG. The TFG sequence is fused to the 3{prime} end of NTRK1, generating the TRK-T3 fusion transcript found in papillary thyroid carcinoma. The cDNA therefore represents the full-length transcript of the TFG gene. TFG was localized to 3q11-q12 by fluorescence in situ hybridization. The 3{prime} and the 5{prime} ends of the TFG cDNA probe hybridized to a 2.2-kb band on Northern blot filters in all tissues examined. 28 refs., 5 figs., 1 tab.« less
Complete telomere-to-telomere de novo assembly of the Plasmodium falciparum genome through long-read (>11 kb), single molecule, real-time sequencing

PubMed Central

Vembar, Shruthi Sridhar; Seetin, Matthew; Lambert, Christine; Nattestad, Maria; Schatz, Michael C.; Baybayan, Primo; Scherf, Artur; Smith, Melissa Laird

2016-01-01

The application of next-generation sequencing to estimate genetic diversity of Plasmodium falciparum, the most lethal malaria parasite, has proved challenging due to the skewed AT-richness [∼80.6% (A + T)] of its genome and the lack of technology to assemble highly polymorphic subtelomeric regions that contain clonally variant, multigene virulence families (Ex: var and rifin). To address this, we performed amplification-free, single molecule, real-time sequencing of P. falciparum genomic DNA and generated reads of average length 12 kb, with 50% of the reads between 15.5 and 50 kb in length. Next, using the Hierarchical Genome Assembly Process, we assembled the P. falciparum genome de novo and successfully compiled all 14 nuclear chromosomes telomere-to-telomere. We also accurately resolved centromeres [∼90–99% (A + T)] and subtelomeric regions and identified large insertions and duplications that add extra var and rifin genes to the genome, along with smaller structural variants such as homopolymer tract expansions. Overall, we show that amplification-free, long-read sequencing combined with de novo assembly overcomes major challenges inherent to studying the P. falciparum genome. Indeed, this technology may not only identify the polymorphic and repetitive subtelomeric sequences of parasite populations from endemic areas but may also evaluate structural variation linked to virulence, drug resistance and disease transmission. PMID:27345719
Porcine transcriptome analysis based on 97 non-normalized cDNA libraries and assembly of 1,021,891 expressed sequence tags

PubMed Central

Gorodkin, Jan; Cirera, Susanna; Hedegaard, Jakob; Gilchrist, Michael J; Panitz, Frank; Jørgensen, Claus; Scheibye-Knudsen, Karsten; Arvin, Troels; Lumholdt, Steen; Sawera, Milena; Green, Trine; Nielsen, Bente J; Havgaard, Jakob H; Rosenkilde, Carina; Wang, Jun; Li, Heng; Li, Ruiqiang; Liu, Bin; Hu, Songnian; Dong, Wei; Li, Wei; Yu, Jun; Wang, Jian; Stærfeldt, Hans-Henrik; Wernersson, Rasmus; Madsen, Lone B; Thomsen, Bo; Hornshøj, Henrik; Bujie, Zhan; Wang, Xuegang; Wang, Xuefei; Bolund, Lars; Brunak, Søren; Yang, Huanming; Bendixen, Christian; Fredholm, Merete

2007-01-01

Background Knowledge of the structure of gene expression is essential for mammalian transcriptomics research. We analyzed a collection of more than one million porcine expressed sequence tags (ESTs), of which two-thirds were generated in the Sino-Danish Pig Genome Project and one-third are from public databases. The Sino-Danish ESTs were generated from one normalized and 97 non-normalized cDNA libraries representing 35 different tissues and three developmental stages. Results Using the Distiller package, the ESTs were assembled to roughly 48,000 contigs and 73,000 singletons, of which approximately 25% have a high confidence match to UniProt. Approximately 6,000 new porcine gene clusters were identified. Expression analysis based on the non-normalized libraries resulted in the following findings. The distribution of cluster sizes is scaling invariant. Brain and testes are among the tissues with the greatest number of different expressed genes, whereas tissues with more specialized function, such as developing liver, have fewer expressed genes. There are at least 65 high confidence housekeeping gene candidates and 876 cDNA library-specific gene candidates. We identified differential expression of genes between different tissues, in particular brain/spinal cord, and found patterns of correlation between genes that share expression in pairs of libraries. Finally, there was remarkable agreement in expression between specialized tissues according to Gene Ontology categories. Conclusion This EST collection, the largest to date in pig, represents an essential resource for annotation, comparative genomics, assembly of the pig genome sequence, and further porcine transcription studies. PMID:17407547
Single-cell template strand sequencing by Strand-seq enables the characterization of individual homologs.

PubMed

Sanders, Ashley D; Falconer, Ester; Hills, Mark; Spierings, Diana C J; Lansdorp, Peter M

2017-06-01

The ability to distinguish between genome sequences of homologous chromosomes in single cells is important for studies of copy-neutral genomic rearrangements (such as inversions and translocations), building chromosome-length haplotypes, refining genome assemblies, mapping sister chromatid exchange events and exploring cellular heterogeneity. Strand-seq is a single-cell sequencing technology that resolves the individual homologs within a cell by restricting sequence analysis to the DNA template strands used during DNA replication. This protocol, which takes up to 4 d to complete, relies on the directionality of DNA, in which each single strand of a DNA molecule is distinguished based on its 5'-3' orientation. Culturing cells in a thymidine analog for one round of cell division labels nascent DNA strands, allowing for their selective removal during genomic library construction. To preserve directionality of template strands, genomic preamplification is bypassed and labeled nascent strands are nicked and not amplified during library preparation. Each single-cell library is multiplexed for pooling and sequencing, and the resulting sequence data are aligned, mapping to either the minus or plus strand of the reference genome, to assign template strand states for each chromosome in the cell. The major adaptations to conventional single-cell sequencing protocols include harvesting of daughter cells after a single round of BrdU incorporation, bypassing of whole-genome amplification, and removal of the BrdU + strand during Strand-seq library preparation. By sequencing just template strands, the structure and identity of each homolog are preserved.
Guiding the folding pathway of DNA origami

NASA Astrophysics Data System (ADS)

Dunn, Katherine E.; Dannenberg, Frits; Ouldridge, Thomas E.; Kwiatkowska, Marta; Turberfield, Andrew J.; Bath, Jonathan

2015-09-01

DNA origami is a robust assembly technique that folds a single-stranded DNA template into a target structure by annealing it with hundreds of short `staple' strands. Its guiding design principle is that the target structure is the single most stable configuration. The folding transition is cooperative and, as in the case of proteins, is governed by information encoded in the polymer sequence. A typical origami folds primarily into the desired shape, but misfolded structures can kinetically trap the system and reduce the yield. Although adjusting assembly conditions or following empirical design rules can improve yield, well-folded origami often need to be separated from misfolded structures. The problem could in principle be avoided if assembly pathway and kinetics were fully understood and then rationally optimized. To this end, here we present a DNA origami system with the unusual property of being able to form a small set of distinguishable and well-folded shapes that represent discrete and approximately degenerate energy minima in a vast folding landscape, thus allowing us to probe the assembly process. The obtained high yield of well-folded origami structures confirms the existence of efficient folding pathways, while the shape distribution provides information about individual trajectories through the folding landscape. We find that, similarly to protein folding, the assembly of DNA origami is highly cooperative; that reversible bond formation is important in recovering from transient misfoldings; and that the early formation of long-range connections can very effectively enforce particular folds. We use these insights to inform the design of the system so as to steer assembly towards desired structures. Expanding the rational design process to include the assembly pathway should thus enable more reproducible synthesis, particularly when targeting more complex structures. We anticipate that this expansion will be essential if DNA origami is to continue its rapid development and become a reliable manufacturing technology.
Guiding the folding pathway of DNA origami.

PubMed

Dunn, Katherine E; Dannenberg, Frits; Ouldridge, Thomas E; Kwiatkowska, Marta; Turberfield, Andrew J; Bath, Jonathan

2015-09-03

DNA origami is a robust assembly technique that folds a single-stranded DNA template into a target structure by annealing it with hundreds of short 'staple' strands. Its guiding design principle is that the target structure is the single most stable configuration. The folding transition is cooperative and, as in the case of proteins, is governed by information encoded in the polymer sequence. A typical origami folds primarily into the desired shape, but misfolded structures can kinetically trap the system and reduce the yield. Although adjusting assembly conditions or following empirical design rules can improve yield, well-folded origami often need to be separated from misfolded structures. The problem could in principle be avoided if assembly pathway and kinetics were fully understood and then rationally optimized. To this end, here we present a DNA origami system with the unusual property of being able to form a small set of distinguishable and well-folded shapes that represent discrete and approximately degenerate energy minima in a vast folding landscape, thus allowing us to probe the assembly process. The obtained high yield of well-folded origami structures confirms the existence of efficient folding pathways, while the shape distribution provides information about individual trajectories through the folding landscape. We find that, similarly to protein folding, the assembly of DNA origami is highly cooperative; that reversible bond formation is important in recovering from transient misfoldings; and that the early formation of long-range connections can very effectively enforce particular folds. We use these insights to inform the design of the system so as to steer assembly towards desired structures. Expanding the rational design process to include the assembly pathway should thus enable more reproducible synthesis, particularly when targeting more complex structures. We anticipate that this expansion will be essential if DNA origami is to continue its rapid development and become a reliable manufacturing technology.
Structure and assembly of the essential RNA ring component of a viral DNA packaging motor

DOE Office of Scientific and Technical Information (OSTI.GOV)

Ding, Fang; Lu, Changrui; Zhao, Wei

2011-07-25

Prohead RNA (pRNA) is an essential component in the assembly and operation of the powerful bacteriophage {psi}29 DNA packaging motor. The pRNA forms a multimeric ring via intermolecular base-pairing interactions between protomers that serves to guide the assembly of the ring ATPase that drives DNA packaging. Here we report the quaternary structure of this rare multimeric RNA at 3.5 {angstrom} resolution, crystallized as tetrameric rings. Strong quaternary interactions and the inherent flexibility helped rationalize how free pRNA is able to adopt multiple oligomerization states in solution. These characteristics also allowed excellent fitting of the crystallographic pRNA protomers into previous prohead/pRNAmore » cryo-EM reconstructions, supporting the presence of a pentameric, but not hexameric, pRNA ring in the context of the DNA packaging motor. The pentameric pRNA ring anchors itself directly to the phage prohead by interacting specifically with the fivefold symmetric capsid structures that surround the head-tail connector portal. From these contacts, five RNA superhelices project from the pRNA ring, where they serve as scaffolds for binding and assembly of the ring ATPase, and possibly mediate communication between motor components. Construction of structure-based designer pRNAs with little sequence similarity to the wild-type pRNA were shown to fully support the packaging of {psi}29 DNA.« less
The Genome of the Beluga Whale (Delphinapterus leucas)

PubMed Central

Taylor, Gregory A.; Chan, Simon; Warren, René L.; Hammond, S. Austin; Bilobram, Steven; Mordecai, Gideon; Miller, Kristina M.; Schulze, Angela; Chan, Amy M.; Jones, Samantha J.; Tse, Kane; Li, Irene; Cheung, Dorothy; Mungall, Karen L.; Choo, Caleb; Ally, Adrian; Dhalla, Noreen; Tam, Angela K. Y.; Troussard, Armelle; Kirk, Heather; Pandoh, Pawan; Paulino, Daniel; Coope, Robin J. N.; Moore, Richard; Zhao, Yongjun; Birol, Inanc; Ma, Yussanne; Marra, Marco; Haulena, Martin

2017-01-01

The beluga whale is a cetacean that inhabits arctic and subarctic regions, and is the only living member of the genus Delphinapterus. The genome of the beluga whale was determined using DNA sequencing approaches that employed both microfluidic partitioning library and non-partitioned library construction. The former allowed for the construction of a highly contiguous assembly with a scaffold N50 length of over 19 Mbp and total reconstruction of 2.32 Gbp. To aid our understanding of the functional elements, transcriptome data was also derived from brain, duodenum, heart, lung, spleen, and liver tissue. Assembled sequence and all of the underlying sequence data are available at the National Center for Biotechnology Information (NCBI) under the Bioproject accession number PRJNA360851A. PMID:29232881
The Genome of the Beluga Whale (Delphinapterus leucas).

PubMed

Jones, Steven J M; Taylor, Gregory A; Chan, Simon; Warren, René L; Hammond, S Austin; Bilobram, Steven; Mordecai, Gideon; Suttle, Curtis A; Miller, Kristina M; Schulze, Angela; Chan, Amy M; Jones, Samantha J; Tse, Kane; Li, Irene; Cheung, Dorothy; Mungall, Karen L; Choo, Caleb; Ally, Adrian; Dhalla, Noreen; Tam, Angela K Y; Troussard, Armelle; Kirk, Heather; Pandoh, Pawan; Paulino, Daniel; Coope, Robin J N; Mungall, Andrew J; Moore, Richard; Zhao, Yongjun; Birol, Inanc; Ma, Yussanne; Marra, Marco; Haulena, Martin

2017-12-11

The beluga whale is a cetacean that inhabits arctic and subarctic regions, and is the only living member of the genus Delphinapterus . The genome of the beluga whale was determined using DNA sequencing approaches that employed both microfluidic partitioning library and non-partitioned library construction. The former allowed for the construction of a highly contiguous assembly with a scaffold N50 length of over 19 Mbp and total reconstruction of 2.32 Gbp. To aid our understanding of the functional elements, transcriptome data was also derived from brain, duodenum, heart, lung, spleen, and liver tissue. Assembled sequence and all of the underlying sequence data are available at the National Center for Biotechnology Information (NCBI) under the Bioproject accession number PRJNA360851A.
A fully decompressed synthetic bacteriophage øX174 genome assembled and archived in yeast.

PubMed

Jaschke, Paul R; Lieberman, Erica K; Rodriguez, Jon; Sierra, Adrian; Endy, Drew

2012-12-20

The 5386 nucleotide bacteriophage øX174 genome has a complicated architecture that encodes 11 gene products via overlapping protein coding sequences spanning multiple reading frames. We designed a 6302 nucleotide synthetic surrogate, øX174.1, that fully separates all primary phage protein coding sequences along with cognate translation control elements. To specify øX174.1f, a decompressed genome the same length as wild type, we truncated the gene F coding sequence. We synthesized DNA encoding fragments of øX174.1f and used a combination of in vitro- and yeast-based assembly to produce yeast vectors encoding natural or designer bacteriophage genomes. We isolated clonal preparations of yeast plasmid DNA and transfected E. coli C strains. We recovered viable øX174 particles containing the øX174.1f genome from E. coli C strains that independently express full-length gene F. We expect that yeast can serve as a genomic 'drydock' within which to maintain and manipulate clonal lineages of other obligate lytic phage. Copyright © 2012 Elsevier Inc. All rights reserved.
The DNA-encoded nucleosome organization of a eukaryotic genome.

PubMed

Kaplan, Noam; Moore, Irene K; Fondufe-Mittendorf, Yvonne; Gossett, Andrea J; Tillo, Desiree; Field, Yair; LeProust, Emily M; Hughes, Timothy R; Lieb, Jason D; Widom, Jonathan; Segal, Eran

2009-03-19

Nucleosome organization is critical for gene regulation. In living cells this organization is determined by multiple factors, including the action of chromatin remodellers, competition with site-specific DNA-binding proteins, and the DNA sequence preferences of the nucleosomes themselves. However, it has been difficult to estimate the relative importance of each of these mechanisms in vivo, because in vivo nucleosome maps reflect the combined action of all influencing factors. Here we determine the importance of nucleosome DNA sequence preferences experimentally by measuring the genome-wide occupancy of nucleosomes assembled on purified yeast genomic DNA. The resulting map, in which nucleosome occupancy is governed only by the intrinsic sequence preferences of nucleosomes, is similar to in vivo nucleosome maps generated in three different growth conditions. In vitro, nucleosome depletion is evident at many transcription factor binding sites and around gene start and end sites, indicating that nucleosome depletion at these sites in vivo is partly encoded in the genome. We confirm these results with a micrococcal nuclease-independent experiment that measures the relative affinity of nucleosomes for approximately 40,000 double-stranded 150-base-pair oligonucleotides. Using our in vitro data, we devise a computational model of nucleosome sequence preferences that is significantly correlated with in vivo nucleosome occupancy in Caenorhabditis elegans. Our results indicate that the intrinsic DNA sequence preferences of nucleosomes have a central role in determining the organization of nucleosomes in vivo.
Carbon nanotube-DNA nanoarchitectures and electronic functionality.

PubMed

Wang, Xu; Liu, Fei; Andavan, G T Senthil; Jing, Xiaoye; Singh, Krishna; Yazdanpanah, Vahid R; Bruque, Nicolas; Pandey, Rajeev R; Lake, Roger; Ozkan, Mihrimah; Wang, Kang L; Ozkan, Cengiz S

2006-11-01

Biological molecules such as deoxyribonucleic acid (DNA) possess inherent recognition and self-assembly capabilities, and are attractive templates for constructing functional hierarchical material structures as building blocks for nanoelectronics. Here we report the assembly and electronic functionality of nanoarchitectures based on conjugates of single-walled carbon nanotubes (SWNTs) functionalized with carboxylic groups and single-stranded DNA (ssDNA) sequences possessing terminal amino groups on both ends, hybridized together through amide linkages by adopting a straightforward synthetic route. Morphological and chemical-functional characterization of the nanoarchitectures are investigated using scanning electron microscopy, transmission electron microscopy, atomic force microscopy, energy-dispersive X-ray spectroscopy, Raman spectroscopy, and Fourier-transform infrared spectroscopy. Electrical measurements (I-V characterization) of the nanoarchitectures demonstrate negative differential resistance in the presence of SWNT/ssDNA interfaces, which indicates a biomimetic route to fabricating resonant tunneling diodes. I-V characterization on platinum-metallized SWNT-ssDNA nanoarchitectures via salt reduction indicates modulation of their electrical properties, with effects ranging from those of a resonant tunneling diode to a resistor, depending on the amount of metallization. Electron transport through the nanoarchitectures has been analyzed by density functional theory calculations. Our studies illustrate the great promise of biomimetic assembly of functional nanosystems based on biotemplated materials and present new avenues toward exciting future opportunities in nanoelectronics and nanobiotechnology.
Biomimetic nanochannels based biosensor for ultrasensitive and label-free detection of nucleic acids.

PubMed

Sun, Zhongyue; Liao, Tangbin; Zhang, Yulin; Shu, Jing; Zhang, Hong; Zhang, Guo-Jun

2016-12-15

A very simple sensing device based on biomimetic nanochannels has been developed for label-free, ultrasensitive and highly sequence-specific detection of DNA. Probe DNA was modified on the inner wall of the nanochannel surface by layer-by-layer (LBL) assembly. After probe DNA immobilization, DNA detection was realized by monitoring the rectified ion current when hybridization occurred. Due to three dimensional (3D) nanoscale environment of the nanochannel, this special geometry dramatically increased the surface area of the nanochannel for immobilization of probe molecules on the inner-surface and enlarged contact area between probes and target-molecules. Thus, the unique sensor reached a reliable detection limit of 10 fM for target DNA. In addition, this DNA sensor could discriminate complementary DNA (c-DNA) from non-complementary DNA (nc-DNA), two-base mismatched DNA (2bm-DNA) and one-base mismatched DNA (1bm-DNA) with high specificity. Moreover, the nanochannel-based biosensor was also able to detect target DNA even in an interfering environment and serum samples. This approach will provide a novel biosensing platform for detection and discrimination of disease-related molecular targets and unknown sequence DNA. Copyright © 2016 Elsevier B.V. All rights reserved.
Optimizing and evaluating the reconstruction of Metagenome-assembled microbial genomes.

PubMed

Papudeshi, Bhavya; Haggerty, J Matthew; Doane, Michael; Morris, Megan M; Walsh, Kevin; Beattie, Douglas T; Pande, Dnyanada; Zaeri, Parisa; Silva, Genivaldo G Z; Thompson, Fabiano; Edwards, Robert A; Dinsdale, Elizabeth A

2017-11-28

Microbiome/host interactions describe characteristics that affect the host's health. Shotgun metagenomics includes sequencing a random subset of the microbiome to analyze its taxonomic and metabolic potential. Reconstruction of DNA fragments into genomes from metagenomes (called metagenome-assembled genomes) assigns unknown fragments to taxa/function and facilitates discovery of novel organisms. Genome reconstruction incorporates sequence assembly and sorting of assembled sequences into bins, characteristic of a genome. However, the microbial community composition, including taxonomic and phylogenetic diversity may influence genome reconstruction. We determine the optimal reconstruction method for four microbiome projects that had variable sequencing platforms (IonTorrent and Illumina), diversity (high or low), and environment (coral reefs and kelp forests), using a set of parameters to select for optimal assembly and binning tools. We tested the effects of the assembly and binning processes on population genome reconstruction using 105 marine metagenomes from 4 projects. Reconstructed genomes were obtained from each project using 3 assemblers (IDBA, MetaVelvet, and SPAdes) and 2 binning tools (GroopM and MetaBat). We assessed the efficiency of assemblers using statistics that including contig continuity and contig chimerism and the effectiveness of binning tools using genome completeness and taxonomic identification. We concluded that SPAdes, assembled more contigs (143,718 ± 124 contigs) of longer length (N50 = 1632 ± 108 bp), and incorporated the most sequences (sequences-assembled = 19.65%). The microbial richness and evenness were maintained across the assembly, suggesting low contig chimeras. SPAdes assembly was responsive to the biological and technological variations within the project, compared with other assemblers. Among binning tools, we conclude that MetaBat produced bins with less variation in GC content (average standard deviation: 1.49), low species richness (4.91 ± 0.66), and higher genome completeness (40.92 ± 1.75) across all projects. MetaBat extracted 115 bins from the 4 projects of which 66 bins were identified as reconstructed metagenome-assembled genomes with sequences belonging to a specific genus. We identified 13 novel genomes, some of which were 100% complete, but show low similarity to genomes within databases. In conclusion, we present a set of biologically relevant parameters for evaluation to select for optimal assembly and binning tools. For the tools we tested, SPAdes assembler and MetaBat binning tools reconstructed quality metagenome-assembled genomes for the four projects. We also conclude that metagenomes from microbial communities that have high coverage of phylogenetically distinct, and low taxonomic diversity results in highest quality metagenome-assembled genomes.
Phylogenetic Position of a Copper Age Sheep (Ovis aries) Mitochondrial DNA

PubMed Central

Olivieri, Cristina; Ermini, Luca; Rizzi, Ermanno; Corti, Giorgio; Luciani, Stefania; Marota, Isolina; De Bellis, Gianluca; Rollo, Franco

2012-01-01

Background Sheep (Ovis aries) were domesticated in the Fertile Crescent region about 9,000-8,000 years ago. Currently, few mitochondrial (mt) DNA studies are available on archaeological sheep. In particular, no data on archaeological European sheep are available. Methodology/Principal Findings Here we describe the first portion of mtDNA sequence of a Copper Age European sheep. DNA was extracted from hair shafts which were part of the clothes of the so-called Tyrolean Iceman or Ötzi (5,350 - 5,100 years before present). Mitochondrial DNA (a total of 2,429 base pairs, encompassing a portion of the control region, tRNAPhe, a portion of the 12S rRNA gene, and the whole cytochrome B gene) was sequenced using a mixed sequencing procedure based on PCR amplification and 454 sequencing of pooled amplification products. We have compared the sequence with the corresponding sequence of 334 extant lineages. Conclusions/Significance A phylogenetic network based on a new cladistic notation for the mitochondrial diversity of domestic sheep shows that the Ötzi's sheep falls within haplogroup B, thus demonstrating that sheep belonging to this haplogroup were already present in the Alps more than 5,000 years ago. On the other hand, the lineage of the Ötzi's sheep is defined by two transitions (16147, and 16440) which, assembled together, define a motif that has not yet been identified in modern sheep populations. PMID:22457789
Decoding the massive genome of loblolly pine using haploid DNA and novel assembly strategies

PubMed Central

2014-01-01

Background The size and complexity of conifer genomes has, until now, prevented full genome sequencing and assembly. The large research community and economic importance of loblolly pine, Pinus taeda L., made it an early candidate for reference sequence determination. Results We develop a novel strategy to sequence the genome of loblolly pine that combines unique aspects of pine reproductive biology and genome assembly methodology. We use a whole genome shotgun approach relying primarily on next generation sequence generated from a single haploid seed megagametophyte from a loblolly pine tree, 20-1010, that has been used in industrial forest tree breeding. The resulting sequence and assembly was used to generate a draft genome spanning 23.2 Gbp and containing 20.1 Gbp with an N50 scaffold size of 66.9 kbp, making it a significant improvement over available conifer genomes. The long scaffold lengths allow the annotation of 50,172 gene models with intron lengths averaging over 2.7 kbp and sometimes exceeding 100 kbp in length. Analysis of orthologous gene sets identifies gene families that may be unique to conifers. We further characterize and expand the existing repeat library based on the de novo analysis of the repetitive content, estimated to encompass 82% of the genome. Conclusions In addition to its value as a resource for researchers and breeders, the loblolly pine genome sequence and assembly reported here demonstrates a novel approach to sequencing the large and complex genomes of this important group of plants that can now be widely applied. PMID:24647006
Assessing the performance of the Oxford Nanopore Technologies MinION

PubMed Central

Laver, T.; Harrison, J.; O’Neill, P.A.; Moore, K.; Farbos, A.; Paszkiewicz, K.; Studholme, D.J.

2015-01-01

The Oxford Nanopore Technologies (ONT) MinION is a new sequencing technology that potentially offers read lengths of tens of kilobases (kb) limited only by the length of DNA molecules presented to it. The device has a low capital cost, is by far the most portable DNA sequencer available, and can produce data in real-time. It has numerous prospective applications including improving genome sequence assemblies and resolution of repeat-rich regions. Before such a technology is widely adopted, it is important to assess its performance and limitations in respect of throughput and accuracy. In this study we assessed the performance of the MinION by re-sequencing three bacterial genomes, with very different nucleotide compositions ranging from 28.6% to 70.7%; the high G + C strain was underrepresented in the sequencing reads. We estimate the error rate of the MinION (after base calling) to be 38.2%. Mean and median read lengths were 2 kb and 1 kb respectively, while the longest single read was 98 kb. The whole length of a 5 kb rRNA operon was covered by a single read. As the first nanopore-based single molecule sequencer available to researchers, the MinION is an exciting prospect; however, the current error rate limits its ability to compete with existing sequencing technologies, though we do show that MinION sequence reads can enhance contiguity of de novo assembly when used in conjunction with Illumina MiSeq data. PMID:26753127
Two-Way Gold Nanoparticle Label-Free Sensing of Specific Sequence and Small Molecule Targets Using Switchable Concatemers.

PubMed

Zhu, Longjiao; Shao, Xiangli; Luo, Yunbo; Huang, Kunlung; Xu, Wentao

2017-05-19

A two-way colorimetric biosensor based on unmodified gold nanoparticles (GNPs) and a switchable double-stranded DNA (dsDNA) concatemer have been demonstrated. Two hairpin probes (H1 and H2) were first designed that provided the fuels to assemble the dsDNA concatemers via hybridization chain reaction (HCR). A functional hairpin (FH) was rationally designed to recognize the target sequences. All the hairpins contained a single-stranded DNA (ssDNA) loop and sticky end to prevent GNPs from salt-induced aggregation. In the presence of target sequence, the capture probe blocked in the FH recognizes the target to form a duplex DNA, which causes the release of the initiator probe by FH conformational change. This process then starts the alternate-opening of H1 and H2 through HCR, and dsDNA concatemers grow from the target sequence. As a result, unmodified GNPs undergo salt-induced aggregation because the formed dsDNA concatemers are stiffer and provide less stabilization. A light purple-to-blue color variation was observed in the bulk solution, termed the light-off sensing way. Furthermore, H1 ingeniously inserted an aptamer sequence to generate dsDNA concatemers with multiple small molecule binding sites. In the presence of small molecule targets, concatemers can be disassembled into mixtures with ssDNA sticky ends. A blue-to-purple reverse color variation was observed due to the regeneration of the ssDNA, termed the light-on way. The two-way biosensor can detect both nucleic acids and small molecule targets with one sensing device. This switchable sensing element is label-free, enzyme-free, and sophisticated-instrumentation-free. The detection limits of both targets were below nanomolar.
Enzyme-guided DNA Sewing Architecture

PubMed Central

Song, In Hyun; Shin, Seung Won; Park, Kyung Soo; Lansac, Yves; Jang, Yun Hee; Um, Soong Ho

2015-01-01

With the advent of nanotechnology, a variety of nanoarchitectures with varied physicochemical properties have been designed. Owing to the unique characteristics, DNAs have been used as a functional building block for novel nanoarchitecture. In particular, a self-assembly of long DNA molecules via a piece DNA staple has been utilized to attain such constructs. However, it needs many talented prerequisites (e.g., complicated computer program) with fewer yields of products. In addition, it has many limitations to overcome: for instance, (i) thermal instability under moderate environments and (ii) restraint in size caused by the restricted length of scaffold strands. Alternatively, the enzymatic sewing linkage of short DNA blocks is simply designed into long DNA assemblies but it is more error-prone due to the undeveloped sequence data. Here, we present, for the first time, a comprehensive study for directly combining DNA structures into higher DNA sewing constructs through the 5′-end cohesive ligation of T4 enzyme. Inspired by these achievements, the synthesized DNA nanomaterials were also utilized for effective detection and real-time diagnosis of cancer-specific and cytosolic RNA markers. This generalized protocol for generic DNA sewing is expected to be useful in several DNA nanotechnology as well as any nucleic acid-related fields. PMID:26634810
Enzyme-guided DNA Sewing Architecture

NASA Astrophysics Data System (ADS)

Song, In Hyun; Shin, Seung Won; Park, Kyung Soo; Lansac, Yves; Jang, Yun Hee; Um, Soong Ho

2015-12-01

With the advent of nanotechnology, a variety of nanoarchitectures with varied physicochemical properties have been designed. Owing to the unique characteristics, DNAs have been used as a functional building block for novel nanoarchitecture. In particular, a self-assembly of long DNA molecules via a piece DNA staple has been utilized to attain such constructs. However, it needs many talented prerequisites (e.g., complicated computer program) with fewer yields of products. In addition, it has many limitations to overcome: for instance, (i) thermal instability under moderate environments and (ii) restraint in size caused by the restricted length of scaffold strands. Alternatively, the enzymatic sewing linkage of short DNA blocks is simply designed into long DNA assemblies but it is more error-prone due to the undeveloped sequence data. Here, we present, for the first time, a comprehensive study for directly combining DNA structures into higher DNA sewing constructs through the 5‧-end cohesive ligation of T4 enzyme. Inspired by these achievements, the synthesized DNA nanomaterials were also utilized for effective detection and real-time diagnosis of cancer-specific and cytosolic RNA markers. This generalized protocol for generic DNA sewing is expected to be useful in several DNA nanotechnology as well as any nucleic acid-related fields.

A clone-free, single molecule map of the domestic cow (Bos taurus) genome.

PubMed

Zhou, Shiguo; Goldstein, Steve; Place, Michael; Bechner, Michael; Patino, Diego; Potamousis, Konstantinos; Ravindran, Prabu; Pape, Louise; Rincon, Gonzalo; Hernandez-Ortiz, Juan; Medrano, Juan F; Schwartz, David C

2015-08-28

The cattle (Bos taurus) genome was originally selected for sequencing due to its economic importance and unique biology as a model organism for understanding other ruminants, or mammals. Currently, there are two cattle genome sequence assemblies (UMD3.1 and Btau4.6) from groups using dissimilar assembly algorithms, which were complemented by genetic and physical map resources. However, past comparisons between these assemblies revealed substantial differences. Consequently, such discordances have engendered ambiguities when using reference sequence data, impacting genomic studies in cattle and motivating construction of a new optical map resource--BtOM1.0--to guide comparisons and improvements to the current sequence builds. Accordingly, our comprehensive comparisons of BtOM1.0 against the UMD3.1 and Btau4.6 sequence builds tabulate large-to-immediate scale discordances requiring mediation. The optical map, BtOM1.0, spanning the B. taurus genome (Hereford breed, L1 Dominette 01449) was assembled from an optical map dataset consisting of 2,973,315 (439 X; raw dataset size before assembly) single molecule optical maps (Rmaps; 1 Rmap = 1 restriction mapped DNA molecule) generated by the Optical Mapping System. The BamHI map spans 2,575.30 Mb and comprises 78 optical contigs assembled by a combination of iterative (using the reference sequence: UMD3.1) and de novo assembly techniques. BtOM1.0 is a high-resolution physical map featuring an average restriction fragment size of 8.91 Kb. Comparisons of BtOM1.0 vs. UMD3.1, or Btau4.6, revealed that Btau4.6 presented far more discordances (7,463) vs. UMD3.1 (4,754). Overall, we found that Btau4.6 presented almost double the number of discordances than UMD3.1 across most of the 6 categories of sequence vs. map discrepancies, which are: COMPLEX (misassembly), DELs (extraneous sequences), INSs (missing sequences), ITs (Inverted/Translocated sequences), ECs (extra restriction cuts) and MCs (missing restriction cuts). Alignments of UMD3.1 and Btau4.6 to BtOM1.0 reveal discordances commensurate with previous reports, and affirm the NCBI's current designation of UMD3.1 sequence assembly as the "reference assembly" and the Btau4.6 as the "alternate assembly." The cattle genome optical map, BtOM1.0, when used as a comprehensive and largely independent guide, will greatly assist improvements to existing sequence builds, and later serve as an accurate physical scaffold for studies concerning the comparative genomics of cattle breeds.
Comparative genomics of 9 novel Paenibacillus larvae bacteriophages

PubMed Central

Stamereilers, Casey; LeBlanc, Lucy; Yost, Diane; Amy, Penny S.; Tsourkas, Philippos K.

2016-01-01

ABSTRACT American Foulbrood Disease, caused by the bacterium Paenibacillus larvae, is one of the most destructive diseases of the honeybee, Apis mellifera. Our group recently published the sequences of 9 new phages with the ability to infect and lyse P. larvae. Here, we characterize the genomes of these P. larvae phages, compare them to each other and to other sequenced P. larvae phages, and putatively identify protein function. The phage genomes are 38–45 kb in size and contain 68–86 genes, most of which appear to be unique to P. larvae phages. We classify P. larvae phages into 2 main clusters and one singleton based on nucleotide sequence identity. Three of the new phages show sequence similarity to other sequenced P. larvae phages, while the remaining 6 do not. We identified functions for roughly half of the P. larvae phage proteins, including structural, assembly, host lysis, DNA replication/metabolism, regulatory, and host-related functions. Structural and assembly proteins are highly conserved among our phages and are located at the start of the genome. DNA replication/metabolism, regulatory, and host-related proteins are located in the middle and end of the genome, and are not conserved, with many of these genes found in some of our phages but not others. All nine phages code for a conserved N-acetylmuramoyl-L-alanine amidase. Comparative analysis showed the phages use the “cohesive ends with 3′ overhang” DNA packaging strategy. This work is the first in-depth study of P. larvae phage genomics, and serves as a marker for future work in this area. PMID:27738559
In situ amplified electrochemical aptasensing for sensitive detection of adenosine triphosphate by coupling target-induced hybridization chain reaction with the assembly of silver nanotags.

PubMed

Zhou, Qian; Lin, Youxiu; Lin, Yuping; Wei, Qiaohua; Chen, Guonan; Tang, Dianping

2016-01-01

Biomolecular immobilization and construction of the sensing platform are usually crucial for the successful development of a high-efficiency detection system. Herein we report on a novel and label-free signal-amplified aptasensing for sensitive electrochemical detection of small molecules (adenosine triphosphate, ATP, used in this case) by coupling with target-induced hybridization chain reaction (HCR) and the assembly of electroactive silver nanotags. The system mainly consisted of two alternating hairpin probes, a partial-pairing trigger-aptamer duplex DNA and a capture probe immobilized on the electrode. Upon target ATP introduction, the analyte attacked the aptamer and released the trigger DNA, which was captured by capture DNA immobilized on the electrode to form a newly partial-pairing double-stranded DNA. Thereafter, the exposed domain at trigger DNA could be utilized as the initator strand to open the hairpin probes in sequence, and propagated a chain reaction of hybridization events between two alternating hairpins to form a long nicked double-helix. The electrochemical signal derived from the assembled silver nanotags on the nicked double-helix. Under optimal conditions, the electrochemical aptasensor could exhibit a high sensitivity and a low detection limit, and allowed the detection of ATP at a concentration as low as 0.03 pM. Our design showed a high selectivity for target ATP against its analogs because of the high-specificity ATP-aptamer reaction, and its applicable for monitoring ATP in the spiking serum samples. Improtantly, the distinct advantages of the developed aptasensor make it hold a great potential for the development of simple and robust sensing strategies for the detection of other small molecules by controlling the apatmer sequence. Copyright © 2015 Elsevier B.V. All rights reserved.
Spatially-Interactive Biomolecular Networks Organized by Nucleic Acid Nanostructures

PubMed Central

Fu, Jinglin; Liu, Minghui; Liu, Yan; Yan, Hao

2013-01-01

Conspectus Living systems have evolved a variety of nanostructures to control the molecular interactions that mediate many functions including the recognition of targets by receptors, the binding of enzymes to substrates, and the regulation of enzymatic activity. Mimicking these structures outside of the cell requires methods that offer nanoscale control over the organization of individual network components. Advances in DNA nanotechnology have enabled the design and fabrication of sophisticated one-, two- and three-dimensional (1D, 2D and 3D) nanostructures that utilize spontaneous and sequence specific DNA hybridization. Compared to other self-assembling biopolymers, DNA nanostructures offer predictable and programmable interactions, and surface features to which other nanoparticles and bio-molecules can be precisely positioned. The ability to control the spatial arrangement of the components while constructing highly-organized networks will lead to various applications of these systems. For example, DNA nanoarrays with surface displays of molecular probes can sense noncovalent hybridization interactions with DNA, RNA, and proteins and covalent chemical reactions. DNA nanostructures can also align external molecules into well-defined arrays, which may improve the resolution of many structural determination methods, such as X-ray diffraction, cryo-EM, NMR, and super-resolution fluorescence. Moreover, by constraining target entities to specific conformations, self-assembled DNA nanostructures can serve as molecular rulers to evaluate conformation-dependent activities. This Account describes the most recent advances in the DNA nanostructure directed assembly of biomolecular networks and explores the possibility of applying this technology to other fields of study. Recently, several reports have demonstrated the DNA nanostructure directed assembly of spatially-interactive biomolecular networks. For example, researchers have constructed synthetic multi-enzyme cascades by organizing the position of the components using DNA nanoscaffolds in vitro, or by utilizing RNA matrices in vivo. These structures display enhanced efficiency compared to the corresponding unstructured enzyme mixtures. Such systems are designed to mimic cellular function, where substrate diffusion between enzymes is facilitated and reactions are catalyzed with high efficiency and specificity. In addition, researchers have assembled multiple choromophores into arrays using a DNA nanoscaffold that optimizes the relative distance between the dyes and their spatial organization. The resulting artificial light harvesting system exhibits efficient cascading energy transfers. Finally, DNA nanostructures have been used as assembly templates to construct nanodevices that execute rationally-designed behaviors, including cargo loading, transportation and route control. PMID:22642503
Transcriptional dynamics of the developing sweet cherry (Prunus avium L.) fruit: sequencing, annotation and expression profiling of exocarp-associated genes

PubMed Central

Alkio, Merianne; Jonas, Uwe; Declercq, Myriam; Van Nocker, Steven; Knoche, Moritz

2014-01-01

The exocarp, or skin, of fleshy fruit is a specialized tissue that protects the fruit, attracts seed dispersing fruit eaters, and has large economical relevance for fruit quality. Development of the exocarp involves regulated activities of many genes. This research analyzed global gene expression in the exocarp of developing sweet cherry (Prunus avium L., ‘Regina’), a fruit crop species with little public genomic resources. A catalog of transcript models (contigs) representing expressed genes was constructed from de novo assembled short complementary DNA (cDNA) sequences generated from developing fruit between flowering and maturity at 14 time points. Expression levels in each sample were estimated for 34 695 contigs from numbers of reads mapping to each contig. Contigs were annotated functionally based on BLAST, gene ontology and InterProScan analyses. Coregulated genes were detected using partitional clustering of expression patterns. The results are discussed with emphasis on genes putatively involved in cuticle deposition, cell wall metabolism and sugar transport. The high temporal resolution of the expression patterns presented here reveals finely tuned developmental specialization of individual members of gene families. Moreover, the de novo assembled sweet cherry fruit transcriptome with 7760 full-length protein coding sequences and over 20 000 other, annotated cDNA sequences together with their developmental expression patterns is expected to accelerate molecular research on this important tree fruit crop. PMID:26504533
Hierarchical assembly of viral nanotemplates with encoded microparticles via nucleic acid hybridization.

PubMed

Tan, Wui Siew; Lewis, Christina L; Horelik, Nicholas E; Pregibon, Daniel C; Doyle, Patrick S; Yi, Hyunmin

2008-11-04

We demonstrate hierarchical assembly of tobacco mosaic virus (TMV)-based nanotemplates with hydrogel-based encoded microparticles via nucleic acid hybridization. TMV nanotemplates possess a highly defined structure and a genetically engineered high density thiol functionality. The encoded microparticles are produced in a high throughput microfluidic device via stop-flow lithography (SFL) and consist of spatially discrete regions containing encoded identity information, an internal control, and capture DNAs. For the hybridization-based assembly, partially disassembled TMVs were programmed with linker DNAs that contain sequences complementary to both the virus 5' end and a selected capture DNA. Fluorescence microscopy, atomic force microscopy (AFM), and confocal microscopy results clearly indicate facile assembly of TMV nanotemplates onto microparticles with high spatial and sequence selectivity. We anticipate that our hybridization-based assembly strategy could be employed to create multifunctional viral-synthetic hybrid materials in a rapid and high-throughput manner. Additionally, we believe that these viral-synthetic hybrid microparticles may find broad applications in high capacity, multiplexed target sensing.
Rhipicephalus (Boophilus) microplus strain Deutsch, 5 BAC clone sequencing, including two encoding Cytochrome P450s and one encoding CzEst9 carboxylesterase

USDA-ARS?s Scientific Manuscript database

The cattle tick, Rhipicephalus (Boophilus) microplus, has a genome over 2.4 times the size of the human genome, and with over 70% of repetitive DNA, this genome would prove very costly to sequence at today's prices and difficult to assemble and analyze. BAC clones give insight into the genome struct...
Draft Genome Sequence of Exiguobacterium sp. Strain BMC-KP, an Environmental Isolate from Bryn Mawr, Pennsylvania.

PubMed

Hyson, Peter; Shapiro, Joshua A; Wien, Michelle W

2015-10-08

Exiguobacterium sp. strain BMC-KP was isolated as part of a student environmental sampling project at Bryn Mawr College, PA. Sequencing of bacterial DNA assembled a 3.32-Mb draft genome. Analysis suggests the presence of genes for tolerance to cold and toxic metals, broad carbohydrate metabolism, and genes derived from phage. Copyright © 2015 Hyson et al.
Target-Catalyzed DNA Four-Way Junctions for CRET Imaging of MicroRNA, Concatenated Logic Operations, and Self-Assembly of DNA Nanohydrogels for Targeted Drug Delivery.

PubMed

Bi, Sai; Xiu, Bao; Ye, Jiayan; Dong, Ying

2015-10-21

Here we report a target-catalyzed DNA four-way junction (DNA-4WJ) on the basis of toehold-mediated DNA strand displacement reaction (TM-SDR), which is readily applied in enzyme-free amplified chemiluminescence resonance energy transfer (CRET) imaging of microRNA. In this system, the introduction of target microRNA-let-7a (miR-let-7a) activates a cascade of assembly steps with four DNA hairpins, followed by a disassembly step in which the target microRNA is displaced and released from DNA-4WJ to catalyze the self-assembly of additional branched junctions. As a result, G-quadruplex subunit sequences and fluorophore fluorescein amidite (FAM) are encoded in DNA-4WJ in a close proximity, stimulating a CRET process in the presence of hemin/K(+) to form horseradish peroxidase (HRP)-mimicking DNAzyme that catalyzes the generation of luminol/H2O2 chemiluminescence (CL), which further transfers to FAM. The background signal is easily reduced using magnetic graphene oxide (MGO) to remove unreacted species through magnetic separation, which makes a great contribution to improve the detection sensitivity and achieves a detection limit as low as 6.9 fM microRNA-let-7a (miR-let-7a). In addition, four-input concatenated logic circuits with an automatic reset function have been successfully constructed relying on the architecture of the proposed DNA-4WJ. More importantly, DNA nanohydrogels are self-assembled using DNA-4WJs as building units after centrifugation, which are driven by liquid crystallization and dense packaging of building units. Moreover, the DNA nanohydrogels are readily functionalized by incorporating with aptamers, bioimaging agents, and drug loading sites, which thus are served as efficient nanocarriers for targeted drug delivery and cancer therapy with high loading capacity and excellent biocompatibility.
Rapid Optimization of Engineered Metabolic Pathways with Serine Integrase Recombinational Assembly (SIRA).

PubMed

Merrick, C A; Wardrope, C; Paget, J E; Colloms, S D; Rosser, S J

2016-01-01

Metabolic pathway engineering in microbial hosts for heterologous biosynthesis of commodity compounds and fine chemicals offers a cheaper, greener, and more reliable method of production than does chemical synthesis. However, engineering metabolic pathways within a microbe is a complicated process: levels of gene expression, protein stability, enzyme activity, and metabolic flux must be balanced for high productivity without compromising host cell viability. A major rate-limiting step in engineering microbes for optimum biosynthesis of a target compound is DNA assembly, as current methods can be cumbersome and costly. Serine integrase recombinational assembly (SIRA) is a rapid DNA assembly method that utilizes serine integrases, and is particularly applicable to rapid optimization of engineered metabolic pathways. Using six pairs of orthogonal attP and attB sites with different central dinucleotide sequences that follow SIRA design principles, we have demonstrated that ΦC31 integrase can be used to (1) insert a single piece of DNA into a substrate plasmid; (2) assemble three, four, and five DNA parts encoding the enzymes for functional metabolic pathways in a one-pot reaction; (3) generate combinatorial libraries of metabolic pathway constructs with varied ribosome binding site strengths or gene orders in a one-pot reaction; and (4) replace and add DNA parts within a construct through targeted postassembly modification. We explain the mechanism of SIRA and the principles behind designing a SIRA reaction. We also provide protocols for making SIRA reaction components and practical methods for applying SIRA to rapid optimization of metabolic pathways. © 2016 Elsevier Inc. All rights reserved.
Improvement of the Threespine Stickleback Genome Using a Hi-C-Based Proximity-Guided Assembly.

PubMed

Peichel, Catherine L; Sullivan, Shawn T; Liachko, Ivan; White, Michael A

2017-09-01

Scaffolding genomes into complete chromosome assemblies remains challenging even with the rapidly increasing sequence coverage generated by current next-generation sequence technologies. Even with scaffolding information, many genome assemblies remain incomplete. The genome of the threespine stickleback (Gasterosteus aculeatus), a fish model system in evolutionary genetics and genomics, is not completely assembled despite scaffolding with high-density linkage maps. Here, we first test the ability of a Hi-C based proximity-guided assembly (PGA) to perform a de novo genome assembly from relatively short contigs. Using Hi-C based PGA, we generated complete chromosome assemblies from a distribution of short contigs (20-100 kb). We found that 96.40% of contigs were correctly assigned to linkage groups (LGs), with ordering nearly identical to the previous genome assembly. Using available bacterial artificial chromosome (BAC) end sequences, we provide evidence that some of the few discrepancies between the Hi-C assembly and the existing assembly are due to structural variation between the populations used for the 2 assemblies or errors in the existing assembly. This Hi-C assembly also allowed us to improve the existing assembly, assigning over 60% (13.35 Mb) of the previously unassigned (~21.7 Mb) contigs to LGs. Together, our results highlight the potential of the Hi-C based PGA method to be used in combination with short read data to perform relatively inexpensive de novo genome assemblies. This approach will be particularly useful in organisms in which it is difficult to perform linkage mapping or to obtain high molecular weight DNA required for other scaffolding methods. © The American Genetic Association 2017. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.
A duplex DNA-gold nanoparticle probe composed as a colorimetric biosensor for sequence-specific DNA-binding proteins.

PubMed

Ahn, Junho; Choi, Yeonweon; Lee, Ae-Ree; Lee, Joon-Hwa; Jung, Jong Hwa

2016-03-21

Using duplex DNA-AuNP aggregates, a sequence-specific DNA-binding protein, SQUAMOSA Promoter-binding-Like protein 12 (SPL-12), was directly determined by SPL-12-duplex DNA interaction-based colorimetric actions of DNA-Au assemblies. In order to prepare duplex DNA-Au aggregates, thiol-modified DNA 1 and DNA 2 were attached onto the surface of AuNPs, respectively, by the salt-aging method and then the DNA-attached AuNPs were mixed. Duplex-DNA-Au aggregates having the average size of 160 nm diameter and the maximum absorption at 529 nm were able to recognize SPL-12 and reached the equivalent state by the addition of ∼30 equivalents of SPL-12 accompanying a color change from red to blue with a red shift of the maximum absorption at 570 nm. As a result, the aggregation size grew to about 247 nm. Also, at higher temperatures of the mixture of duplex-DNA-Au aggregate solution and SPL-12, the equivalent state was reached rapidly. On the contrary, in the control experiment using Bovine Serum Albumin (BSA), no absorption band shift of duplex-DNA-Au aggregates was observed.
Hybridization chain reaction-based instantaneous derivatization technology for chemiluminescence detection of specific DNA sequences.

PubMed

Wang, Xin; Lau, Choiwan; Kai, Masaaki; Lu, Jianzhong

2013-05-07

We propose here a new amplifying strategy that uses hybridization chain reaction (HCR) to detect specific sequences of DNA, where stable DNA monomers assemble on the magnetic beads only upon exposure to a target DNA. Briefly, in the HCR process, two complementary stable species of hairpins coexist in solution until the introduction of initiator reporter strands triggers a cascade of hybridization events that yield nicked double helices analogous to alternating copolymers. Moreover, a "sandwich-type" detection strategy is employed in our design. Magnetic beads, which are functionalized with capture DNA, are reacted with the target, and sandwiched with the above nicked double helices. Then, chemiluminescence (CL) detection proceeds via an instantaneous derivatization reaction between a specific CL reagent, 3,4,5-trimethoxylphenylglyoxal (TMPG), and the guanine nucleotides within the target DNA, reporter strands and DNA monomers for the generation of light. Our results clearly show that the amplification detection of specific sequences of DNA achieves a better performance (e.g. wide linear response range, low detection limit, and high specificity) as compared to the traditional sandwich type (capture/target/reporter) assays. Upon modification, the approach presented could be extended to detect other types of targets. We believe that this simple technique is promising for improving medical diagnosis and treatment.
The Essential Component in DNA-Based Information Storage System: Robust Error-Tolerating Module

PubMed Central

Yim, Aldrin Kay-Yuen; Yu, Allen Chi-Shing; Li, Jing-Woei; Wong, Ada In-Chun; Loo, Jacky F. C.; Chan, King Ming; Kong, S. K.; Yip, Kevin Y.; Chan, Ting-Fung

2014-01-01

The size of digital data is ever increasing and is expected to grow to 40,000 EB by 2020, yet the estimated global information storage capacity in 2011 is <300 EB, indicating that most of the data are transient. DNA, as a very stable nano-molecule, is an ideal massive storage device for long-term data archive. The two most notable illustrations are from Church et al. and Goldman et al., whose approaches are well-optimized for most sequencing platforms – short synthesized DNA fragments without homopolymer. Here, we suggested improvements on error handling methodology that could enable the integration of DNA-based computational process, e.g., algorithms based on self-assembly of DNA. As a proof of concept, a picture of size 438 bytes was encoded to DNA with low-density parity-check error-correction code. We salvaged a significant portion of sequencing reads with mutations generated during DNA synthesis and sequencing and successfully reconstructed the entire picture. A modular-based programing framework – DNAcodec with an eXtensible Markup Language-based data format was also introduced. Our experiments demonstrated the practicability of long DNA message recovery with high error tolerance, which opens the field to biocomputing and synthetic biology. PMID:25414846
Gap Resolution

DOE Office of Scientific and Technical Information (OSTI.GOV)

Labutti, Kurt; Foster, Brian; Lapidus, Alla

Gap Resolution is a software package that was developed to improve Newbler genome assemblies by automating the closure of sequence gaps caused by repetitive regions in the DNA. This is done by performing the follow steps:1) Identify and distribute the data for each gap in sub-projects. 2) Assemble the data associated with each sub-project using a secondary assembler, such as Newbler or PGA. 3) Determine if any gaps are closed after reassembly, and either design fakes (consensus of closed gap) for those that closed or lab experiments for those that require additional data. The software requires as input a genomemore » assembly produce by the Newbler assembler provided by Roche and 454 data containing paired-end reads.« less
De Novo Assembly of Human Herpes Virus Type 1 (HHV-1) Genome, Mining of Non-Canonical Structures and Detection of Novel Drug-Resistance Mutations Using Short- and Long-Read Next Generation Sequencing Technologies

PubMed Central

Karamitros, Timokratis; Piorkowska, Renata; Katzourakis, Aris; Magiorkinis, Gkikas; Mbisa, Jean Lutamyo

2016-01-01

Human herpesvirus type 1 (HHV-1) has a large double-stranded DNA genome of approximately 152 kbp that is structurally complex and GC-rich. This makes the assembly of HHV-1 whole genomes from short-read sequencing data technically challenging. To improve the assembly of HHV-1 genomes we have employed a hybrid genome assembly protocol using data from two sequencing technologies: the short-read Roche 454 and the long-read Oxford Nanopore MinION sequencers. We sequenced 18 HHV-1 cell culture-isolated clinical specimens collected from immunocompromised patients undergoing antiviral therapy. The susceptibility of the samples to several antivirals was determined by plaque reduction assay. Hybrid genome assembly resulted in a decrease in the number of contigs in 6 out of 7 samples and an increase in N(G)50 and N(G)75 of all 7 samples sequenced by both technologies. The approach also enhanced the detection of non-canonical contigs including a rearrangement between the unique (UL) and repeat (T/IRL) sequence regions of one sample that was not detectable by assembly of 454 reads alone. We detected several known and novel resistance-associated mutations in UL23 and UL30 genes. Genome-wide genetic variability ranged from <1% to 53% of amino acids in each gene exhibiting at least one substitution within the pool of samples. The UL23 gene had one of the highest genetic variabilities at 35.2% in keeping with its role in development of drug resistance. The assembly of accurate, full-length HHV-1 genomes will be useful in determining genetic determinants of drug resistance, virulence, pathogenesis and viral evolution. The numerous, complex repeat regions of the HHV-1 genome currently remain a barrier towards this goal. PMID:27309375
De Novo Assembly of Human Herpes Virus Type 1 (HHV-1) Genome, Mining of Non-Canonical Structures and Detection of Novel Drug-Resistance Mutations Using Short- and Long-Read Next Generation Sequencing Technologies.

PubMed

Karamitros, Timokratis; Harrison, Ian; Piorkowska, Renata; Katzourakis, Aris; Magiorkinis, Gkikas; Mbisa, Jean Lutamyo

2016-01-01

Human herpesvirus type 1 (HHV-1) has a large double-stranded DNA genome of approximately 152 kbp that is structurally complex and GC-rich. This makes the assembly of HHV-1 whole genomes from short-read sequencing data technically challenging. To improve the assembly of HHV-1 genomes we have employed a hybrid genome assembly protocol using data from two sequencing technologies: the short-read Roche 454 and the long-read Oxford Nanopore MinION sequencers. We sequenced 18 HHV-1 cell culture-isolated clinical specimens collected from immunocompromised patients undergoing antiviral therapy. The susceptibility of the samples to several antivirals was determined by plaque reduction assay. Hybrid genome assembly resulted in a decrease in the number of contigs in 6 out of 7 samples and an increase in N(G)50 and N(G)75 of all 7 samples sequenced by both technologies. The approach also enhanced the detection of non-canonical contigs including a rearrangement between the unique (UL) and repeat (T/IRL) sequence regions of one sample that was not detectable by assembly of 454 reads alone. We detected several known and novel resistance-associated mutations in UL23 and UL30 genes. Genome-wide genetic variability ranged from <1% to 53% of amino acids in each gene exhibiting at least one substitution within the pool of samples. The UL23 gene had one of the highest genetic variabilities at 35.2% in keeping with its role in development of drug resistance. The assembly of accurate, full-length HHV-1 genomes will be useful in determining genetic determinants of drug resistance, virulence, pathogenesis and viral evolution. The numerous, complex repeat regions of the HHV-1 genome currently remain a barrier towards this goal.
DNA sequence chromatogram browsing using JAVA and CORBA.

PubMed

Parsons, J D; Buehler, E; Hillier, L

1999-03-01

DNA sequence chromatograms (traces) are the primary data source for all large-scale genomic and expressed sequence tags (ESTs) sequencing projects. Access to the sequencing trace assists many later analyses, for example contig assembly and polymorphism detection, but obtaining and using traces is problematic. Traces are not collected and published centrally, they are much larger than the base calls derived from them, and viewing them requires the interactivity of a local graphical client with local data. To provide efficient global access to DNA traces, we developed a client/server system based on flexible Java components integrated into other applications including an applet for use in a WWW browser and a stand-alone trace viewer. Client/server interaction is facilitated by CORBA middleware which provides a well-defined interface, a naming service, and location independence. [The software is packaged as a Jar file available from the following URL: http://www.ebi.ac.uk/jparsons. Links to working examples of the trace viewers can be found at http://corba.ebi.ac.uk/EST. All the Washington University mouse EST traces are available for browsing at the same URL.
Designing oligo libraries taking alternative splicing into account

NASA Astrophysics Data System (ADS)

Shoshan, Avi; Grebinskiy, Vladimir; Magen, Avner; Scolnicov, Ariel; Fink, Eyal; Lehavi, David; Wasserman, Alon

2001-06-01

We have designed sequences for DNA microarrays and oligo libraries, taking alternative splicing into account. Alternative splicing is a common phenomenon, occurring in more than 25% of the human genes. In many cases, different splice variants have different functions, are expressed in different tissues or may indicate different stages of disease. When designing sequences for DNA microarrays or oligo libraries, it is very important to take into account the sequence information of all the mRNA transcripts. Therefore, when a gene has more than one transcript (as a result of alternative splicing, alternative promoter sites or alternative poly-adenylation sites), it is very important to take all of them into account in the design. We have used the LEADS transcriptome prediction system to cluster and assemble the human sequences in GenBank and design optimal oligonucleotides for all the human genes with a known mRNA sequence based on the LEADS predictions.
Shotgun Optical Maps of the Whole Escherichia coli O157:H7 Genome

PubMed Central

Lim, Alex; Dimalanta, Eileen T.; Potamousis, Konstantinos D.; Yen, Galex; Apodoca, Jennifer; Tao, Chunhong; Lin, Jieyi; Qi, Rong; Skiadas, John; Ramanathan, Arvind; Perna, Nicole T.; Plunkett, Guy; Burland, Valerie; Mau, Bob; Hackett, Jeremiah; Blattner, Frederick R.; Anantharaman, Thomas S.; Mishra, Bhubaneswar; Schwartz, David C.

2001-01-01

We have constructed NheI and XhoI optical maps of Escherichia coli O157:H7 solely from genomic DNA molecules to provide a uniquely valuable scaffold for contig closure and sequence validation. E. coli O157:H7 is a common pathogen found in contaminated food and water. Our approach obviated the need for the analysis of clones, PCR products, and hybridizations, because maps were constructed from ensembles of single DNA molecules. Shotgun sequencing of bacterial genomes remains labor-intensive, despite advances in sequencing technology. This is partly due to manual intervention required during the last stages of finishing. The applicability of optical mapping to this problem was enhanced by advances in machine vision techniques that improved mapping throughput and created a path to full automation of mapping. Comparisons were made between maps and sequence data that characterized sequence gaps and guided nascent assemblies. PMID:11544203

Lessons for livestock genomics from genome and transcriptome sequencing in cattle and other mammals.

PubMed

Taylor, Jeremy F; Whitacre, Lynsey K; Hoff, Jesse L; Tizioto, Polyana C; Kim, JaeWoo; Decker, Jared E; Schnabel, Robert D

2016-08-17

Decreasing sequencing costs and development of new protocols for characterizing global methylation, gene expression patterns and regulatory regions have stimulated the generation of large livestock datasets. Here, we discuss experiences in the analysis of whole-genome and transcriptome sequence data. We analyzed whole-genome sequence (WGS) data from 132 individuals from five canid species (Canis familiaris, C. latrans, C. dingo, C. aureus and C. lupus) and 61 breeds, three bison (Bison bison), 64 water buffalo (Bubalus bubalis) and 297 bovines from 17 breeds. By individual, data vary in extent of reference genome depth of coverage from 4.9X to 64.0X. We have also analyzed RNA-seq data for 580 samples representing 159 Bos taurus and Rattus norvegicus animals and 98 tissues. By aligning reads to a reference assembly and calling variants, we assessed effects of average depth of coverage on the actual coverage and on the number of called variants. We examined the identity of unmapped reads by assembling them and querying produced contigs against the non-redundant nucleic acids database. By imputing high-density single nucleotide polymorphism data on 4010 US registered Angus animals to WGS using Run4 of the 1000 Bull Genomes Project and assessing the accuracy of imputation, we identified misassembled reference sequence regions. We estimate that a 24X depth of coverage is required to achieve 99.5 % coverage of the reference assembly and identify 95 % of the variants within an individual's genome. Genomes sequenced to low average coverage (e.g., <10X) may fail to cover 10 % of the reference genome and identify <75 % of variants. About 10 % of genomic DNA or transcriptome sequence reads fail to align to the reference assembly. These reads include loci missing from the reference assembly and misassembled genes and interesting symbionts, commensal and pathogenic organisms. Assembly errors and a lack of annotation of functional elements significantly limit the utility of the current draft livestock reference assemblies. The Functional Annotation of Animal Genomes initiative seeks to annotate functional elements, while a 70X Pac-Bio assembly for cow is underway and may result in a significantly improved reference assembly.
De novo assembled expressed gene catalog of a fast-growing Eucalyptus tree produced by Illumina mRNA-Seq

PubMed Central

2010-01-01

Background De novo assembly of transcript sequences produced by short-read DNA sequencing technologies offers a rapid approach to obtain expressed gene catalogs for non-model organisms. A draft genome sequence will be produced in 2010 for a Eucalyptus tree species (E. grandis) representing the most important hardwood fibre crop in the world. Genome annotation of this valuable woody plant and genetic dissection of its superior growth and productivity will be greatly facilitated by the availability of a comprehensive collection of expressed gene sequences from multiple tissues and organs. Results We present an extensive expressed gene catalog for a commercially grown E. grandis × E. urophylla hybrid clone constructed using only Illumina mRNA-Seq technology and de novo assembly. A total of 18,894 transcript-derived contigs, a large proportion of which represent full-length protein coding genes were assembled and annotated. Analysis of assembly quality, length and diversity show that this dataset represent the most comprehensive expressed gene catalog for any Eucalyptus tree. mRNA-Seq analysis furthermore allowed digital expression profiling of all of the assembled transcripts across diverse xylogenic and non-xylogenic tissues, which is invaluable for ascribing putative gene functions. Conclusions De novo assembly of Illumina mRNA-Seq reads is an efficient approach for transcriptome sequencing and profiling in Eucalyptus and other non-model organisms. The transcriptome resource (Eucspresso, http://eucspresso.bi.up.ac.za/) generated by this study will be of value for genomic analysis of woody biomass production in Eucalyptus and for comparative genomic analysis of growth and development in woody and herbaceous plants. PMID:21122097
Using herbarium-derived DNAs to assemble a large-scale DNA barcode library for the vascular plants of Canada.

PubMed

Kuzmina, Maria L; Braukmann, Thomas W A; Fazekas, Aron J; Graham, Sean W; Dewaard, Stephanie L; Rodrigues, Anuar; Bennett, Bruce A; Dickinson, Timothy A; Saarela, Jeffery M; Catling, Paul M; Newmaster, Steven G; Percy, Diana M; Fenneman, Erin; Lauron-Moreau, Aurélien; Ford, Bruce; Gillespie, Lynn; Subramanyam, Ragupathy; Whitton, Jeannette; Jennings, Linda; Metsger, Deborah; Warne, Connor P; Brown, Allison; Sears, Elizabeth; Dewaard, Jeremy R; Zakharov, Evgeny V; Hebert, Paul D N

2017-12-01

Constructing complete, accurate plant DNA barcode reference libraries can be logistically challenging for large-scale floras. Here we demonstrate the promise and challenges of using herbarium collections for building a DNA barcode reference library for the vascular plant flora of Canada. Our study examined 20,816 specimens representing 5076 of 5190 vascular plant species in Canada (98%). For 98% of the specimens, at least one of the DNA barcode regions was recovered from the plastid loci rbcL and matK and from the nuclear ITS2 region. We used beta regression to quantify the effects of age, type of preservation, and taxonomic affiliation (family) on DNA sequence recovery. Specimen age and method of preservation had significant effects on sequence recovery for all markers, but influenced some families more (e.g., Boraginaceae) than others (e.g., Asteraceae). Our DNA barcode library represents an unparalleled resource for metagenomic and ecological genetic research working on temperate and arctic biomes. An observed decline in sequence recovery with specimen age may be associated with poor primer matches, intragenomic variation (for ITS2), or inhibitory secondary compounds in some taxa.
Using herbarium-derived DNAs to assemble a large-scale DNA barcode library for the vascular plants of Canada1

PubMed Central

Kuzmina, Maria L.; Braukmann, Thomas W. A.; Fazekas, Aron J.; Graham, Sean W.; Dewaard, Stephanie L.; Rodrigues, Anuar; Bennett, Bruce A.; Dickinson, Timothy A.; Saarela, Jeffery M.; Catling, Paul M.; Newmaster, Steven G.; Percy, Diana M.; Fenneman, Erin; Lauron-Moreau, Aurélien; Ford, Bruce; Gillespie, Lynn; Subramanyam, Ragupathy; Whitton, Jeannette; Jennings, Linda; Metsger, Deborah; Warne, Connor P.; Brown, Allison; Sears, Elizabeth; Dewaard, Jeremy R.; Zakharov, Evgeny V.; Hebert, Paul D. N.

2017-01-01

Premise of the study: Constructing complete, accurate plant DNA barcode reference libraries can be logistically challenging for large-scale floras. Here we demonstrate the promise and challenges of using herbarium collections for building a DNA barcode reference library for the vascular plant flora of Canada. Methods: Our study examined 20,816 specimens representing 5076 of 5190 vascular plant species in Canada (98%). For 98% of the specimens, at least one of the DNA barcode regions was recovered from the plastid loci rbcL and matK and from the nuclear ITS2 region. We used beta regression to quantify the effects of age, type of preservation, and taxonomic affiliation (family) on DNA sequence recovery. Results: Specimen age and method of preservation had significant effects on sequence recovery for all markers, but influenced some families more (e.g., Boraginaceae) than others (e.g., Asteraceae). Discussion: Our DNA barcode library represents an unparalleled resource for metagenomic and ecological genetic research working on temperate and arctic biomes. An observed decline in sequence recovery with specimen age may be associated with poor primer matches, intragenomic variation (for ITS2), or inhibitory secondary compounds in some taxa. PMID:29299394
High-throughput sequencing of three Lemnoideae (duckweeds) chloroplast genomes from total DNA.

PubMed

Wang, Wenqin; Messing, Joachim

2011-01-01

Chloroplast genomes provide a wealth of information for evolutionary and population genetic studies. Chloroplasts play a particularly important role in the adaption for aquatic plants because they float on water and their major surface is exposed continuously to sunlight. The subfamily of Lemnoideae represents such a collection of aquatic species that because of photosynthesis represents one of the fastest growing plant species on earth. We sequenced the chloroplast genomes from three different genera of Lemnoideae, Spirodela polyrhiza, Wolffiella lingulata and Wolffia australiana by high-throughput DNA sequencing of genomic DNA using the SOLiD platform. Unfractionated total DNA contains high copies of plastid DNA so that sequences from the nucleus and mitochondria can easily be filtered computationally. Remaining sequence reads were assembled into contiguous sequences (contigs) using SOLiD software tools. Contigs were mapped to a reference genome of Lemna minor and gaps, selected by PCR, were sequenced on the ABI3730xl platform. This combinatorial approach yielded whole genomic contiguous sequences in a cost-effective manner. Over 1,000-time coverage of chloroplast from total DNA were reached by the SOLiD platform in a single spot on a quadrant slide without purification. Comparative analysis indicated that the chloroplast genome was conserved in gene number and organization with respect to the reference genome of L. minor. However, higher nucleotide substitution, abundant deletions and insertions occurred in non-coding regions of these genomes, indicating a greater genomic dynamics than expected from the comparison of other related species in the Pooideae. Noticeably, there was no transition bias over transversion in Lemnoideae. The data should have immediate applications in evolutionary biology and plant taxonomy with increased resolution and statistical power.
High-Throughput Sequencing of Three Lemnoideae (Duckweeds) Chloroplast Genomes from Total DNA

PubMed Central

Wang, Wenqin; Messing, Joachim

2011-01-01

Background Chloroplast genomes provide a wealth of information for evolutionary and population genetic studies. Chloroplasts play a particularly important role in the adaption for aquatic plants because they float on water and their major surface is exposed continuously to sunlight. The subfamily of Lemnoideae represents such a collection of aquatic species that because of photosynthesis represents one of the fastest growing plant species on earth. Methods We sequenced the chloroplast genomes from three different genera of Lemnoideae, Spirodela polyrhiza, Wolffiella lingulata and Wolffia australiana by high-throughput DNA sequencing of genomic DNA using the SOLiD platform. Unfractionated total DNA contains high copies of plastid DNA so that sequences from the nucleus and mitochondria can easily be filtered computationally. Remaining sequence reads were assembled into contiguous sequences (contigs) using SOLiD software tools. Contigs were mapped to a reference genome of Lemna minor and gaps, selected by PCR, were sequenced on the ABI3730xl platform. Conclusions This combinatorial approach yielded whole genomic contiguous sequences in a cost-effective manner. Over 1,000-time coverage of chloroplast from total DNA were reached by the SOLiD platform in a single spot on a quadrant slide without purification. Comparative analysis indicated that the chloroplast genome was conserved in gene number and organization with respect to the reference genome of L. minor. However, higher nucleotide substitution, abundant deletions and insertions occurred in non-coding regions of these genomes, indicating a greater genomic dynamics than expected from the comparison of other related species in the Pooideae. Noticeably, there was no transition bias over transversion in Lemnoideae. The data should have immediate applications in evolutionary biology and plant taxonomy with increased resolution and statistical power. PMID:21931804
De novo assembly of the pepper transcriptome (Capsicum annuum): a benchmark for in silico discovery of SNPs, SSRs and candidate genes.

PubMed

Ashrafi, Hamid; Hill, Theresa; Stoffel, Kevin; Kozik, Alexander; Yao, Jiqiang; Chin-Wo, Sebastian Reyes; Van Deynze, Allen

2012-10-30

Molecular breeding of pepper (Capsicum spp.) can be accelerated by developing DNA markers associated with transcriptomes in breeding germplasm. Before the advent of next generation sequencing (NGS) technologies, the majority of sequencing data were generated by the Sanger sequencing method. By leveraging Sanger EST data, we have generated a wealth of genetic information for pepper including thousands of SNPs and Single Position Polymorphic (SPP) markers. To complement and enhance these resources, we applied NGS to three pepper genotypes: Maor, Early Jalapeño and Criollo de Morelos-334 (CM334) to identify SNPs and SSRs in the assembly of these three genotypes. Two pepper transcriptome assemblies were developed with different purposes. The first reference sequence, assembled by CAP3 software, comprises 31,196 contigs from >125,000 Sanger-EST sequences that were mainly derived from a Korean F1-hybrid line, Bukang. Overlapping probes were designed for 30,815 unigenes to construct a pepper Affymetrix GeneChip® microarray for whole genome analyses. In addition, custom Python scripts were used to identify 4,236 SNPs in contigs of the assembly. A total of 2,489 simple sequence repeats (SSRs) were identified from the assembly, and primers were designed for the SSRs. Annotation of contigs using Blast2GO software resulted in information for 60% of the unigenes in the assembly. The second transcriptome assembly was constructed from more than 200 million Illumina Genome Analyzer II reads (80-120 nt) using a combination of Velvet, CLC workbench and CAP3 software packages. BWA, SAMtools and in-house Perl scripts were used to identify SNPs among three pepper genotypes. The SNPs were filtered to be at least 50 bp from any intron-exon junctions as well as flanking SNPs. More than 22,000 high-quality putative SNPs were identified. Using the MISA software, 10,398 SSR markers were also identified within the Illumina transcriptome assembly and primers were designed for the identified markers. The assembly was annotated by Blast2GO and 14,740 (12%) of annotated contigs were associated with functional proteins. Before availability of pepper genome sequence, assembling transcriptomes of this economically important crop was required to generate thousands of high-quality molecular markers that could be used in breeding programs. In order to have a better understanding of the assembled sequences and to identify candidate genes underlying QTLs, we annotated the contigs of Sanger-EST and Illumina transcriptome assemblies. These and other information have been curated in a database that we have dedicated for pepper project.
Genotype Specification Language.

PubMed

Wilson, Erin H; Sagawa, Shiori; Weis, James W; Schubert, Max G; Bissell, Michael; Hawthorne, Brian; Reeves, Christopher D; Dean, Jed; Platt, Darren

2016-06-17

We describe here the Genotype Specification Language (GSL), a language that facilitates the rapid design of large and complex DNA constructs used to engineer genomes. The GSL compiler implements a high-level language based on traditional genetic notation, as well as a set of low-level DNA manipulation primitives. The language allows facile incorporation of parts from a library of cloned DNA constructs and from the "natural" library of parts in fully sequenced and annotated genomes. GSL was designed to engage genetic engineers in their native language while providing a framework for higher level abstract tooling. To this end we define four language levels, Level 0 (literal DNA sequence) through Level 3, with increasing abstraction of part selection and construction paths. GSL targets an intermediate language based on DNA slices that translates efficiently into a wide range of final output formats, such as FASTA and GenBank, and includes formats that specify instructions and materials such as oligonucleotide primers to allow the physical construction of the GSL designs by individual strain engineers or an automated DNA assembly core facility.
Highly sensitive detection of mutations in CHO cell recombinant DNA using multi-parallel single molecule real-time DNA sequencing.

PubMed

Cartwright, Joseph F; Anderson, Karin; Longworth, Joseph; Lobb, Philip; James, David C

2018-06-01

High-fidelity replication of biologic-encoding recombinant DNA sequences by engineered mammalian cell cultures is an essential pre-requisite for the development of stable cell lines for the production of biotherapeutics. However, immortalized mammalian cells characteristically exhibit an increased point mutation frequency compared to mammalian cells in vivo, both across their genomes and at specific loci (hotspots). Thus unforeseen mutations in recombinant DNA sequences can arise and be maintained within producer cell populations. These may affect both the stability of recombinant gene expression and give rise to protein sequence variants with variable bioactivity and immunogenicity. Rigorous quantitative assessment of recombinant DNA integrity should therefore form part of the cell line development process and be an essential quality assurance metric for instances where synthetic/multi-component assemblies are utilized to engineer mammalian cells, such as the assessment of recombinant DNA fidelity or the mutability of single-site integration target loci. Based on Pacific Biosciences (Menlo Park, CA) single molecule real-time (SMRT™) circular consensus sequencing (CCS) technology we developed a rDNA sequence analysis tool to process the multi-parallel sequencing of ∼40,000 single recombinant DNA molecules. After statistical filtering of raw sequencing data, we show that this analytical method is capable of detecting single point mutations in rDNA to a minimum single mutation frequency of 0.0042% (<1/24,000 bases). Using a stable CHO transfectant pool harboring a randomly integrated 5 kB plasmid construct encoding GFP we found that 28% of recombinant plasmid copies contained at least one low frequency (<0.3%) point mutation. These mutations were predominantly found in GC base pairs (85%) and that there was no positional bias in mutation across the plasmid sequence. There was no discernable difference between the mutation frequencies of coding and non-coding DNA. The putative ratio of non-synonymous and synonymous changes within the open reading frames (ORFs) in the plasmid sequence indicates that natural selection does not impact upon the prevalence of these mutations. Here we have demonstrated the abundance of mutations that fall outside of the reported range of detection of next generation sequencing (NGS) and second generation sequencing (SGS) platforms, providing a methodology capable of being utilized in cell line development platforms to identify the fidelity of recombinant genes throughout the production process. © 2018 Wiley Periodicals, Inc.
Multiple hybrid de novo genome assembly of finger millet, an orphan allotetraploid crop.

PubMed

Hatakeyama, Masaomi; Aluri, Sirisha; Balachadran, Mathi Thumilan; Sivarajan, Sajeevan Radha; Patrignani, Andrea; Grüter, Simon; Poveda, Lucy; Shimizu-Inatsugi, Rie; Baeten, John; Francoijs, Kees-Jan; Nataraja, Karaba N; Reddy, Yellodu A Nanja; Phadnis, Shamprasad; Ravikumar, Ramapura L; Schlapbach, Ralph; Sreeman, Sheshshayee M; Shimizu, Kentaro K

2017-09-05

Finger millet (Eleusine coracana (L.) Gaertn) is an important crop for food security because of its tolerance to drought, which is expected to be exacerbated by global climate changes. Nevertheless, it is often classified as an orphan/underutilized crop because of the paucity of scientific attention. Among several small millets, finger millet is considered as an excellent source of essential nutrient elements, such as iron and zinc; hence, it has potential as an alternate coarse cereal. However, high-quality genome sequence data of finger millet are currently not available. One of the major problems encountered in the genome assembly of this species was its polyploidy, which hampers genome assembly compared with a diploid genome. To overcome this problem, we sequenced its genome using diverse technologies with sufficient coverage and assembled it via a novel multiple hybrid assembly workflow that combines next-generation with single-molecule sequencing, followed by whole-genome optical mapping using the Bionano Irys® system. The total number of scaffolds was 1,897 with an N50 length >2.6 Mb and detection of 96% of the universal single-copy orthologs. The majority of the homeologs were assembled separately. This indicates that the proposed workflow is applicable to the assembly of other allotetraploid genomes. © The Author 2017. Published by Oxford University Press on behalf of Kazusa DNA Research Institute.
Three-Dimensional Structures Self-Assembled from DNA Bricks

PubMed Central

Ke, Yonggang; Ong, Luvena L.; Shih, William M.; Yin, Peng

2013-01-01

We describe a simple and robust method to construct complex three-dimensional (3D) structures using short synthetic DNA strands that we call “DNA bricks”. In one-step annealing reactions, bricks with hundreds of distinct sequences self-assemble into prescribed 3D shapes. Each 32-nucleotide brick is a modular component; it binds to four local neighbors and can be removed or added independently. Each 8-base-pair interaction between bricks defines a voxel with dimensions 2.5 nanometers by 2.5 nanometers by 2.7 nanometers, and a master brick collection defines a “molecular canvas” with dimensions of 10 by 10 by 10 voxels. By selecting subsets of bricks from this canvas, we constructed a panel of 102 distinct shapes exhibiting sophisticated surface features as well as intricate interior cavities and tunnels. PMID:23197527
Sequence conservation from human to prokaryotes of Surf1, a protein involved in cytochrome c oxidase assembly, deficient in Leigh syndrome.

PubMed

Poyau, A; Buchet, K; Godinot, C

1999-12-03

The human SURF1 gene encoding a protein involved in cytochrome c oxidase (COX) assembly, is mutated in most patients presenting Leigh syndrome associated with COX deficiency. Proteins homologous to the human Surf1 have been identified in nine eukaryotes and six prokaryotes using database alignment tools, structure prediction and/or cDNA sequencing. Their sequence comparison revealed a remarkable Surf1 conservation during evolution and put forward at least four highly conserved domains that should be essential for Surf1 function. In Paracoccus denitrificans, the Surf1 homologue is found in the quinol oxidase operon, suggesting that Surf1 is associated with a primitive quinol oxidase which belongs to the same superfamily as cytochrome oxidase.
Characterisation of the transcriptome of a wild great tit Parus major population by next generation sequencing

PubMed Central

2011-01-01

Background The recent development of next generation sequencing technologies has made it possible to generate very large amounts of sequence data in species with little or no genome information. Combined with the large phenotypic databases available for wild and non-model species, these data will provide an unprecedented opportunity to "genomicise" ecological model organisms and establish the genetic basis of quantitative traits in natural populations. Results This paper describes the sequencing, de novo assembly and analysis from the transcriptome of eight tissues of ten wild great tits. Approximately 4.6 million sequences and 1.4 billion bases of DNA were generated and assembled into 95,979 contigs, one third of which aligned with known Taeniopygia guttata (zebra finch) and Gallus gallus (chicken) transcripts. The majority (78%) of the remaining contigs aligned within or very close to regions of the zebra finch genome containing known genes, suggesting that they represented precursor mRNA rather than untranscribed genomic DNA. More than 35,000 single nucleotide polymorphisms and 10,000 microsatellite repeats were identified. Eleven percent of contigs were expressed in every tissue, while twenty one percent of contigs were expressed in only one tissue. The function of those contigs with strong evidence for tissue specific expression and contigs expressed in every tissue was inferred from the gene ontology (GO) terms associated with these contigs; heart and pancreas had the highest number of highly tissue specific GO terms (21.4% and 28.5% respectively). Conclusions In summary, the transcriptomic data generated in this study will contribute towards efforts to assemble and annotate the great tit genome, as well as providing the markers required to perform gene mapping studies in wild populations. PMID:21635727
Paenibacillus phoenicis sp. nov., isolated from the Phoenix Lander assembly facility and a subsurface molybdenum mine.

PubMed

Benardini, James N; Vaishampayan, Parag A; Schwendner, Petra; Swanner, Elizabeth; Fukui, Youhei; Osman, Sharif; Satomi, Masakata; Venkateswaran, Kasthuri

2011-06-01

A novel Gram-positive, motile, endospore-forming, aerobic bacterium was isolated from the NASA Phoenix Lander assembly clean room that exhibits 100 % 16S rRNA gene sequence similarity to two strains isolated from a deep subsurface environment. All strains are rod-shaped, endospore-forming bacteria, whose endospores are resistant to UV radiation up to 500 J m(-2). A polyphasic taxonomic study including traditional phenotypic tests, fatty acid analysis, 16S rRNA gene sequencing and DNA-DNA hybridization analysis was performed to characterize these novel strains. The 16S rRNA gene sequencing convincingly grouped these novel strains within the genus Paenibacillus as a separate cluster from previously described species. The similarity of 16S rRNA gene sequences among the novel strains was identical but only 98.1 to 98.5 % with their nearest neighbours Paenibacillus barengoltzii ATCC BAA-1209(T) and Paenibacillus timonensis CIP 108005(T). The menaquinone MK-7 was dominant in these novel strains as shown in other species of the genus Paenibacillus. The DNA-DNA hybridization dissociation value was <45 % with the closest related species. The novel strains had DNA G+C contents of 51.9 to 52.8 mol%. Phenotypically, the novel strains can be readily differentiated from closely related species by the absence of urease and gelatinase and the production of acids from a variety of sugars including l-arabinose. The major fatty acid was anteiso-C(15 : 0) as seen in P. barengoltzii and P. timonensis whereas the proportion of C(16 : 0) was significantly different from the closely related species. Based on phylogenetic and phenotypic results, it was concluded that these strains represent a novel species of the genus Paenibacillus, for which the name Paenibacillus phoenicis sp. nov. is proposed. The type strain is 3PO2SA(T) ( = NRRL B-59348(T) = NBRC 106274(T)).
What Combined Measurements From Structures and Imaging Tell Us About DNA Damage Responses

PubMed Central

Brosey, Chris A.; Ahmed, Zamal; Lees-Miller, Susan P.; Tainer, John A.

2017-01-01

DNA damage outcomes depend upon the efficiency and fidelity of DNA damage responses (DDRs) for different cells and damage. As such, DDRs represent tightly regulated prototypical systems for linking nanoscale biomolecular structure and assembly to the biology of genomic regulation and cell signaling. However, the dynamic and multifunctional nature of DDR assemblies can render elusive the correlation between the structures of DDR factors and specific biological disruptions to the DDR when these structures are altered. In this chapter, we discuss concepts and strategies for combining structural, biophysical, and imaging techniques to investigate DDR recognition and regulation, and thus bridge sequence-level structural biochemistry to quantitative biological outcomes visualized in cells. We focus on representative DDR responses from PARP/PARG/AIF damage signaling in DNA single-strand break repair and nonhomologous end joining complexes in double-strand break repair. Methods with exemplary experimental results are considered with a focus on strategies for probing flexibility, conformational changes, and assembly processes that shape a predictive understanding of DDR mechanisms in a cellular context. Integration of structural and imaging measurements promises to provide foundational knowledge to rationally control and optimize DNA damage outcomes for synthetic lethality and for immune activation with resulting insights for biology and cancer interventions. PMID:28668129
Protein domain analysis of genomic sequence data reveals regulation of LRR related domains in plant transpiration in Ficus.

PubMed

Lang, Tiange; Yin, Kangquan; Liu, Jinyu; Cao, Kunfang; Cannon, Charles H; Du, Fang K

2014-01-01

Predicting protein domains is essential for understanding a protein's function at the molecular level. However, up till now, there has been no direct and straightforward method for predicting protein domains in species without a reference genome sequence. In this study, we developed a functionality with a set of programs that can predict protein domains directly from genomic sequence data without a reference genome. Using whole genome sequence data, the programming functionality mainly comprised DNA assembly in combination with next-generation sequencing (NGS) assembly methods and traditional methods, peptide prediction and protein domain prediction. The proposed new functionality avoids problems associated with de novo assembly due to micro reads and small single repeats. Furthermore, we applied our functionality for the prediction of leucine rich repeat (LRR) domains in four species of Ficus with no reference genome, based on NGS genomic data. We found that the LRRNT_2 and LRR_8 domains are related to plant transpiration efficiency, as indicated by the stomata index, in the four species of Ficus. The programming functionality established in this study provides new insights for protein domain prediction, which is particularly timely in the current age of NGS data expansion.
A TATA binding protein mutant with increased affinity for DNA directs transcription from a reversed TATA sequence in vivo.

PubMed

Spencer, J Vaughn; Arndt, Karen M

2002-12-01

The TATA-binding protein (TBP) nucleates the assembly and determines the position of the preinitiation complex at RNA polymerase II-transcribed genes. We investigated the importance of two conserved residues on the DNA binding surface of Saccharomyces cerevisiae TBP to DNA binding and sequence discrimination. Because they define a significant break in the twofold symmetry of the TBP-TATA interface, Ala100 and Pro191 have been proposed to be key determinants of TBP binding orientation and transcription directionality. In contrast to previous predictions, we found that substitution of an alanine for Pro191 did not allow recognition of a reversed TATA box in vivo; however, the reciprocal change, Ala100 to proline, resulted in efficient utilization of this and other variant TATA sequences. In vitro assays demonstrated that TBP mutants with the A100P and P191A substitutions have increased and decreased affinity for DNA, respectively. The TATA binding defect of TBP with the P191A mutation could be intragenically suppressed by the A100P substitution. Our results suggest that Ala100 and Pro191 are important for DNA binding and sequence recognition by TBP, that the naturally occurring asymmetry of Ala100 and Pro191 is not essential for function, and that a single amino acid change in TBP can lead to elevated DNA binding affinity and recognition of a reversed TATA sequence.
RNA-Seq Analysis of Cocos nucifera: Transcriptome Sequencing and De Novo Assembly for Subsequent Functional Genomics Approaches

PubMed Central

Xia, Wei; Mason, Annaliese S.; Xia, Zhihui; Qiao, Fei; Zhao, Songlin; Tang, Haoru

2013-01-01

Background Cocos nucifera (coconut), a member of the Arecaceae family, is an economically important woody palm grown in tropical regions. Despite its agronomic importance, previous germplasm assessment studies have relied solely on morphological and agronomical traits. Molecular biology techniques have been scarcely used in assessment of genetic resources and for improvement of important agronomic and quality traits in Cocos nucifera, mostly due to the absence of available sequence information. Methodology/Principal Findings To provide basic information for molecular breeding and further molecular biological analysis in Cocos nucifera, we applied RNA-seq technology and de novo assembly to gain a global overview of the Cocos nucifera transcriptome from mixed tissue samples. Using Illumina sequencing, we obtained 54.9 million short reads and conducted de novo assembly to obtain 57,304 unigenes with an average length of 752 base pairs. Sequence comparison between assembled unigenes and released cDNA sequences of Cocos nucifera and Elaeis guineensis indicated that the assembled sequences were of high quality. Approximately 99.9% of unigenes were novel compared to the released coconut EST sequences. Using BLASTX, 68.2% of unigenes were successfully annotated based on the Genbank non-redundant (Nr) protein database. The annotated unigenes were then further classified using the Gene Ontology (GO), Clusters of Orthologous Groups (COG) and Kyoto Encyclopedia of Genes and Genomes (KEGG) databases. Conclusions/Significance Our study provides a large quantity of novel genetic information for Cocos nucifera. This information will act as a valuable resource for further molecular genetic studies and breeding in coconut, as well as for isolation and characterization of functional genes involved in different biochemical pathways in this important tropical crop species. PMID:23555859
RNA-Seq analysis of Cocos nucifera: transcriptome sequencing and de novo assembly for subsequent functional genomics approaches.

PubMed

Fan, Haikuo; Xiao, Yong; Yang, Yaodong; Xia, Wei; Mason, Annaliese S; Xia, Zhihui; Qiao, Fei; Zhao, Songlin; Tang, Haoru

2013-01-01

Cocos nucifera (coconut), a member of the Arecaceae family, is an economically important woody palm grown in tropical regions. Despite its agronomic importance, previous germplasm assessment studies have relied solely on morphological and agronomical traits. Molecular biology techniques have been scarcely used in assessment of genetic resources and for improvement of important agronomic and quality traits in Cocos nucifera, mostly due to the absence of available sequence information. To provide basic information for molecular breeding and further molecular biological analysis in Cocos nucifera, we applied RNA-seq technology and de novo assembly to gain a global overview of the Cocos nucifera transcriptome from mixed tissue samples. Using Illumina sequencing, we obtained 54.9 million short reads and conducted de novo assembly to obtain 57,304 unigenes with an average length of 752 base pairs. Sequence comparison between assembled unigenes and released cDNA sequences of Cocos nucifera and Elaeis guineensis indicated that the assembled sequences were of high quality. Approximately 99.9% of unigenes were novel compared to the released coconut EST sequences. Using BLASTX, 68.2% of unigenes were successfully annotated based on the Genbank non-redundant (Nr) protein database. The annotated unigenes were then further classified using the Gene Ontology (GO), Clusters of Orthologous Groups (COG) and Kyoto Encyclopedia of Genes and Genomes (KEGG) databases. Our study provides a large quantity of novel genetic information for Cocos nucifera. This information will act as a valuable resource for further molecular genetic studies and breeding in coconut, as well as for isolation and characterization of functional genes involved in different biochemical pathways in this important tropical crop species.
Variant ribosomal RNA alleles are conserved and exhibit tissue-specific expression

PubMed Central

Parks, Matthew M.; Kurylo, Chad M.; Dass, Randall A.; Bojmar, Linda; Lyden, David; Vincent, C. Theresa; Blanchard, Scott C.

2018-01-01

The ribosome, the integration point for protein synthesis in the cell, is conventionally considered a homogeneous molecular assembly that only passively contributes to gene expression. Yet, epigenetic features of the ribosomal DNA (rDNA) operon and changes in the ribosome’s molecular composition have been associated with disease phenotypes, suggesting that the ribosome itself may possess inherent regulatory capacity. Analyzing whole-genome sequencing data from the 1000 Genomes Project and the Mouse Genomes Project, we find that rDNA copy number varies widely across individuals, and we identify pervasive intra- and interindividual nucleotide variation in the 5S, 5.8S, 18S, and 28S ribosomal RNA (rRNA) genes of both human and mouse. Conserved rRNA sequence heterogeneities map to functional centers of the assembled ribosome, variant rRNA alleles exhibit tissue-specific expression, and ribosomes bearing variant rRNA alleles are present in the actively translating ribosome pool. These findings provide a critical framework for exploring the possibility that the expression of genomically encoded variant rRNA alleles gives rise to physically and functionally heterogeneous ribosomes that contribute to mammalian physiology and human disease. PMID:29503865

The octamer-binding proteins form multi-protein--DNA complexes with the HSV alpha TIF regulatory protein.

PubMed Central

Kristie, T M; LeBowitz, J H; Sharp, P A

1989-01-01

The herpes simplex virus transactivator, alpha TIF, stimulates transcription of the alpha/immediate early genes via a cis-acting site containing an octamer element and a conserved flanking sequence. The alpha TIF protein, produced in a baculovirus expression system, nucleates the formation of at least two DNA--protein complexes on this regulatory element. Both of these complexes contain the ubiquitous Oct-1 protein, whose POU domain alone is sufficient to allow assembly of the alpha TIF-dependent complexes. A second member of the POU domain family, the lymphoid specific Oct-2 protein, can also be assembled into similar complexes at high concentrations of alpha TIF protein. These complexes contain at least two cellular proteins in addition to Oct-1. One of these proteins is present in both insect and HeLa cells and probably recognizes sequences in the cis element. The second cellular protein, only present in HeLa cells, probably binds by protein-protein interactions. Images PMID:2556266
The octamer-binding proteins form multi-protein--DNA complexes with the HSV alpha TIF regulatory protein.

PubMed

Kristie, T M; LeBowitz, J H; Sharp, P A

1989-12-20

The herpes simplex virus transactivator, alpha TIF, stimulates transcription of the alpha/immediate early genes via a cis-acting site containing an octamer element and a conserved flanking sequence. The alpha TIF protein, produced in a baculovirus expression system, nucleates the formation of at least two DNA--protein complexes on this regulatory element. Both of these complexes contain the ubiquitous Oct-1 protein, whose POU domain alone is sufficient to allow assembly of the alpha TIF-dependent complexes. A second member of the POU domain family, the lymphoid specific Oct-2 protein, can also be assembled into similar complexes at high concentrations of alpha TIF protein. These complexes contain at least two cellular proteins in addition to Oct-1. One of these proteins is present in both insect and HeLa cells and probably recognizes sequences in the cis element. The second cellular protein, only present in HeLa cells, probably binds by protein-protein interactions.
From famine to feast? Selecting nuclear DNA sequence loci for plant species-level phylogeny reconstruction

PubMed Central

Hughes, Colin E; Eastwood, Ruth J; Donovan Bailey, C

2005-01-01

Phylogenetic analyses of DNA sequences have prompted spectacular progress in assembling the Tree of Life. However, progress in constructing phylogenies among closely related species, at least for plants, has been less encouraging. We show that for plants, the rapid accumulation of DNA characters at higher taxonomic levels has not been matched by conventional sequence loci at the species level, leaving a lack of well-resolved gene trees that is hindering investigations of many fundamental questions in plant evolutionary biology. The most popular approach to address this problem has been to use low-copy nuclear genes as a source of DNA sequence data. However, this has had limited success because levels of variation among nuclear intron sequences across groups of closely related species are extremely variable and generally lower than conventionally used loci, and because no universally useful low-copy nuclear DNA sequence loci have been developed. This suggests that solutions will, for the most part, be lineage-specific, prompting a move away from ‘universal’ gene thinking for species-level phylogenetics. The benefits and limitations of alternative approaches to locate more variable nuclear loci are discussed and the potential of anonymous non-genic nuclear loci is highlighted. Given the virtually unlimited number of loci that can be generated using these new approaches, it is clear that effective screening will be critical for efficient selection of the most informative loci. Strategies for screening are outlined. PMID:16553318
Modular Nuclease-Responsive DNA Three-Way Junction-Based Dynamic Assembly of a DNA Device and Its Sensing Application.

PubMed

Zhu, Jing; Wang, Lei; Xu, Xiaowen; Wei, Haiping; Jiang, Wei

2016-04-05

Here, we explored a modular strategy for rational design of nuclease-responsive three-way junctions (TWJs) and fabricated a dynamic DNA device in a "plug-and-play" fashion. First, inactivated TWJs were designed, which contained three functional domains: the inaccessible toehold and branch migration domains, the specific sites of nucleases, and the auxiliary complementary sequence. The actions of different nucleases on their specific sites in TWJs caused the close proximity of the same toehold and branch migration domains, resulting in the activation of the TWJs and the formation of a universal trigger for the subsequent dynamic assembly. Second, two hairpins (H1 and H2) were introduced, which could coexist in a metastable state, initially to act as the components for the dynamic assembly. Once the trigger initiated the opening of H1 via TWJs-driven strand displacement, the cascade hybridization of hairpins immediately switched on, resulting in the formation of the concatemers of H1/H2 complex appending numerous integrated G-quadruplexes, which were used to obtain label-free signal readout. The inherent modularity of this design allowed us to fabricate a flexible DNA dynamic device and detect multiple nucleases through altering the recognition pattern slightly. Taking uracil-DNA glycosylase and CpG methyltransferase M.SssI as models, we successfully realized the butt joint between the uracil-DNA glycosylase and M.SssI recognition events and the dynamic assembly process. Furthermore, we achieved ultrasensitive assay of nuclease activity and the inhibitor screening. The DNA device proposed here will offer an adaptive and flexible tool for clinical diagnosis and anticancer drug discovery.
Rapid sequencing of the bamboo mitochondrial genome using Illumina technology and parallel episodic evolution of organelle genomes in grasses.

PubMed

Ma, Peng-Fei; Guo, Zhen-Hua; Li, De-Zhu

2012-01-01

Compared to their counterparts in animals, the mitochondrial (mt) genomes of angiosperms exhibit a number of unique features. However, unravelling their evolution is hindered by the few completed genomes, of which are essentially Sanger sequenced. While next-generation sequencing technologies have revolutionized chloroplast genome sequencing, they are just beginning to be applied to angiosperm mt genomes. Chloroplast genomes of grasses (Poaceae) have undergone episodic evolution and the evolutionary rate was suggested to be correlated between chloroplast and mt genomes in Poaceae. It is interesting to investigate whether correlated rate change also occurred in grass mt genomes as expected under lineage effects. A time-calibrated phylogenetic tree is needed to examine rate change. We determined a largely completed mt genome from a bamboo, Ferrocalamus rimosivaginus (Poaceae), through Illumina sequencing of total DNA. With combination of de novo and reference-guided assembly, 39.5-fold coverage Illumina reads were finally assembled into scaffolds totalling 432,839 bp. The assembled genome contains nearly the same genes as the completed mt genomes in Poaceae. For examining evolutionary rate in grass mt genomes, we reconstructed a phylogenetic tree including 22 taxa based on 31 mt genes. The topology of the well-resolved tree was almost identical to that inferred from chloroplast genome with only minor difference. The inconsistency possibly derived from long branch attraction in mtDNA tree. By calculating absolute substitution rates, we found significant rate change (∼4-fold) in mt genome before and after the diversification of Poaceae both in synonymous and nonsynonymous terms. Furthermore, the rate change was correlated with that of chloroplast genomes in grasses. Our result demonstrates that it is a rapid and efficient approach to obtain angiosperm mt genome sequences using Illumina sequencing technology. The parallel episodic evolution of mt and chloroplast genomes in grasses is consistent with lineage effects.
Rapid Sequencing of the Bamboo Mitochondrial Genome Using Illumina Technology and Parallel Episodic Evolution of Organelle Genomes in Grasses

PubMed Central

Ma, Peng-Fei; Guo, Zhen-Hua; Li, De-Zhu

2012-01-01

Background Compared to their counterparts in animals, the mitochondrial (mt) genomes of angiosperms exhibit a number of unique features. However, unravelling their evolution is hindered by the few completed genomes, of which are essentially Sanger sequenced. While next-generation sequencing technologies have revolutionized chloroplast genome sequencing, they are just beginning to be applied to angiosperm mt genomes. Chloroplast genomes of grasses (Poaceae) have undergone episodic evolution and the evolutionary rate was suggested to be correlated between chloroplast and mt genomes in Poaceae. It is interesting to investigate whether correlated rate change also occurred in grass mt genomes as expected under lineage effects. A time-calibrated phylogenetic tree is needed to examine rate change. Methodology/Principal Findings We determined a largely completed mt genome from a bamboo, Ferrocalamus rimosivaginus (Poaceae), through Illumina sequencing of total DNA. With combination of de novo and reference-guided assembly, 39.5-fold coverage Illumina reads were finally assembled into scaffolds totalling 432,839 bp. The assembled genome contains nearly the same genes as the completed mt genomes in Poaceae. For examining evolutionary rate in grass mt genomes, we reconstructed a phylogenetic tree including 22 taxa based on 31 mt genes. The topology of the well-resolved tree was almost identical to that inferred from chloroplast genome with only minor difference. The inconsistency possibly derived from long branch attraction in mtDNA tree. By calculating absolute substitution rates, we found significant rate change (∼4-fold) in mt genome before and after the diversification of Poaceae both in synonymous and nonsynonymous terms. Furthermore, the rate change was correlated with that of chloroplast genomes in grasses. Conclusions/Significance Our result demonstrates that it is a rapid and efficient approach to obtain angiosperm mt genome sequences using Illumina sequencing technology. The parallel episodic evolution of mt and chloroplast genomes in grasses is consistent with lineage effects. PMID:22272330
A High Quality Draft Consensus Sequence of the Genome of a Heterozygous Grapevine Variety

PubMed Central

Cartwright, Dustin A.; Cestaro, Alessandro; Pruss, Dmitry; Pindo, Massimo; FitzGerald, Lisa M.; Vezzulli, Silvia; Reid, Julia; Malacarne, Giulia; Iliev, Diana; Coppola, Giuseppina; Wardell, Bryan; Micheletti, Diego; Macalma, Teresita; Facci, Marco; Mitchell, Jeff T.; Perazzolli, Michele; Eldredge, Glenn; Gatto, Pamela; Oyzerski, Rozan; Moretto, Marco; Gutin, Natalia; Stefanini, Marco; Chen, Yang; Segala, Cinzia; Davenport, Christine; Demattè, Lorenzo; Mraz, Amy; Battilana, Juri; Stormo, Keith; Costa, Fabrizio; Tao, Quanzhou; Si-Ammour, Azeddine; Harkins, Tim; Lackey, Angie; Perbost, Clotilde; Taillon, Bruce; Stella, Alessandra; Solovyev, Victor; Fawcett, Jeffrey A.; Sterck, Lieven; Vandepoele, Klaas; Grando, Stella M.; Toppo, Stefano; Moser, Claudio; Lanchbury, Jerry; Bogden, Robert; Skolnick, Mark; Sgaramella, Vittorio; Bhatnagar, Satish K.; Fontana, Paolo; Gutin, Alexander; Van de Peer, Yves; Salamini, Francesco; Viola, Roberto

2007-01-01

Background Worldwide, grapes and their derived products have a large market. The cultivated grape species Vitis vinifera has potential to become a model for fruit trees genetics. Like many plant species, it is highly heterozygous, which is an additional challenge to modern whole genome shotgun sequencing. In this paper a high quality draft genome sequence of a cultivated clone of V. vinifera Pinot Noir is presented. Principal Findings We estimate the genome size of V. vinifera to be 504.6 Mb. Genomic sequences corresponding to 477.1 Mb were assembled in 2,093 metacontigs and 435.1 Mb were anchored to the 19 linkage groups (LGs). The number of predicted genes is 29,585, of which 96.1% were assigned to LGs. This assembly of the grape genome provides candidate genes implicated in traits relevant to grapevine cultivation, such as those influencing wine quality, via secondary metabolites, and those connected with the extreme susceptibility of grape to pathogens. Single nucleotide polymorphism (SNP) distribution was consistent with a diffuse haplotype structure across the genome. Of around 2,000,000 SNPs, 1,751,176 were mapped to chromosomes and one or more of them were identified in 86.7% of anchored genes. The relative age of grape duplicated genes was estimated and this made possible to reveal a relatively recent Vitis-specific large scale duplication event concerning at least 10 chromosomes (duplication not reported before). Conclusions Sanger shotgun sequencing and highly efficient sequencing by synthesis (SBS), together with dedicated assembly programs, resolved a complex heterozygous genome. A consensus sequence of the genome and a set of mapped marker loci were generated. Homologous chromosomes of Pinot Noir differ by 11.2% of their DNA (hemizygous DNA plus chromosomal gaps). SNP markers are offered as a tool with the potential of introducing a new era in the molecular breeding of grape. PMID:18094749
Binding of sulphonated indigo derivatives to RepA-WH1 inhibits DNA-induced protein amyloidogenesis

PubMed Central

Gasset-Rosa, Fátima; Maté, María Jesús; Dávila-Fajardo, Cristina; Bravo, Jerónimo; Giraldo, Rafael

2008-01-01

The quest for inducers and inhibitors of protein amyloidogenesis is of utmost interest, since they are key tools to understand the molecular bases of proteinopathies such as Alzheimer, Parkinson, Huntington and Creutzfeldt–Jakob diseases. It is also expected that such molecules could lead to valid therapeutic agents. In common with the mammalian prion protein (PrP), the N-terminal Winged-Helix (WH1) domain of the pPS10 plasmid replication protein (RepA) assembles in vitro into a variety of amyloid nanostructures upon binding to different specific dsDNA sequences. Here we show that di- (S2) and tetra-sulphonated (S4) derivatives of indigo stain dock at the DNA recognition interface in the RepA-WH1 dimer. They compete binding of RepA to its natural target dsDNA repeats, found at the repA operator and at the origin of replication of the plasmid. Calorimetry points to the existence of a major site, with micromolar affinity, for S4-indigo in RepA-WH1 dimers. As revealed by electron microscopy, in the presence of inducer dsDNA, both S2/S4 stains inhibit the assembly of RepA-WH1 into fibres. These results validate the concept that DNA can promote protein assembly into amyloids and reveal that the binding sites of effector molecules can be targeted to inhibit amyloidogenesis. PMID:18285361
Mi-2/NuRD complex function is required for normal S phase progression and assembly of pericentric heterochromatin.

PubMed

Sims, Jennifer K; Wade, Paul A

2011-09-01

During chromosome duplication, it is essential to replicate not only the DNA sequence, but also the complex nucleoprotein structures of chromatin. Pericentric heterochromatin is critical for silencing repetitive elements and plays an essential structural role during mitosis. However, relatively little is understood about its assembly and maintenance during replication. The Mi2/NuRD chromatin remodeling complex tightly associates with actively replicating pericentric heterochromatin, suggesting a role in its assembly. Here we demonstrate that depletion of the catalytic ATPase subunit CHD4/Mi-2β in cells with a dampened DNA damage response results in a slow-growth phenotype characterized by delayed progression through S phase. Furthermore, we observe defects in pericentric heterochromatin maintenance and assembly. Our data suggest that chromatin assembly defects are sensed by an ATM-dependent intra-S phase chromatin quality checkpoint, resulting in a temporal block to the transition from early to late S phase. These findings implicate Mi-2β in the maintenance of chromatin structure and proper cell cycle progression.
High Quality Maize Centromere 10 Sequence Reveals Evidence of Frequent Recombination Events

PubMed Central

Wolfgruber, Thomas K.; Nakashima, Megan M.; Schneider, Kevin L.; Sharma, Anupma; Xie, Zidian; Albert, Patrice S.; Xu, Ronghui; Bilinski, Paul; Dawe, R. Kelly; Ross-Ibarra, Jeffrey; Birchler, James A.; Presting, Gernot G.

2016-01-01

The ancestral centromeres of maize contain long stretches of the tandemly arranged CentC repeat. The abundance of tandem DNA repeats and centromeric retrotransposons (CR) has presented a significant challenge to completely assembling centromeres using traditional sequencing methods. Here, we report a nearly complete assembly of the 1.85 Mb maize centromere 10 from inbred B73 using PacBio technology and BACs from the reference genome project. The error rates estimated from overlapping BAC sequences are 7 × 10−6 and 5 × 10−5 for mismatches and indels, respectively. The number of gaps in the region covered by the reassembly was reduced from 140 in the reference genome to three. Three expressed genes are located between 92 and 477 kb from the inferred ancestral CentC cluster, which lies within the region of highest centromeric repeat density. The improved assembly increased the count of full-length CR from 5 to 55 and revealed a 22.7 kb segmental duplication that occurred approximately 121,000 years ago. Our analysis provides evidence of frequent recombination events in the form of partial retrotransposons, deletions within retrotransposons, chimeric retrotransposons, segmental duplications including higher order CentC repeats, a deleted CentC monomer, centromere-proximal inversions, and insertion of mitochondrial sequences. Double-strand DNA break (DSB) repair is the most plausible mechanism for these events and may be the major driver of centromere repeat evolution and diversity. In many cases examined here, DSB repair appears to be mediated by microhomology, suggesting that tandem repeats may have evolved to efficiently repair frequent DSBs in centromeres. PMID:27047500
DNA Photo Lithography with Cinnamate-based Photo-Bio-Nano-Glue

NASA Astrophysics Data System (ADS)

Feng, Lang; Li, Minfeng; Romulus, Joy; Sha, Ruojie; Royer, John; Wu, Kun-Ta; Xu, Qin; Seeman, Nadrian; Weck, Marcus; Chaikin, Paul

2013-03-01

We present a technique to make patterned functional surfaces, using a cinnamate photo cross-linker and photolithography. We have designed and modified a complementary set of single DNA strands to incorporate a pair of opposing cinnamate molecules. On exposure to 360nm UV, the cinnamate makes a highly specific covalent bond permanently linking only the complementary strands containing the cinnamates. We have studied this specific and efficient crosslinking with cinnamate-containing DNA in solution and on particles. UV addressability allows us to pattern surfaces functionally. The entire surface is coated with a DNA sequence A incorporating cinnamate. DNA strands A'B with one end containing a complementary cinnamated sequence A' attached to another sequence B, are then hybridized to the surface. UV photolithography is used to bind the A'B strand in a specific pattern. The system is heated and the unbound DNA is washed away. The pattern is then observed by thermo-reversibly hybridizing either fluorescently dyed B' strands complementary to B, or colloids coated with B' strands. Our techniques can be used to reversibly and/or permanently bind, via DNA linkers, an assortment of molecules, proteins and nanostructures. Potential applications range from advanced self-assembly, such as templated self-replication schemes recently reported, to designed physical and chemical patterns, to high-resolution multi-functional DNA surfaces for genetic detection or DNA computing.
Automation and integration of multiplexed on-line sample preparation with capillary electrophoresis for DNA sequencing

DOE Office of Scientific and Technical Information (OSTI.GOV)

Tan, H.

1999-03-31

The purpose of this research is to develop a multiplexed sample processing system in conjunction with multiplexed capillary electrophoresis for high-throughput DNA sequencing. The concept from DNA template to called bases was first demonstrated with a manually operated single capillary system. Later, an automated microfluidic system with 8 channels based on the same principle was successfully constructed. The instrument automatically processes 8 templates through reaction, purification, denaturation, pre-concentration, injection, separation and detection in a parallel fashion. A multiplexed freeze/thaw switching principle and a distribution network were implemented to manage flow direction and sample transportation. Dye-labeled terminator cycle-sequencing reactions are performedmore » in an 8-capillary array in a hot air thermal cycler. Subsequently, the sequencing ladders are directly loaded into a corresponding size-exclusion chromatographic column operated at {approximately} 60 C for purification. On-line denaturation and stacking injection for capillary electrophoresis is simultaneously accomplished at a cross assembly set at {approximately} 70 C. Not only the separation capillary array but also the reaction capillary array and purification columns can be regenerated after every run. DNA sequencing data from this system allow base calling up to 460 bases with accuracy of 98%.« less
RNA-Seq analysis and transcriptome assembly for blackberry (Rubus sp. Var. Lochness) fruit.

PubMed

Garcia-Seco, Daniel; Zhang, Yang; Gutierrez-Mañero, Francisco J; Martin, Cathie; Ramos-Solano, Beatriz

2015-01-22

There is an increasing interest in berries, especially blackberries in the diet, because of recent reports of their health benefits due to their high content of flavonoids. A broad range of genomic tools are available for other Rosaceae species but these tools are still lacking in the Rubus genus, thus limiting gene discovery and the breeding of improved varieties. De novo RNA-seq of ripe blackberries grown under field conditions was performed using Illumina Hiseq 2000. Almost 9 billion nucleotide bases were sequenced in total. Following assembly, 42,062 consensus sequences were detected. For functional annotation, 33,040 (NR), 32,762 (NT), 21,932 (Swiss-Prot), 20,134 (KEGG), 13,676 (COG), 24,168 (GO) consensus sequences were annotated using different databases; in total 34,552 annotated sequences were identified. For protein prediction analysis, the number of coding DNA sequences (CDS) that mapped to the protein database was 32,540. Non redundant (NR), annotation showed that 25,418 genes (73.5%) has the highest similarity with Fragaria vesca subspecies vesca. Reanalysis was undertaken by aligning the reads with this reference genome for a deeper analysis of the transcriptome. We demonstrated that de novo assembly, using Trinity and later annotation with Blast using different databases, were complementary to alignment to the reference sequence using SOAPaligner/SOAP2. The Fragaria reference genome belongs to a species in the same family as blackberry (Rosaceae) but to a different genus. Since blackberries are tetraploids, the possibility of artefactual gene chimeras resulting from mis-assembly was tested with one of the genes sequenced by RNAseq, Chalcone Synthase (CHS). cDNAs encoding this protein were cloned and sequenced. Primers designed to the assembled sequences accurately distinguished different contigs, at least for chalcone synthase genes. We prepared and analysed transcriptome data from ripe blackberries, for which prior genomic information was limited. This new sequence information will improve the knowledge of this important and healthy fruit, providing an invaluable new tool for biological research.
Bacillus horneckiae sp. nov., isolated from a spacecraft-assembly clean room.

PubMed

Vaishampayan, Parag; Probst, Alexander; Krishnamurthi, Srinivasan; Ghosh, Sudeshna; Osman, Shariff; McDowall, Alasdair; Ruckmani, Arunachalam; Mayilraj, Shanmugam; Venkateswaran, Kasthuri

2010-05-01

Five Gram-stain-positive, motile, aerobic strains were isolated from a clean room of the Kennedy Space Center where the Phoenix spacecraft was assembled. All strains are rod-shaped, spore-forming bacteria, whose spores were resistant to UV radiation up to 1000 J m(-2). The spores were subterminally positioned and produced an external layer. A polyphasic taxonomic study including traditional biochemical tests, fatty acid analysis, cell-wall typing, lipid analyses, 16S rRNA gene sequencing and DNA-DNA hybridization studies was performed to characterize these novel strains. 16S rRNA gene sequencing and lipid analyses convincingly grouped these novel strains within the genus Bacillus as a cluster separate from already described species. The similarity of 16S rRNA gene sequences among the novel strains was >99 %, but the similarity was only about 97 % with their nearest neighbours Bacillus pocheonensis, Bacillus firmus and Bacillus bataviensis. DNA-DNA hybridization dissociation values were <24 % to the closest related type strains. The novel strains had a G+C content 35.6+/-0.5 mol% and could liquefy gelatin but did not utilize or produce acids from any of the carbon substrates tested. The major fatty acids were iso-C(15 : 0) and anteiso-C(15 : 0) and the cell-wall diamino acid was meso-diaminopimelic acid. Based on phylogenetic and phenotypic results, it is concluded that these strains represent a novel species of the genus Bacillus, for which the name Bacillus horneckiae sp. nov. is proposed. The type strain is 1P01SC(T) (=NRRL B-59162(T) =MTCC 9535(T)).
Topological Interaction by Entanglement of DNA

NASA Astrophysics Data System (ADS)

Feng, Lang; Sha, Ruojie; Seeman, Nadrian; Chaikin, Paul

2012-02-01

We find and study a new type of interaction between colloids, Topological Interaction by Entanglement of DNA (TIED), due to concatenation of loops formed by palindromic DNA. Consider a particle coated with palindromic DNA of sequence ``P1.'' Below the DNA hybridization temperature (Tm), loops of the self-complementary DNA form on the particle surface. Direct hybridization with similar particle covered with a different sequence P2 do not occur. However when particles are held together at T > Tm, then cooled to T < Tm, some of the loops entangle and link, similar to a Olympic Gel. We quantitatively observe and measure this topological interaction between colloids in a ˜5^o C temperature window, ˜6^o C lower than direct binding of complementary DNA with similar strength and introduce the concept of entanglement binding free energy. To prove our interaction to be topological, we unknot the purely entangled binding sites between colloids by adding Topoisomerase I which unconcatenates our loops. This research suggests novel history dependent ways of binding particles and serves as a new design tool in colloidal self-assembly.
Interdependence of pyrene interactions and tetramolecular G4-DNA assembly.

PubMed

Doluca, Osman; Withers, Jamie M; Loo, Trevor S; Edwards, Patrick J B; González, Carlos; Filichev, Vyacheslav V

2015-03-28

Controlling the arrangement of organic chromophores in supramolecular architectures is of primary importance for the development of novel functional molecules. Insertion of a twisted intercalating nucleic acid (TINA) moiety, containing phenylethynylpyren-1-yl derivatives, into a G-rich DNA sequence alters G-quadruplex folding, resulting in supramolecular structures with defined pyrene arrangements. Based on CD, NMR and ESI-mass-spectra, as well as TINA excited dimer (excimer) fluorescence emission we propose that insertion of the TINA monomer in the middle of a dTG4T sequence (i.e. dTGGXGGT, where X is TINA) converts a parallel tetramolecular G-quadruplex into an assembly composed of two identical antiparallel G-quadruplex subunits stacked via TINA-TINA interface. Kinetic analysis showed that TINA-TINA association controls complex formation in the presence of Na(+) but barely competes with guanine-mediated association in K(+) or in the sequence with the longer G-run (dTGGGXGGGT). These results demonstrate new perspectives in the design of molecular entities that can kinetically control G-quadruplex formation and show how tetramolecular G-quadruplexes can be used as a tuneable scaffold to control the arrangement of organic chromophores.
Extreme-Scale De Novo Genome Assembly

DOE Office of Scientific and Technical Information (OSTI.GOV)

Georganas, Evangelos; Hofmeyr, Steven; Egan, Rob

De novo whole genome assembly reconstructs genomic sequence from short, overlapping, and potentially erroneous DNA segments and is one of the most important computations in modern genomics. This work presents HipMER, a high-quality end-to-end de novo assembler designed for extreme scale analysis, via efficient parallelization of the Meraculous code. Genome assembly software has many components, each of which stresses different components of a computer system. This chapter explains the computational challenges involved in each step of the HipMer pipeline, the key distributed data structures, and communication costs in detail. We present performance results of assembling the human genome and themore » large hexaploid wheat genome on large supercomputers up to tens of thousands of cores.« less
Transcriptome Analysis in Venom Gland of the Predatory Giant Ant Dinoponera quadriceps: Insights into the Polypeptide Toxin Arsenal of Hymenopterans

PubMed Central

Chong, Cheong-Meng; Leung, Siu Wai; Prieto-da-Silva, Álvaro R. B.; Havt, Alexandre; Quinet, Yves P.; Martins, Alice M. C.; Lee, Simon M. Y.; Rádis-Baptista, Gandhi

2014-01-01

Background Dinoponera quadriceps is a predatory giant ant that inhabits the Neotropical region and subdues its prey (insects) with stings that deliver a toxic cocktail of molecules. Human accidents occasionally occur and cause local pain and systemic symptoms. A comprehensive study of the D. quadriceps venom gland transcriptome is required to advance our knowledge about the toxin repertoire of the giant ant venom and to understand the physiopathological basis of Hymenoptera envenomation. Results We conducted a transcriptome analysis of a cDNA library from the D. quadriceps venom gland with Sanger sequencing in combination with whole-transcriptome shotgun deep sequencing. From the cDNA library, a total of 420 independent clones were analyzed. Although the proportion of dinoponeratoxin isoform precursors was high, the first giant ant venom inhibitor cysteine-knot (ICK) toxin was found. The deep next generation sequencing yielded a total of 2,514,767 raw reads that were assembled into 18,546 contigs. A BLAST search of the assembled contigs against non-redundant and Swiss-Prot databases showed that 6,463 contigs corresponded to BLASTx hits and indicated an interesting diversity of transcripts related to venom gene expression. The majority of these venom-related sequences code for a major polypeptide core, which comprises venom allergens, lethal-like proteins and esterases, and a minor peptide framework composed of inter-specific structurally conserved cysteine-rich toxins. Both the cDNA library and deep sequencing yielded large proportions of contigs that showed no similarities with known sequences. Conclusions To our knowledge, this is the first report of the venom gland transcriptome of the New World giant ant D. quadriceps. The glandular venom system was dissected, and the toxin arsenal was revealed; this process brought to light novel sequences that included an ICK-folded toxins, allergen proteins, esterases (phospholipases and carboxylesterases), and lethal-like toxins. These findings contribute to the understanding of the ecology, behavior and venomics of hymenopterans. PMID:24498135
G-Anchor: a novel approach for whole-genome comparative mapping utilizing evolutionary conserved DNA sequences.

PubMed

Lenis, Vasileios Panagiotis E; Swain, Martin; Larkin, Denis M

2018-05-01

Cross-species whole-genome sequence alignment is a critical first step for genome comparative analyses, ranging from the detection of sequence variants to studies of chromosome evolution. Animal genomes are large and complex, and whole-genome alignment is a computationally intense process, requiring expensive high-performance computing systems due to the need to explore extensive local alignments. With hundreds of sequenced animal genomes available from multiple projects, there is an increasing demand for genome comparative analyses. Here, we introduce G-Anchor, a new, fast, and efficient pipeline that uses a strictly limited but highly effective set of local sequence alignments to anchor (or map) an animal genome to another species' reference genome. G-Anchor makes novel use of a databank of highly conserved DNA sequence elements. We demonstrate how these elements may be aligned to a pair of genomes, creating anchors. These anchors enable the rapid mapping of scaffolds from a de novo assembled genome to chromosome assemblies of a reference species. Our results demonstrate that G-Anchor can successfully anchor a vertebrate genome onto a phylogenetically related reference species genome using a desktop or laptop computer within a few hours and with comparable accuracy to that achieved by a highly accurate whole-genome alignment tool such as LASTZ. G-Anchor thus makes whole-genome comparisons accessible to researchers with limited computational resources. G-Anchor is a ready-to-use tool for anchoring a pair of vertebrate genomes. It may be used with large genomes that contain a significant fraction of evolutionally conserved DNA sequences and that are not highly repetitive, polypoid, or excessively fragmented. G-Anchor is not a substitute for whole-genome aligning software but can be used for fast and accurate initial genome comparisons. G-Anchor is freely available and a ready-to-use tool for the pairwise comparison of two genomes.
Kinetics of interaction of Cotton Leaf Curl Kokhran Virus-Dabawali (CLCuKV-Dab) coat protein and its mutants with ssDNA

DOE Office of Scientific and Technical Information (OSTI.GOV)

Priyadarshini, C.G. Poornima; Savithri, H.S., E-mail: bchss@biochem.iisc.ernet.i

Gemini viral assembly and transport of viral DNA into nucleus for replication, essentially involve DNA-coat protein interactions. The kinetics of interaction of Cotton Leaf Curl Kokhran Virus-Dabawali recombinant coat protein (rCP) with DNA was studied by electrophoretic mobility shift assay (EMSA) and surface plasmon resonance (SPR). The rCP interacted with ssDNA with a K{sub A}, of 2.6 +- 0.29 x 10{sup 8} M{sup -1} in a sequence non-specific manner. The CP has a conserved C2H2 type zinc finger motif composed of residues C68, C72, H81 and H85. Mutation of these residues to alanine resulted in reduced binding to DNA probes.more » The H85A mutant rCP showed the least binding with approximately 756 fold loss in the association rate and a three order magnitude decrease in the binding affinity as compared to rCP. The CP-DNA interactions via the zinc finger motif could play a crucial role in virus assembly and in nuclear transport.« less

Statistical physics of nucleosome positioning and chromatin structure

NASA Astrophysics Data System (ADS)

Morozov, Alexandre

2012-02-01

Genomic DNA is packaged into chromatin in eukaryotic cells. The fundamental building block of chromatin is the nucleosome, a 147 bp-long DNA molecule wrapped around the surface of a histone octamer. Arrays of nucleosomes are positioned along DNA according to their sequence preferences and folded into higher-order chromatin fibers whose structure is poorly understood. We have developed a framework for predicting sequence-specific histone-DNA interactions and the effective two-body potential responsible for ordering nucleosomes into regular higher-order structures. Our approach is based on the analogy between nucleosomal arrays and a one-dimensional fluid of finite-size particles with nearest-neighbor interactions. We derive simple rules which allow us to predict nucleosome occupancy solely from the dinucleotide content of the underlying DNA sequences.Dinucleotide content determines the degree of stiffness of the DNA polymer and thus defines its ability to bend into the nucleosomal superhelix. As expected, the nucleosome positioning rules are universal for chromatin assembled in vitro on genomic DNA from baker's yeast and from the nematode worm C.elegans, where nucleosome placement follows intrinsic sequence preferences and steric exclusion. However, the positioning rules inferred from in vivo C.elegans chromatin are affected by global nucleosome depletion from chromosome arms relative to central domains, likely caused by the attachment of the chromosome arms to the nuclear membrane. Furthermore, intrinsic nucleosome positioning rules are overwritten in transcribed regions, indicating that chromatin organization is actively managed by the transcriptional and splicing machinery.
Carbohydrate active enzymes revealed in Coptotermes formosanus transcriptome

USDA-ARS?s Scientific Manuscript database

A normalized cDNA library of Coptotermes formosanus was constructed using mixed RNA isolated from workers, soldiers, nymphs and alates of both sexes. Sequencing of this library generated 131,637 EST and 25,939 unigenes were assembled. Carbohydrate active enzymes (CAZymes) revealed in this library we...
De Novo Transcriptome Sequence Assembly from Coconut Leaves and Seeds with a Focus on Factors Involved in RNA-Directed DNA Methylation

PubMed Central

Huang, Ya-Yi; Lee, Chueh-Pai; Fu, Jason L.; Chang, Bill Chia-Han; Matzke, Antonius J. M.; Matzke, Marjori

2014-01-01

Coconut palm (Cocos nucifera) is a symbol of the tropics and a source of numerous edible and nonedible products of economic value. Despite its nutritional and industrial significance, coconut remains under-represented in public repositories for genomic and transcriptomic data. We report de novo transcript assembly from RNA-seq data and analysis of gene expression in seed tissues (embryo and endosperm) and leaves of a dwarf coconut variety. Assembly of 10 GB sequencing data for each tissue resulted in 58,211 total unigenes in embryo, 61,152 in endosperm, and 33,446 in leaf. Within each unigene pool, 24,857 could be annotated in embryo, 29,731 could be annotated in endosperm, and 26,064 could be annotated in leaf. A KEGG analysis identified 138, 138, and 139 pathways, respectively, in transcriptomes of embryo, endosperm, and leaf tissues. Given the extraordinarily large size of coconut seeds and the importance of small RNA-mediated epigenetic regulation during seed development in model plants, we used homology searches to identify putative homologs of factors required for RNA-directed DNA methylation in coconut. The findings suggest that RNA-directed DNA methylation is important during coconut seed development, particularly in maturing endosperm. This dataset will expand the genomics resources available for coconut and provide a foundation for more detailed analyses that may assist molecular breeding strategies aimed at improving this major tropical crop. PMID:25193496
De novo transcriptome sequence assembly from coconut leaves and seeds with a focus on factors involved in RNA-directed DNA methylation.

PubMed

Huang, Ya-Yi; Lee, Chueh-Pai; Fu, Jason L; Chang, Bill Chia-Han; Matzke, Antonius J M; Matzke, Marjori

2014-09-04

Coconut palm (Cocos nucifera) is a symbol of the tropics and a source of numerous edible and nonedible products of economic value. Despite its nutritional and industrial significance, coconut remains under-represented in public repositories for genomic and transcriptomic data. We report de novo transcript assembly from RNA-seq data and analysis of gene expression in seed tissues (embryo and endosperm) and leaves of a dwarf coconut variety. Assembly of 10 GB sequencing data for each tissue resulted in 58,211 total unigenes in embryo, 61,152 in endosperm, and 33,446 in leaf. Within each unigene pool, 24,857 could be annotated in embryo, 29,731 could be annotated in endosperm, and 26,064 could be annotated in leaf. A KEGG analysis identified 138, 138, and 139 pathways, respectively, in transcriptomes of embryo, endosperm, and leaf tissues. Given the extraordinarily large size of coconut seeds and the importance of small RNA-mediated epigenetic regulation during seed development in model plants, we used homology searches to identify putative homologs of factors required for RNA-directed DNA methylation in coconut. The findings suggest that RNA-directed DNA methylation is important during coconut seed development, particularly in maturing endosperm. This dataset will expand the genomics resources available for coconut and provide a foundation for more detailed analyses that may assist molecular breeding strategies aimed at improving this major tropical crop. Copyright © 2014 Huang et al.
Mapping DNA methylation by transverse current sequencing: Reduction of noise from neighboring nucleotides

NASA Astrophysics Data System (ADS)

Alvarez, Jose; Massey, Steven; Kalitsov, Alan; Velev, Julian

Nanopore sequencing via transverse current has emerged as a competitive candidate for mapping DNA methylation without needed bisulfite-treatment, fluorescent tag, or PCR amplification. By eliminating the error producing amplification step, long read lengths become feasible, which greatly simplifies the assembly process and reduces the time and the cost inherent in current technologies. However, due to the large error rates of nanopore sequencing, single base resolution has not been reached. A very important source of noise is the intrinsic structural noise in the electric signature of the nucleotide arising from the influence of neighboring nucleotides. In this work we perform calculations of the tunneling current through DNA molecules in nanopores using the non-equilibrium electron transport method within an effective multi-orbital tight-binding model derived from first-principles calculations. We develop a base-calling algorithm accounting for the correlations of the current through neighboring bases, which in principle can reduce the error rate below any desired precision. Using this method we show that we can clearly distinguish DNA methylation and other base modifications based on the reading of the tunneling current.
Stimuli-Responsive DNA-Based Hydrogels: From Basic Principles to Applications.

PubMed

Kahn, Jason S; Hu, Yuwei; Willner, Itamar

2017-04-18

The base sequence of nucleic acids encodes structural and functional information into the DNA biopolymer. External stimuli such as metal ions, pH, light, or added nucleic acid fuel strands provide triggers to reversibly switch nucleic acid structures such as metal-ion-bridged duplexes, i-motifs, triplex nucleic acids, G-quadruplexes, or programmed double-stranded hybrids of oligonucleotides (DNA). The signal-triggered oligonucleotide structures have been broadly applied to develop switchable DNA nanostructures and DNA machines, and these stimuli-responsive assemblies provide functional scaffolds for the rapidly developing area of DNA nanotechnology. Stimuli-responsive hydrogels undergoing signal-triggered hydrogel-to-solution transitions or signal-controlled stiffness changes attract substantial interest as functional matrices for controlled drug delivery, materials exhibiting switchable mechanical properties, acting as valves or actuators, and "smart" materials for sensing and information processing. The integration of stimuli-responsive oligonucleotides with hydrogel-forming polymers provides versatile means to exploit the functional information encoded in the nucleic acid sequences to yield stimuli-responsive hydrogels exhibiting switchable physical, structural, and chemical properties. Stimuli-responsive DNA-based nucleic acid structures are integrated in acrylamide polymer chains and reversible, switchable hydrogel-to-solution transitions of the systems are demonstrated by applying external triggers, such as metal ions, pH-responsive strands, G-quadruplex, and appropriate counter triggers that bridge and dissociate the polymer chains. By combining stimuli-responsive nucleic acid bridges with thermosensitive poly(N-isopropylacrylamide) (pNIPAM) chains, systems undergoing reversible solution ↔ hydrogel ↔ solid transitions are demonstrated. Specifically, by bridging acrylamide polymer chains by two nucleic acid functionalities, where one type of bridging unit provides a stimuli-responsive element and the second unit acts as internal "bridging memory", shape-memory hydrogels undergoing reversible and switchable transitions between shaped hydrogels and shapeless quasi-liquid states are demonstrated. By using stimuli-responsive hydrogel cross-linking units that can assemble the bridging units by two different input signals, the orthogonally-triggered functions of the shape-memory were shown. Furthermore, a versatile approach to assemble stimuli-responsive DNA-based acrylamide hydrogel films on surfaces is presented. The method involves the activation of the hybridization chain-reaction (HCR) by a surface-confined promoter strand, in the presence of acrylamide chains modified with two DNA hairpin structures and appropriate stimuli-responsive tethers. The resulting hydrogel-modified surfaces revealed switchable stiffness properties and signal-triggered catalytic functions. By applying the method to assemble the hydrogel microparticles, substrate-loaded, stimuli-responsive microcapsules are prepared. The signal-triggered DNA-based hydrogel microcapsules are applied as drug carriers for controlled release. The different potential applications and future perspectives of stimuli responsive hydrogels are discussed. Specifically, the use of these smart materials and assemblies as carriers for controlled drug release and as shape-memory matrices for information storage and inscription and the use of surface-confined stimuli-responsive hydrogels, exhibiting switchable stiffness properties, for catalysis and controlled growth of cells are discussed.
Water-Soluble Conjugated Polymers: Self-Assembly and Biosensor Applications

NASA Astrophysics Data System (ADS)

Bazan, Guillermo

2005-03-01

Homogeneous assays can be designed which take advantage of the optical amplification of conjugated polymers and the self-assembly characteristic of aqueous polyelectrolytes. For example, a ssDNA sequence sensor comprises an aqueous solution containing a cationic water soluble conjugated polymer such as poly(9,9-bis(trimethylammonium)-hexyl)-fluorene phenylene) with a peptide nucleic acid (PNA) labeled with a dye (PNA-C*). Signal transduction is controlled by hybridization of the neutral PNA-C* probe and the negative ssDNA target, resulting in favorable electrostatic interactions between the hybrid complex and the cationic polymer. Distance requirements for Förster energy transfer are thus met only when ssDNA of complementary sequence to the PNA-C* probe is present. Signal amplification by the conjugated polymer provides fluorescein emission >25 times higher than that of the directly excited dye. Transduction by electrostatic interactions followed by energy transfer is a general strategy. Examples involving other biomolecular recognition events, such as DNA/DNA, RNA/protein and RNA/RNA, will also be provided. The mechanism of biosensing will be discussed, with special attention to the varying contributions of hydrophobic and electrostatic forces, polymer conformation, charge density, local concentration of C*s and tailored defect sites for aggregation-induced optical changes. Finally, the water solubility of these conjugated polymers opens possibilities for spin casting onto organic materials, without dissolving the underlying layers. This property is useful for fabricating multilayer organic optoelectronic devices by simple solution techniques.
CASFISH: CRISPR/Cas9-mediated in situ labeling of genomic loci in fixed cells.

PubMed

Deng, Wulan; Shi, Xinghua; Tjian, Robert; Lionnet, Timothée; Singer, Robert H

2015-09-22

Direct visualization of genomic loci in the 3D nucleus is important for understanding the spatial organization of the genome and its association with gene expression. Various DNA FISH methods have been developed in the past decades, all involving denaturing dsDNA and hybridizing fluorescent nucleic acid probes. Here we report a novel approach that uses in vitro constituted nuclease-deficient clustered regularly interspaced short palindromic repeats (CRISPR)/CRISPR-associated caspase 9 (Cas9) complexes as probes to label sequence-specific genomic loci fluorescently without global DNA denaturation (Cas9-mediated fluorescence in situ hybridization, CASFISH). Using fluorescently labeled nuclease-deficient Cas9 (dCas9) protein assembled with various single-guide RNA (sgRNA), we demonstrated rapid and robust labeling of repetitive DNA elements in pericentromere, centromere, G-rich telomere, and coding gene loci. Assembling dCas9 with an array of sgRNAs tiling arbitrary target loci, we were able to visualize nonrepetitive genomic sequences. The dCas9/sgRNA binary complex is stable and binds its target DNA with high affinity, allowing sequential or simultaneous probing of multiple targets. CASFISH assays using differently colored dCas9/sgRNA complexes allow multicolor labeling of target loci in cells. In addition, the CASFISH assay is remarkably rapid under optimal conditions and is applicable for detection in primary tissue sections. This rapid, robust, less disruptive, and cost-effective technology adds a valuable tool for basic research and genetic diagnosis.
Supramolecular Hydrogels Based on DNA Self-Assembly.

PubMed

Shao, Yu; Jia, Haoyang; Cao, Tianyang; Liu, Dongsheng

2017-04-18

Extracellular matrix (ECM) provides essential supports three dimensionally to the cells in living organs, including mechanical support and signal, nutrition, oxygen, and waste transportation. Thus, using hydrogels to mimic its function has attracted much attention in recent years, especially in tissue engineering, cell biology, and drug screening. However, a hydrogel system that can merit all parameters of the natural ECM is still a challenge. In the past decade, deoxyribonucleic acid (DNA) has arisen as an outstanding building material for the hydrogels, as it has unique properties compared to most synthetic or natural polymers, such as sequence designability, precise recognition, structural rigidity, and minimal toxicity. By simple attachment to polymers as a side chain, DNA has been widely used as cross-links in hydrogel preparation. The formed secondary structures could confer on the hydrogel designable responsiveness, such as response to temperature, pH, metal ions, proteins, DNA, RNA, and small signal molecules like ATP. Moreover, single or multiple DNA restriction enzyme sites could be incorporated into the hydrogels by sequence design and greatly expand the latitude of their responses. Compared with most supramolecular hydrogels, these DNA cross-linked hydrogels could be relatively strong and easily adjustable via sequence variation, but it is noteworthy that these hydrogels still have excellent thixotropic properties and could be easily injected through a needle. In addition, the quick formation of duplex has also enabled the multilayer three-dimensional injection printing of living cells with the hydrogel as matrix. When the matrix is built purely by DNA assembly structures, the hydrogel inherits all the previously described characteristics; however, the long persistence length of DNA structures excluded the small size meshes of the network and made the hydrogel permeable to nutrition for cell proliferation. This unique property greatly expands the cell viability in the three-dimensional matrix to several weeks and also provides an easy way to prepare interpenetrating double network materials. In this Account, we outline the stream of hydrogels based on DNA self-assembly and discuss the mechanism that brings outstanding properties to the materials. Unlike most reported hydrogel systems, the all-in-one character of the DNA hydrogel avoids the "cask effect" in the properties. We believe the hydrogel will greatly benefit cell behavior studies especially in the following aspects: (1) stem cell differentiation can be studied with solely tunable mechanical strength of the matrix; (2) the dynamic nature of the network can allow cell migration through the hydrogel, which will help to build a more realistic model to observe the migration of cancer cells in vivo; (3) combination with rapidly developing three-dimension printing technology, the hydrogel will boost the construction of three-dimensional tissues and artificial organs.
Analysis of European mtDNAs for recombination.

PubMed

Elson, J L; Andrews, R M; Chinnery, P F; Lightowlers, R N; Turnbull, D M; Howell, N

2001-01-01

The standard paradigm postulates that the human mitochondrial genome (mtDNA) is strictly maternally inherited and that, consequently, mtDNA lineages are clonal. As a result of mtDNA clonality, phylogenetic and population genetic analyses should therefore be free of the complexities imposed by biparental recombination. The use of mtDNA in analyses of human molecular evolution is contingent, in fact, on clonality, which is also a condition that is critical both for forensic studies and for understanding the transmission of pathogenic mtDNA mutations within families. This paradigm, however, has been challenged recently by Eyre-Walker and colleagues. Using two different tests, they have concluded that recombination has contributed to the distribution of mtDNA polymorphisms within the human population. We have assembled a database that comprises the complete sequences of 64 European and 2 African mtDNAs. When this set of sequences was analyzed using any of three measures of linkage disequilibrium, one of the tests of Eyre-Walker and colleagues, there was no evidence for mtDNA recombination. When their test for excess homoplasies was applied to our set of sequences, only a slight excess of homoplasies was observed. We discuss possible reasons that our results differ from those of Eyre-Walker and colleagues. When we take the various results together, our conclusion is that mtDNA recombination has not been sufficiently frequent during human evolution to overturn the standard paradigm.
When less is more: 'slicing' sequencing data improves read decoding accuracy and de novo assembly quality.

PubMed

Lonardi, Stefano; Mirebrahim, Hamid; Wanamaker, Steve; Alpert, Matthew; Ciardo, Gianfranco; Duma, Denisa; Close, Timothy J

2015-09-15

As the invention of DNA sequencing in the 70s, computational biologists have had to deal with the problem of de novo genome assembly with limited (or insufficient) depth of sequencing. In this work, we investigate the opposite problem, that is, the challenge of dealing with excessive depth of sequencing. We explore the effect of ultra-deep sequencing data in two domains: (i) the problem of decoding reads to bacterial artificial chromosome (BAC) clones (in the context of the combinatorial pooling design we have recently proposed), and (ii) the problem of de novo assembly of BAC clones. Using real ultra-deep sequencing data, we show that when the depth of sequencing increases over a certain threshold, sequencing errors make these two problems harder and harder (instead of easier, as one would expect with error-free data), and as a consequence the quality of the solution degrades with more and more data. For the first problem, we propose an effective solution based on 'divide and conquer': we 'slice' a large dataset into smaller samples of optimal size, decode each slice independently, and then merge the results. Experimental results on over 15 000 barley BACs and over 4000 cowpea BACs demonstrate a significant improvement in the quality of the decoding and the final assembly. For the second problem, we show for the first time that modern de novo assemblers cannot take advantage of ultra-deep sequencing data. Python scripts to process slices and resolve decoding conflicts are available from http://goo.gl/YXgdHT; software Hashfilter can be downloaded from http://goo.gl/MIyZHs stelo@cs.ucr.edu or timothy.close@ucr.edu Supplementary data are available at Bioinformatics online. © The Author 2015. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.
Hyb-Seq: Combining target enrichment and genome skimming for plant phylogenomics1

PubMed Central

Weitemier, Kevin; Straub, Shannon C. K.; Cronn, Richard C.; Fishbein, Mark; Schmickl, Roswitha; McDonnell, Angela; Liston, Aaron

2014-01-01

• Premise of the study: Hyb-Seq, the combination of target enrichment and genome skimming, allows simultaneous data collection for low-copy nuclear genes and high-copy genomic targets for plant systematics and evolution studies. • Methods and Results: Genome and transcriptome assemblies for milkweed (Asclepias syriaca) were used to design enrichment probes for 3385 exons from 768 genes (>1.6 Mbp) followed by Illumina sequencing of enriched libraries. Hyb-Seq of 12 individuals (10 Asclepias species and two related genera) resulted in at least partial assembly of 92.6% of exons and 99.7% of genes and an average assembly length >2 Mbp. Importantly, complete plastomes and nuclear ribosomal DNA cistrons were assembled using off-target reads. Phylogenomic analyses demonstrated signal conflict between genomes. • Conclusions: The Hyb-Seq approach enables targeted sequencing of thousands of low-copy nuclear exons and flanking regions, as well as genome skimming of high-copy repeats and organellar genomes, to efficiently produce genome-scale data sets for phylogenomics. PMID:25225629
Aptamer based SERS detection of Salmonella typhimurium using DNA-assembled gold nanodimers.

PubMed

Xu, Xumin; Ma, Xiaoyuan; Wang, Haitao; Wang, Zhouping

2018-06-12

The authors describe a surface-enhanced Raman scattering (SERS) based aptasensor for Salmonella typhimurium (S. typhimurium). Gold nanoparticles (AuNPs; 35 nm i.d.) were functionalized with the aptamer (ssDNA 1) and used as the capture probe, while smaller (15 nm) AuNPs were modified with a Cy3-labeled complementary sequence (ssDNA 2) and used as the signalling probe. The asymmetric gold nanodimers (AuNDs) were assemblied with the Raman signal probe and the capture probe via hybridization of the complementary ssDNAs. The gap between two nanoparticles is a "hot spot" in which the Raman reporter Cy3 is localized. It experiences a strong enhancement of the electromagnetic field around the particle. After addition of S. typhimurium, it will be bound by the aptamer which therefore is partially dehybridized from its complementary sequence. Hence, Raman intensity drops. Under the optimal experimental conditions, the SERS signal at 1203 cm -1 increases linearly with the logarithm of the number of colonies in the 10 2 to 10 7 cfu·mL -1 concentration range, and the limit of detection is 35 cfu·mL -1 . The method can be performed within 1 h and was successfully applied to the analysis of spiked milk samples and performed very well and with high specificity. Graphical abstract DNA-assembled asymmetric gold nanodimers (AuNDs) were synthesized and appllied in a SERS-based aptasensor for S. typhimurium. Capture probe was preferentially combined with S. typhimurium and the structure of the AuNDs was destroyed. The "hot spot" vanished partly, this resulting in the decreased Raman intensity of Cy3.
Silver Nanoparticle Oligonucleotide Conjugates Based on DNA with Triple Cyclic Disulfide Moieties

PubMed Central

Lee, Jae-Seung; Lytton-Jean, Abigail K. R.; Hurst, Sarah J.; Mirkin, Chad A.

2011-01-01

We report a new strategy for preparing silver nanoparticle oligonucleotide conjugates that are based upon DNA with cyclic disulfide-anchoring groups. These particles are extremely stable and can withstand NaCl concentrations up to 1.0 M. When silver nanoparticles functionalized with complementary sequences are combined, they assemble to form DNA-linked nanoparticle networks. This assembly process is reversible with heating and is associated with a red-shifting of the particle surface plasmon resonance and a concomitant color change from yellow to pale red. Analogous to the oligonucleotide-functionalized gold nanoparticles, these particles also exhibit highly cooperative binding properties with extremely sharp melting transitions. This work is an important step towards being able to use silver nanoparticle oligonucleotide conjugates for a variety of purposes, including molecular diagnostic labels, synthons in programmable materials synthesis approaches, and functional components for nanoelectronic and plasmonic devices. PMID:17571909
Restarting and recentering genetic algorithm variations for DNA fragment assembly: The necessity of a multi-strategy approach.

PubMed

Hughes, James Alexander; Houghten, Sheridan; Ashlock, Daniel

2016-12-01

DNA Fragment assembly - an NP-Hard problem - is one of the major steps in of DNA sequencing. Multiple strategies have been used for this problem, including greedy graph-based algorithms, deBruijn graphs, and the overlap-layout-consensus approach. This study focuses on the overlap-layout-consensus approach. Heuristics and computational intelligence methods are combined to exploit their respective benefits. These algorithm combinations were able to produce high quality results surpassing the best results obtained by a number of competitive algorithms specially designed and tuned for this problem on thirteen of sixteen popular benchmarks. This work also reinforces the necessity of using multiple search strategies as it is clearly observed that algorithm performance is dependent on problem instance; without a deeper look into many searches, top solutions could be missed entirely. Copyright Â© 2016. Published by Elsevier Ireland Ltd.
Mapping the yeast genome by melting in nanofluidic devices

NASA Astrophysics Data System (ADS)

Welch, Robert L.; Czolkos, Ilja; Sladek, Rob; Reisner, Walter

2012-02-01

Optical mapping of DNA provides large-scale genomic information that can be used to assemble contigs from next-generation sequencing, and to detect re-arrangements between single cells. A recent optical mapping technique called denaturation mapping has the unique advantage of using physical principles rather than the action of enzymes to probe genomic structure. The absence of reagents or reaction steps makes denaturation mapping simpler than other protocols. Denaturation mapping uses fluorescence microscopy to image the pattern of partial melting along a DNA molecule extended in a channel of cross-section ˜100nm at the heart of a nanofluidic device. We successfully aligned melting maps from single DNA molecules to a theoretical map of the yeast genome (11.6Mbp) to identify their location. By aligning hundreds of molecules we assembled a consensus melting map of the yeast genome with 95% coverage.
Accessory factors promote AlfA-dependent plasmid segregation by regulating filament nucleation, disassembly, and bundling

PubMed Central

Polka, Jessica K.; Kollman, Justin M.; Mullins, R. Dyche

2014-01-01

In bacteria, some plasmids are partitioned to daughter cells by assembly of actin-like proteins (ALPs). The best understood ALP, ParM, has a core set of biochemical properties that contributes to its function, including dynamic instability, spontaneous nucleation, and bidirectional elongation. AlfA, an ALP that pushes plasmids apart in Bacillus, relies on a different set of underlying properties to segregate DNA. AlfA elongates unidirectionally and is not dynamically unstable; its assembly and disassembly are regulated by a cofactor, AlfB. Free AlfB breaks up AlfA bundles and promotes filament turnover. However, when AlfB is bound to the centromeric DNA sequence, parN, it forms a segrosome complex that nucleates and stabilizes AlfA filaments. When reconstituted in vitro, this system creates polarized, motile comet tails that associate by antiparallel filament bundling to form bipolar, DNA-segregating spindles. PMID:24481252
The Release 6 reference sequence of the Drosophila melanogaster genome

DOE PAGES

Hoskins, Roger A.; Carlson, Joseph W.; Wan, Kenneth H.; ...

2015-01-14

Drosophila melanogaster plays an important role in molecular, genetic, and genomic studies of heredity, development, metabolism, behavior, and human disease. The initial reference genome sequence reported more than a decade ago had a profound impact on progress in Drosophila research, and improving the accuracy and completeness of this sequence continues to be important to further progress. We previously described improvement of the 117-Mb sequence in the euchromatic portion of the genome and 21 Mb in the heterochromatic portion, using a whole-genome shotgun assembly, BAC physical mapping, and clone-based finishing. Here, we report an improved reference sequence of the single-copy andmore » middle-repetitive regions of the genome, produced using cytogenetic mapping to mitotic and polytene chromosomes, clone-based finishing and BAC fingerprint verification, ordering of scaffolds by alignment to cDNA sequences, incorporation of other map and sequence data, and validation by whole-genome optical restriction mapping. These data substantially improve the accuracy and completeness of the reference sequence and the order and orientation of sequence scaffolds into chromosome arm assemblies. Representation of the Y chromosome and other heterochromatic regions is particularly improved. The new 143.9-Mb reference sequence, designated Release 6, effectively exhausts clone-based technologies for mapping and sequencing. Highly repeat-rich regions, including large satellite blocks and functional elements such as the ribosomal RNA genes and the centromeres, are largely inaccessible to current sequencing and assembly methods and remain poorly represented. In conclusion, further significant improvements will require sequencing technologies that do not depend on molecular cloning and that produce very long reads.« less
The Release 6 reference sequence of the Drosophila melanogaster genome

DOE Office of Scientific and Technical Information (OSTI.GOV)

Hoskins, Roger A.; Carlson, Joseph W.; Wan, Kenneth H.

Drosophila melanogaster plays an important role in molecular, genetic, and genomic studies of heredity, development, metabolism, behavior, and human disease. The initial reference genome sequence reported more than a decade ago had a profound impact on progress in Drosophila research, and improving the accuracy and completeness of this sequence continues to be important to further progress. We previously described improvement of the 117-Mb sequence in the euchromatic portion of the genome and 21 Mb in the heterochromatic portion, using a whole-genome shotgun assembly, BAC physical mapping, and clone-based finishing. Here, we report an improved reference sequence of the single-copy andmore » middle-repetitive regions of the genome, produced using cytogenetic mapping to mitotic and polytene chromosomes, clone-based finishing and BAC fingerprint verification, ordering of scaffolds by alignment to cDNA sequences, incorporation of other map and sequence data, and validation by whole-genome optical restriction mapping. These data substantially improve the accuracy and completeness of the reference sequence and the order and orientation of sequence scaffolds into chromosome arm assemblies. Representation of the Y chromosome and other heterochromatic regions is particularly improved. The new 143.9-Mb reference sequence, designated Release 6, effectively exhausts clone-based technologies for mapping and sequencing. Highly repeat-rich regions, including large satellite blocks and functional elements such as the ribosomal RNA genes and the centromeres, are largely inaccessible to current sequencing and assembly methods and remain poorly represented. In conclusion, further significant improvements will require sequencing technologies that do not depend on molecular cloning and that produce very long reads.« less
Deep sampling of the Palomero maize transcriptome by a high throughput strategy of pyrosequencing.

PubMed

Vega-Arreguín, Julio C; Ibarra-Laclette, Enrique; Jiménez-Moraila, Beatriz; Martínez, Octavio; Vielle-Calzada, Jean Philippe; Herrera-Estrella, Luis; Herrera-Estrella, Alfredo

2009-07-06

In-depth sequencing analysis has not been able to determine the overall complexity of transcriptional activity of a plant organ or tissue sample. In some cases, deep parallel sequencing of Expressed Sequence Tags (ESTs), although not yet optimized for the sequencing of cDNAs, has represented an efficient procedure for validating gene prediction and estimating overall gene coverage. This approach could be very valuable for complex plant genomes. In addition, little emphasis has been given to efforts aiming at an estimation of the overall transcriptional universe found in a multicellular organism at a specific developmental stage. To explore, in depth, the transcriptional diversity in an ancient maize landrace, we developed a protocol to optimize the sequencing of cDNAs and performed 4 consecutive GS20-454 pyrosequencing runs of a cDNA library obtained from 2 week-old Palomero Toluqueño maize plants. The protocol reported here allowed obtaining over 90% of informative sequences. These GS20-454 runs generated over 1.5 Million reads, representing the largest amount of sequences reported from a single plant cDNA library. A collection of 367,391 quality-filtered reads (30.09 Mb) from a single run was sufficient to identify transcripts corresponding to 34% of public maize ESTs databases; total sequences generated after 4 filtered runs increased this coverage to 50%. Comparisons of all 1.5 Million reads to the Maize Assembled Genomic Islands (MAGIs) provided evidence for the transcriptional activity of 11% of MAGIs. We estimate that 5.67% (86,069 sequences) do not align with public ESTs or annotated genes, potentially representing new maize transcripts. Following the assembly of 74.4% of the reads in 65,493 contigs, real-time PCR of selected genes confirmed a predicted correlation between the abundance of GS20-454 sequences and corresponding levels of gene expression. A protocol was developed that significantly increases the number, length and quality of cDNA reads using massive 454 parallel sequencing. We show that recurrent 454 pyrosequencing of a single cDNA sample is necessary to attain a thorough representation of the transcriptional universe present in maize, that can also be used to estimate transcript abundance of specific genes. This data suggests that the molecular and functional diversity contained in the vast native landraces remains to be explored, and that large-scale transcriptional sequencing of a presumed ancestor of the modern maize varieties represents a valuable approach to characterize the functional diversity of maize for future agricultural and evolutionary studies.

The molecular biology of environmental aromatic hydrocarbons: Progress report for the period September 1, 1986 through July 31, 1987

DOE Office of Scientific and Technical Information (OSTI.GOV)

Weiss, S.B.

Our laboratory has explored the use of short DNA oligomers as targets for activated polycyclic aromatic hydrocarbons, such as benzo(a)pyrene diol epoxide (BPDE), in order to detect alterations in DNA sequence arrangement. In this model system, oligomers alkylated with (+)-BPDE are ligated into M13 viral DNA and used to transfect Escherichia coli. These cells are plated on agar, incubated at 37/sup 0/C, progeny viral clones are selected, amplified, and the viral DNAs isolated are sequenced at the site of oligomer insertion. We have devised a procedure for the preparation of unique duplex DNA oligomers such that the site of oligomermore » alkylation is specific for a single deoxynucleotide species in the two DNA strands. The procedure for oligomer assembly also allows us to vary the position of the alkylated residue in each of the two strands. Using our model system, the results obtained over the past year can be summarized as follows. When nonalkylated oligomer constructs are ligated into M13 viral DNA and used to transfect E. coli, no modifications in DNA sequence arrangement are detected in progeny viral DNAs. On the other hand, with oligomer constructs containing BP-adducts two major types of modifications in DNA sequence arrangement were observed: (1) large deletions, and (2) nonhomologous (illegitimate) recombinants. Both of these DNA modifications result in the complete removal of the oligomer insert. Transfection of E. coli that are recA/sup -/ does not alter these DNA modifications, therefore, it appears that the deletions and recombinants induced by the alkylated inserts are not under control of the RecA gene. As the distance between the alkylated residues in the duplex strands is increased, the number of recombinant events detected is reduced. In addition to the above types of DNA modifications, restoration of the original nucleotide sequence in the alkylated construct was also observed in progeny viral DNAs. 7 refs., 6 figs., 2 tabs.« less
Draft genome of the Peruvian scallop Argopecten purpuratus.

PubMed

Li, Chao; Liu, Xiao; Liu, Bo; Ma, Bin; Liu, Fengqiao; Liu, Guilong; Shi, Qiong; Wang, Chunde

2018-04-01

The Peruvian scallop, Argopecten purpuratus, is mainly cultured in southern Chile and Peru was introduced into China in the last century. Unlike other Argopecten scallops, the Peruvian scallop normally has a long life span of up to 7 to 10 years. Therefore, researchers have been using it to develop hybrid vigor. Here, we performed whole genome sequencing, assembly, and gene annotation of the Peruvian scallop, with an important aim to develop genomic resources for genetic breeding in scallops. A total of 463.19-Gb raw DNA reads were sequenced. A draft genome assembly of 724.78 Mb was generated (accounting for 81.87% of the estimated genome size of 885.29 Mb), with a contig N50 size of 80.11 kb and a scaffold N50 size of 1.02 Mb. Repeat sequences were calculated to reach 33.74% of the whole genome, and 26,256 protein-coding genes and 3,057 noncoding RNAs were predicted from the assembly. We generated a high-quality draft genome assembly of the Peruvian scallop, which will provide a solid resource for further genetic breeding and for the analysis of the evolutionary history of this economically important scallop.
Modularly assembled designer TAL effector nucleases for targeted gene knockout and gene replacement in eukaryotes

DOE Office of Scientific and Technical Information (OSTI.GOV)

Li, T; Huang, S; Zhao, XF

Recent studies indicate that the DNA recognition domain of transcription activator-like (TAL) effectors can be combined with the nuclease domain of FokI restriction enzyme to produce TAL effector nucleases (TALENs) that, in pairs, bind adjacent DNA target sites and produce double-strand breaks between the target sequences, stimulating non-homologous end-joining and homologous recombination. Here, we exploit the four prevalent TAL repeats and their DNA recognition cipher to develop a 'modular assembly' method for rapid production of designer TALENs (dTALENs) that recognize unique DNA sequence up to 23 bases in any gene. We have used this approach to engineer 10 dTALENs tomore » target specific loci in native yeast chromosomal genes. All dTALENs produced high rates of site-specific gene disruptions and created strains with expected mutant phenotypes. Moreover, dTALENs stimulated high rates (up to 34%) of gene replacement by homologous recombination. Finally, dTALENs caused no detectable cytotoxicity and minimal levels of undesired genetic mutations in the treated yeast strains. These studies expand the realm of verified TALEN activity from cultured human cells to an intact eukaryotic organism and suggest that low-cost, highly dependable dTALENs can assume a significant role for gene modifications of value in human and animal health, agriculture and industry.« less
De novo assembly of the pepper transcriptome (Capsicum annuum): a benchmark for in silico discovery of SNPs, SSRs and candidate genes

PubMed Central

2012-01-01

Background Molecular breeding of pepper (Capsicum spp.) can be accelerated by developing DNA markers associated with transcriptomes in breeding germplasm. Before the advent of next generation sequencing (NGS) technologies, the majority of sequencing data were generated by the Sanger sequencing method. By leveraging Sanger EST data, we have generated a wealth of genetic information for pepper including thousands of SNPs and Single Position Polymorphic (SPP) markers. To complement and enhance these resources, we applied NGS to three pepper genotypes: Maor, Early Jalapeño and Criollo de Morelos-334 (CM334) to identify SNPs and SSRs in the assembly of these three genotypes. Results Two pepper transcriptome assemblies were developed with different purposes. The first reference sequence, assembled by CAP3 software, comprises 31,196 contigs from >125,000 Sanger-EST sequences that were mainly derived from a Korean F1-hybrid line, Bukang. Overlapping probes were designed for 30,815 unigenes to construct a pepper Affymetrix GeneChip® microarray for whole genome analyses. In addition, custom Python scripts were used to identify 4,236 SNPs in contigs of the assembly. A total of 2,489 simple sequence repeats (SSRs) were identified from the assembly, and primers were designed for the SSRs. Annotation of contigs using Blast2GO software resulted in information for 60% of the unigenes in the assembly. The second transcriptome assembly was constructed from more than 200 million Illumina Genome Analyzer II reads (80–120 nt) using a combination of Velvet, CLC workbench and CAP3 software packages. BWA, SAMtools and in-house Perl scripts were used to identify SNPs among three pepper genotypes. The SNPs were filtered to be at least 50 bp from any intron-exon junctions as well as flanking SNPs. More than 22,000 high-quality putative SNPs were identified. Using the MISA software, 10,398 SSR markers were also identified within the Illumina transcriptome assembly and primers were designed for the identified markers. The assembly was annotated by Blast2GO and 14,740 (12%) of annotated contigs were associated with functional proteins. Conclusions Before availability of pepper genome sequence, assembling transcriptomes of this economically important crop was required to generate thousands of high-quality molecular markers that could be used in breeding programs. In order to have a better understanding of the assembled sequences and to identify candidate genes underlying QTLs, we annotated the contigs of Sanger-EST and Illumina transcriptome assemblies. These and other information have been curated in a database that we have dedicated for pepper project. PMID:23110314
From NGS assembly challenges to instability of fungal mitochondrial genomes: A case study in genome complexity.

PubMed

Misas, Elizabeth; Muñoz, José Fernando; Gallo, Juan Esteban; McEwen, Juan Guillermo; Clay, Oliver Keatinge

2016-04-01

The presence of repetitive or non-unique DNA persisting over sizable regions of a eukaryotic genome can hinder the genome's successful de novo assembly from short reads: ambiguities in assigning genome locations to the non-unique subsequences can result in premature termination of contigs and thus overfragmented assemblies. Fungal mitochondrial (mtDNA) genomes are compact (typically less than 100 kb), yet often contain short non-unique sequences that can be shown to impede their successful de novo assembly in silico. Such repeats can also confuse processes in the cell in vivo. A well-studied example is ectopic (out-of-register, illegitimate) recombination associated with repeat pairs, which can lead to deletion of functionally important genes that are located between the repeats. Repeats that remain conserved over micro- or macroevolutionary timescales despite such risks may indicate functionally or structurally (e.g., for replication) important regions. This principle could form the basis of a mining strategy for accelerating discovery of function in genome sequences. We present here our screening of a sample of 11 fully sequenced fungal mitochondrial genomes by observing where exact k-mer repeats occurred several times; initial analyses motivated us to focus on 17-mers occurring more than three times. Based on the diverse repeats we observe, we propose that such screening may serve as an efficient expedient for gaining a rapid but representative first insight into the repeat landscapes of sparsely characterized mitochondrial chromosomes. Our matching of the flagged repeats to previously reported regions of interest supports the idea that systems of persisting, non-trivial repeats in genomes can often highlight features meriting further attention. Copyright © 2016 Elsevier Ltd. All rights reserved.
A Surrogate Approach to Study the Evolution of Noncoding DNA Elements That Organize Eukaryotic Genomes

PubMed Central

Vermaak, Danielle; Bayes, Joshua J.

2009-01-01

Comparative genomics provides a facile way to address issues of evolutionary constraint acting on different elements of the genome. However, several important DNA elements have not reaped the benefits of this new approach. Some have proved intractable to current day sequencing technology. These include centromeric and heterochromatic DNA, which are essential for chromosome segregation as well as gene regulation, but the highly repetitive nature of the DNA sequences in these regions make them difficult to assemble into longer contigs. Other sequences, like dosage compensation X chromosomal sites, origins of DNA replication, or heterochromatic sequences that encode piwi-associated RNAs, have proved difficult to study because they do not have recognizable DNA features that allow them to be described functionally or computationally. We have employed an alternate approach to the direct study of these DNA elements. By using proteins that specifically bind these noncoding DNAs as surrogates, we can indirectly assay the evolutionary constraints acting on these important DNA elements. We review the impact that such “surrogate strategies” have had on our understanding of the evolutionary constraints shaping centromeres, origins of DNA replication, and dosage compensation X chromosomal sites. These have begun to reveal that in contrast to the view that such structural DNA elements are either highly constrained (under purifying selection) or free to drift (under neutral evolution), some of them may instead be shaped by adaptive evolution and genetic conflicts (these are not mutually exclusive). These insights also help to explain why the same elements (e.g., centromeres and replication origins), which are so complex in some eukaryotic genomes, can be simple and well defined in other where similar conflicts do not exist. PMID:19635763
bold: The Barcode of Life Data System (http://www.barcodinglife.org)

PubMed Central

RATNASINGHAM, SUJEEVAN; HEBERT, PAUL D N

2007-01-01

The Barcode of Life Data System (bold) is an informatics workbench aiding the acquisition, storage, analysis and publication of DNA barcode records. By assembling molecular, morphological and distributional data, it bridges a traditional bioinformatics chasm. bold is freely available to any researcher with interests in DNA barcoding. By providing specialized services, it aids the assembly of records that meet the standards needed to gain BARCODE designation in the global sequence databases. Because of its web-based delivery and flexible data security model, it is also well positioned to support projects that involve broad research alliances. This paper provides a brief introduction to the key elements of bold, discusses their functional capabilities, and concludes by examining computational resources and future prospects. PMID:18784790
3G vector-primer plasmid for constructing full-length-enriched cDNA libraries.

PubMed

Zheng, Dong; Zhou, Yanna; Zhang, Zidong; Li, Zaiyu; Liu, Xuedong

2008-09-01

We designed a 3G vector-primer plasmid for the generation of full-length-enriched complementary DNA (cDNA) libraries. By employing the terminal transferase activity of reverse transcriptase and the modified strand replacement method, this plasmid (assembled with a polydT end and a deoxyguanosine [dG] end) combines priming full-length cDNA strand synthesis and directional cDNA cloning. As a result, the number of steps involved in cDNA library preparation is decreased while simplifying downstream gene manipulation, sequencing, and subcloning. The 3G vector-primer plasmid method yields fully represented plasmid primed libraries that are equivalent to those made by the SMART (switching mechanism at 5' end of RNA transcript) approach.
Optical properties and electronic transitions of DNA oligonucleotides as a function of composition and stacking sequence.

PubMed

Schimelman, Jacob B; Dryden, Daniel M; Poudel, Lokendra; Krawiec, Katherine E; Ma, Yingfang; Podgornik, Rudolf; Parsegian, V Adrian; Denoyer, Linda K; Ching, Wai-Yim; Steinmetz, Nicole F; French, Roger H

2015-02-14

The role of base pair composition and stacking sequence in the optical properties and electronic transitions of DNA is of fundamental interest. We present and compare the optical properties of DNA oligonucleotides (AT)10, (AT)5(GC)5, and (AT-GC)5 using both ab initio methods and UV-vis molar absorbance measurements. Our data indicate a strong dependence of both the position and intensity of UV absorbance features on oligonucleotide composition and stacking sequence. The partial densities of states for each oligonucleotide indicate that the valence band edge arises from a feature associated with the PO4(3-) complex anion, and the conduction band edge arises from anti-bonding states in DNA base pairs. The results show a strong correspondence between the ab initio and experimentally determined optical properties. These results highlight the benefit of full spectral analysis of DNA, as opposed to reductive methods that consider only the 260 nm absorbance (A260) or simple purity ratios, such as A260/A230 or A260/A280, and suggest that the slope of the absorption edge onset may provide a useful metric for the degree of base pair stacking in DNA. These insights may prove useful for applications in biology, bioelectronics, and mesoscale self-assembly.
Biorecognition by DNA oligonucleotides after Exposure to Photoresists and Resist Removers

PubMed Central

Dean, Stacey L.; Morrow, Thomas J.; Patrick, Sue; Li, Mingwei; Clawson, Gary; Mayer, Theresa S.; Keating, Christine D.

2013-01-01

Combining biological molecules with integrated circuit technology is of considerable interest for next generation sensors and biomedical devices. Current lithographic microfabrication methods, however, were developed for compatibility with silicon technology rather than bioorganic molecules and consequently it cannot be assumed that biomolecules will remain attached and intact during on-chip processing. Here, we evaluate the effects of three common photoresists (Microposit S1800 series, PMGI SF6, and Megaposit SPR 3012) and two photoresist removers (acetone and 1165 remover) on the ability of surface-immobilized DNA oligonucleotides to selectively recognize their reverse-complementary sequence. Two common DNA immobilization methods were compared: adsorption of 5′-thiolated sequences directly to gold nanowires and covalent attachment of 5′-thiolated sequences to surface amines on silica coated nanowires. We found that acetone had deleterious effects on selective hybridization as compared to 1165 remover, presumably due to incomplete resist removal. Use of the PMGI photoresist, which involves a high temperature bake step, was detrimental to the later performance of nanowire-bound DNA in hybridization assays, especially for DNA attached via thiol adsorption. The other three photoresists did not substantially degrade DNA binding capacity or selectivity for complementary DNA sequences. To determine if the lithographic steps caused more subtle damage, we also tested oligonucleotides containing a single base mismatch. Finally, a two-step photolithographic process was developed and used in combination with dielectrophoretic nanowire assembly to produce an array of doubly-contacted, electrically isolated individual nanowire components on a chip. Post-fabrication fluorescence imaging indicated that nanowire-bound DNA was present and able to selectively bind complementary strands. PMID:23952639
Evaluation of the Implications of Nanoscale Architectures on Contextual Knowledge Discovery and Memory: Self-Assembled Architectures and Memory

DTIC Science & Technology

2008-05-01

patterns. Our strategy to nucleate Ag nanoparticles has been to use a templating protein (e.g., streptavidin) that has been chemically pre- charged with...assembly is used to direct the formation of switching devices and wires to create logic circuitry, memory, and I/O interfaces . We can control the reaction...determines the formation of structures (through complementarity ). Sequence design is important because it determines many aspects of the target DNA
Cloning, Assembly, and Modification of the Primary Human Cytomegalovirus Isolate Toledo by Yeast-Based Transformation-Associated Recombination.

PubMed

Vashee, Sanjay; Stockwell, Timothy B; Alperovich, Nina; Denisova, Evgeniya A; Gibson, Daniel G; Cady, Kyle C; Miller, Kristofer; Kannan, Krishna; Malouli, Daniel; Crawford, Lindsey B; Voorhies, Alexander A; Bruening, Eric; Caposio, Patrizia; Früh, Klaus

2017-01-01

Genetic engineering of cytomegalovirus (CMV) currently relies on generating a bacterial artificial chromosome (BAC) by introducing a bacterial origin of replication into the viral genome using in vivo recombination in virally infected tissue culture cells. However, this process is inefficient, results in adaptive mutations, and involves deletion of viral genes to avoid oversized genomes when inserting the BAC cassette. Moreover, BAC technology does not permit the simultaneous manipulation of multiple genome loci and cannot be used to construct synthetic genomes. To overcome these limitations, we adapted synthetic biology tools to clone CMV genomes in Saccharomyces cerevisiae . Using an early passage of the human CMV isolate Toledo, we first applied transformation-associated recombination (TAR) to clone 16 overlapping fragments covering the entire Toledo genome in Saccharomyces cerevisiae . Then, we assembled these fragments by TAR in a stepwise process until the entire genome was reconstituted in yeast. Since next-generation sequence analysis revealed that the low-passage-number isolate represented a mixture of parental and fibroblast-adapted genomes, we selectively modified individual DNA fragments of fibroblast-adapted Toledo (Toledo-F) and again used TAR assembly to recreate parental Toledo (Toledo-P). Linear, full-length HCMV genomes were transfected into human fibroblasts to recover virus. Unlike Toledo-F, Toledo-P displayed characteristics of primary isolates, including broad cellular tropism in vitro and the ability to establish latency and reactivation in humanized mice. Our novel strategy thus enables de novo cloning of CMV genomes, more-efficient genome-wide engineering, and the generation of viral genomes that are partially or completely derived from synthetic DNA. IMPORTANCE The genomes of large DNA viruses, such as human cytomegalovirus (HCMV), are difficult to manipulate using current genetic tools, and at this time, it is not possible to obtain, molecular clones of CMV without extensive tissue culture. To overcome these limitations, we used synthetic biology tools to capture genomic fragments from viral DNA and assemble full-length genomes in yeast. Using an early passage of the HCMV isolate Toledo containing a mixture of wild-type and tissue culture-adapted virus. we directly cloned the majority sequence and recreated the minority sequence by simultaneous modification of multiple genomic regions. Thus, our novel approach provides a paradigm to not only efficiently engineer HCMV and other large DNA viruses on a genome-wide scale but also facilitates the cloning and genetic manipulation of primary isolates and provides a pathway to generating entirely synthetic genomes.
Cloning, Assembly, and Modification of the Primary Human Cytomegalovirus Isolate Toledo by Yeast-Based Transformation-Associated Recombination

PubMed Central

Vashee, Sanjay; Stockwell, Timothy B.; Alperovich, Nina; Denisova, Evgeniya A.; Gibson, Daniel G.; Cady, Kyle C.; Miller, Kristofer; Kannan, Krishna; Malouli, Daniel; Crawford, Lindsey B.; Voorhies, Alexander A.; Bruening, Eric; Caposio, Patrizia

2017-01-01

ABSTRACT Genetic engineering of cytomegalovirus (CMV) currently relies on generating a bacterial artificial chromosome (BAC) by introducing a bacterial origin of replication into the viral genome using in vivo recombination in virally infected tissue culture cells. However, this process is inefficient, results in adaptive mutations, and involves deletion of viral genes to avoid oversized genomes when inserting the BAC cassette. Moreover, BAC technology does not permit the simultaneous manipulation of multiple genome loci and cannot be used to construct synthetic genomes. To overcome these limitations, we adapted synthetic biology tools to clone CMV genomes in Saccharomyces cerevisiae. Using an early passage of the human CMV isolate Toledo, we first applied transformation-associated recombination (TAR) to clone 16 overlapping fragments covering the entire Toledo genome in Saccharomyces cerevisiae. Then, we assembled these fragments by TAR in a stepwise process until the entire genome was reconstituted in yeast. Since next-generation sequence analysis revealed that the low-passage-number isolate represented a mixture of parental and fibroblast-adapted genomes, we selectively modified individual DNA fragments of fibroblast-adapted Toledo (Toledo-F) and again used TAR assembly to recreate parental Toledo (Toledo-P). Linear, full-length HCMV genomes were transfected into human fibroblasts to recover virus. Unlike Toledo-F, Toledo-P displayed characteristics of primary isolates, including broad cellular tropism in vitro and the ability to establish latency and reactivation in humanized mice. Our novel strategy thus enables de novo cloning of CMV genomes, more-efficient genome-wide engineering, and the generation of viral genomes that are partially or completely derived from synthetic DNA. IMPORTANCE The genomes of large DNA viruses, such as human cytomegalovirus (HCMV), are difficult to manipulate using current genetic tools, and at this time, it is not possible to obtain, molecular clones of CMV without extensive tissue culture. To overcome these limitations, we used synthetic biology tools to capture genomic fragments from viral DNA and assemble full-length genomes in yeast. Using an early passage of the HCMV isolate Toledo containing a mixture of wild-type and tissue culture-adapted virus. we directly cloned the majority sequence and recreated the minority sequence by simultaneous modification of multiple genomic regions. Thus, our novel approach provides a paradigm to not only efficiently engineer HCMV and other large DNA viruses on a genome-wide scale but also facilitates the cloning and genetic manipulation of primary isolates and provides a pathway to generating entirely synthetic genomes. PMID:28989973
De Novo Assembly and Characterization of Four Anthozoan (Phylum Cnidaria) Transcriptomes.

PubMed

Kitchen, Sheila A; Crowder, Camerron M; Poole, Angela Z; Weis, Virginia M; Meyer, Eli

2015-09-17

Many nonmodel species exemplify important biological questions but lack the sequence resources required to study the genes and genomic regions underlying traits of interest. Reef-building corals are famously sensitive to rising seawater temperatures, motivating ongoing research into their stress responses and long-term prospects in a changing climate. A comprehensive understanding of these processes will require extending beyond the sequenced coral genome (Acropora digitifera) to encompass diverse coral species and related anthozoans. Toward that end, we have assembled and annotated reference transcriptomes to develop catalogs of gene sequences for three scleractinian corals (Fungia scutaria, Montastraea cavernosa, Seriatopora hystrix) and a temperate anemone (Anthopleura elegantissima). High-throughput sequencing of cDNA libraries produced ~20-30 million reads per sample, and de novo assembly of these reads produced ~75,000-110,000 transcripts from each sample with size distributions (mean ~1.4 kb, N50 ~2 kb), comparable to the distribution of gene models from the coral genome (mean ~1.7 kb, N50 ~2.2 kb). Each assembly includes matches for more than half the gene models from A. digitifera (54-67%) and many reasonably complete transcripts (~5300-6700) spanning nearly the entire gene (ortholog hit ratios ≥0.75). The catalogs of gene sequences developed in this study made it possible to identify hundreds to thousands of orthologs across diverse scleractinian species and related taxa. We used these sequences for phylogenetic inference, recovering known relationships and demonstrating superior performance over phylogenetic trees constructed using single mitochondrial loci. The resources developed in this study provide gene sequences and genetic markers for several anthozoan species. To enhance the utility of these resources for the research community, we developed searchable databases enabling researchers to rapidly recover sequences for genes of interest. Our analysis of de novo assembly quality highlights metrics that we expect will be useful for evaluating the relative quality of other de novo transcriptome assemblies. The identification of orthologous sequences and phylogenetic reconstruction demonstrates the feasibility of these methods for clarifying the substantial uncertainties in the existing scleractinian phylogeny. Copyright © 2015 Kitchen et al.
Probing the structure and function of biopolymer-carbon nanotube hybrids with molecular dynamics

NASA Astrophysics Data System (ADS)

Johnson, Robert R.

2009-12-01

Nanoscience deals with the characterization and manipulation of matter on the atomic/molecular size scale in order to deepen our understanding of condensed matter and develop revolutionary technology. Meeting the demands of the rapidly advancing nanotechnological frontier requires novel, multifunctional nanoscale materials. Among the most promising nanomaterials to fulfill this need are biopolymer-carbon nanotube hybrids (Bio-CNT). Bio-CNT consists of a single-walled carbon nanotube (CNT) coated with a self-assembled layer of biopolymers such as DNA or protein. Experiments have demonstrated that these nanomaterials possess a wide range of technologically useful properties with applications in nanoelectronics, medicine, homeland security, environmental safety and microbiology. However, a fundamental understanding of the self-assembly mechanics, structure and energetics of Bio-CNT is lacking. The objective of this thesis is to address this deficiency through molecular dynamics (MD) simulation, which provides an atomic-scale window into the behavior of this unique nanomaterial. MD shows that Bio-CNT composed of single-stranded DNA (ssDNA) self-assembles via the formation of high affinity contacts between DNA bases and the CNT sidewall. Calculation of the base-CNT binding free energy by thermodynamic integration reveals that these contacts result from the attractive pi--pi stacking interaction. Binding affinities follow the trend G > A > T > C. MD reveals that long ssDNA sequences are driven into a helical wrapping about CNT with a sub-10 nm pitch by electrostatic and torsional interactions in the backbone. A large-scale replica exchange molecular dynamics simulation reveals that ssDNA-CNT hybrids are disordered. At room temperature, ssDNA can reside in several low-energy conformations that contain a sequence-specific arrangement of bases detached from CNT surface. MD demonstrates that protein-CNT hybrids composed of the Coxsackie-adenovirus receptor are biologically active and function as a nanobiosensor with specific recognition of Knob proteins from the adenovirus capsid. Simulation also shows that the rigid CNT damps structural fluctuations in bound proteins, which may have important ramifications for biosensing devices composed of protein-CNT hybrids. These results expand current knowledge of Bio-CNT and demonstrate the effectiveness of MD for investigations of nanobiomolecular systems.
Diversity in Requirement of Genetic and Epigenetic Factors for Centromere Function in Fungi ▿

PubMed Central

Roy, Babhrubahan; Sanyal, Kaustuv

2011-01-01

A centromere is a chromosomal region on which several proteins assemble to form the kinetochore. The centromere-kinetochore complex helps in the attachment of chromosomes to spindle microtubules to mediate segregation of chromosomes to daughter cells during mitosis and meiosis. In several budding yeast species, the centromere forms in a DNA sequence-dependent manner, whereas in most other fungi, factors other than the DNA sequence also determine the centromere location, as centromeres were able to form on nonnative sequences (neocentromeres) when native centromeres were deleted in engineered strains. Thus, in the absence of a common DNA sequence, the cues that have facilitated centromere formation on a specific DNA sequence for millions of years remain a mystery. Kinetochore formation is facilitated by binding of a centromere-specific histone protein member of the centromeric protein A (CENP-A) family that replaces a canonical histone H3 to form a specialized centromeric chromatin structure. However, the process of kinetochore formation on the rapidly evolving and seemingly diverse centromere DNAs in different fungal species is largely unknown. More interestingly, studies in various yeasts suggest that the factors required for de novo centromere formation (establishment) may be different from those required for maintenance (propagation) of an already established centromere. Apart from the DNA sequence and CENP-A, many other factors, such as posttranslational modification (PTM) of histones at centric and pericentric chromatin, RNA interference, and DNA methylation, are also involved in centromere formation, albeit in a species-specific manner. In this review, we discuss how several genetic and epigenetic factors influence the evolution of structure and function of centromeres in fungal species. PMID:21908596
Interfacing DNA nanodevices with biology: challenges, solutions and perspectives

NASA Astrophysics Data System (ADS)

Vinther, Mathias; Kjems, Jørgen

2016-08-01

The cellular machinery performs millions of complex reactions with extreme precision at nanoscale. From studying these reactions, scientists have become inspired to build artificial nanosized molecular devices with programmed functions. One of the fundamental tools in designing and creating these nanodevices is molecular self-assembly. In nature, deoxyribonucleic acid (DNA) is inarguably one of the most remarkable self-assembling molecules. Governed by the Watson-Crick base-pairing rules, DNA assembles with a structural reliability and predictability based on sequence composition unlike any other complex biological polymer. This consistency has enabled rational design of hundreds of two- and three-dimensional shapes with a molecular precision and homogeneity not preceded by any other known technology at the nanometer scale. During the last two decades, DNA nanotechnology has undergone a rapid evolution pioneered by the work of Nadrian Seeman (Kallenbach et al 1983 Nature 205 829-31). Especially the introduction of the versatile DNA Origami technique by Rothemund (2006 Nature 440 297-302) led to an efflorescence of new DNA-based self-assembled nanostructures (Andersen et al 2009 Nature 459 73-6, Douglas et al 2009 Nature 459 414-8, Dietz et al 2009 Science 325 725-30, Han et al 2011 Science 332 342-6, Iinuma et al 2014 Science 344 65-9), and variations of this technique have contributed to an increasing repertoire of DNA nanostructures (Wei et al 2012 Nature 485 623-6, Ke et al 2012 Science 338 1177-83, Benson et al 2015 Nature 523 441-4, Zhang et al 2015 Nat. Nanotechnol. 10 779-84, Scheible et al 2015 Small 11 5200-5). These advances have naturally triggered the question: What can these DNA nanostructures be used for? One of the leading proposals of use for DNA nanotechnology has been in biology and biomedicine acting as a molecular ‘nanorobot’ or smart drug interacting with the cellular machinery. In this review, we will explore and examine the perspective of DNA nanotechnology for such use. We summarize which requirements DNA nanostructures must fulfil to function in cellular environments and inside living organisms. In addition, we highlight recent advances in interfacing DNA nanostructures with biology.
Quantum Point Contact Single-Nucleotide Conductance for DNA and RNA Sequence Identification.

PubMed

Afsari, Sepideh; Korshoj, Lee E; Abel, Gary R; Khan, Sajida; Chatterjee, Anushree; Nagpal, Prashant

2017-11-28

Several nanoscale electronic methods have been proposed for high-throughput single-molecule nucleic acid sequence identification. While many studies display a large ensemble of measurements as "electronic fingerprints" with some promise for distinguishing the DNA and RNA nucleobases (adenine, guanine, cytosine, thymine, and uracil), important metrics such as accuracy and confidence of base calling fall well below the current genomic methods. Issues such as unreliable metal-molecule junction formation, variation of nucleotide conformations, insufficient differences between the molecular orbitals responsible for single-nucleotide conduction, and lack of rigorous base calling algorithms lead to overlapping nanoelectronic measurements and poor nucleotide discrimination, especially at low coverage on single molecules. Here, we demonstrate a technique for reproducible conductance measurements on conformation-constrained single nucleotides and an advanced algorithmic approach for distinguishing the nucleobases. Our quantum point contact single-nucleotide conductance sequencing (QPICS) method uses combed and electrostatically bound single DNA and RNA nucleotides on a self-assembled monolayer of cysteamine molecules. We demonstrate that by varying the applied bias and pH conditions, molecular conductance can be switched ON and OFF, leading to reversible nucleotide perturbation for electronic recognition (NPER). We utilize NPER as a method to achieve >99.7% accuracy for DNA and RNA base calling at low molecular coverage (∼12×) using unbiased single measurements on DNA/RNA nucleotides, which represents a significant advance compared to existing sequencing methods. These results demonstrate the potential for utilizing simple surface modifications and existing biochemical moieties in individual nucleobases for a reliable, direct, single-molecule, nanoelectronic DNA and RNA nucleotide identification method for sequencing.
A global assembly of cotton ESTs

PubMed Central

Udall, Joshua A.; Swanson, Jordan M.; Haller, Karl; Rapp, Ryan A.; Sparks, Michael E.; Hatfield, Jamie; Yu, Yeisoo; Wu, Yingru; Dowd, Caitriona; Arpat, Aladdin B.; Sickler, Brad A.; Wilkins, Thea A.; Guo, Jin Ying; Chen, Xiao Ya; Scheffler, Jodi; Taliercio, Earl; Turley, Ricky; McFadden, Helen; Payton, Paxton; Klueva, Natalya; Allen, Randell; Zhang, Deshui; Haigler, Candace; Wilkerson, Curtis; Suo, Jinfeng; Schulze, Stefan R.; Pierce, Margaret L.; Essenberg, Margaret; Kim, HyeRan; Llewellyn, Danny J.; Dennis, Elizabeth S.; Kudrna, David; Wing, Rod; Paterson, Andrew H.; Soderlund, Cari; Wendel, Jonathan F.

2006-01-01

Approximately 185,000 Gossypium EST sequences comprising >94,800,000 nucleotides were amassed from 30 cDNA libraries constructed from a variety of tissues and organs under a range of conditions, including drought stress and pathogen challenges. These libraries were derived from allopolyploid cotton (Gossypium hirsutum; AT and DT genomes) as well as its two diploid progenitors, Gossypium arboreum (A genome) and Gossypium raimondii (D genome). ESTs were assembled using the Program for Assembling and Viewing ESTs (PAVE), resulting in 22,030 contigs and 29,077 singletons (51,107 unigenes). Further comparisons among the singletons and contigs led to recognition of 33,665 exemplar sequences that represent a nonredundant set of putative Gossypium genes containing partial or full-length coding regions and usually one or two UTRs. The assembly, along with their UniProt BLASTX hits, GO annotation, and Pfam analysis results, are freely accessible as a public resource for cotton genomics. Because ESTs from diploid and allotetraploid Gossypium were combined in a single assembly, we were in many cases able to bioinformatically distinguish duplicated genes in allotetraploid cotton and assign them to either the A or D genome. The assembly and associated information provide a framework for future investigation of cotton functional and evolutionary genomics. PMID:16478941
SPRi-based biosensing platforms for detection of specific DNA sequences using thiolate and dithiocarbamate assemblies

NASA Astrophysics Data System (ADS)

Drozd, Marcin; Pietrzak, Mariusz D.; Malinowska, Elżbieta

2018-05-01

The framework of presented study covers the development and examination of the analytical performance of surface plasmon resonance-based (SPR) DNA biosensors dedicated for a detection of model target oligonucleotide sequence. For this aim, various strategies of immobilization of DNA probes on gold transducers were tested. Besides the typical approaches: chemisorption of thiolated ssDNA (DNA-thiol) and physisorption of non-functionalized oligonucleotides, relatively new method based on chemisorption of dithiocarbamate-functionalized ssDNA (DNA-DTC) was applied for the first time for preparation of DNA-based SPR biosensor. The special emphasis was put on the correlation between the method of DNA immobilization and the composition of obtained receptor layer. The carried out studies focused on the examination of the capability of developed receptors layers to interact with both target DNA and DNA-functionalized AuNPs. It was found, that the detection limit of target DNA sequence (27 nb length) depends on the strategy of probe immobilization and backfilling method, and in the best case it amounted to 0,66 nM. Moreover, the application of ssDNA-functionalized gold nanoparticles (AuNPs) as plasmonic labels for secondary enhancement of SPR response is presented. The influence of spatial organization and surface density of a receptor layer on the ability to interact with DNA-functionalized AuNPs is discussed. Due to the best compatibility of receptors immobilized via DTC chemisorption: 1.47 ± 0.4 ·1012 molecules • cm-2 (with the calculated area occupied by single nanoparticle label of 132.7 nm2), DNA chemisorption based on DTCs is pointed as especially promising for DNA biosensors utilizing indirect detection in competitive assays.

SPRi-Based Biosensing Platforms for Detection of Specific DNA Sequences Using Thiolate and Dithiocarbamate Assemblies.

PubMed

Drozd, Marcin; Pietrzak, Mariusz D; Malinowska, Elżbieta

2018-01-01

The framework of presented study covers the development and examination of the analytical performance of surface plasmon resonance-based (SPR) DNA biosensors dedicated for a detection of model target oligonucleotide sequence. For this aim, various strategies of immobilization of DNA probes on gold transducers were tested. Besides the typical approaches: chemisorption of thiolated ssDNA (DNA-thiol) and physisorption of non-functionalized oligonucleotides, relatively new method based on chemisorption of dithiocarbamate-functionalized ssDNA (DNA-DTC) was applied for the first time for preparation of DNA-based SPR biosensor. The special emphasis was put on the correlation between the method of DNA immobilization and the composition of obtained receptor layer. The carried out studies focused on the examination of the capability of developed receptors layers to interact with both target DNA and DNA-functionalized AuNPs. It was found, that the detection limit of target DNA sequence (27 nb length) depends on the strategy of probe immobilization and backfilling method, and in the best case it amounted to 0.66 nM. Moreover, the application of ssDNA-functionalized gold nanoparticles (AuNPs) as plasmonic labels for secondary enhancement of SPR response is presented. The influence of spatial organization and surface density of a receptor layer on the ability to interact with DNA-functionalized AuNPs is discussed. Due to the best compatibility of receptors immobilized via DTC chemisorption: 1.47 ± 0.4 · 10 12 molecules · cm -2 (with the calculated area occupied by single nanoparticle label of ~132.7 nm 2 ), DNA chemisorption based on DTCs is pointed as especially promising for DNA biosensors utilizing indirect detection in competitive assays.
Studying long 16S rDNA sequences with ultrafast-metagenomic sequence classification using exact alignments (Kraken).

PubMed

Valenzuela-González, Fabiola; Martínez-Porchas, Marcel; Villalpando-Canchola, Enrique; Vargas-Albores, Francisco

2016-03-01

Ultrafast-metagenomic sequence classification using exact alignments (Kraken) is a novel approach to classify 16S rDNA sequences. The classifier is based on mapping short sequences to the lowest ancestor and performing alignments to form subtrees with specific weights in each taxon node. This study aimed to evaluate the classification performance of Kraken with long 16S rDNA random environmental sequences produced by cloning and then Sanger sequenced. A total of 480 clones were isolated and expanded, and 264 of these clones formed contigs (1352 ± 153 bp). The same sequences were analyzed using the Ribosomal Database Project (RDP) classifier. Deeper classification performance was achieved by Kraken than by the RDP: 73% of the contigs were classified up to the species or variety levels, whereas 67% of these contigs were classified no further than the genus level by the RDP. The results also demonstrated that unassembled sequences analyzed by Kraken provide similar or inclusively deeper information. Moreover, sequences that did not form contigs, which are usually discarded by other programs, provided meaningful information when analyzed by Kraken. Finally, it appears that the assembly step for Sanger sequences can be eliminated when using Kraken. Kraken cumulates the information of both sequence senses, providing additional elements for the classification. In conclusion, the results demonstrate that Kraken is an excellent choice for use in the taxonomic assignment of sequences obtained by Sanger sequencing or based on third generation sequencing, of which the main goal is to generate larger sequences. Copyright © 2016 Elsevier B.V. All rights reserved.
Isolation and sequence characterization of DNA-A genome of a new begomovirus strain associated with severe leaf curling symptoms of Jatropha curcas L.

PubMed

Chauhan, Sushma; Rahman, Hifzur; Mastan, Shaik G; Pamidimarri, D V N Sudheer; Reddy, Muppala P

2018-07-20

Begomoviruses belong to the family Geminiviridae are associated with several disease symptoms, such as mosaic and leaf curling in Jatropha curcas. The molecular characterization of these viral strains will help in developing management strategies to control the disease. In this study, J. curcas that was infected with begomovirus and showed acute leaf curling symptoms were identified. DNA-A segment from pathogenic viral strain was isolated and sequenced. The sequenced genome was assembled and characterized in detail. The full-length DNA-A sequence was covered by primer walking. The genome sequence showed the general organization of DNA-A from begomovirus by the distribution of ORFs in both viral and anti-viral strands. The genome size ranged from 2844 bp-2852 bp. Three strains with minor nucleotide variations were identified, and a phylogenetic analysis was performed by comparing the DNA-A segments from other reported begomovirus isolates. The maximum sequence similarity was observed with Euphorbia yellow mosaic virus (FN435995). In the phylogenetic tree, no clustering was observed with previously reported begomovirus strains isolated from J. curcas host. The strains isolated in this study belong to new begomoviral strain that elicits symptoms of leaf curling in J. curcas. The results indicate that the probable origin of the strains is from Jatropha mosaic virus infecting J. gassypifolia. The strains isolated in this study are referred as Jatropha curcas leaf curl India virus (JCLCIV) based on the major symptoms exhibited by host J. curcas. Copyright © 2018 Elsevier B.V. All rights reserved.
DNA-DNA interaction beyond the ground state

NASA Astrophysics Data System (ADS)

Lee, D. J.; Wynveen, A.; Kornyshev, A. A.

2004-11-01

The electrostatic interaction potential between DNA duplexes in solution is a basis for the statistical mechanics of columnar DNA assemblies. It may also play an important role in recombination of homologous genes. We develop a theory of this interaction that includes thermal torsional fluctuations of DNA using field-theoretical methods and Monte Carlo simulations. The theory extends and rationalizes the earlier suggested variational approach which was developed in the context of a ground state theory of interaction of nonhomologous duplexes. It shows that the heuristic variational theory is equivalent to the Hartree self-consistent field approximation. By comparison of the Hartree approximation with an exact solution based on the QM analogy of path integrals, as well as Monte Carlo simulations, we show that this easily analytically-tractable approximation works very well in most cases. Thermal fluctuations do not remove the ability of DNA molecules to attract each other at favorable azimuthal conformations, neither do they wash out the possibility of electrostatic “snap-shot” recognition of homologous sequences, considered earlier on the basis of ground state calculations. At short distances DNA molecules undergo a “torsional alignment transition,” which is first order for nonhomologous DNA and weaker order for homologous sequences.
Environmental Barcoding: A Next-Generation Sequencing Approach for Biomonitoring Applications Using River Benthos

PubMed Central

Hajibabaei, Mehrdad; Shokralla, Shadi; Zhou, Xin; Singer, Gregory A. C.; Baird, Donald J.

2011-01-01

Timely and accurate biodiversity analysis poses an ongoing challenge for the success of biomonitoring programs. Morphology-based identification of bioindicator taxa is time consuming, and rarely supports species-level resolution especially for immature life stages. Much work has been done in the past decade to develop alternative approaches for biodiversity analysis using DNA sequence-based approaches such as molecular phylogenetics and DNA barcoding. On-going assembly of DNA barcode reference libraries will provide the basis for a DNA-based identification system. The use of recently introduced next-generation sequencing (NGS) approaches in biodiversity science has the potential to further extend the application of DNA information for routine biomonitoring applications to an unprecedented scale. Here we demonstrate the feasibility of using 454 massively parallel pyrosequencing for species-level analysis of freshwater benthic macroinvertebrate taxa commonly used for biomonitoring. We designed our experiments in order to directly compare morphology-based, Sanger sequencing DNA barcoding, and next-generation environmental barcoding approaches. Our results show the ability of 454 pyrosequencing of mini-barcodes to accurately identify all species with more than 1% abundance in the pooled mixture. Although the approach failed to identify 6 rare species in the mixture, the presence of sequences from 9 species that were not represented by individuals in the mixture provides evidence that DNA based analysis may yet provide a valuable approach in finding rare species in bulk environmental samples. We further demonstrate the application of the environmental barcoding approach by comparing benthic macroinvertebrates from an urban region to those obtained from a conservation area. Although considerable effort will be required to robustly optimize NGS tools to identify species from bulk environmental samples, our results indicate the potential of an environmental barcoding approach for biomonitoring programs. PMID:21533287
The NnCenH3 protein and centromeric DNA sequence profiles of Nelumbo nucifera Gaertn. (sacred lotus) reveal the DNA structures and dynamics of centromeres in basal eudicots.

PubMed

Zhu, Zhixuan; Gui, Songtao; Jin, Jing; Yi, Rong; Wu, Zhihua; Qian, Qian; Ding, Yi

2016-09-01

Centromeres on eukaryotic chromosomes consist of large arrays of DNA repeats that undergo very rapid evolution. Nelumbo nucifera Gaertn. (sacred lotus) is a phylogenetic relict and an aquatic perennial basal eudicot. Studies concerning the centromeres of this basal eudicot species could provide ancient evolutionary perspectives. In this study, we characterized the centromeric marker protein NnCenH3 (sacred lotus centromere-specific histone H3 variant), and used a chromatin immunoprecipitation (ChIP)-based technique to recover the NnCenH3 nucleosome-associated sequences of sacred lotus. The properties of the centromere-binding protein and DNA sequences revealed notable divergence between sacred lotus and other flowering plants, including the following factors: (i) an NnCenH3 alternative splicing variant comprising only a partial centromere-targeting domain, (ii) active genes with low transcription levels in the NnCenH3 nucleosomal regions, and (iii) the prevalence of the Ty1/copia class of long terminal repeat (LTR) retrotransposons in the centromeres of sacred lotus chromosomes. In addition, the dynamic natures of the centromeric region showed that some of the centromeric repeat DNA sequences originated from telomeric repeats, and a pair of centromeres on the dicentric chromosome 1 was inactive in the metaphase cells of sacred lotus. Our characterization of the properties of centromeric DNA structure within the sacred lotus genome describes a centromeric profile in ancient basal eudicots and might provide evidence of the origins and evolution of centromeres. Furthermore, the identification of centromeric DNA sequences is of great significance for the assembly of the sacred lotus genome. © 2016 The Authors The Plant Journal © 2016 John Wiley & Sons Ltd.
An improved filtering algorithm for big read datasets and its application to single-cell assembly.

PubMed

Wedemeyer, Axel; Kliemann, Lasse; Srivastav, Anand; Schielke, Christian; Reusch, Thorsten B; Rosenstiel, Philip

2017-07-03

For single-cell or metagenomic sequencing projects, it is necessary to sequence with a very high mean coverage in order to make sure that all parts of the sample DNA get covered by the reads produced. This leads to huge datasets with lots of redundant data. A filtering of this data prior to assembly is advisable. Brown et al. (2012) presented the algorithm Diginorm for this purpose, which filters reads based on the abundance of their k-mers. We present Bignorm, a faster and quality-conscious read filtering algorithm. An important new algorithmic feature is the use of phred quality scores together with a detailed analysis of the k-mer counts to decide which reads to keep. We qualify and recommend parameters for our new read filtering algorithm. Guided by these parameters, we remove in terms of median 97.15% of the reads while keeping the mean phred score of the filtered dataset high. Using the SDAdes assembler, we produce assemblies of high quality from these filtered datasets in a fraction of the time needed for an assembly from the datasets filtered with Diginorm. We conclude that read filtering is a practical and efficient method for reducing read data and for speeding up the assembly process. This applies not only for single cell assembly, as shown in this paper, but also to other projects with high mean coverage datasets like metagenomic sequencing projects. Our Bignorm algorithm allows assemblies of competitive quality in comparison to Diginorm, while being much faster. Bignorm is available for download at https://git.informatik.uni-kiel.de/axw/Bignorm .
Mimicking an Enzyme-Based Colorimetric Aptasensor for Antibiotic Residue Detection in Milk Combining Magnetic Loop-DNA Probes and CHA-Assisted Target Recycling Amplification.

PubMed

Luan, Qian; Gan, Ning; Cao, Yuting; Li, Tianhua

2017-07-19

A mimicking-enzyme-based colorimetric aptasensor was developed for the detection of kanamycin (KANA) in milk using magnetic loop-DNA-NMOF-Pt (m-L-DNA) probes and catalytic hairpin assembly (CHA)-assisted target recycling for signal amplification. The m-L-DNA probes were constructed via hybridization of hairpin DNA H1 (containing aptamer sequence) immobilized magnetic beads (m-H1) and signal DNA (sDNA, partial hybridization with H1) labeled nano Fe-MIL-88NH 2 -Pt (NMOF-Pt-sDNA). In the presence of KANA and complementary hairpin DNA H2, the m-L-DNA probes decomposed and formed an m-H1/KANA intermediate, which triggered the CHA reaction to form a stable duplex strand (m-H1-H2) while releasing KANA again for recycling. Consequently, numerous NMOF-Pt-sDNA as mimicking enzymes can synergistically catalyze 3,3',5,5'-tetramethylbenzidine (TMB) for color development. The aptasensor exhibited high selectivity and sensitivity for KANA in milk with a detection limit of 0.2 pg mL -1 within 30 min. The assay can be conveniently extended for on-site screening of other antibiotics in foods by simply changing the base sequence of the probes.
ContEst16S: an algorithm that identifies contaminated prokaryotic genomes using 16S RNA gene sequences.

PubMed

Lee, Imchang; Chalita, Mauricio; Ha, Sung-Min; Na, Seong-In; Yoon, Seok-Hwan; Chun, Jongsik

2017-06-01

Thanks to the recent advancement of DNA sequencing technology, the cost and time of prokaryotic genome sequencing have been dramatically decreased. It has repeatedly been reported that genome sequencing using high-throughput next-generation sequencing is prone to contaminations due to its high depth of sequencing coverage. Although a few bioinformatics tools are available to detect potential contaminations, these have inherited limitations as they only use protein-coding genes. Here we introduce a new algorithm, called ContEst16S, to detect potential contaminations using 16S rRNA genes from genome assemblies. We screened 69 745 prokaryotic genomes from the NCBI Assembly Database using ContEst16S and found that 594 were contaminated by bacteria, human and plants. Of the predicted contaminated genomes, 8 % were not predicted by the existing protein-coding gene-based tool, implying that both methods can be complementary in the detection of contaminations. A web-based service of the algorithm is available at www.ezbiocloud.net/tools/contest16s.
The chaperonin-60 universal target is a barcode for bacteria that enables de novo assembly of metagenomic sequence data.

PubMed

Links, Matthew G; Dumonceaux, Tim J; Hemmingsen, Sean M; Hill, Janet E

2012-01-01

Barcoding with molecular sequences is widely used to catalogue eukaryotic biodiversity. Studies investigating the community dynamics of microbes have relied heavily on gene-centric metagenomic profiling using two genes (16S rRNA and cpn60) to identify and track Bacteria. While there have been criteria formalized for barcoding of eukaryotes, these criteria have not been used to evaluate gene targets for other domains of life. Using the framework of the International Barcode of Life we evaluated DNA barcodes for Bacteria. Candidates from the 16S rRNA gene and the protein coding cpn60 gene were evaluated. Within complete bacterial genomes in the public domain representing 983 species from 21 phyla, the largest difference between median pairwise inter- and intra-specific distances ("barcode gap") was found from cpn60. Distribution of sequence diversity along the ∼555 bp cpn60 target region was remarkably uniform. The barcode gap of the cpn60 universal target facilitated the faithful de novo assembly of full-length operational taxonomic units from pyrosequencing data from a synthetic microbial community. Analysis supported the recognition of both 16S rRNA and cpn60 as DNA barcodes for Bacteria. The cpn60 universal target was found to have a much larger barcode gap than 16S rRNA suggesting cpn60 as a preferred barcode for Bacteria. A large barcode gap for cpn60 provided a robust target for species-level characterization of data. The assembly of consensus sequences for barcodes was shown to be a reliable method for the identification and tracking of novel microbes in metagenomic studies.
A Benchmark Study on Error Assessment and Quality Control of CCS Reads Derived from the PacBio RS

PubMed Central

Jiao, Xiaoli; Zheng, Xin; Ma, Liang; Kutty, Geetha; Gogineni, Emile; Sun, Qiang; Sherman, Brad T.; Hu, Xiaojun; Jones, Kristine; Raley, Castle; Tran, Bao; Munroe, David J.; Stephens, Robert; Liang, Dun; Imamichi, Tomozumi; Kovacs, Joseph A.; Lempicki, Richard A.; Huang, Da Wei

2013-01-01

PacBio RS, a newly emerging third-generation DNA sequencing platform, is based on a real-time, single-molecule, nano-nitch sequencing technology that can generate very long reads (up to 20-kb) in contrast to the shorter reads produced by the first and second generation sequencing technologies. As a new platform, it is important to assess the sequencing error rate, as well as the quality control (QC) parameters associated with the PacBio sequence data. In this study, a mixture of 10 prior known, closely related DNA amplicons were sequenced using the PacBio RS sequencing platform. After aligning Circular Consensus Sequence (CCS) reads derived from the above sequencing experiment to the known reference sequences, we found that the median error rate was 2.5% without read QC, and improved to 1.3% with an SVM based multi-parameter QC method. In addition, a De Novo assembly was used as a downstream application to evaluate the effects of different QC approaches. This benchmark study indicates that even though CCS reads are post error-corrected it is still necessary to perform appropriate QC on CCS reads in order to produce successful downstream bioinformatics analytical results. PMID:24179701
A Benchmark Study on Error Assessment and Quality Control of CCS Reads Derived from the PacBio RS.

PubMed

Jiao, Xiaoli; Zheng, Xin; Ma, Liang; Kutty, Geetha; Gogineni, Emile; Sun, Qiang; Sherman, Brad T; Hu, Xiaojun; Jones, Kristine; Raley, Castle; Tran, Bao; Munroe, David J; Stephens, Robert; Liang, Dun; Imamichi, Tomozumi; Kovacs, Joseph A; Lempicki, Richard A; Huang, Da Wei

2013-07-31

PacBio RS, a newly emerging third-generation DNA sequencing platform, is based on a real-time, single-molecule, nano-nitch sequencing technology that can generate very long reads (up to 20-kb) in contrast to the shorter reads produced by the first and second generation sequencing technologies. As a new platform, it is important to assess the sequencing error rate, as well as the quality control (QC) parameters associated with the PacBio sequence data. In this study, a mixture of 10 prior known, closely related DNA amplicons were sequenced using the PacBio RS sequencing platform. After aligning Circular Consensus Sequence (CCS) reads derived from the above sequencing experiment to the known reference sequences, we found that the median error rate was 2.5% without read QC, and improved to 1.3% with an SVM based multi-parameter QC method. In addition, a De Novo assembly was used as a downstream application to evaluate the effects of different QC approaches. This benchmark study indicates that even though CCS reads are post error-corrected it is still necessary to perform appropriate QC on CCS reads in order to produce successful downstream bioinformatics analytical results.
MicroRNA-triggered, cascaded and catalytic self-assembly of functional ``DNAzyme ferris wheel'' nanostructures for highly sensitive colorimetric detection of cancer cells

NASA Astrophysics Data System (ADS)

Zhou, Wenjiao; Liang, Wenbin; Li, Xin; Chai, Yaqin; Yuan, Ruo; Xiang, Yun

2015-05-01

The construction of DNA nanostructures with various sizes and shapes has significantly advanced during the past three decades, yet the application of these DNA nanostructures for solving real problems is still in the early stage. On the basis of microRNA-triggered, catalytic self-assembly formation of the functional ``DNAzyme ferris wheel'' nanostructures, we show here a new signal amplification platform for highly sensitive, label-free and non-enzyme colorimetric detection of a small number of human prostate cancer cells. The microRNA (miR-141), which is catalytically recycled and reused, triggers isothermal self-assembly of a pre-designed, G-quadruplex sequence containing hairpin DNAs into ``DNAzyme ferris wheel''-like nanostructures (in association with hemin) with horseradish peroxidase mimicking activity. These DNAzyme nanostructures catalyze an intensified color transition of the probe solution for highly sensitive detection of miR-141 down to 0.5 pM with the naked eye, and the monitoring of as low as 283 human prostate cancer cells can also, theoretically, be achieved in a colorimetric approach. The work demonstrated here thus offers new opportunities for the construction of functional DNA nanostructures and for the application of these DNA nanostructures as an effective signal amplification means in the sensitive detection of nucleic acid biomarkers.
Toward allotetraploid cotton genome assembly: integration of a high-density molecular genetic linkage map with DNA sequence information

PubMed Central

2012-01-01

Background Cotton is the world’s most important natural textile fiber and a significant oilseed crop. Decoding cotton genomes will provide the ultimate reference and resource for research and utilization of the species. Integration of high-density genetic maps with genomic sequence information will largely accelerate the process of whole-genome assembly in cotton. Results In this paper, we update a high-density interspecific genetic linkage map of allotetraploid cultivated cotton. An additional 1,167 marker loci have been added to our previously published map of 2,247 loci. Three new marker types, InDel (insertion-deletion) and SNP (single nucleotide polymorphism) developed from gene information, and REMAP (retrotransposon-microsatellite amplified polymorphism), were used to increase map density. The updated map consists of 3,414 loci in 26 linkage groups covering 3,667.62 cM with an average inter-locus distance of 1.08 cM. Furthermore, genome-wide sequence analysis was finished using 3,324 informative sequence-based markers and publicly-available Gossypium DNA sequence information. A total of 413,113 EST and 195 BAC sequences were physically anchored and clustered by 3,324 sequence-based markers. Of these, 14,243 ESTs and 188 BACs from different species of Gossypium were clustered and specifically anchored to the high-density genetic map. A total of 2,748 candidate unigenes from 2,111 ESTs clusters and 63 BACs were mined for functional annotation and classification. The 337 ESTs/genes related to fiber quality traits were integrated with 132 previously reported cotton fiber quality quantitative trait loci, which demonstrated the important roles in fiber quality of these genes. Higher-level sequence conservation between different cotton species and between the A- and D-subgenomes in tetraploid cotton was found, indicating a common evolutionary origin for orthologous and paralogous loci in Gossypium. Conclusion This study will serve as a valuable genomic resource for tetraploid cotton genome assembly, for cloning genes related to superior agronomic traits, and for further comparative genomic analyses in Gossypium. PMID:23046547
PRP5: a helicase-like protein required for mRNA splicing in yeast.

PubMed Central

Dalbadie-McFarland, G; Abelson, J

1990-01-01

A 96-kDa protein predicted by the DNA sequence of the Saccharomyces cerevisiae PRP5 gene contains a domain that bears a striking resemblance to a family of RNA helicases characterized by the conserved amino acid sequence Asp-Glu-Ala-Asp (D-E-A-D). Previous work indicated that the product of the PRP5 gene is required for splicing and that spliceosome assembly does not occur in its absence. However, its precise role in splicing and the nature of its biochemical activity remained unknown. To examine the role of PRP5 in splicing, we cloned the gene by complementation of a temperature-sensitive mutation and determined its DNA sequence. We discuss here the possible roles for an RNA helicase in splicing and for the activity of the PRP5 protein. Images PMID:2349233
Draft genome sequence of Cicer reticulatum L., the wild progenitor of chickpea provides a resource for agronomic trait improvement.

PubMed

Gupta, Sonal; Nawaz, Kashif; Parween, Sabiha; Roy, Riti; Sahu, Kamlesh; Kumar Pole, Anil; Khandal, Hitaishi; Srivastava, Rishi; Kumar Parida, Swarup; Chattopadhyay, Debasis

2017-02-01

Cicer reticulatum L. is the wild progenitor of the fourth most important legume crop chickpea (C. arietinum L.). We assembled short-read sequences into 416 Mb draft genome of C. reticulatum and anchored 78% (327 Mb) of this assembly to eight linkage groups. Genome annotation predicted 25,680 protein-coding genes covering more than 90% of predicted gene space. The genome assembly shared a substantial synteny and conservation of gene orders with the genome of the model legume Medicago truncatula. Resistance gene homologs of wild and domesticated chickpeas showed high sequence homology and conserved synteny. Comparison of gene sequences and nucleotide diversity using 66 wild and domesticated chickpea accessions suggested that the desi type chickpea was genetically closer to the wild species than the kabuli type. Comparative analyses predicted gene flow between the wild and the cultivated species during domestication. Molecular diversity and population genetic structure determination using 15,096 genome-wide single nucleotide polymorphisms revealed an admixed domestication pattern among cultivated (desi and kabuli) and wild chickpea accessions belonging to three population groups reflecting significant influence of parentage or geographical origin for their cultivar-specific population classification. The assembly and the polymorphic sequence resources presented here would facilitate the study of chickpea domestication and targeted use of wild Cicer germplasms for agronomic trait improvement in chickpea. © The Author 2016. Published by Oxford University Press on behalf of Kazusa DNA Research Institute.
Resolving the phylogenetic position of Darwin's extinct ground sloth (Mylodon darwinii) using mitogenomic and nuclear exon data.

PubMed

Delsuc, Frédéric; Kuch, Melanie; Gibb, Gillian C; Hughes, Jonathan; Szpak, Paul; Southon, John; Enk, Jacob; Duggan, Ana T; Poinar, Hendrik N

2018-05-16

Mylodon darwinii is the extinct giant ground sloth named after Charles Darwin, who first collected its remains in South America. We have successfully obtained a high-quality mitochondrial genome at 99-fold coverage using an Illumina shotgun sequencing of a 12 880-year-old bone fragment from Mylodon Cave in Chile. Low level of DNA damage showed that this sample was exceptionally well preserved for an ancient subfossil, probably the result of the dry and cold conditions prevailing within the cave. Accordingly, taxonomic assessment of our shotgun metagenomic data showed a very high percentage of endogenous DNA with 22% of the assembled metagenomic contigs assigned to Xenarthra. Additionally, we enriched over 15 kb of sequence data from seven nuclear exons, using target sequence capture designed against a wide xenarthran dataset. Phylogenetic and dating analyses of the mitogenomic dataset including all extant species of xenarthrans and the assembled nuclear supermatrix unambiguously place Mylodon darwinii as the sister-group of modern two-fingered sloths, from which it diverged around 22 million years ago. These congruent results from both the mitochondrial and nuclear data support the diphyly of the two modern sloth lineages, implying the convergent evolution of their unique suspensory behaviour as an adaption to arboreality. Our results offer promising perspectives for whole-genome sequencing of this emblematic extinct taxon. © 2018 The Authors.
RAD tag sequencing as a source of SNP markers in Cynara cardunculus L

PubMed Central

2012-01-01

Background The globe artichoke (Cynara cardunculus L. var. scolymus) genome is relatively poorly explored, especially compared to those of the other major Asteraceae crops sunflower and lettuce. No SNP markers are in the public domain. We have combined the recently developed restriction-site associated DNA (RAD) approach with the Illumina DNA sequencing platform to effect the rapid and mass discovery of SNP markers for C. cardunculus. Results RAD tags were sequenced from the genomic DNA of three C. cardunculus mapping population parents, generating 9.7 million reads, corresponding to ~1 Gbp of sequence. An assembly based on paired ends produced ~6.0 Mbp of genomic sequence, separated into ~19,000 contigs (mean length 312 bp), of which ~21% were fragments of putative coding sequence. The shared sequences allowed for the discovery of ~34,000 SNPs and nearly 800 indels, equivalent to a SNP frequency of 5.6 per 1,000 nt, and an indel frequency of 0.2 per 1,000 nt. A sample of heterozygous SNP loci was mapped by CAPS assays and this exercise provided validation of our mining criteria. The repetitive fraction of the genome had a high representation of retrotransposon sequence, followed by simple repeats, AT-low complexity regions and mobile DNA elements. The genomic k-mers distribution and CpG rate of C. cardunculus, compared with data derived from three whole genome-sequenced dicots species, provided a further evidence of the random representation of the C. cardunculus genome generated by RAD sampling. Conclusion The RAD tag sequencing approach is a cost-effective and rapid method to develop SNP markers in a highly heterozygous species. Our approach permitted to generate a large and robust SNP datasets by the adoption of optimized filtering criteria. PMID:22214349
Vertebrate Genome Evolution in the Light of Fish Cytogenomics and rDNAomics

PubMed Central

Howell, W. Mike

2018-01-01

To understand the cytogenomic evolution of vertebrates, we must first unravel the complex genomes of fishes, which were the first vertebrates to evolve and were ancestors to all other vertebrates. We must not forget the immense time span during which the fish genomes had to evolve. Fish cytogenomics is endowed with unique features which offer irreplaceable insights into the evolution of the vertebrate genome. Due to the general DNA base compositional homogeneity of fish genomes, fish cytogenomics is largely based on mapping DNA repeats that still represent serious obstacles in genome sequencing and assembling, even in model species. Localization of repeats on chromosomes of hundreds of fish species and populations originating from diversified environments have revealed the biological importance of this genomic fraction. Ribosomal genes (rDNA) belong to the most informative repeats and in fish, they are subject to a more relaxed regulation than in higher vertebrates. This can result in formation of a literal ‘rDNAome’ consisting of more than 20,000 copies with their high proportion employed in extra-coding functions. Because rDNA has high rates of transcription and recombination, it contributes to genome diversification and can form reproductive barrier. Our overall knowledge of fish cytogenomics grows rapidly by a continuously increasing number of fish genomes sequenced and by use of novel sequencing methods improving genome assembly. The recently revealed exceptional compositional heterogeneity in an ancient fish lineage (gars) sheds new light on the compositional genome evolution in vertebrates generally. We highlight the power of synergy of cytogenetics and genomics in fish cytogenomics, its potential to understand the complexity of genome evolution in vertebrates, which is also linked to clinical applications and the chromosomal backgrounds of speciation. We also summarize the current knowledge on fish cytogenomics and outline its main future avenues. PMID:29443947
Scanning the human genome at kilobase resolution.

PubMed

Chen, Jun; Kim, Yeong C; Jung, Yong-Chul; Xuan, Zhenyu; Dworkin, Geoff; Zhang, Yanming; Zhang, Michael Q; Wang, San Ming

2008-05-01

Normal genome variation and pathogenic genome alteration frequently affect small regions in the genome. Identifying those genomic changes remains a technical challenge. We report here the development of the DGS (Ditag Genome Scanning) technique for high-resolution analysis of genome structure. The basic features of DGS include (1) use of high-frequent restriction enzymes to fractionate the genome into small fragments; (2) collection of two tags from two ends of a given DNA fragment to form a ditag to represent the fragment; (3) application of the 454 sequencing system to reach a comprehensive ditag sequence collection; (4) determination of the genome origin of ditags by mapping to reference ditags from known genome sequences; (5) use of ditag sequences directly as the sense and antisense PCR primers to amplify the original DNA fragment. To study the relationship between ditags and genome structure, we performed a computational study by using the human genome reference sequences as a model, and analyzed the ditags experimentally collected from the well-characterized normal human DNA GM15510 and the leukemic human DNA of Kasumi-1 cells. Our studies show that DGS provides a kilobase resolution for studying genome structure with high specificity and high genome coverage. DGS can be applied to validate genome assembly, to compare genome similarity and variation in normal populations, and to identify genomic abnormality including insertion, inversion, deletion, translocation, and amplification in pathological genomes such as cancer genomes.

Post-Flight Microbial Analysis of Samples from the International Space Station Water Recovery System and Oxygen Generation System

NASA Technical Reports Server (NTRS)

Birmele, Michele N.

2011-01-01

The Regenerative, Environmental Control and Life Support System (ECLSS) on the International Space Station (ISS) includes the the Water Recovery System (WRS) and the Oxygen Generation System (OGS). The WRS consists of a Urine Processor Assembly (UPA) and Water Processor Assembly (WPA). This report describes microbial characterization of wastewater and surface samples collected from the WRS and OGS subsystems, returned to KSC, JSC, and MSFC on consecutive shuttle flights (STS-129 and STS-130) in 2009-10. STS-129 returned two filters that contained fluid samples from the WPA Waste Tank Orbital Recovery Unit (ORU), one from the waste tank and the other from the ISS humidity condensate. Direct count by microscopic enumeration revealed 8.38 x 104 cells per mL in the humidity condensate sample, but none of those cells were recoverable on solid agar media. In contrast, 3.32 x lOs cells per mL were measured from a surface swab of the WRS waste tank, including viable bacteria and fungi recovered after S12 days of incubation on solid agar media. Based on rDNA sequencing and phenotypic characterization, a fungus recovered from the filter was determined to be Lecythophora mutabilis. The bacterial isolate was identified by rDNA sequence data to be Methylobacterium radiotolerans. Additional UPA subsystem samples were returned on STS-130 for analysis. Both liquid and solid samples were collected from the Russian urine container (EDV), Distillation Assembly (DA) and Recycle Filter Tank Assembly (RFTA) for post-flight analysis. The bacterium Pseudomonas aeruginosa and fungus Chaetomium brasiliense were isolated from the EDV samples. No viable bacteria or fungi were recovered from RFTA brine samples (N= 6), but multiple samples (N = 11) from the DA and RFTA were found to contain fungal and bacterial cells. Many recovered cells have been identified to genus by rDNA sequencing and carbon source utilization profiling (BiOLOG Gen III). The presence of viable bacteria and fungi from WRS and OGS subsystems demonstrates the need for continued monitoring of ECLSS during future ISS operations and investigation of advanced antimicrobial controls.
Transcriptome sequencing and annotation of the microalgae Dunaliella tertiolecta: Pathway description and gene discovery for production of next-generation biofuels

PubMed Central

2011-01-01

Background Biodiesel or ethanol derived from lipids or starch produced by microalgae may overcome many of the sustainability challenges previously ascribed to petroleum-based fuels and first generation plant-based biofuels. The paucity of microalgae genome sequences, however, limits gene-based biofuel feedstock optimization studies. Here we describe the sequencing and de novo transcriptome assembly for the non-model microalgae species, Dunaliella tertiolecta, and identify pathways and genes of importance related to biofuel production. Results Next generation DNA pyrosequencing technology applied to D. tertiolecta transcripts produced 1,363,336 high quality reads with an average length of 400 bases. Following quality and size trimming, ~ 45% of the high quality reads were assembled into 33,307 isotigs with a 31-fold coverage and 376,482 singletons. Assembled sequences and singletons were subjected to BLAST similarity searches and annotated with Gene Ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) orthology (KO) identifiers. These analyses identified the majority of lipid and starch biosynthesis and catabolism pathways in D. tertiolecta. Conclusions The construction of metabolic pathways involved in the biosynthesis and catabolism of fatty acids, triacylglycrols, and starch in D. tertiolecta as well as the assembled transcriptome provide a foundation for the molecular genetics and functional genomics required to direct metabolic engineering efforts that seek to enhance the quantity and character of microalgae-based biofuel feedstock. PMID:21401935
A comprehensive resource of genomic, epigenomic and transcriptomic sequencing data for the black truffle Tuber melanosporum

PubMed Central

2014-01-01

Background Tuber melanosporum, also known in the gastronomic community as “truffle”, features one of the largest fungal genomes (125 Mb) with an exceptionally high transposable element (TE) and repetitive DNA content (>58%). The main purpose of DNA methylation in fungi is TE silencing. As obligate outcrossing organisms, truffles are bound to a sexual mode of propagation, which together with TEs is thought to represent a major force driving the evolution of DNA methylation. Thus, it was of interest to examine if and how T. melanosporum exploits DNA methylation to maintain genome integrity. Findings We performed whole-genome DNA bisulfite sequencing and mRNA sequencing on different developmental stages of T. melanosporum; namely, fruitbody (“truffle”), free-living mycelium and ectomycorrhiza. The data revealed a high rate of cytosine methylation (>44%), selectively targeting TEs rather than genes with a strong preference for CpG sites. Whole genome DNA sequencing uncovered multiple TE-enriched, copy number variant regions bearing a significant fraction of hypomethylated and expressed TEs, almost exclusively in free-living mycelium propagated in vitro. Treatment of mycelia with 5-azacytidine partially reduced DNA methylation and increased TE transcription. Our transcriptome assembly also resulted in the identification of a set of novel transcripts from 614 genes. Conclusions The datasets presented here provide valuable and comprehensive (epi)genomic information that can be of interest for evolutionary genomics studies of multicellular (filamentous) fungi, in particular Ascomycetes belonging to the subphylum, Pezizomycotina. Evidence derived from comparative methylome and transcriptome analyses indicates that a non-exhaustive and partly reversible methylation process operates in truffles. PMID:25392735
A comprehensive resource of genomic, epigenomic and transcriptomic sequencing data for the black truffle Tuber melanosporum.

PubMed

Chen, Pao-Yang; Montanini, Barbara; Liao, Wen-Wei; Morselli, Marco; Jaroszewicz, Artur; Lopez, David; Ottonello, Simone; Pellegrini, Matteo

2014-01-01

Tuber melanosporum, also known in the gastronomic community as "truffle", features one of the largest fungal genomes (125 Mb) with an exceptionally high transposable element (TE) and repetitive DNA content (>58%). The main purpose of DNA methylation in fungi is TE silencing. As obligate outcrossing organisms, truffles are bound to a sexual mode of propagation, which together with TEs is thought to represent a major force driving the evolution of DNA methylation. Thus, it was of interest to examine if and how T. melanosporum exploits DNA methylation to maintain genome integrity. We performed whole-genome DNA bisulfite sequencing and mRNA sequencing on different developmental stages of T. melanosporum; namely, fruitbody ("truffle"), free-living mycelium and ectomycorrhiza. The data revealed a high rate of cytosine methylation (>44%), selectively targeting TEs rather than genes with a strong preference for CpG sites. Whole genome DNA sequencing uncovered multiple TE-enriched, copy number variant regions bearing a significant fraction of hypomethylated and expressed TEs, almost exclusively in free-living mycelium propagated in vitro. Treatment of mycelia with 5-azacytidine partially reduced DNA methylation and increased TE transcription. Our transcriptome assembly also resulted in the identification of a set of novel transcripts from 614 genes. The datasets presented here provide valuable and comprehensive (epi)genomic information that can be of interest for evolutionary genomics studies of multicellular (filamentous) fungi, in particular Ascomycetes belonging to the subphylum, Pezizomycotina. Evidence derived from comparative methylome and transcriptome analyses indicates that a non-exhaustive and partly reversible methylation process operates in truffles.
Palaeosymbiosis Revealed by Genomic Fossils of Wolbachia in a Strongyloidean Nematode

PubMed Central

Koutsovoulos, Georgios; Makepeace, Benjamin; Tanya, Vincent N.; Blaxter, Mark

2014-01-01

Wolbachia are common endosymbionts of terrestrial arthropods, and are also found in nematodes: the animal-parasitic filaria, and the plant-parasite Radopholus similis. Lateral transfer of Wolbachia DNA to the host genome is common. We generated a draft genome sequence for the strongyloidean nematode parasite Dictyocaulus viviparus, the cattle lungworm. In the assembly, we identified nearly 1 Mb of sequence with similarity to Wolbachia. The fragments were unlikely to derive from a live Wolbachia infection: most were short, and the genes were disabled through inactivating mutations. Many fragments were co-assembled with definitively nematode-derived sequence. We found limited evidence of expression of the Wolbachia-derived genes. The D. viviparus Wolbachia genes were most similar to filarial strains and strains from the host-promiscuous clade F. We conclude that D. viviparus was infected by Wolbachia in the past, and that clade F-like symbionts may have been the source of filarial Wolbachia infections. PMID:24901418
Metavisitor, a Suite of Galaxy Tools for Simple and Rapid Detection and Discovery of Viruses in Deep Sequence Data

PubMed Central

Vernick, Kenneth D.

2017-01-01

Metavisitor is a software package that allows biologists and clinicians without specialized bioinformatics expertise to detect and assemble viral genomes from deep sequence datasets. The package is composed of a set of modular bioinformatic tools and workflows that are implemented in the Galaxy framework. Using the graphical Galaxy workflow editor, users with minimal computational skills can use existing Metavisitor workflows or adapt them to suit specific needs by adding or modifying analysis modules. Metavisitor works with DNA, RNA or small RNA sequencing data over a range of read lengths and can use a combination of de novo and guided approaches to assemble genomes from sequencing reads. We show that the software has the potential for quick diagnosis as well as discovery of viruses from a vast array of organisms. Importantly, we provide here executable Metavisitor use cases, which increase the accessibility and transparency of the software, ultimately enabling biologists or clinicians to focus on biological or medical questions. PMID:28045932
Design and preparation of beta-sheet forming repetitive and block-copolymerized polypeptides.

PubMed

Higashiya, Seiichiro; Topilina, Natalya I; Ngo, Silvana C; Zagorevskii, Dmitri; Welch, John T

2007-05-01

The design and rapid construction of libraries of genes coding beta-sheet forming repetitive and block-copolymerized polypeptides bearing various C- and N-terminal sequences are described. The design was based on the assembly of DNA cassettes coding for the (GA)3GX amino acid sequence where the (GAGAGA) sequences would constitute the beta-strand units of a larger beta-sheet assembly. The edges of this beta-sheet would be functionalized by the turn-inducing amino acids (GX). The polypeptides were expressed in Escherichia coli using conventional vectors and were purified by Ni-nitriloacetic acid (NTA) chromatography. The correlation of polymer structure with molecular weight was investigated by gel electrophoresis and mass spectrometry. The monomer sequences and post-translational chemical modifications were found to influence the mobility of the polypeptides over the full range of polypeptide molecular weights while the electrophoretic mobility of lower molecular weight polypeptides was more susceptible to C- and N-termini polypeptide modifications.
Complete genome of the cotton bacteria blight pathogen Xanthomonas citri pv. malvacearum strain MSCT

USDA-ARS?s Scientific Manuscript database

Xanthomonas citri pv. malvacearum (Xcm) is a major pathogen of Gossypium hirsutum. In this study we report the complete genome of the Xcm strain MSCT assembled from long read DNA sequencing technology. The MSCT genome is the first Xcm genome that has complete coding regions for Xcm transcriptional a...
Zinc finger nuclease technology: advances and obstacles in modelling and treating genetic disorders.

PubMed

Jabalameli, Hamid Reza; Zahednasab, Hamid; Karimi-Moghaddam, Amin; Jabalameli, Mohammad Reza

2015-03-01

Zinc finger nucleases (ZFNs) are engineered restriction enzymes designed to target specific DNA sequences within the genome. Assembly of zinc finger DNA-binding domain to a DNA-cleavage domain enables the enzyme machinery to target unique locus in the genome and invoke endogenous DNA repair mechanisms. This machinery offers a versatile approach in allele editing and gene therapy. Here we discuss the architecture of ZFNs and strategies for generating targeted modifications within the genome. We review advances in gene therapy and modelling of the disease using these enzymes and finally, discuss the practical obstacles in using this technology. Copyright © 2014 Elsevier B.V. All rights reserved.
Versatile and Programmable DNA Logic Gates on Universal and Label-Free Homogeneous Electrochemical Platform.

PubMed

Ge, Lei; Wang, Wenxiao; Sun, Ximei; Hou, Ting; Li, Feng

2016-10-04

Herein, a novel universal and label-free homogeneous electrochemical platform is demonstrated, on which a complete set of DNA-based two-input Boolean logic gates (OR, NAND, AND, NOR, INHIBIT, IMPLICATION, XOR, and XNOR) is constructed by simply and rationally deploying the designed DNA polymerization/nicking machines without complicated sequence modulation. Single-stranded DNA is employed as the proof-of-concept target/input to initiate or prevent the DNA polymerization/nicking cyclic reactions on these DNA machines to synthesize numerous intact G-quadruplex sequences or binary G-quadruplex subunits as the output. The generated output strands then self-assemble into G-quadruplexes that render remarkable decrease to the diffusion current response of methylene blue and, thus, provide the amplified homogeneous electrochemical readout signal not only for the logic gate operations but also for the ultrasensitive detection of the target/input. This system represents the first example of homogeneous electrochemical logic operation. Importantly, the proposed homogeneous electrochemical logic gates possess the input/output homogeneity and share a constant output threshold value. Moreover, the modular design of DNA polymerization/nicking machines enables the adaptation of these homogeneous electrochemical logic gates to various input and output sequences. The results of this study demonstrate the versatility and universality of the label-free homogeneous electrochemical platform in the design of biomolecular logic gates and provide a potential platform for the further development of large-scale DNA-based biocomputing circuits and advanced biosensors for multiple molecular targets.
Automated ensemble assembly and validation of microbial genomes.

PubMed

Koren, Sergey; Treangen, Todd J; Hill, Christopher M; Pop, Mihai; Phillippy, Adam M

2014-05-03

The continued democratization of DNA sequencing has sparked a new wave of development of genome assembly and assembly validation methods. As individual research labs, rather than centralized centers, begin to sequence the majority of new genomes, it is important to establish best practices for genome assembly. However, recent evaluations such as GAGE and the Assemblathon have concluded that there is no single best approach to genome assembly. Instead, it is preferable to generate multiple assemblies and validate them to determine which is most useful for the desired analysis; this is a labor-intensive process that is often impossible or unfeasible. To encourage best practices supported by the community, we present iMetAMOS, an automated ensemble assembly pipeline; iMetAMOS encapsulates the process of running, validating, and selecting a single assembly from multiple assemblies. iMetAMOS packages several leading open-source tools into a single binary that automates parameter selection and execution of multiple assemblers, scores the resulting assemblies based on multiple validation metrics, and annotates the assemblies for genes and contaminants. We demonstrate the utility of the ensemble process on 225 previously unassembled Mycobacterium tuberculosis genomes as well as a Rhodobacter sphaeroides benchmark dataset. On these real data, iMetAMOS reliably produces validated assemblies and identifies potential contamination without user intervention. In addition, intelligent parameter selection produces assemblies of R. sphaeroides comparable to or exceeding the quality of those from the GAGE-B evaluation, affecting the relative ranking of some assemblers. Ensemble assembly with iMetAMOS provides users with multiple, validated assemblies for each genome. Although computationally limited to small or mid-sized genomes, this approach is the most effective and reproducible means for generating high-quality assemblies and enables users to select an assembly best tailored to their specific needs.
Nucleosome breathing and remodeling constrain CRISPR-Cas9 function

PubMed Central

Isaac, R Stefan; Jiang, Fuguo; Doudna, Jennifer A; Lim, Wendell A; Narlikar, Geeta J; Almeida, Ricardo

2016-01-01

The CRISPR-Cas9 bacterial surveillance system has become a versatile tool for genome editing and gene regulation in eukaryotic cells, yet how CRISPR-Cas9 contends with the barriers presented by eukaryotic chromatin is poorly understood. Here we investigate how the smallest unit of chromatin, a nucleosome, constrains the activity of the CRISPR-Cas9 system. We find that nucleosomes assembled on native DNA sequences are permissive to Cas9 action. However, the accessibility of nucleosomal DNA to Cas9 is variable over several orders of magnitude depending on dynamic properties of the DNA sequence and the distance of the PAM site from the nucleosome dyad. We further find that chromatin remodeling enzymes stimulate Cas9 activity on nucleosomal templates. Our findings imply that the spontaneous breathing of nucleosomal DNA together with the action of chromatin remodelers allow Cas9 to effectively act on chromatin in vivo. DOI: http://dx.doi.org/10.7554/eLife.13450.001 PMID:27130520
Genetic circuit design automation.

PubMed

Nielsen, Alec A K; Der, Bryan S; Shin, Jonghyeon; Vaidyanathan, Prashant; Paralanov, Vanya; Strychalski, Elizabeth A; Ross, David; Densmore, Douglas; Voigt, Christopher A

2016-04-01

Computation can be performed in living cells by DNA-encoded circuits that process sensory information and control biological functions. Their construction is time-intensive, requiring manual part assembly and balancing of regulator expression. We describe a design environment, Cello, in which a user writes Verilog code that is automatically transformed into a DNA sequence. Algorithms build a circuit diagram, assign and connect gates, and simulate performance. Reliable circuit design requires the insulation of gates from genetic context, so that they function identically when used in different circuits. We used Cello to design 60 circuits forEscherichia coli(880,000 base pairs of DNA), for which each DNA sequence was built as predicted by the software with no additional tuning. Of these, 45 circuits performed correctly in every output state (up to 10 regulators and 55 parts), and across all circuits 92% of the output states functioned as predicted. Design automation simplifies the incorporation of genetic circuits into biotechnology projects that require decision-making, control, sensing, or spatial organization. Copyright © 2016, American Association for the Advancement of Science.
An electrochemical impedance biosensor for Hg2+ detection based on DNA hydrogel by coupling with DNAzyme-assisted target recycling and hybridization chain reaction.

PubMed

Cai, Wei; Xie, Shunbi; Zhang, Jin; Tang, Dianyong; Tang, Ying

2017-12-15

In this work, an electrochemical impedance biosensor for high sensitive detection of Hg 2+ was presented by coupling with Hg 2+ -induced activation of Mg 2+ -specific DNAzyme (Mg 2+ -DNAzyme) for target cycling and hybridization chain reaction (HCR) assembled DNA hydrogel for signal amplification. Firstly, we synthesized two different copolymer chains P1 and P2 by modifying hairpin DNA H3 and H4 with acrylamide polymer, respectively. Subsequently, Hg 2+ was served as trigger to activate the Mg 2+ -DNAzyme for selectively cleavage ribonucleobase-modified substrate in the presence of Mg 2+ . The partial substrate strand could dissociate from DNAzyme structure, and hybridize with capture probe H1 to expose its concealed sequence for further hybridization. With the help of the exposed sequence, the HCR between hairpin DNA H3 and H4 in P1 and P2 was initiated, and assembled a layer of DNA cross-linked hydrogel on the electrode surface. The formed non-conductive DNA hydrogel film could greatly hinder the interfacial electronic transfer which provided a possibility for us to construct a high sensitive impedance biosensor for Hg 2+ detection. Under the optimal conditions, the impedance biosensor showed an excellent sensitivity and selectivity toward Hg 2+ in a concentration range of 0.1pM - 10nM with a detection limit of 0.042pM Moreover, the real sample analysis reveal that the proposed biosensor is capable of discriminating Hg 2+ ions in reliable and quantitative manners, indicating this method has a promising potential for preliminary application in routine tests. Copyright © 2017 Elsevier B.V. All rights reserved.
Building a model: developing genomic resources for common milkweed (Asclepias syriaca) with low coverage genome sequencing

PubMed Central

2011-01-01

Background Milkweeds (Asclepias L.) have been extensively investigated in diverse areas of evolutionary biology and ecology; however, there are few genetic resources available to facilitate and compliment these studies. This study explored how low coverage genome sequencing of the common milkweed (Asclepias syriaca L.) could be useful in characterizing the genome of a plant without prior genomic information and for development of genomic resources as a step toward further developing A. syriaca as a model in ecology and evolution. Results A 0.5× genome of A. syriaca was produced using Illumina sequencing. A virtually complete chloroplast genome of 158,598 bp was assembled, revealing few repeats and loss of three genes: accD, clpP, and ycf1. A nearly complete rDNA cistron (18S-5.8S-26S; 7,541 bp) and 5S rDNA (120 bp) sequence were obtained. Assessment of polymorphism revealed that the rDNA cistron and 5S rDNA had 0.3% and 26.7% polymorphic sites, respectively. A partial mitochondrial genome sequence (130,764 bp), with identical gene content to tobacco, was also assembled. An initial characterization of repeat content indicated that Ty1/copia-like retroelements are the most common repeat type in the milkweed genome. At least one A. syriaca microread hit 88% of Catharanthus roseus (Apocynaceae) unigenes (median coverage of 0.29×) and 66% of single copy orthologs (COSII) in asterids (median coverage of 0.14×). From this partial characterization of the A. syriaca genome, markers for population genetics (microsatellites) and phylogenetics (low-copy nuclear genes) studies were developed. Conclusions The results highlight the promise of next generation sequencing for development of genomic resources for any organism. Low coverage genome sequencing allows characterization of the high copy fraction of the genome and exploration of the low copy fraction of the genome, which facilitate the development of molecular tools for further study of a target species and its relatives. This study represents a first step in the development of a community resource for further study of plant-insect co-evolution, anti-herbivore defense, floral developmental genetics, reproductive biology, chemical evolution, population genetics, and comparative genomics using milkweeds, and A. syriaca in particular, as ecological and evolutionary models. PMID:21542930
Building a model: developing genomic resources for common milkweed (Asclepias syriaca) with low coverage genome sequencing.

PubMed

Straub, Shannon C K; Fishbein, Mark; Livshultz, Tatyana; Foster, Zachary; Parks, Matthew; Weitemier, Kevin; Cronn, Richard C; Liston, Aaron

2011-05-04

Milkweeds (Asclepias L.) have been extensively investigated in diverse areas of evolutionary biology and ecology; however, there are few genetic resources available to facilitate and compliment these studies. This study explored how low coverage genome sequencing of the common milkweed (Asclepias syriaca L.) could be useful in characterizing the genome of a plant without prior genomic information and for development of genomic resources as a step toward further developing A. syriaca as a model in ecology and evolution. A 0.5× genome of A. syriaca was produced using Illumina sequencing. A virtually complete chloroplast genome of 158,598 bp was assembled, revealing few repeats and loss of three genes: accD, clpP, and ycf1. A nearly complete rDNA cistron (18S-5.8S-26S; 7,541 bp) and 5S rDNA (120 bp) sequence were obtained. Assessment of polymorphism revealed that the rDNA cistron and 5S rDNA had 0.3% and 26.7% polymorphic sites, respectively. A partial mitochondrial genome sequence (130,764 bp), with identical gene content to tobacco, was also assembled. An initial characterization of repeat content indicated that Ty1/copia-like retroelements are the most common repeat type in the milkweed genome. At least one A. syriaca microread hit 88% of Catharanthus roseus (Apocynaceae) unigenes (median coverage of 0.29×) and 66% of single copy orthologs (COSII) in asterids (median coverage of 0.14×). From this partial characterization of the A. syriaca genome, markers for population genetics (microsatellites) and phylogenetics (low-copy nuclear genes) studies were developed. The results highlight the promise of next generation sequencing for development of genomic resources for any organism. Low coverage genome sequencing allows characterization of the high copy fraction of the genome and exploration of the low copy fraction of the genome, which facilitate the development of molecular tools for further study of a target species and its relatives. This study represents a first step in the development of a community resource for further study of plant-insect co-evolution, anti-herbivore defense, floral developmental genetics, reproductive biology, chemical evolution, population genetics, and comparative genomics using milkweeds, and A. syriaca in particular, as ecological and evolutionary models.
Sensitive electrochemical assaying of DNA methyltransferase activity based on mimic-hybridization chain reaction amplified strategy.

PubMed

Zhang, Linqun; Liu, Yuanjian; Li, Ying; Zhao, Yuewu; Wei, Wei; Liu, Songqin

2016-08-24

A mimic-hybridization chain reaction (mimic-HCR) amplified strategy was proposed for sensitive electrochemically detection of DNA methylation and methyltransferase (MTase) activity In the presence of methylated DNA, DNA-gold nanoparticles (DNA-AuNPs) were captured on the electrode by sandwich-type assembly. It then triggered mimic-HCR of two hairpin probes to produce many long double-helix chains for numerous hexaammineruthenium (III) chloride ([Ru(NH3)6](3+), RuHex) inserting. As a result, the signal for electrochemically detection of DNA MTase activity could be amplified. If DNA was non-methylated, however, the sandwich-type assembly would not form because the short double-stranded DNAs (dsDNA) on the Au electrode could be cleaved and digested by restriction endonuclease HpaII (HapII) and exonuclease III (Exo III), resulting in the signal decrement. Based on this, an electrochemical approach for detection of M.SssI MTase activity with high sensitivity was developed. The linear range for M.SssI MTase activity was from 0.05 U mL(-1) to 10 U mL(-1), with a detection limit down to 0.03 U mL(-1). Moreover, this detecting strategy held great promise as an easy-to-use and highly sensitive method for other MTase activity and inhibition detection by exchanging the corresponding DNA sequence. Copyright © 2016 Elsevier B.V. All rights reserved.
[Identification of Tibetan medicine "Dida" of Gentianaceae using DNA barcoding].

PubMed

Liu, Chuan; Zhang, Yu-Xin; Liu, Yue; Chen, Yi-Long; Fan, Gang; Xiang, Li; Xu, Jiang; Zhang, Yi

2016-02-01

The ITS2 barcode was used toidentify Tibetan medicine "Dida", and tosecure its quality and safety in medication. A total of 13 species, 151 experimental samples for the study from the Tibetan Plateau, including Gentianaceae Swertia, Halenia, Gentianopsis, Comastoma, Lomatogonium ITS2 sequences were amplified, and purified PCR products were sequenced. Sequence assembly and consensus sequence generation were performed using the CodonCode Aligner V3.7.1. The Kimura 2-Parameter (K2P) distances were calculated using MEGA 6.0. The neighbor-joining (NJ) phylogenetic trees were constructed. There are 31 haplotypes among 231 bp after alignment of all ITS2 sequence haplotypes, and the average G±C content of 61.40%. The NJ tree strongly supported that every species clustered into their own clade and high identification success rate, except that Swertia bifolia and Swertia wolfangiana could not be distinguished from each other based on the sequence divergences. DNA barcoding could be used as a fast and accurate identification method to distinguish Tibetan medicine "Dida" to ensure its safe use. Copyright© by the Chinese Pharmaceutical Association.
Highly conserved intragenic HSV-2 sequences: Results from next-generation sequencing of HSV-2 UL and US regions from genital swabs collected from 3 continents.

PubMed

Johnston, Christine; Magaret, Amalia; Roychoudhury, Pavitra; Greninger, Alexander L; Cheng, Anqi; Diem, Kurt; Fitzgibbon, Matthew P; Huang, Meei-Li; Selke, Stacy; Lingappa, Jairam R; Celum, Connie; Jerome, Keith R; Wald, Anna; Koelle, David M

2017-10-01

Understanding the variability in circulating herpes simplex virus type 2 (HSV-2) genomic sequences is critical to the development of HSV-2 vaccines. Genital lesion swabs containing ≥ 10 7 log 10 copies HSV DNA collected from Africa, the USA, and South America underwent next-generation sequencing, followed by K-mer based filtering and de novo genomic assembly. Sites of heterogeneity within coding regions in unique long and unique short (U L _U S ) regions were identified. Phylogenetic trees were created using maximum likelihood reconstruction. Among 46 samples from 38 persons, 1468 intragenic base-pair substitutions were identified. The maximum nucleotide distance between strains for concatenated U L_ U S segments was 0.4%. Phylogeny did not reveal geographic clustering. The most variable proteins had non-synonymous mutations in < 3% of amino acids. Unenriched HSV-2 DNA can undergo next-generation sequencing to identify intragenic variability. The use of clinical swabs for sequencing expands the information that can be gathered directly from these specimens. Copyright © 2017 Elsevier Inc. All rights reserved.
[Identification and analysis of Corydalis boweri, Meconopsis horridula and their close related species of the same genus by using ITS2 DNA barcode].

PubMed

Dou, Rong-kun; Bi, Zhen-fei; Bai, Rui-xue; Ren, Yao-yao; Tan, Rui; Song, Liang-ke; Li, Di-qiang; Mao, Can-quan

2015-04-01

The study is aimed to ensure the quality and safety of medicinal plants by using ITS2 DNA barcode technology to identify Corydalis boweri, Meconopsis horridula and their close related species. The DNA of 13 herb samples including C. boweri and M. horridula from Lhasa of Tibet was extracted, ITS PCR were amplified and sequenced. Both assembled and web downloaded 71 ITS2 sequences were removed of 5. 8S and 28S. Multiple sequence alignment was completed and the intraspecific and interspecific genetic distances were calculated by MEGA 5.0, while the neighbor-joining phylogenetic trees were constructed. We also predicted the ITS2 secondary structure of C. boweri, M. horridula and their close related species. The results showed that ITS2 as DNA barcode was able to identify C. boweri, M. horridula as well as well as their close related species effectively. The established based on ITS2 barcode method provides the regular and safe detection technology for identification of C. boweri, M. horridula and their close related species, adulterants and counterfeits, in order to ensure their quality control, safe medication, reasonable development and utilization.

Minimap2: pairwise alignment for nucleotide sequences.

PubMed

Li, Heng

2018-05-10

Recent advances in sequencing technologies promise ultra-long reads of ∼100 kilo bases (kb) in average, full-length mRNA or cDNA reads in high throughput and genomic contigs over 100 mega bases (Mb) in length. Existing alignment programs are unable or inefficient to process such data at scale, which presses for the development of new alignment algorithms. Minimap2 is a general-purpose alignment program to map DNA or long mRNA sequences against a large reference database. It works with accurate short reads of ≥ 100bp in length, ≥1kb genomic reads at error rate ∼15%, full-length noisy Direct RNA or cDNA reads, and assembly contigs or closely related full chromosomes of hundreds of megabases in length. Minimap2 does split-read alignment, employs concave gap cost for long insertions and deletions (INDELs) and introduces new heuristics to reduce spurious alignments. It is 3-4 times as fast as mainstream short-read mappers at comparable accuracy, and is ≥30 times faster than long-read genomic or cDNA mappers at higher accuracy, surpassing most aligners specialized in one type of alignment. https://github.com/lh3/minimap2. hengli@broadinstitute.org.
Developmentally programmed DNA splicing in Paramecium reveals short-distance crosstalk between DNA cleavage sites

PubMed Central

Gratias, Ariane; Lepère, Gersende; Garnier, Olivier; Rosa, Sarah; Duharcourt, Sandra; Malinsky, Sophie; Meyer, Eric; Bétermier, Mireille

2008-01-01

Somatic genome assembly in the ciliate Paramecium involves the precise excision of thousands of short internal eliminated sequences (IESs) that are scattered throughout the germline genome and often interrupt open reading frames. Excision is initiated by double-strand breaks centered on the TA dinucleotides that are conserved at each IES boundary, but the factors that drive cleavage site recognition remain unknown. A degenerate consensus was identified previously at IES ends and genetic analyses confirmed the participation of their nucleotide sequence in efficient excision. Even for wild-type IESs, however, variant excision patterns (excised or nonexcised) may be inherited maternally through sexual events, in a homology-dependent manner. We show here that this maternal epigenetic control interferes with the targeting of DNA breaks at IES ends. Furthermore, we demonstrate that a mutation in the TA at one end of an IES impairs DNA cleavage not only at the mutant end but also at the wild-type end. We conclude that crosstalk between both ends takes place prior to their cleavage and propose that the ability of an IES to adopt an excision-prone conformation depends on the combination of its nucleotide sequence and of additional determinants. PMID:18420657
Amplified biosensing using the horseradish peroxidase-mimicking DNAzyme as an electrocatalyst.

PubMed

Pelossof, Gilad; Tel-Vered, Ran; Elbaz, Johann; Willner, Itamar

2010-06-01

The hemin/G-quadruplex horseradish peroxidase-mimicking DNAzyme is assembled on Au electrodes. It reveals bioelectrocatalytic properties and electrocatalyzes the reduction of H(2)O(2). The bioelectrocatalytic functions of the hemin/G-quadruplex DNAzyme are used to develop electrochemical sensors that follow the activity of glucose oxidase and biosensors for the detection of DNA or low-molecular-weight substrates (adenosine monophosphate, AMP). Hairpin nucleic structures that include the G-quadruplex sequence in a caged configuration and the nucleic acid sequence complementary to the analyte DNA, or the aptamer sequence for AMP, are immobilized on Au-electrode surfaces. In the presence of the DNA analyte, or AMP, the hairpin structures are opened, and the hemin/G-quadruplex horseradish peroxidase-mimicking DNAzyme structures are generated on the electrode surfaces. The bioelectrocatalytic cathodic currents generated by the functionalized electrodes, upon the electrochemical reduction of H(2)O(2), provide a quantitative measure for the detection of the target analytes. The DNA target was analyzed with a detection limit of 1 x 10(-12) M, while the detection limit for analyzing AMP was 1 x 10(-6) M. Methods to regenerate the sensing surfaces are presented.
The Paramecium germline genome provides a niche for intragenic parasitic DNA: evolutionary dynamics of internal eliminated sequences.

PubMed

Arnaiz, Olivier; Mathy, Nathalie; Baudry, Céline; Malinsky, Sophie; Aury, Jean-Marc; Denby Wilkes, Cyril; Garnier, Olivier; Labadie, Karine; Lauderdale, Benjamin E; Le Mouël, Anne; Marmignon, Antoine; Nowacki, Mariusz; Poulain, Julie; Prajer, Malgorzata; Wincker, Patrick; Meyer, Eric; Duharcourt, Sandra; Duret, Laurent; Bétermier, Mireille; Sperling, Linda

2012-01-01

Insertions of parasitic DNA within coding sequences are usually deleterious and are generally counter-selected during evolution. Thanks to nuclear dimorphism, ciliates provide unique models to study the fate of such insertions. Their germline genome undergoes extensive rearrangements during development of a new somatic macronucleus from the germline micronucleus following sexual events. In Paramecium, these rearrangements include precise excision of unique-copy Internal Eliminated Sequences (IES) from the somatic DNA, requiring the activity of a domesticated piggyBac transposase, PiggyMac. We have sequenced Paramecium tetraurelia germline DNA, establishing a genome-wide catalogue of -45,000 IESs, in order to gain insight into their evolutionary origin and excision mechanism. We obtained direct evidence that PiggyMac is required for excision of all IESs. Homology with known P. tetraurelia Tc1/mariner transposons, described here, indicates that at least a fraction of IESs derive from these elements. Most IES insertions occurred before a recent whole-genome duplication that preceded diversification of the P. aurelia species complex, but IES invasion of the Paramecium genome appears to be an ongoing process. Once inserted, IESs decay rapidly by accumulation of deletions and point substitutions. Over 90% of the IESs are shorter than 150 bp and present a remarkable size distribution with a -10 bp periodicity, corresponding to the helical repeat of double-stranded DNA and suggesting DNA loop formation during assembly of a transpososome-like excision complex. IESs are equally frequent within and between coding sequences; however, excision is not 100% efficient and there is selective pressure against IES insertions, in particular within highly expressed genes. We discuss the possibility that ancient domestication of a piggyBac transposase favored subsequent propagation of transposons throughout the germline by allowing insertions in coding sequences, a fraction of the genome in which parasitic DNA is not usually tolerated.
The Paramecium Germline Genome Provides a Niche for Intragenic Parasitic DNA: Evolutionary Dynamics of Internal Eliminated Sequences

PubMed Central

Arnaiz, Olivier; Mathy, Nathalie; Baudry, Céline; Malinsky, Sophie; Aury, Jean-Marc; Denby Wilkes, Cyril; Garnier, Olivier; Labadie, Karine; Lauderdale, Benjamin E.; Le Mouël, Anne; Marmignon, Antoine; Nowacki, Mariusz; Poulain, Julie; Prajer, Malgorzata; Wincker, Patrick; Meyer, Eric; Duharcourt, Sandra; Duret, Laurent; Bétermier, Mireille; Sperling, Linda

2012-01-01

Insertions of parasitic DNA within coding sequences are usually deleterious and are generally counter-selected during evolution. Thanks to nuclear dimorphism, ciliates provide unique models to study the fate of such insertions. Their germline genome undergoes extensive rearrangements during development of a new somatic macronucleus from the germline micronucleus following sexual events. In Paramecium, these rearrangements include precise excision of unique-copy Internal Eliminated Sequences (IES) from the somatic DNA, requiring the activity of a domesticated piggyBac transposase, PiggyMac. We have sequenced Paramecium tetraurelia germline DNA, establishing a genome-wide catalogue of ∼45,000 IESs, in order to gain insight into their evolutionary origin and excision mechanism. We obtained direct evidence that PiggyMac is required for excision of all IESs. Homology with known P. tetraurelia Tc1/mariner transposons, described here, indicates that at least a fraction of IESs derive from these elements. Most IES insertions occurred before a recent whole-genome duplication that preceded diversification of the P. aurelia species complex, but IES invasion of the Paramecium genome appears to be an ongoing process. Once inserted, IESs decay rapidly by accumulation of deletions and point substitutions. Over 90% of the IESs are shorter than 150 bp and present a remarkable size distribution with a ∼10 bp periodicity, corresponding to the helical repeat of double-stranded DNA and suggesting DNA loop formation during assembly of a transpososome-like excision complex. IESs are equally frequent within and between coding sequences; however, excision is not 100% efficient and there is selective pressure against IES insertions, in particular within highly expressed genes. We discuss the possibility that ancient domestication of a piggyBac transposase favored subsequent propagation of transposons throughout the germline by allowing insertions in coding sequences, a fraction of the genome in which parasitic DNA is not usually tolerated. PMID:23071448
Full genome virus detection in fecal samples using sensitive nucleic acid preparation, deep sequencing, and a novel iterative sequence classification algorithm.

PubMed

Cotten, Matthew; Oude Munnink, Bas; Canuti, Marta; Deijs, Martin; Watson, Simon J; Kellam, Paul; van der Hoek, Lia

2014-01-01

We have developed a full genome virus detection process that combines sensitive nucleic acid preparation optimised for virus identification in fecal material with Illumina MiSeq sequencing and a novel post-sequencing virus identification algorithm. Enriched viral nucleic acid was converted to double-stranded DNA and subjected to Illumina MiSeq sequencing. The resulting short reads were processed with a novel iterative Python algorithm SLIM for the identification of sequences with homology to known viruses. De novo assembly was then used to generate full viral genomes. The sensitivity of this process was demonstrated with a set of fecal samples from HIV-1 infected patients. A quantitative assessment of the mammalian, plant, and bacterial virus content of this compartment was generated and the deep sequencing data were sufficient to assembly 12 complete viral genomes from 6 virus families. The method detected high levels of enteropathic viruses that are normally controlled in healthy adults, but may be involved in the pathogenesis of HIV-1 infection and will provide a powerful tool for virus detection and for analyzing changes in the fecal virome associated with HIV-1 progression and pathogenesis.
Full Genome Virus Detection in Fecal Samples Using Sensitive Nucleic Acid Preparation, Deep Sequencing, and a Novel Iterative Sequence Classification Algorithm

PubMed Central

Cotten, Matthew; Oude Munnink, Bas; Canuti, Marta; Deijs, Martin; Watson, Simon J.; Kellam, Paul; van der Hoek, Lia

2014-01-01

We have developed a full genome virus detection process that combines sensitive nucleic acid preparation optimised for virus identification in fecal material with Illumina MiSeq sequencing and a novel post-sequencing virus identification algorithm. Enriched viral nucleic acid was converted to double-stranded DNA and subjected to Illumina MiSeq sequencing. The resulting short reads were processed with a novel iterative Python algorithm SLIM for the identification of sequences with homology to known viruses. De novo assembly was then used to generate full viral genomes. The sensitivity of this process was demonstrated with a set of fecal samples from HIV-1 infected patients. A quantitative assessment of the mammalian, plant, and bacterial virus content of this compartment was generated and the deep sequencing data were sufficient to assembly 12 complete viral genomes from 6 virus families. The method detected high levels of enteropathic viruses that are normally controlled in healthy adults, but may be involved in the pathogenesis of HIV-1 infection and will provide a powerful tool for virus detection and for analyzing changes in the fecal virome associated with HIV-1 progression and pathogenesis. PMID:24695106
Genome-wide engineering of an infectious clone of herpes simplex virus type 1 using synthetic genomics assembly methods.

PubMed

Oldfield, Lauren M; Grzesik, Peter; Voorhies, Alexander A; Alperovich, Nina; MacMath, Derek; Najera, Claudia D; Chandra, Diya Sabrina; Prasad, Sanjana; Noskov, Vladimir N; Montague, Michael G; Friedman, Robert M; Desai, Prashant J; Vashee, Sanjay

2017-10-17

Here, we present a transformational approach to genome engineering of herpes simplex virus type 1 (HSV-1), which has a large DNA genome, using synthetic genomics tools. We believe this method will enable more rapid and complex modifications of HSV-1 and other large DNA viruses than previous technologies, facilitating many useful applications. Yeast transformation-associated recombination was used to clone 11 fragments comprising the HSV-1 strain KOS 152 kb genome. Using overlapping sequences between the adjacent pieces, we assembled the fragments into a complete virus genome in yeast, transferred it into an Escherichia coli host, and reconstituted infectious virus following transfection into mammalian cells. The virus derived from this yeast-assembled genome, KOS YA , replicated with kinetics similar to wild-type virus. We demonstrated the utility of this modular assembly technology by making numerous modifications to a single gene, making changes to two genes at the same time and, finally, generating individual and combinatorial deletions to a set of five conserved genes that encode virion structural proteins. While the ability to perform genome-wide editing through assembly methods in large DNA virus genomes raises dual-use concerns, we believe the incremental risks are outweighed by potential benefits. These include enhanced functional studies, generation of oncolytic virus vectors, development of delivery platforms of genes for vaccines or therapy, as well as more rapid development of countermeasures against potential biothreats.
Genome-wide engineering of an infectious clone of herpes simplex virus type 1 using synthetic genomics assembly methods

PubMed Central

Grzesik, Peter; Voorhies, Alexander A.; Alperovich, Nina; MacMath, Derek; Najera, Claudia D.; Chandra, Diya Sabrina; Prasad, Sanjana; Noskov, Vladimir N.; Montague, Michael G.; Friedman, Robert M.; Desai, Prashant J.

2017-01-01

Here, we present a transformational approach to genome engineering of herpes simplex virus type 1 (HSV-1), which has a large DNA genome, using synthetic genomics tools. We believe this method will enable more rapid and complex modifications of HSV-1 and other large DNA viruses than previous technologies, facilitating many useful applications. Yeast transformation-associated recombination was used to clone 11 fragments comprising the HSV-1 strain KOS 152 kb genome. Using overlapping sequences between the adjacent pieces, we assembled the fragments into a complete virus genome in yeast, transferred it into an Escherichia coli host, and reconstituted infectious virus following transfection into mammalian cells. The virus derived from this yeast-assembled genome, KOSYA, replicated with kinetics similar to wild-type virus. We demonstrated the utility of this modular assembly technology by making numerous modifications to a single gene, making changes to two genes at the same time and, finally, generating individual and combinatorial deletions to a set of five conserved genes that encode virion structural proteins. While the ability to perform genome-wide editing through assembly methods in large DNA virus genomes raises dual-use concerns, we believe the incremental risks are outweighed by potential benefits. These include enhanced functional studies, generation of oncolytic virus vectors, development of delivery platforms of genes for vaccines or therapy, as well as more rapid development of countermeasures against potential biothreats. PMID:28928148
Multiple conformational states of DnaA protein regulate its interaction with DnaA boxes in the initiation of DNA replication.

PubMed

Patel, Meera J; Bhatia, Lavesh; Yilmaz, Gulden; Biswas-Fiss, Esther E; Biswas, Subhasis B

2017-09-01

DnaA protein is the initiator of genomic DNA replication in prokaryotes. It binds to specific DNA sequences in the origin of DNA replication and unwinds small AT-rich sequences downstream for the assembly of the replisome. The mechanism of activation of DnaA that enables it to bind and organize the origin DNA and leads to replication initiation remains unclear. In this study, we have developed double-labeled fluorescent DnaA probes to analyze conformational states of DnaA protein upon binding DNA, nucleotide, and Soj sporulation protein using Fluorescence Resonance Energy Transfer (FRET). Our studies demonstrate that DnaA protein undergoes large conformational changes upon binding to substrates and there are multiple distinct conformational states that enable it to initiate DNA replication. DnaA protein adopted a relaxed conformation by expanding ~15Å upon binding ATP and DNA to form the ATP·DnaA·DNA complex. Hydrolysis of bound ATP to ADP led to a contraction of DnaA within the complex. The relaxed conformation of DnaA is likely required for the formation of the multi-protein ATP·DnaA·DNA complex. In the initiation of sporulation, Soj binding to DnaA prevented relaxation of its conformation. Soj·ADP appeared to block the activation of DnaA, suggesting a mechanism for Soj·ADP in switching initiation of DNA replication to sporulation. Our studies demonstrate that multiple conformational states of DnaA protein regulate its binding to DNA in the initiation of DNA replication. Copyright © 2017 Elsevier B.V. All rights reserved.
Organization of 'nanocrystal molecules' using DNA

NASA Astrophysics Data System (ADS)

Alivisatos, A. Paul; Johnsson, Kai P.; Peng, Xiaogang; Wilson, Troy E.; Loweth, Colin J.; Bruchez, Marcel P.; Schultz, Peter G.

1996-08-01

PATTERNING matter on the nanometre scale is an important objective of current materials chemistry and physics. It is driven by both the need to further miniaturize electronic components and the fact that at the nanometre scale, materials properties are strongly size-dependent and thus can be tuned sensitively1. In nanoscale crystals, quantum size effects and the large number of surface atoms influence the, chemical, electronic, magnetic and optical behaviour2-4. 'Top-down' (for example, lithographic) methods for nanoscale manipulation reach only to the upper end of the nanometre regime5; but whereas 'bottom-up' wet chemical techniques allow for the preparation of mono-disperse, defect-free crystallites just 1-10 nm in size6-10, ways to control the structure of nanocrystal assemblies are scarce. Here we describe a strategy for the synthesis of'nanocrystal molecules', in which discrete numbers of gold nanocrystals are organized into spatially defined structures based on Watson-Crick base-pairing interactions. We attach single-stranded DNA oligonucleotides of defined length and sequence to individual nanocrystals, and these assemble into dimers and trimers on addition of a complementary single-stranded DNA template. We anticipate that this approach should allow the construction of more complex two-and three-dimensional assemblies.
Allele Identification for Transcriptome-Based Population Genomics in the Invasive Plant Centaurea solstitialis

PubMed Central

Dlugosch, Katrina M.; Lai, Zhao; Bonin, Aurélie; Hierro, José; Rieseberg, Loren H.

2013-01-01

Transcriptome sequences are becoming more broadly available for multiple individuals of the same species, providing opportunities to derive population genomic information from these datasets. Using the 454 Life Science Genome Sequencer FLX and FLX-Titanium next-generation platforms, we generated 11−430 Mbp of sequence for normalized cDNA for 40 wild genotypes of the invasive plant Centaurea solstitialis, yellow starthistle, from across its worldwide distribution. We examined the impact of sequencing effort on transcriptome recovery and overlap among individuals. To do this, we developed two novel publicly available software pipelines: SnoWhite for read cleaning before assembly, and AllelePipe for clustering of loci and allele identification in assembled datasets with or without a reference genome. AllelePipe is designed specifically for cases in which read depth information is not appropriate or available to assist with disentangling closely related paralogs from allelic variation, as in transcriptome or previously assembled libraries. We find that modest applications of sequencing effort recover most of the novel sequences present in the transcriptome of this species, including single-copy loci and a representative distribution of functional groups. In contrast, the coverage of variable sites, observation of heterozygosity, and overlap among different libraries are all highly dependent on sequencing effort. Nevertheless, the information gained from overlapping regions was informative regarding coarse population structure and variation across our small number of population samples, providing the first genetic evidence in support of hypothesized invasion scenarios. PMID:23390612
Regulating DNA Self-assembly by DNA-Surface Interactions.

PubMed

Liu, Longfei; Li, Yulin; Wang, Yong; Zheng, Jianwei; Mao, Chengde

2017-12-14

DNA self-assembly provides a powerful approach for preparation of nanostructures. It is often studied in bulk solution and involves only DNA-DNA interactions. When confined to surfaces, DNA-surface interactions become an additional, important factor to DNA self-assembly. However, the way in which DNA-surface interactions influence DNA self-assembly is not well studied. In this study, we showed that weak DNA-DNA interactions could be stabilized by DNA-surface interactions to allow large DNA nanostructures to form. In addition, the assembly can be conducted isothermally at room temperature in as little as 5 seconds. © 2017 Wiley-VCH Verlag GmbH & Co. KGaA, Weinheim.
DNA nanotechnology based on i-motif structures.

PubMed

Dong, Yuanchen; Yang, Zhongqiang; Liu, Dongsheng

2014-06-17

CONSPECTUS: Most biological processes happen at the nanometer scale, and understanding the energy transformations and material transportation mechanisms within living organisms has proved challenging. To better understand the secrets of life, researchers have investigated artificial molecular motors and devices over the past decade because such systems can mimic certain biological processes. DNA nanotechnology based on i-motif structures is one system that has played an important role in these investigations. In this Account, we summarize recent advances in functional DNA nanotechnology based on i-motif structures. The i-motif is a DNA quadruplex that occurs as four stretches of cytosine repeat sequences form C·CH(+) base pairs, and their stabilization requires slightly acidic conditions. This unique property has produced the first DNA molecular motor driven by pH changes. The motor is reliable, and studies show that it is capable of millisecond running speeds, comparable to the speed of natural protein motors. With careful design, the output of these types of motors was combined to drive micrometer-sized cantilevers bend. Using established DNA nanostructure assembly and functionalization methods, researchers can easily integrate the motor within other DNA assembled structures and functional units, producing DNA molecular devices with new functions such as suprahydrophobic/suprahydrophilic smart surfaces that switch, intelligent nanopores triggered by pH changes, molecular logic gates, and DNA nanosprings. Recently, researchers have produced motors driven by light and electricity, which have allowed DNA motors to be integrated within silicon-based nanodevices. Moreover, some devices based on i-motif structures have proven useful for investigating processes within living cells. The pH-responsiveness of the i-motif structure also provides a way to control the stepwise assembly of DNA nanostructures. In addition, because of the stability of the i-motif, this structure can serve as the stem of one-dimensional nanowires, and a four-strand stem can provide a new basis for three-dimensional DNA structures such as pillars. By sacrificing some accuracy in assembly, we used these properties to prepare the first fast-responding pure DNA supramolecular hydrogel. This hydrogel does not swell and cannot encapsulate small molecules. These unique properties could lead to new developments in smart materials based on DNA assembly and support important applications in fields such as tissue engineering. We expect that DNA nanotechnology will continue to develop rapidly. At a fundamental level, further studies should lead to greater understanding of the energy transformation and material transportation mechanisms at the nanometer scale. In terms of applications, we expect that many of these elegant molecular devices will soon be used in vivo. These further studies could demonstrate the power of DNA nanotechnology in biology, material science, chemistry, and physics.
extendFromReads

DOE Office of Scientific and Technical Information (OSTI.GOV)

Williams, Kelly P.

2013-10-03

This package assists in genome assembly. extendFromReads takes as input a set of Illumina (eg, MiSeq) DNA sequencing reads, a query seed sequence and a direction to extend the seed. The algorithm collects all seed-- ]matching reads (flipping reverse-- ]orientation hits), trims off the seed and additional sequence in the other direction, sorts the remaining sequences alphabetically, and prints them aligned without gaps from the point of seed trimming. This produces a visual display distinguishing the flanks of multi- ]copy seeds. A companion script hitMates.pl collects the mates of seed-- ]hi]ng reads, whose alignment reveals longer extensions from the seed.more » The collect/trim/sort strategy was made iterative and scaled up in the script denovo.pl, for de novo contig assembly. An index is pre-- ]built using indexReads.pl that for each unique 21-- ]mer found in all the reads, records its gfate h of extension (whether extendable, blocked by low coverage, or blocked by branching after a duplicated sequence) and other characteristics. Importantly, denovo.pl records all branchings that follow a branching contig endpoint, providing contig- ]extension information« less
Sequence-Independent Cloning and Post-Translational Modification of Repetitive Protein Polymers through Sortase and Sfp-Mediated Enzymatic Ligation.

PubMed

Ott, Wolfgang; Nicolaus, Thomas; Gaub, Hermann E; Nash, Michael A

2016-04-11

Repetitive protein-based polymers are important for many applications in biotechnology and biomaterials development. Here we describe the sequential additive ligation of highly repetitive DNA sequences, their assembly into genes encoding protein-polymers with precisely tunable lengths and compositions, and their end-specific post-translational modification with organic dyes and fluorescent protein domains. Our new Golden Gate-based cloning approach relies on incorporation of only type IIS BsaI restriction enzyme recognition sites using PCR, which allowed us to install ybbR-peptide tags, Sortase c-tags, and cysteine residues onto either end of the repetitive gene polymers without leaving residual cloning scars. The assembled genes were expressed in Escherichia coli and purified using inverse transition cycling (ITC). Characterization by cloud point spectrophotometry, and denaturing polyacrylamide gel electrophoresis with fluorescence detection confirmed successful phosphopantetheinyl transferase (Sfp)-mediated post-translational N-terminal labeling of the protein-polymers with a coenzyme A-647 dye (CoA-647) and simultaneous sortase-mediated C-terminal labeling with a GFP domain containing an N-terminal GG-motif in a one-pot reaction. In a further demonstration, we installed an N-terminal cysteine residue into an elastin-like polypeptide (ELP) that was subsequently conjugated to a single chain poly(ethylene glycol)-maleimide (PEG-maleimide) synthetic polymer, noticeably shifting the ELP cloud point. The ability to straightforwardly assemble repetitive DNA sequences encoding ELPs of precisely tunable length and to post-translationally modify them specifically at the N- and C- termini provides a versatile platform for the design and production of multifunctional smart protein-polymeric materials.
Triplex-forming oligonucleotides: a third strand for DNA nanotechnology

PubMed Central

2018-01-01

Abstract DNA self-assembly has proved to be a useful bottom-up strategy for the construction of user-defined nanoscale objects, lattices and devices. The design of these structures has largely relied on exploiting simple base pairing rules and the formation of double-helical domains as secondary structural elements. However, other helical forms involving specific non-canonical base-base interactions have introduced a novel paradigm into the process of engineering with DNA. The most notable of these is a three-stranded complex generated by the binding of a third strand within the duplex major groove, generating a triple-helical (‘triplex’) structure. The sequence, structural and assembly requirements that differentiate triplexes from their duplex counterparts has allowed the design of nanostructures for both dynamic and/or structural purposes, as well as a means to target non-nucleic acid components to precise locations within a nanostructure scaffold. Here, we review the properties of triplexes that have proved useful in the engineering of DNA nanostructures, with an emphasis on applications that hitherto have not been possible by duplex formation alone. PMID:29228337
Redox polymer and probe DNA tethered to gold electrodes for enzyme-amplified amperometric detection of DNA hybridization.

PubMed

Kavanagh, Paul; Leech, Dónal

2006-04-15

The detection of nucleic acids based upon recognition surfaces formed by co-immobilization of a redox polymer mediator and DNA probe sequences on gold electrodes is described. The recognition surface consists of a redox polymer, [Os(2,2'-bipyridine)2(polyvinylimidazole)(10)Cl](+/2+), and a model single DNA strand cross-linked and tethered to a gold electrode via an anchoring self-assembled monolayer (SAM) of cysteamine. Hybridization between the immobilized probe DNA of the recognition surface and a biotin-conjugated target DNA sequence (designed from the ssrA gene of Listeria monocytogenes), followed by addition of an enzyme (glucose oxidase)-avidin conjugate, results in electrical contact between the enzyme and the mediating redox polymer. In the presence of glucose, the current generated due to the catalytic oxidation of glucose to gluconolactone is measured, and a response is obtained that is binding-dependent. The tethering of the probe DNA and redox polymer to the SAM improves the stability of the surface to assay conditions of rigorous washing and high salt concentration (1 M). These conditions eliminate nonspecific interaction of both the target DNA and the enzyme-avidin conjugate with the recognition surfaces. The sensor response increases linearly with increasing concentration of target DNA in the range of 1 x 10(-9) to 2 x 10(-6) M. The detection limit is approximately 1.4 fmol, (corresponding to 0.2 nM of target DNA). Regeneration of the recognition surface is possible by treatment with 0.25 M NaOH solution. After rehybridization of the regenerated surface with the target DNA sequence, >95% of the current is recovered, indicating that the redox polymer and probe DNA are strongly bound to the surface. These results demonstrate the utility of the proposed approach.
Ultrasensitive and selective signal-on electrochemical DNA detection via exonuclease III catalysis and hybridization chain reaction amplification.

PubMed

Ren, Wang; Gao, Zhong Feng; Li, Nian Bing; Luo, Hong Qun

2015-01-15

This work reported a novel, ultrasensitive, and selective platform for electrochemical detection of DNA, employing an integration of exonuclease III (Exo-III) assisted target recycling and hybridization chain reaction (HCR) for the dual signal amplification strategy. The hairpin capture probe DNA (C-DNA) with an Exo-III 3' overhang end was self-assembled on a gold electrode. In the presence of target DNA (T-DNA), C-DNA hybridized with the T-DNA to form a duplex region, exposing its 5' complementary sequence (initiator). Exo-III was applied to selectively digest duplex region from its 3-hydroxyl termini until the duplex was fully consumed, leaving the remnant initiator. The intact T-DNA spontaneously dissociated from the structure and then initiated the next hybridization process as a result of catalysis of the Exo-III. HCR event was triggered by the initiator and two hairpin helper signal probes labeled with methylene blue, facilitating the polymerization of oligonucleotides into a long nicked dsDNA molecule. The numerous exposed remnant initiators can trigger more HCR events. Because of integration of dual signal amplification and the specific HCR process reaction, the resultant sensor showed a high sensitivity for the detection of the target DNA in a linear range from 1.0 fM to 1.0 nM, and a detection limit as low as 0.2 fM. The proposed dual signal amplification strategy provides a powerful tool for detecting different sequences of target DNA by changing the sequence of capture probe and signal probes, holding a great potential for early diagnosis in gene-related diseases. Copyright © 2014 Elsevier B.V. All rights reserved.
Role of the Escherichia coli grpE heat shock protein in the initiation of bacteriophage lambda DNA replication.

PubMed

Osipiuk, J; Zylicz, M

1991-01-01

Initiation of replication of lambda DNA requires assembly of the proper nucleoprotein complex consisting of the lambda origin of replication-lambda O-lambda P-dnaB proteins. The dnaJ, dnaK and grpE heat shock proteins destabilize the lambda P-dnaB interaction in this complex permitting dnaB helicase to unwind lambda DNA near ori lambda sequence. First step of this disassembling reaction is the binding of dnaK protein to lambda P protein. In this report we examined the influence of dnaJ and grpE proteins on stability of the lambda P-dnaK complex. Our results show that grpE alone dissociates this complex, but both grpE and dnaJ together do not. These results suggest that, in the presence of grpE protein, dnaK protein has a higher affinity for lambda P protein complexed with dnaJ protein than in the situation where grpE protein is not used.

Some links on this page may take you to non-federal websites. Their policies may differ from this site.