Sample records for highly efficient sequence-specific

  1. YAMAT-seq: an efficient method for high-throughput sequencing of mature transfer RNAs

    PubMed Central

    Shigematsu, Megumi; Honda, Shozo; Loher, Phillipe; Telonis, Aristeidis G.; Rigoutsos, Isidore

    2017-01-01

    Abstract Besides translation, transfer RNAs (tRNAs) play many non-canonical roles in various biological pathways and exhibit highly variable expression profiles. To unravel the emerging complexities of tRNA biology and molecular mechanisms underlying them, an efficient tRNA sequencing method is required. However, the rigid structure of tRNA has been presenting a challenge to the development of such methods. We report the development of Y-shaped Adapter-ligated MAture TRNA sequencing (YAMAT-seq), an efficient and convenient method for high-throughput sequencing of mature tRNAs. YAMAT-seq circumvents the issue of inefficient adapter ligation, a characteristic of conventional RNA sequencing methods for mature tRNAs, by employing the efficient and specific ligation of Y-shaped adapter to mature tRNAs using T4 RNA Ligase 2. Subsequent cDNA amplification and next-generation sequencing successfully yield numerous mature tRNA sequences. YAMAT-seq has high specificity for mature tRNAs and high sensitivity to detect most isoacceptors from minute amount of total RNA. Moreover, YAMAT-seq shows quantitative capability to estimate expression levels of mature tRNAs, and has high reproducibility and broad applicability for various cell lines. YAMAT-seq thus provides high-throughput technique for identifying tRNA profiles and their regulations in various transcriptomes, which could play important regulatory roles in translation and other biological processes. PMID:28108659

  2. A novel bioinformatics method for efficient knowledge discovery by BLSOM from big genomic sequence data.

    PubMed

    Bai, Yu; Iwasaki, Yuki; Kanaya, Shigehiko; Zhao, Yue; Ikemura, Toshimichi

    2014-01-01

    With remarkable increase of genomic sequence data of a wide range of species, novel tools are needed for comprehensive analyses of the big sequence data. Self-Organizing Map (SOM) is an effective tool for clustering and visualizing high-dimensional data such as oligonucleotide composition on one map. By modifying the conventional SOM, we have previously developed Batch-Learning SOM (BLSOM), which allows classification of sequence fragments according to species, solely depending on the oligonucleotide composition. In the present study, we introduce the oligonucleotide BLSOM used for characterization of vertebrate genome sequences. We first analyzed pentanucleotide compositions in 100 kb sequences derived from a wide range of vertebrate genomes and then the compositions in the human and mouse genomes in order to investigate an efficient method for detecting differences between the closely related genomes. BLSOM can recognize the species-specific key combination of oligonucleotide frequencies in each genome, which is called a "genome signature," and the specific regions specifically enriched in transcription-factor-binding sequences. Because the classification and visualization power is very high, BLSOM is an efficient powerful tool for extracting a wide range of information from massive amounts of genomic sequences (i.e., big sequence data).

  3. BiQ Analyzer HT: locus-specific analysis of DNA methylation by high-throughput bisulfite sequencing

    PubMed Central

    Lutsik, Pavlo; Feuerbach, Lars; Arand, Julia; Lengauer, Thomas; Walter, Jörn; Bock, Christoph

    2011-01-01

    Bisulfite sequencing is a widely used method for measuring DNA methylation in eukaryotic genomes. The assay provides single-base pair resolution and, given sufficient sequencing depth, its quantitative accuracy is excellent. High-throughput sequencing of bisulfite-converted DNA can be applied either genome wide or targeted to a defined set of genomic loci (e.g. using locus-specific PCR primers or DNA capture probes). Here, we describe BiQ Analyzer HT (http://biq-analyzer-ht.bioinf.mpi-inf.mpg.de/), a user-friendly software tool that supports locus-specific analysis and visualization of high-throughput bisulfite sequencing data. The software facilitates the shift from time-consuming clonal bisulfite sequencing to the more quantitative and cost-efficient use of high-throughput sequencing for studying locus-specific DNA methylation patterns. In addition, it is useful for locus-specific visualization of genome-wide bisulfite sequencing data. PMID:21565797

  4. YAMAT-seq: an efficient method for high-throughput sequencing of mature transfer RNAs.

    PubMed

    Shigematsu, Megumi; Honda, Shozo; Loher, Phillipe; Telonis, Aristeidis G; Rigoutsos, Isidore; Kirino, Yohei

    2017-05-19

    Besides translation, transfer RNAs (tRNAs) play many non-canonical roles in various biological pathways and exhibit highly variable expression profiles. To unravel the emerging complexities of tRNA biology and molecular mechanisms underlying them, an efficient tRNA sequencing method is required. However, the rigid structure of tRNA has been presenting a challenge to the development of such methods. We report the development of Y-shaped Adapter-ligated MAture TRNA sequencing (YAMAT-seq), an efficient and convenient method for high-throughput sequencing of mature tRNAs. YAMAT-seq circumvents the issue of inefficient adapter ligation, a characteristic of conventional RNA sequencing methods for mature tRNAs, by employing the efficient and specific ligation of Y-shaped adapter to mature tRNAs using T4 RNA Ligase 2. Subsequent cDNA amplification and next-generation sequencing successfully yield numerous mature tRNA sequences. YAMAT-seq has high specificity for mature tRNAs and high sensitivity to detect most isoacceptors from minute amount of total RNA. Moreover, YAMAT-seq shows quantitative capability to estimate expression levels of mature tRNAs, and has high reproducibility and broad applicability for various cell lines. YAMAT-seq thus provides high-throughput technique for identifying tRNA profiles and their regulations in various transcriptomes, which could play important regulatory roles in translation and other biological processes. © The Author(s) 2017. Published by Oxford University Press on behalf of Nucleic Acids Research.

  5. Terminator oligo blocking efficiently eliminates rRNA from Drosophila small RNA sequencing libraries.

    PubMed

    Wickersheim, Michelle L; Blumenstiel, Justin P

    2013-11-01

    A large number of methods are available to deplete ribosomal RNA reads from high-throughput RNA sequencing experiments. Such methods are critical for sequencing Drosophila small RNAs between 20 and 30 nucleotides because size selection is not typically sufficient to exclude the highly abundant class of 30 nucleotide 2S rRNA. Here we demonstrate that pre-annealing terminator oligos complimentary to Drosophila 2S rRNA prior to 5' adapter ligation and reverse transcription efficiently depletes 2S rRNA sequences from the sequencing reaction in a simple and inexpensive way. This depletion is highly specific and is achieved with minimal perturbation of miRNA and piRNA profiles.

  6. Automated sequence-specific protein NMR assignment using the memetic algorithm MATCH.

    PubMed

    Volk, Jochen; Herrmann, Torsten; Wüthrich, Kurt

    2008-07-01

    MATCH (Memetic Algorithm and Combinatorial Optimization Heuristics) is a new memetic algorithm for automated sequence-specific polypeptide backbone NMR assignment of proteins. MATCH employs local optimization for tracing partial sequence-specific assignments within a global, population-based search environment, where the simultaneous application of local and global optimization heuristics guarantees high efficiency and robustness. MATCH thus makes combined use of the two predominant concepts in use for automated NMR assignment of proteins. Dynamic transition and inherent mutation are new techniques that enable automatic adaptation to variable quality of the experimental input data. The concept of dynamic transition is incorporated in all major building blocks of the algorithm, where it enables switching between local and global optimization heuristics at any time during the assignment process. Inherent mutation restricts the intrinsically required randomness of the evolutionary algorithm to those regions of the conformation space that are compatible with the experimental input data. Using intact and artificially deteriorated APSY-NMR input data of proteins, MATCH performed sequence-specific resonance assignment with high efficiency and robustness.

  7. VirusDetect: An automated pipeline for efficient virus discovery using deep sequencing of small RNAs

    USDA-ARS?s Scientific Manuscript database

    Accurate detection of viruses in plants and animals is critical for agriculture production and human health. Deep sequencing and assembly of virus-derived siRNAs has proven to be a highly efficient approach for virus discovery. However, to date no computational tools specifically designed for both k...

  8. Repeat sequence chromosome specific nucleic acid probes and methods of preparing and using

    DOEpatents

    Weier, H.U.G.; Gray, J.W.

    1995-06-27

    A primer directed DNA amplification method to isolate efficiently chromosome-specific repeated DNA wherein degenerate oligonucleotide primers are used is disclosed. The probes produced are a heterogeneous mixture that can be used with blocking DNA as a chromosome-specific staining reagent, and/or the elements of the mixture can be screened for high specificity, size and/or high degree of repetition among other parameters. The degenerate primers are sets of primers that vary in sequence but are substantially complementary to highly repeated nucleic acid sequences, preferably clustered within the template DNA, for example, pericentromeric alpha satellite repeat sequences. The template DNA is preferably chromosome-specific. Exemplary primers and probes are disclosed. The probes of this invention can be used to determine the number of chromosomes of a specific type in metaphase spreads, in germ line and/or somatic cell interphase nuclei, micronuclei and/or in tissue sections. Also provided is a method to select arbitrarily repeat sequence probes that can be screened for chromosome-specificity. 18 figs.

  9. Repeat sequence chromosome specific nucleic acid probes and methods of preparing and using

    DOEpatents

    Weier, Heinz-Ulrich G.; Gray, Joe W.

    1995-01-01

    A primer directed DNA amplification method to isolate efficiently chromosome-specific repeated DNA wherein degenerate oligonucleotide primers are used is disclosed. The probes produced are a heterogeneous mixture that can be used with blocking DNA as a chromosome-specific staining reagent, and/or the elements of the mixture can be screened for high specificity, size and/or high degree of repetition among other parameters. The degenerate primers are sets of primers that vary in sequence but are substantially complementary to highly repeated nucleic acid sequences, preferably clustered within the template DNA, for example, pericentromeric alpha satellite repeat sequences. The template DNA is preferably chromosome-specific. Exemplary primers ard probes are disclosed. The probes of this invention can be used to determine the number of chromosomes of a specific type in metaphase spreads, in germ line and/or somatic cell interphase nuclei, micronuclei and/or in tissue sections. Also provided is a method to select arbitrarily repeat sequence probes that can be screened for chromosome-specificity.

  10. An Efficient Approach for the Development of Locus Specific Primers in Bread Wheat (Triticum aestivum L.) and Its Application to Re-Sequencing of Genes Involved in Frost Tolerance

    PubMed Central

    Babben, Steve; Perovic, Dragan; Koch, Michael; Ordon, Frank

    2015-01-01

    Recent declines in costs accelerated sequencing of many species with large genomes, including hexaploid wheat (Triticum aestivum L.). Although the draft sequence of bread wheat is known, it is still one of the major challenges to developlocus specific primers suitable to be used in marker assisted selection procedures, due to the high homology of the three genomes. In this study we describe an efficient approach for the development of locus specific primers comprising four steps, i.e. (i) identification of genomic and coding sequences (CDS) of candidate genes, (ii) intron- and exon-structure reconstruction, (iii) identification of wheat A, B and D sub-genome sequences and primer development based on sequence differences between the three sub-genomes, and (iv); testing of primers for functionality, correct size and localisation. This approach was applied to single, low and high copy genes involved in frost tolerance in wheat. In summary for 27 of these genes for which sequences were derived from Triticum aestivum, Triticum monococcum and Hordeum vulgare, a set of 119 primer pairs was developed and after testing on Nulli-tetrasomic (NT) lines, a set of 65 primer pairs (54.6%), corresponding to 19 candidate genes, turned out to be specific. Out of these a set of 35 fragments was selected for validation via Sanger's amplicon re-sequencing. All fragments, with the exception of one, could be assigned to the original reference sequence. The approach presented here showed a much higher specificity in primer development in comparison to techniques used so far in bread wheat and can be applied to other polyploid species with a known draft sequence. PMID:26565976

  11. Efficient Identification of Murine M2 Macrophage Peptide Targeting Ligands by Phage Display and Next-Generation Sequencing.

    PubMed

    Liu, Gary W; Livesay, Brynn R; Kacherovsky, Nataly A; Cieslewicz, Maryelise; Lutz, Emi; Waalkes, Adam; Jensen, Michael C; Salipante, Stephen J; Pun, Suzie H

    2015-08-19

    Peptide ligands are used to increase the specificity of drug carriers to their target cells and to facilitate intracellular delivery. One method to identify such peptide ligands, phage display, enables high-throughput screening of peptide libraries for ligands binding to therapeutic targets of interest. However, conventional methods for identifying target binders in a library by Sanger sequencing are low-throughput, labor-intensive, and provide a limited perspective (<0.01%) of the complete sequence space. Moreover, the small sample space can be dominated by nonspecific, preferentially amplifying "parasitic sequences" and plastic-binding sequences, which may lead to the identification of false positives or exclude the identification of target-binding sequences. To overcome these challenges, we employed next-generation Illumina sequencing to couple high-throughput screening and high-throughput sequencing, enabling more comprehensive access to the phage display library sequence space. In this work, we define the hallmarks of binding sequences in next-generation sequencing data, and develop a method that identifies several target-binding phage clones for murine, alternatively activated M2 macrophages with a high (100%) success rate: sequences and binding motifs were reproducibly present across biological replicates; binding motifs were identified across multiple unique sequences; and an unselected, amplified library accurately filtered out parasitic sequences. In addition, we validate the Multiple Em for Motif Elicitation tool as an efficient and principled means of discovering binding sequences.

  12. Non-canonical mechanism for translational control in bacteria: synthesis of ribosomal protein S1

    PubMed Central

    Boni, Irina V.; Artamonova, Valentina S.; Tzareva, Nina V.; Dreyfus, Marc

    2001-01-01

    Translation initiation region (TIR) of the rpsA mRNA encoding ribosomal protein S1 is one of the most efficient in Escherichia coli despite the absence of a canonical Shine–Dalgarno-element. Its high efficiency is under strong negative autogenous control, a puzzling phenomenon as S1 has no strict sequence specificity. To define sequence and structural elements responsible for translational efficiency and autoregulation of the rpsA mRNA, a series of rpsA′–′lacZ chromosomal fusions bearing various mutations in the rpsA TIR was created and tested for β-galactosidase activity in the absence and presence of excess S1. These in vivo results, as well as data obtained by in vitro techniques and phylogenetic comparison, allow us to propose a model for the structural and functional organization of the rpsA TIR specific for proteobacteria related to E.coli. According to the model, the high efficiency of translation initiation is provided by a specific fold of the rpsA leader forming a non-contiguous ribosome entry site, which is destroyed upon binding of free S1 when it acts as an autogenous repressor. PMID:11483525

  13. Nitrogen and phosphorus treatment of marine wastewater by a laboratory-scale sequencing batch reactor with eco-friendly marine high-efficiency sediment.

    PubMed

    Cho, Seonghyeon; Kim, Jinsoo; Kim, Sungchul; Lee, Sang-Seob

    2017-06-22

    We screened and identified a NH 3 -N-removing bacterial strain, Bacillus sp. KGN1, and a [Formula: see text] removing strain, Vibrio sp. KGP1, from 960 indigenous marine isolates from seawater and marine sediment from Tongyeong, South Korea. We developed eco-friendly high-efficiency marine sludge (eco-HEMS), and inoculated these marine bacterial strains into the marine sediment. A laboratory-scale sequencing batch reactor (SBR) system using the eco-HEMS for marine wastewater from land-based fish farms improved the treatment performance as indicated by 88.2% removal efficiency (RE) of total nitrogen (initial: 5.6 mg/L) and 90.6% RE of total phosphorus (initial: 1.2 mg/L) under the optimal operation conditions (food and microorganism (F/M) ratio, 0.35 g SCOD Cr /g mixed liquor volatile suspended solids (MLVSS)·d; dissolved oxygen (DO) 1.0 ± 0.2 mg/L; hydraulic retention time (HRT), 6.6 h; solids retention time (SRT), 12 d). The following kinetic parameters were obtained: cell yield (Y), 0.29 g MLVSS/g SCOD Cr ; specific growth rate (µ), 0.06 d -1 ; specific nitrification rate (SNR), 0.49 mg NH 3 -N/g MLVSS·h; specific denitrification rate (SDNR), 0.005 mg [Formula: see text]/g MLVSS·h; specific phosphorus uptake rate (SPUR), 0.12 mg [Formula: see text]/g MLVSS·h. The nitrogen- and phosphorus-removing bacterial strains comprised 18.4% of distribution rate in the microbial community of eco-HEMS under the optimal operation conditions. Therefore, eco-HEMS effectively removed nitrogen and phosphorus from highly saline marine wastewater from land-based fish farms with improving SNR, SDNR, and SPUR values in more diverse microbial communities. DO: dissolved oxygen; Eco-HEMS: eco-friendly high efficiency marine sludge; F/M: food and microorganism ratio; HRT: hydraulic retention time; ML(V)SS: mixed liquor (volatile) suspended solids; NCBI: National Center for Biotechnology Information; ND: not determined; qPCR: quantitative real-time polymerase chain reaction; RE: removal efficiency; SBR: sequencing batch reactor; SD: standard deviation; SDNR: specific denitrification rate; SNR: specific nitrification rate; SPUR: specific phosphate uptake rate; SRT: solids retention time; T-N: total nitrogen; T-P: total phosphorus; (V)SS: (volatile) suspended solids; w.w.: wet weight.

  14. Strategies to Improve Efficiency and Specificity of Degenerate Primers in PCR.

    PubMed

    Campos, Maria Jorge; Quesada, Alberto

    2017-01-01

    PCR with degenerate primers can be used to identify the coding sequence of an unknown protein or to detect a genetic variant within a gene family. These primers, which are complex mixtures of slightly different oligonucleotide sequences, can be optimized to increase the efficiency and/or specificity of PCR in the amplification of a sequence of interest by the introduction of mismatches with the target sequence and balancing their position toward the primers 5'- or 3'-ends. In this work, we explain in detail examples of rational design of primers in two different applications, including the use of specific determinants at the 3'-end, to: (1) improve PCR efficiency with coding sequences for members of a protein family by fully degeneration at a core box of conserved genetic information, with the reduction of degeneration at the 5'-end, and (2) optimize specificity of allelic discrimination of closely related orthologous by 5'-end degenerate primers.

  15. SMARTIV: combined sequence and structure de-novo motif discovery for in-vivo RNA binding data.

    PubMed

    Polishchuk, Maya; Paz, Inbal; Yakhini, Zohar; Mandel-Gutfreund, Yael

    2018-05-25

    Gene expression regulation is highly dependent on binding of RNA-binding proteins (RBPs) to their RNA targets. Growing evidence supports the notion that both RNA primary sequence and its local secondary structure play a role in specific Protein-RNA recognition and binding. Despite the great advance in high-throughput experimental methods for identifying sequence targets of RBPs, predicting the specific sequence and structure binding preferences of RBPs remains a major challenge. We present a novel webserver, SMARTIV, designed for discovering and visualizing combined RNA sequence and structure motifs from high-throughput RNA-binding data, generated from in-vivo experiments. The uniqueness of SMARTIV is that it predicts motifs from enriched k-mers that combine information from ranked RNA sequences and their predicted secondary structure, obtained using various folding methods. Consequently, SMARTIV generates Position Weight Matrices (PWMs) in a combined sequence and structure alphabet with assigned P-values. SMARTIV concisely represents the sequence and structure motif content as a single graphical logo, which is informative and easy for visual perception. SMARTIV was examined extensively on a variety of high-throughput binding experiments for RBPs from different families, generated from different technologies, showing consistent and accurate results. Finally, SMARTIV is a user-friendly webserver, highly efficient in run-time and freely accessible via http://smartiv.technion.ac.il/.

  16. SPHINX--an algorithm for taxonomic binning of metagenomic sequences.

    PubMed

    Mohammed, Monzoorul Haque; Ghosh, Tarini Shankar; Singh, Nitin Kumar; Mande, Sharmila S

    2011-01-01

    Compared with composition-based binning algorithms, the binning accuracy and specificity of alignment-based binning algorithms is significantly higher. However, being alignment-based, the latter class of algorithms require enormous amount of time and computing resources for binning huge metagenomic datasets. The motivation was to develop a binning approach that can analyze metagenomic datasets as rapidly as composition-based approaches, but nevertheless has the accuracy and specificity of alignment-based algorithms. This article describes a hybrid binning approach (SPHINX) that achieves high binning efficiency by utilizing the principles of both 'composition'- and 'alignment'-based binning algorithms. Validation results with simulated sequence datasets indicate that SPHINX is able to analyze metagenomic sequences as rapidly as composition-based algorithms. Furthermore, the binning efficiency (in terms of accuracy and specificity of assignments) of SPHINX is observed to be comparable with results obtained using alignment-based algorithms. A web server for the SPHINX algorithm is available at http://metagenomics.atc.tcs.com/SPHINX/.

  17. Novel Method for High-Throughput Full-Length IGHV-D-J Sequencing of the Immune Repertoire from Bulk B-Cells with Single-Cell Resolution.

    PubMed

    Vergani, Stefano; Korsunsky, Ilya; Mazzarello, Andrea Nicola; Ferrer, Gerardo; Chiorazzi, Nicholas; Bagnara, Davide

    2017-01-01

    Efficient and accurate high-throughput DNA sequencing of the adaptive immune receptor repertoire (AIRR) is necessary to study immune diversity in healthy subjects and disease-related conditions. The high complexity and diversity of the AIRR coupled with the limited amount of starting material, which can compromise identification of the full biological diversity makes such sequencing particularly challenging. AIRR sequencing protocols often fail to fully capture the sampled AIRR diversity, especially for samples containing restricted numbers of B lymphocytes. Here, we describe a library preparation method for immunoglobulin sequencing that results in an exhaustive full-length repertoire where virtually every sampled B-cell is sequenced. This maximizes the likelihood of identifying and quantifying the entire IGHV-D-J repertoire of a sample, including the detection of rearrangements present in only one cell in the starting population. The methodology establishes the importance of circumventing genetic material dilution in the preamplification phases and incorporates the use of certain described concepts: (1) balancing the starting material amount and depth of sequencing, (2) avoiding IGHV gene-specific amplification, and (3) using Unique Molecular Identifier. Together, this methodology is highly efficient, in particular for detecting rare rearrangements in the sampled population and when only a limited amount of starting material is available.

  18. CRISPR-Cas9-Edited Site Sequencing (CRES-Seq): An Efficient and High-Throughput Method for the Selection of CRISPR-Cas9-Edited Clones.

    PubMed

    Veeranagouda, Yaligara; Debono-Lagneaux, Delphine; Fournet, Hamida; Thill, Gilbert; Didier, Michel

    2018-01-16

    The emergence of clustered regularly interspaced short palindromic repeats-Cas9 (CRISPR-Cas9) gene editing systems has enabled the creation of specific mutants at low cost, in a short time and with high efficiency, in eukaryotic cells. Since a CRISPR-Cas9 system typically creates an array of mutations in targeted sites, a successful gene editing project requires careful selection of edited clones. This process can be very challenging, especially when working with multiallelic genes and/or polyploid cells (such as cancer and plants cells). Here we described a next-generation sequencing method called CRISPR-Cas9 Edited Site Sequencing (CRES-Seq) for the efficient and high-throughput screening of CRISPR-Cas9-edited clones. CRES-Seq facilitates the precise genotyping up to 96 CRISPR-Cas9-edited sites (CRES) in a single MiniSeq (Illumina) run with an approximate sequencing cost of $6/clone. CRES-Seq is particularly useful when multiple genes are simultaneously targeted by CRISPR-Cas9, and also for screening of clones generated from multiallelic genes/polyploid cells. © 2018 by John Wiley & Sons, Inc. Copyright © 2018 John Wiley & Sons, Inc.

  19. Atropos: specific, sensitive, and speedy trimming of sequencing reads.

    PubMed

    Didion, John P; Martin, Marcel; Collins, Francis S

    2017-01-01

    A key step in the transformation of raw sequencing reads into biological insights is the trimming of adapter sequences and low-quality bases. Read trimming has been shown to increase the quality and reliability while decreasing the computational requirements of downstream analyses. Many read trimming software tools are available; however, no tool simultaneously provides the accuracy, computational efficiency, and feature set required to handle the types and volumes of data generated in modern sequencing-based experiments. Here we introduce Atropos and show that it trims reads with high sensitivity and specificity while maintaining leading-edge speed. Compared to other state-of-the-art read trimming tools, Atropos achieves significant increases in trimming accuracy while remaining competitive in execution times. Furthermore, Atropos maintains high accuracy even when trimming data with elevated rates of sequencing errors. The accuracy, high performance, and broad feature set offered by Atropos makes it an appropriate choice for the pre-processing of Illumina, ABI SOLiD, and other current-generation short-read sequencing datasets. Atropos is open source and free software written in Python (3.3+) and available at https://github.com/jdidion/atropos.

  20. Atropos: specific, sensitive, and speedy trimming of sequencing reads

    PubMed Central

    Collins, Francis S.

    2017-01-01

    A key step in the transformation of raw sequencing reads into biological insights is the trimming of adapter sequences and low-quality bases. Read trimming has been shown to increase the quality and reliability while decreasing the computational requirements of downstream analyses. Many read trimming software tools are available; however, no tool simultaneously provides the accuracy, computational efficiency, and feature set required to handle the types and volumes of data generated in modern sequencing-based experiments. Here we introduce Atropos and show that it trims reads with high sensitivity and specificity while maintaining leading-edge speed. Compared to other state-of-the-art read trimming tools, Atropos achieves significant increases in trimming accuracy while remaining competitive in execution times. Furthermore, Atropos maintains high accuracy even when trimming data with elevated rates of sequencing errors. The accuracy, high performance, and broad feature set offered by Atropos makes it an appropriate choice for the pre-processing of Illumina, ABI SOLiD, and other current-generation short-read sequencing datasets. Atropos is open source and free software written in Python (3.3+) and available at https://github.com/jdidion/atropos. PMID:28875074

  1. Diff-seq: A high throughput sequencing-based mismatch detection assay for DNA variant enrichment and discovery

    PubMed Central

    Karas, Vlad O; Sinnott-Armstrong, Nicholas A; Varghese, Vici; Shafer, Robert W; Greenleaf, William J; Sherlock, Gavin

    2018-01-01

    Abstract Much of the within species genetic variation is in the form of single nucleotide polymorphisms (SNPs), typically detected by whole genome sequencing (WGS) or microarray-based technologies. However, WGS produces mostly uninformative reads that perfectly match the reference, while microarrays require genome-specific reagents. We have developed Diff-seq, a sequencing-based mismatch detection assay for SNP discovery without the requirement for specialized nucleic-acid reagents. Diff-seq leverages the Surveyor endonuclease to cleave mismatched DNA molecules that are generated after cross-annealing of a complex pool of DNA fragments. Sequencing libraries enriched for Surveyor-cleaved molecules result in increased coverage at the variant sites. Diff-seq detected all mismatches present in an initial test substrate, with specific enrichment dependent on the identity and context of the variation. Application to viral sequences resulted in increased observation of variant alleles in a biologically relevant context. Diff-Seq has the potential to increase the sensitivity and efficiency of high-throughput sequencing in the detection of variation. PMID:29361139

  2. CRISPR-FOCUS: A web server for designing focused CRISPR screening experiments.

    PubMed

    Cao, Qingyi; Ma, Jian; Chen, Chen-Hao; Xu, Han; Chen, Zhi; Li, Wei; Liu, X Shirley

    2017-01-01

    The recently developed CRISPR screen technology, based on the CRISPR/Cas9 genome editing system, enables genome-wide interrogation of gene functions in an efficient and cost-effective manner. Although many computational algorithms and web servers have been developed to design single-guide RNAs (sgRNAs) with high specificity and efficiency, algorithms specifically designed for conducting CRISPR screens are still lacking. Here we present CRISPR-FOCUS, a web-based platform to search and prioritize sgRNAs for CRISPR screen experiments. With official gene symbols or RefSeq IDs as the only mandatory input, CRISPR-FOCUS filters and prioritizes sgRNAs based on multiple criteria, including efficiency, specificity, sequence conservation, isoform structure, as well as genomic variations including Single Nucleotide Polymorphisms and cancer somatic mutations. CRISPR-FOCUS also provides pre-defined positive and negative control sgRNAs, as well as other necessary sequences in the construct (e.g., U6 promoters to drive sgRNA transcription and RNA scaffolds of the CRISPR/Cas9). These features allow users to synthesize oligonucleotides directly based on the output of CRISPR-FOCUS. Overall, CRISPR-FOCUS provides a rational and high-throughput approach for sgRNA library design that enables users to efficiently conduct a focused screen experiment targeting up to thousands of genes. (CRISPR-FOCUS is freely available at http://cistrome.org/crispr-focus/).

  3. Site directed recombination

    DOEpatents

    Jurka, Jerzy W.

    1997-01-01

    Enhanced homologous recombination is obtained by employing a consensus sequence which has been found to be associated with integration of repeat sequences, such as Alu and ID. The consensus sequence or sequence having a single transition mutation determines one site of a double break which allows for high efficiency of integration at the site. By introducing single or double stranded DNA having the consensus sequence flanking region joined to a sequence of interest, one can reproducibly direct integration of the sequence of interest at one or a limited number of sites. In this way, specific sites can be identified and homologous recombination achieved at the site by employing a second flanking sequence associated with a sequence proximal to the 3'-nick.

  4. AmpliVar: mutation detection in high-throughput sequence from amplicon-based libraries.

    PubMed

    Hsu, Arthur L; Kondrashova, Olga; Lunke, Sebastian; Love, Clare J; Meldrum, Cliff; Marquis-Nicholson, Renate; Corboy, Greg; Pham, Kym; Wakefield, Matthew; Waring, Paul M; Taylor, Graham R

    2015-04-01

    Conventional means of identifying variants in high-throughput sequencing align each read against a reference sequence, and then call variants at each position. Here, we demonstrate an orthogonal means of identifying sequence variation by grouping the reads as amplicons prior to any alignment. We used AmpliVar to make key-value hashes of sequence reads and group reads as individual amplicons using a table of flanking sequences. Low-abundance reads were removed according to a selectable threshold, and reads above this threshold were aligned as groups, rather than as individual reads, permitting the use of sensitive alignment tools. We show that this approach is more sensitive, more specific, and more computationally efficient than comparable methods for the analysis of amplicon-based high-throughput sequencing data. The method can be extended to enable alignment-free confirmation of variants seen in hybridization capture target-enrichment data. © 2015 WILEY PERIODICALS, INC.

  5. Comprehensive Interrogation of Natural TALE DNA Binding Modules and Transcriptional Repressor Domains

    PubMed Central

    Cong, Le; Zhou, Ruhong; Kuo, Yu-chi; Cunniff, Margaret; Zhang, Feng

    2012-01-01

    Transcription activator-like effectors (TALE) are sequence-specific DNA binding proteins that harbor modular, repetitive DNA binding domains. TALEs have enabled the creation of customizable designer transcriptional factors and sequence-specific nucleases for genome engineering. Here we report two improvements of the TALE toolbox for achieving efficient activation and repression of endogenous gene expression in mammalian cells. We show that the naturally occurring repeat variable diresidue (RVD) Asn-His (NH) has high biological activity and specificity for guanine, a highly prevalent base in mammalian genomes. We also report an effective TALE transcriptional repressor architecture for targeted inhibition of transcription in mammalian cells. These findings will improve the precision and effectiveness of genome engineering that can be achieved using TALEs. PMID:22828628

  6. Hybridization-based antibody cDNA recovery for the production of recombinant antibodies identified by repertoire sequencing.

    PubMed

    Valdés-Alemán, Javier; Téllez-Sosa, Juan; Ovilla-Muñoz, Marbella; Godoy-Lozano, Elizabeth; Velázquez-Ramírez, Daniel; Valdovinos-Torres, Humberto; Gómez-Barreto, Rosa E; Martinez-Barnetche, Jesús

    2014-01-01

    High-throughput sequencing of the antibody repertoire is enabling a thorough analysis of B cell diversity and clonal selection, which may improve the novel antibody discovery process. Theoretically, an adequate bioinformatic analysis could allow identification of candidate antigen-specific antibodies, requiring their recombinant production for experimental validation of their specificity. Gene synthesis is commonly used for the generation of recombinant antibodies identified in silico. Novel strategies that bypass gene synthesis could offer more accessible antibody identification and validation alternatives. We developed a hybridization-based recovery strategy that targets the complementarity-determining region 3 (CDRH3) for the enrichment of cDNA of candidate antigen-specific antibody sequences. Ten clonal groups of interest were identified through bioinformatic analysis of the heavy chain antibody repertoire of mice immunized with hen egg white lysozyme (HEL). cDNA from eight of the targeted clonal groups was recovered efficiently, leading to the generation of recombinant antibodies. One representative heavy chain sequence from each clonal group recovered was paired with previously reported anti-HEL light chains to generate full antibodies, later tested for HEL-binding capacity. The recovery process proposed represents a simple and scalable molecular strategy that could enhance antibody identification and specificity assessment, enabling a more cost-efficient generation of recombinant antibodies.

  7. New encoded single-indicator sequences based on physico-chemical parameters for efficient exon identification.

    PubMed

    Meher, J K; Meher, P K; Dash, G N; Raval, M K

    2012-01-01

    The first step in gene identification problem based on genomic signal processing is to convert character strings into numerical sequences. These numerical sequences are then analysed spectrally or using digital filtering techniques for the period-3 peaks, which are present in exons (coding areas) and absent in introns (non-coding areas). In this paper, we have shown that single-indicator sequences can be generated by encoding schemes based on physico-chemical properties. Two new methods are proposed for generating single-indicator sequences based on hydration energy and dipole moments. The proposed methods produce high peak at exon locations and effectively suppress false exons (intron regions having greater peak than exon regions) resulting in high discriminating factor, sensitivity and specificity.

  8. [Screening specific recognition motif of RNA-binding proteins by SELEX in combination with next-generation sequencing technique].

    PubMed

    Zhang, Lu; Xu, Jinhao; Ma, Jinbiao

    2016-07-25

    RNA-binding protein exerts important biological function by specifically recognizing RNA motif. SELEX (Systematic evolution of ligands by exponential enrichment), an in vitro selection method, can obtain consensus motif with high-affinity and specificity for many target molecules from DNA or RNA libraries. Here, we combined SELEX with next-generation sequencing to study the protein-RNA interaction in vitro. A pool of RNAs with 20 bp random sequences were transcribed by T7 promoter, and target protein was inserted into plasmid containing SBP-tag, which can be captured by streptavidin beads. Through only one cycle, the specific RNA motif can be obtained, which dramatically improved the selection efficiency. Using this method, we found that human hnRNP A1 RRMs domain (UP1 domain) bound RNA motifs containing AGG and AG sequences. The EMSA experiment indicated that hnRNP A1 RRMs could bind the obtained RNA motif. Taken together, this method provides a rapid and effective method to study the RNA binding specificity of proteins.

  9. Individual sequences in large sets of gene sequences may be distinguished efficiently by combinations of shared sub-sequences

    PubMed Central

    Gibbs, Mark J; Armstrong, John S; Gibbs, Adrian J

    2005-01-01

    Background Most current DNA diagnostic tests for identifying organisms use specific oligonucleotide probes that are complementary in sequence to, and hence only hybridise with the DNA of one target species. By contrast, in traditional taxonomy, specimens are usually identified by 'dichotomous keys' that use combinations of characters shared by different members of the target set. Using one specific character for each target is the least efficient strategy for identification. Using combinations of shared bisectionally-distributed characters is much more efficient, and this strategy is most efficient when they separate the targets in a progressively binary way. Results We have developed a practical method for finding minimal sets of sub-sequences that identify individual sequences, and could be targeted by combinations of probes, so that the efficient strategy of traditional taxonomic identification could be used in DNA diagnosis. The sizes of minimal sub-sequence sets depended mostly on sequence diversity and sub-sequence length and interactions between these parameters. We found that 201 distinct cytochrome oxidase subunit-1 (CO1) genes from moths (Lepidoptera) were distinguished using only 15 sub-sequences 20 nucleotides long, whereas only 8–10 sub-sequences 6–10 nucleotides long were required to distinguish the CO1 genes of 92 species from the 9 largest orders of insects. Conclusion The presence/absence of sub-sequences in a set of gene sequences can be used like the questions in a traditional dichotomous taxonomic key; hybridisation probes complementary to such sub-sequences should provide a very efficient means for identifying individual species, subtypes or genotypes. Sequence diversity and sub-sequence length are the major factors that determine the numbers of distinguishing sub-sequences in any set of sequences. PMID:15817134

  10. Technical Considerations for Reduced Representation Bisulfite Sequencing with Multiplexed Libraries

    PubMed Central

    Chatterjee, Aniruddha; Rodger, Euan J.; Stockwell, Peter A.; Weeks, Robert J.; Morison, Ian M.

    2012-01-01

    Reduced representation bisulfite sequencing (RRBS), which couples bisulfite conversion and next generation sequencing, is an innovative method that specifically enriches genomic regions with a high density of potential methylation sites and enables investigation of DNA methylation at single-nucleotide resolution. Recent advances in the Illumina DNA sample preparation protocol and sequencing technology have vastly improved sequencing throughput capacity. Although the new Illumina technology is now widely used, the unique challenges associated with multiplexed RRBS libraries on this platform have not been previously described. We have made modifications to the RRBS library preparation protocol to sequence multiplexed libraries on a single flow cell lane of the Illumina HiSeq 2000. Furthermore, our analysis incorporates a bioinformatics pipeline specifically designed to process bisulfite-converted sequencing reads and evaluate the output and quality of the sequencing data generated from the multiplexed libraries. We obtained an average of 42 million paired-end reads per sample for each flow-cell lane, with a high unique mapping efficiency to the reference human genome. Here we provide a roadmap of modifications, strategies, and trouble shooting approaches we implemented to optimize sequencing of multiplexed libraries on an a RRBS background. PMID:23193365

  11. The application of the high throughput sequencing technology in the transposable elements.

    PubMed

    Liu, Zhen; Xu, Jian-hong

    2015-09-01

    High throughput sequencing technology has dramatically improved the efficiency of DNA sequencing, and decreased the costs to a great extent. Meanwhile, this technology usually has advantages of better specificity, higher sensitivity and accuracy. Therefore, it has been applied to the research on genetic variations, transcriptomics and epigenomics. Recently, this technology has been widely employed in the studies of transposable elements and has achieved fruitful results. In this review, we summarize the application of high throughput sequencing technology in the fields of transposable elements, including the estimation of transposon content, preference of target sites and distribution, insertion polymorphism and population frequency, identification of rare copies, transposon horizontal transfers as well as transposon tagging. We also briefly introduce the major common sequencing strategies and algorithms, their advantages and disadvantages, and the corresponding solutions. Finally, we envision the developing trends of high throughput sequencing technology, especially the third generation sequencing technology, and its application in transposon studies in the future, hopefully providing a comprehensive understanding and reference for related scientific researchers.

  12. Comparison of taxon-specific versus general locus sets for targeted sequence capture in plant phylogenomics.

    PubMed

    Chau, John H; Rahfeldt, Wolfgang A; Olmstead, Richard G

    2018-03-01

    Targeted sequence capture can be used to efficiently gather sequence data for large numbers of loci, such as single-copy nuclear loci. Most published studies in plants have used taxon-specific locus sets developed individually for a clade using multiple genomic and transcriptomic resources. General locus sets can also be developed from loci that have been identified as single-copy and have orthologs in large clades of plants. We identify and compare a taxon-specific locus set and three general locus sets (conserved ortholog set [COSII], shared single-copy nuclear [APVO SSC] genes, and pentatricopeptide repeat [PPR] genes) for targeted sequence capture in Buddleja (Scrophulariaceae) and outgroups. We evaluate their performance in terms of assembly success, sequence variability, and resolution and support of inferred phylogenetic trees. The taxon-specific locus set had the most target loci. Assembly success was high for all locus sets in Buddleja samples. For outgroups, general locus sets had greater assembly success. Taxon-specific and PPR loci had the highest average variability. The taxon-specific data set produced the best-supported tree, but all data sets showed improved resolution over previous non-sequence capture data sets. General locus sets can be a useful source of sequence capture targets, especially if multiple genomic resources are not available for a taxon.

  13. [Efficient genome editing in human pluripotent stem cells through CRISPR/Cas9].

    PubMed

    Liu, Gai-gai; Li, Shuang; Wei, Yu-da; Zhang, Yong-xian; Ding, Qiu-rong

    2015-11-01

    The RNA-guided CRISPR (clustered regularly interspaced short palindromic repeat)-associated Cas9 nuclease has offered a new platform for genome editing with high efficiency. Here, we report the use of CRISPR/Cas9 technology to target a specific genomic region in human pluripotent stem cells. We show that CRISPR/Cas9 can be used to disrupt a gene by introducing frameshift mutations to gene coding region; to knock in specific sequences (e.g. FLAG tag DNA sequence) to targeted genomic locus via homology directed repair; to induce large genomic deletion through dual-guide multiplex. Our results demonstrate the versatile application of CRISPR/Cas9 in stem cell genome editing, which can be widely utilized for functional studies of genes or genome loci in human pluripotent stem cells.

  14. Single-cell isolation by a modular single-cell pipette for RNA-sequencing.

    PubMed

    Zhang, Kai; Gao, Min; Chong, Zechen; Li, Ying; Han, Xin; Chen, Rui; Qin, Lidong

    2016-11-29

    Single-cell transcriptome sequencing highly requires a convenient and reliable method to rapidly isolate a live cell into a specific container such as a PCR tube. Here, we report a modular single-cell pipette (mSCP) consisting of three modular components, a SCP-Tip, an air-displacement pipette (ADP), and ADP-Tips, that can be easily assembled, disassembled, and reassembled. By assembling the SCP-Tip containing a hydrodynamic trap, the mSCP can isolate single cells from 5-10 cells per μL of cell suspension. The mSCP is compatible with microscopic identification of captured single cells to finally achieve 100% single-cell isolation efficiency. The isolated live single cells are in submicroliter volumes and well suitable for single-cell PCR analysis and RNA-sequencing. The mSCP possesses merits of convenience, rapidness, and high efficiency, making it a powerful tool to isolate single cells for transcriptome analysis.

  15. Efficient and Accurate Algorithm for Cleaved Fragments Prediction (CFPA) in Protein Sequences Dataset Based on Consensus and Its Variants: A Novel Degradomics Prediction Application.

    PubMed

    El-Assaad, Atlal; Dawy, Zaher; Nemer, Georges; Hajj, Hazem; Kobeissy, Firas H

    2017-01-01

    Degradomics is a novel discipline that involves determination of the proteases/substrate fragmentation profile, called the substrate degradome, and has been recently applied in different disciplines. A major application of degradomics is its utility in the field of biomarkers where the breakdown products (BDPs) of different protease have been investigated. Among the major proteases assessed, calpain and caspase proteases have been associated with the execution phases of the pro-apoptotic and pro-necrotic cell death, generating caspase/calpain-specific cleaved fragments. The distinction between calpain and caspase protein fragments has been applied to distinguish injury mechanisms. Advanced proteomics technology has been used to identify these BDPs experimentally. However, it has been a challenge to identify these BDPs with high precision and efficiency, especially if we are targeting a number of proteins at one time. In this chapter, we present a novel bioinfromatic detection method that identifies BDPs accurately and efficiently with validation against experimental data. This method aims at predicting the consensus sequence occurrences and their variants in a large set of experimentally detected protein sequences based on state-of-the-art sequence matching and alignment algorithms. After detection, the method generates all the potential cleaved fragments by a specific protease. This space and time-efficient algorithm is flexible to handle the different orientations that the consensus sequence and the protein sequence can take before cleaving. It is O(mn) in space complexity and O(Nmn) in time complexity, with N number of protein sequences, m length of the consensus sequence, and n length of each protein sequence. Ultimately, this knowledge will subsequently feed into the development of a novel tool for researchers to detect diverse types of selected BDPs as putative disease markers, contributing to the diagnosis and treatment of related disorders.

  16. An efficient approach to BAC based assembly of complex genomes.

    PubMed

    Visendi, Paul; Berkman, Paul J; Hayashi, Satomi; Golicz, Agnieszka A; Bayer, Philipp E; Ruperao, Pradeep; Hurgobin, Bhavna; Montenegro, Juan; Chan, Chon-Kit Kenneth; Staňková, Helena; Batley, Jacqueline; Šimková, Hana; Doležel, Jaroslav; Edwards, David

    2016-01-01

    There has been an exponential growth in the number of genome sequencing projects since the introduction of next generation DNA sequencing technologies. Genome projects have increasingly involved assembly of whole genome data which produces inferior assemblies compared to traditional Sanger sequencing of genomic fragments cloned into bacterial artificial chromosomes (BACs). While whole genome shotgun sequencing using next generation sequencing (NGS) is relatively fast and inexpensive, this method is extremely challenging for highly complex genomes, where polyploidy or high repeat content confounds accurate assembly, or where a highly accurate 'gold' reference is required. Several attempts have been made to improve genome sequencing approaches by incorporating NGS methods, to variable success. We present the application of a novel BAC sequencing approach which combines indexed pools of BACs, Illumina paired read sequencing, a sequence assembler specifically designed for complex BAC assembly, and a custom bioinformatics pipeline. We demonstrate this method by sequencing and assembling BAC cloned fragments from bread wheat and sugarcane genomes. We demonstrate that our assembly approach is accurate, robust, cost effective and scalable, with applications for complete genome sequencing in large and complex genomes.

  17. Self-Cloning CRISPR.

    PubMed

    Arbab, Mandana; Sherwood, Richard I

    2016-08-17

    CRISPR/Cas9-gene editing has emerged as a revolutionary technology to easily modify specific genomic loci by designing complementary sgRNA sequences and introducing these into cells along with Cas9. Self-cloning CRISPR/Cas9 (scCRISPR) uses a self-cleaving palindromic sgRNA plasmid (sgPal) that recombines with short PCR-amplified site-specific sgRNA sequences within the target cell by homologous recombination to circumvent the process of sgRNA plasmid construction. Through this mechanism, scCRISPR enables gene editing within 2 hr once sgRNA oligos are available, with high efficiency equivalent to conventional sgRNA targeting: >90% gene knockout in both mouse and human embryonic stem cells and cancer cell lines. Furthermore, using PCR-based addition of short homology arms, we achieve efficient site-specific knock-in of transgenes such as GFP without traditional plasmid cloning or genome-integrated selection cassette (2% to 4% knock-in rate). The methods in this paper describe the most rapid and efficient means of CRISPR gene editing. © 2016 by John Wiley & Sons, Inc. Copyright © 2016 John Wiley & Sons, Inc.

  18. Sequence features associated with the cleavage efficiency of CRISPR/Cas9 system.

    PubMed

    Liu, Xiaoxi; Homma, Ayaka; Sayadi, Jamasb; Yang, Shu; Ohashi, Jun; Takumi, Toru

    2016-01-27

    The CRISPR-Cas9 system has recently emerged as a versatile tool for biological and medical research. In this system, a single guide RNA (sgRNA) directs the endonuclease Cas9 to a targeted DNA sequence for site-specific manipulation. In addition to this targeting function, the sgRNA has also been shown to play a role in activating the endonuclease activity of Cas9. This dual function of the sgRNA likely underlies observations that different sgRNAs have varying on-target activities. Currently, our understanding of the relationship between sequence features of sgRNAs and their on-target cleavage efficiencies remains limited, largely due to difficulties in assessing the cleavage capacity of a large number of sgRNAs. In this study, we evaluated the cleavage activities of 218 sgRNAs using in vitro Surveyor assays. We found that nucleotides at both PAM-distal and PAM-proximal regions of the sgRNA are significantly correlated with on-target efficiency. Furthermore, we also demonstrated that the genomic context of the targeted DNA, the GC percentage, and the secondary structure of sgRNA are critical factors contributing to cleavage efficiency. In summary, our study reveals important parameters for the design of sgRNAs with high on-target efficiencies, especially in the context of high throughput applications.

  19. Polymerase ribozyme efficiency increased by G/T-rich DNA oligonucleotides

    PubMed Central

    Yao, Chengguo; Müller, Ulrich F.

    2011-01-01

    The RNA world hypothesis states that the early evolution of life went through a stage where RNA served as genome and as catalyst. The replication of RNA world organisms would have been facilitated by ribozymes that catalyze RNA polymerization. To recapitulate an RNA world in the laboratory, a series of RNA polymerase ribozymes was developed previously. However, these ribozymes have a polymerization efficiency that is too low for self-replication, and the most efficient ribozymes prefer one specific template sequence. The limiting factor for polymerization efficiency is the weak sequence-independent binding to its primer/template substrate. Most of the known polymerase ribozymes bind an RNA heptanucleotide to form the P2 duplex on the ribozyme. By modifying this heptanucleotide, we were able to significantly increase polymerization efficiency. Truncations at the 3′-terminus of this heptanucleotide increased full-length primer extension by 10-fold, on a specific template sequence. In contrast, polymerization on several different template sequences was improved dramatically by replacing the RNA heptanucleotide with DNA oligomers containing randomized sequences of 15 nt. The presence of G and T in the random sequences was sufficient for this effect, with an optimal composition of 60% G and 40% T. Our results indicate that these DNA sequences function by establishing many weak and nonspecific base-pairing interactions to the single-stranded portion of the template. Such low-specificity interactions could have had important functions in an RNA world. PMID:21622900

  20. Development of Genome Engineering Tools from Plant-Specific PPR Proteins Using Animal Cultured Cells.

    PubMed

    Kobayashi, Takehito; Yagi, Yusuke; Nakamura, Takahiro

    2016-01-01

    The pentatricopeptide repeat (PPR) motif is a sequence-specific RNA/DNA-binding module. Elucidation of the RNA/DNA recognition mechanism has enabled engineering of PPR motifs as new RNA/DNA manipulation tools in living cells, including for genome editing. However, the biochemical characteristics of PPR proteins remain unknown, mostly due to the instability and/or unfolding propensities of PPR proteins in heterologous expression systems such as bacteria and yeast. To overcome this issue, we constructed reporter systems using animal cultured cells. The cell-based system has highly attractive features for PPR engineering: robust eukaryotic gene expression; availability of various vectors, reagents, and antibodies; highly efficient DNA delivery ratio (>80 %); and rapid, high-throughput data production. In this chapter, we introduce an example of such reporter systems: a PPR-based sequence-specific translational activation system. The cell-based reporter system can be applied to characterize plant genes of interested and to PPR engineering.

  1. Sequence and structure-specific elements of HERG mRNA determine channel synthesis and trafficking efficiency

    PubMed Central

    Sroubek, Jakub; Krishnan, Yamini; McDonald, Thomas V.

    2013-01-01

    Human ether-á-gogo-related gene (HERG) encodes a potassium channel that is highly susceptible to deleterious mutations resulting in susceptibility to fatal cardiac arrhythmias. Most mutations adversely affect HERG channel assembly and trafficking. Why the channel is so vulnerable to missense mutations is not well understood. Since nothing is known of how mRNA structural elements factor in channel processing, we synthesized a codon-modified HERG cDNA (HERG-CM) where the codons were synonymously changed to reduce GC content, secondary structure, and rare codon usage. HERG-CM produced typical IKr-like currents; however, channel synthesis and processing were markedly different. Translation efficiency was reduced for HERG-CM, as determined by heterologous expression, in vitro translation, and polysomal profiling. Trafficking efficiency to the cell surface was greatly enhanced, as assayed by immunofluorescence, subcellular fractionation, and surface labeling. Chimeras of HERG-NT/CM indicated that trafficking efficiency was largely dependent on 5′ sequences, while translation efficiency involved multiple areas. These results suggest that HERG translation and trafficking rates are independently governed by noncoding information in various regions of the mRNA molecule. Noncoding information embedded within the mRNA may play a role in the pathogenesis of hereditary arrhythmia syndromes and could provide an avenue for targeted therapeutics.—Sroubek, J., Krishnan, Y., McDonald, T V. Sequence- and structure-specific elements of HERG mRNA determine channel synthesis and trafficking efficiency. PMID:23608144

  2. Efficient introduction of specific homozygous and heterozygous mutations using CRISPR/Cas9.

    PubMed

    Paquet, Dominik; Kwart, Dylan; Chen, Antonia; Sproul, Andrew; Jacob, Samson; Teo, Shaun; Olsen, Kimberly Moore; Gregg, Andrew; Noggle, Scott; Tessier-Lavigne, Marc

    2016-05-05

    The bacterial CRISPR/Cas9 system allows sequence-specific gene editing in many organisms and holds promise as a tool to generate models of human diseases, for example, in human pluripotent stem cells. CRISPR/Cas9 introduces targeted double-stranded breaks (DSBs) with high efficiency, which are typically repaired by non-homologous end-joining (NHEJ) resulting in nonspecific insertions, deletions or other mutations (indels). DSBs may also be repaired by homology-directed repair (HDR) using a DNA repair template, such as an introduced single-stranded oligo DNA nucleotide (ssODN), allowing knock-in of specific mutations. Although CRISPR/Cas9 is used extensively to engineer gene knockouts through NHEJ, editing by HDR remains inefficient and can be corrupted by additional indels, preventing its widespread use for modelling genetic disorders through introducing disease-associated mutations. Furthermore, targeted mutational knock-in at single alleles to model diseases caused by heterozygous mutations has not been reported. Here we describe a CRISPR/Cas9-based genome-editing framework that allows selective introduction of mono- and bi-allelic sequence changes with high efficiency and accuracy. We show that HDR accuracy is increased dramatically by incorporating silent CRISPR/Cas-blocking mutations along with pathogenic mutations, and establish a method termed 'CORRECT' for scarless genome editing. By characterizing and exploiting a stereotyped inverse relationship between a mutation's incorporation rate and its distance to the DSB, we achieve predictable control of zygosity. Homozygous introduction requires a guide RNA targeting close to the intended mutation, whereas heterozygous introduction can be accomplished by distance-dependent suboptimal mutation incorporation or by use of mixed repair templates. Using this approach, we generated human induced pluripotent stem cells with heterozygous and homozygous dominant early onset Alzheimer's disease-causing mutations in amyloid precursor protein (APP(Swe)) and presenilin 1 (PSEN1(M146V)) and derived cortical neurons, which displayed genotype-dependent disease-associated phenotypes. Our findings enable efficient introduction of specific sequence changes with CRISPR/Cas9, facilitating study of human disease.

  3. Evaluation of the Bacterial Diversity in the Human Tongue Coating Based on Genus-Specific Primers for 16S rRNA Sequencing.

    PubMed

    Sun, Beili; Zhou, Dongrui; Tu, Jing; Lu, Zuhong

    2017-01-01

    The characteristics of tongue coating are very important symbols for disease diagnosis in traditional Chinese medicine (TCM) theory. As a habitat of oral microbiota, bacteria on the tongue dorsum have been proved to be the cause of many oral diseases. The high-throughput next-generation sequencing (NGS) platforms have been widely applied in the analysis of bacterial 16S rRNA gene. We developed a methodology based on genus-specific multiprimer amplification and ligation-based sequencing for microbiota analysis. In order to validate the efficiency of the approach, we thoroughly analyzed six tongue coating samples from lung cancer patients with different TCM types, and more than 600 genera of bacteria were detected by this platform. The results showed that ligation-based parallel sequencing combined with enzyme digestion and multiamplification could expand the effective length of sequencing reads and could be applied in the microbiota analysis.

  4. Linear and exponential TAIL-PCR: a method for efficient and quick amplification of flanking sequences adjacent to Tn5 transposon insertion sites.

    PubMed

    Jia, Xianbo; Lin, Xinjian; Chen, Jichen

    2017-11-02

    Current genome walking methods are very time consuming, and many produce non-specific amplification products. To amplify the flanking sequences that are adjacent to Tn5 transposon insertion sites in Serratia marcescens FZSF02, we developed a genome walking method based on TAIL-PCR. This PCR method added a 20-cycle linear amplification step before the exponential amplification step to increase the concentration of the target sequences. Products of the linear amplification and the exponential amplification were diluted 100-fold to decrease the concentration of the templates that cause non-specific amplification. Fast DNA polymerase with a high extension speed was used in this method, and an amplification program was used to rapidly amplify long specific sequences. With this linear and exponential TAIL-PCR (LETAIL-PCR), we successfully obtained products larger than 2 kb from Tn5 transposon insertion mutant strains within 3 h. This method can be widely used in genome walking studies to amplify unknown sequences that are adjacent to known sequences.

  5. Shifted Transversal Design smart-pooling for high coverage interactome mapping

    PubMed Central

    Xin, Xiaofeng; Rual, Jean-François; Hirozane-Kishikawa, Tomoko; Hill, David E.; Vidal, Marc; Boone, Charles; Thierry-Mieg, Nicolas

    2009-01-01

    “Smart-pooling,” in which test reagents are multiplexed in a highly redundant manner, is a promising strategy for achieving high efficiency, sensitivity, and specificity in systems-level projects. However, previous applications relied on low redundancy designs that do not leverage the full potential of smart-pooling, and more powerful theoretical constructions, such as the Shifted Transversal Design (STD), lack experimental validation. Here we evaluate STD smart-pooling in yeast two-hybrid (Y2H) interactome mapping. We employed two STD designs and two established methods to perform ORFeome-wide Y2H screens with 12 baits. We found that STD pooling achieves similar levels of sensitivity and specificity as one-on-one array-based Y2H, while the costs and workloads are divided by three. The screening-sequencing approach is the most cost- and labor-efficient, yet STD identifies about twofold more interactions. Screening-sequencing remains an appropriate method for quickly producing low-coverage interactomes, while STD pooling appears as the method of choice for obtaining maps with higher coverage. PMID:19447967

  6. Engineering peptide ligase specificity by proteomic identification of ligation sites.

    PubMed

    Weeks, Amy M; Wells, James A

    2018-01-01

    Enzyme-catalyzed peptide ligation is a powerful tool for site-specific protein bioconjugation, but stringent enzyme-substrate specificity limits its utility. We developed an approach for comprehensively characterizing peptide ligase specificity for N termini using proteome-derived peptide libraries. We used this strategy to characterize the ligation efficiency for >25,000 enzyme-substrate pairs in the context of the engineered peptide ligase subtiligase and identified a family of 72 mutant subtiligases with activity toward N-terminal sequences that were previously recalcitrant to modification. We applied these mutants individually for site-specific bioconjugation of purified proteins, including antibodies, and in algorithmically selected combinations for sequencing of the cellular N terminome with reduced sequence bias. We also developed a web application to enable algorithmic selection of the most efficient subtiligase variant(s) for bioconjugation to user-defined sequences. Our methods provide a new toolbox of enzymes for site-specific protein modification and a general approach for rapidly defining and engineering peptide ligase specificity.

  7. An RNAi in silico approach to find an optimal shRNA cocktail against HIV-1

    PubMed Central

    2010-01-01

    Background HIV-1 can be inhibited by RNA interference in vitro through the expression of short hairpin RNAs (shRNAs) that target conserved genome sequences. In silico shRNA design for HIV has lacked a detailed study of virus variability constituting a possible breaking point in a clinical setting. We designed shRNAs against HIV-1 considering the variability observed in naïve and drug-resistant isolates available at public databases. Methods A Bioperl-based algorithm was developed to automatically scan multiple sequence alignments of HIV, while evaluating the possibility of identifying dominant and subdominant viral variants that could be used as efficient silencing molecules. Student t-test and Bonferroni Dunn correction test were used to assess statistical significance of our findings. Results Our in silico approach identified the most common viral variants within highly conserved genome regions, with a calculated free energy of ≥ -6.6 kcal/mol. This is crucial for strand loading to RISC complex and for a predicted silencing efficiency score, which could be used in combination for achieving over 90% silencing. Resistant and naïve isolate variability revealed that the most frequent shRNA per region targets a maximum of 85% of viral sequences. Adding more divergent sequences maintained this percentage. Specific sequence features that have been found to be related with higher silencing efficiency were hardly accomplished in conserved regions, even when lower entropy values correlated with better scores. We identified a conserved region among most HIV-1 genomes, which meets as many sequence features for efficient silencing. Conclusions HIV-1 variability is an obstacle to achieving absolute silencing using shRNAs designed against a consensus sequence, mainly because there are many functional viral variants. Our shRNA cocktail could be truly effective at silencing dominant and subdominant naïve viral variants. Additionally, resistant isolates might be targeted under specific antiretroviral selective pressure, but in both cases these should be tested exhaustively prior to clinical use. PMID:21172023

  8. Cloud-based bioinformatics workflow platform for large-scale next-generation sequencing analyses

    PubMed Central

    Liu, Bo; Madduri, Ravi K; Sotomayor, Borja; Chard, Kyle; Lacinski, Lukasz; Dave, Utpal J; Li, Jianqiang; Liu, Chunchen; Foster, Ian T

    2014-01-01

    Due to the upcoming data deluge of genome data, the need for storing and processing large-scale genome data, easy access to biomedical analyses tools, efficient data sharing and retrieval has presented significant challenges. The variability in data volume results in variable computing and storage requirements, therefore biomedical researchers are pursuing more reliable, dynamic and convenient methods for conducting sequencing analyses. This paper proposes a Cloud-based bioinformatics workflow platform for large-scale next-generation sequencing analyses, which enables reliable and highly scalable execution of sequencing analyses workflows in a fully automated manner. Our platform extends the existing Galaxy workflow system by adding data management capabilities for transferring large quantities of data efficiently and reliably (via Globus Transfer), domain-specific analyses tools preconfigured for immediate use by researchers (via user-specific tools integration), automatic deployment on Cloud for on-demand resource allocation and pay-as-you-go pricing (via Globus Provision), a Cloud provisioning tool for auto-scaling (via HTCondor scheduler), and the support for validating the correctness of workflows (via semantic verification tools). Two bioinformatics workflow use cases as well as performance evaluation are presented to validate the feasibility of the proposed approach. PMID:24462600

  9. Cloud-based bioinformatics workflow platform for large-scale next-generation sequencing analyses.

    PubMed

    Liu, Bo; Madduri, Ravi K; Sotomayor, Borja; Chard, Kyle; Lacinski, Lukasz; Dave, Utpal J; Li, Jianqiang; Liu, Chunchen; Foster, Ian T

    2014-06-01

    Due to the upcoming data deluge of genome data, the need for storing and processing large-scale genome data, easy access to biomedical analyses tools, efficient data sharing and retrieval has presented significant challenges. The variability in data volume results in variable computing and storage requirements, therefore biomedical researchers are pursuing more reliable, dynamic and convenient methods for conducting sequencing analyses. This paper proposes a Cloud-based bioinformatics workflow platform for large-scale next-generation sequencing analyses, which enables reliable and highly scalable execution of sequencing analyses workflows in a fully automated manner. Our platform extends the existing Galaxy workflow system by adding data management capabilities for transferring large quantities of data efficiently and reliably (via Globus Transfer), domain-specific analyses tools preconfigured for immediate use by researchers (via user-specific tools integration), automatic deployment on Cloud for on-demand resource allocation and pay-as-you-go pricing (via Globus Provision), a Cloud provisioning tool for auto-scaling (via HTCondor scheduler), and the support for validating the correctness of workflows (via semantic verification tools). Two bioinformatics workflow use cases as well as performance evaluation are presented to validate the feasibility of the proposed approach. Copyright © 2014 Elsevier Inc. All rights reserved.

  10. Transcriptome analysis of Pseudomonas syringae identifies new genes, ncRNAs, and antisense activity

    USDA-ARS?s Scientific Manuscript database

    To fully understand how bacteria respond to their environment, it is essential to assess genome-wide transcriptional activity. New high throughput sequencing technologies make it possible to query the transcriptome of an organism in an efficient unbiased manner. We applied a strand-specific method t...

  11. Allele-specific copy-number discovery from whole-genome and whole-exome sequencing

    PubMed Central

    Wang, WeiBo; Wang, Wei; Sun, Wei; Crowley, James J.; Szatkiewicz, Jin P.

    2015-01-01

    Copy-number variants (CNVs) are a major form of genetic variation and a risk factor for various human diseases, so it is crucial to accurately detect and characterize them. It is conceivable that allele-specific reads from high-throughput sequencing data could be leveraged to both enhance CNV detection and produce allele-specific copy number (ASCN) calls. Although statistical methods have been developed to detect CNVs using whole-genome sequence (WGS) and/or whole-exome sequence (WES) data, information from allele-specific read counts has not yet been adequately exploited. In this paper, we develop an integrated method, called AS-GENSENG, which incorporates allele-specific read counts in CNV detection and estimates ASCN using either WGS or WES data. To evaluate the performance of AS-GENSENG, we conducted extensive simulations, generated empirical data using existing WGS and WES data sets and validated predicted CNVs using an independent methodology. We conclude that AS-GENSENG not only predicts accurate ASCN calls but also improves the accuracy of total copy number calls, owing to its unique ability to exploit information from both total and allele-specific read counts while accounting for various experimental biases in sequence data. Our novel, user-friendly and computationally efficient method and a complete analytic protocol is freely available at https://sourceforge.net/projects/asgenseng/. PMID:25883151

  12. A Simple and Efficient Method for Assembling TALE Protein Based on Plasmid Library

    PubMed Central

    Xu, Huarong; Xin, Ying; Zhang, Tingting; Ma, Lixia; Wang, Xin; Chen, Zhilong; Zhang, Zhiying

    2013-01-01

    DNA binding domain of the transcription activator-like effectors (TALEs) from Xanthomonas sp. consists of tandem repeats that can be rearranged according to a simple cipher to target new DNA sequences with high DNA-binding specificity. This technology has been successfully applied in varieties of species for genome engineering. However, assembling long TALE tandem repeats remains a big challenge precluding wide use of this technology. Although several new methodologies for efficiently assembling TALE repeats have been recently reported, all of them require either sophisticated facilities or skilled technicians to carry them out. Here, we described a simple and efficient method for generating customized TALE nucleases (TALENs) and TALE transcription factors (TALE-TFs) based on TALE repeat tetramer library. A tetramer library consisting of 256 tetramers covers all possible combinations of 4 base pairs. A set of unique primers was designed for amplification of these tetramers. PCR products were assembled by one step of digestion/ligation reaction. 12 TALE constructs including 4 TALEN pairs targeted to mouse Gt(ROSA)26Sor gene and mouse Mstn gene sequences as well as 4 TALE-TF constructs targeted to mouse Oct4, c-Myc, Klf4 and Sox2 gene promoter sequences were generated by using our method. The construction routines took 3 days and parallel constructions were available. The rate of positive clones during colony PCR verification was 64% on average. Sequencing results suggested that all TALE constructs were performed with high successful rate. This is a rapid and cost-efficient method using the most common enzymes and facilities with a high success rate. PMID:23840477

  13. A simple and efficient method for assembling TALE protein based on plasmid library.

    PubMed

    Zhang, Zhiqiang; Li, Duo; Xu, Huarong; Xin, Ying; Zhang, Tingting; Ma, Lixia; Wang, Xin; Chen, Zhilong; Zhang, Zhiying

    2013-01-01

    DNA binding domain of the transcription activator-like effectors (TALEs) from Xanthomonas sp. consists of tandem repeats that can be rearranged according to a simple cipher to target new DNA sequences with high DNA-binding specificity. This technology has been successfully applied in varieties of species for genome engineering. However, assembling long TALE tandem repeats remains a big challenge precluding wide use of this technology. Although several new methodologies for efficiently assembling TALE repeats have been recently reported, all of them require either sophisticated facilities or skilled technicians to carry them out. Here, we described a simple and efficient method for generating customized TALE nucleases (TALENs) and TALE transcription factors (TALE-TFs) based on TALE repeat tetramer library. A tetramer library consisting of 256 tetramers covers all possible combinations of 4 base pairs. A set of unique primers was designed for amplification of these tetramers. PCR products were assembled by one step of digestion/ligation reaction. 12 TALE constructs including 4 TALEN pairs targeted to mouse Gt(ROSA)26Sor gene and mouse Mstn gene sequences as well as 4 TALE-TF constructs targeted to mouse Oct4, c-Myc, Klf4 and Sox2 gene promoter sequences were generated by using our method. The construction routines took 3 days and parallel constructions were available. The rate of positive clones during colony PCR verification was 64% on average. Sequencing results suggested that all TALE constructs were performed with high successful rate. This is a rapid and cost-efficient method using the most common enzymes and facilities with a high success rate.

  14. Gene calling and bacterial genome annotation with BG7.

    PubMed

    Tobes, Raquel; Pareja-Tobes, Pablo; Manrique, Marina; Pareja-Tobes, Eduardo; Kovach, Evdokim; Alekhin, Alexey; Pareja, Eduardo

    2015-01-01

    New massive sequencing technologies are providing many bacterial genome sequences from diverse taxa but a refined annotation of these genomes is crucial for obtaining scientific findings and new knowledge. Thus, bacterial genome annotation has emerged as a key point to investigate in bacteria. Any efficient tool designed specifically to annotate bacterial genomes sequenced with massively parallel technologies has to consider the specific features of bacterial genomes (absence of introns and scarcity of nonprotein-coding sequence) and of next-generation sequencing (NGS) technologies (presence of errors and not perfectly assembled genomes). These features make it convenient to focus on coding regions and, hence, on protein sequences that are the elements directly related with biological functions. In this chapter we describe how to annotate bacterial genomes with BG7, an open-source tool based on a protein-centered gene calling/annotation paradigm. BG7 is specifically designed for the annotation of bacterial genomes sequenced with NGS. This tool is sequence error tolerant maintaining their capabilities for the annotation of highly fragmented genomes or for annotating mixed sequences coming from several genomes (as those obtained through metagenomics samples). BG7 has been designed with scalability as a requirement, with a computing infrastructure completely based on cloud computing (Amazon Web Services).

  15. Efficient generation of mouse models of human diseases via ABE- and BE-mediated base editing.

    PubMed

    Liu, Zhen; Lu, Zongyang; Yang, Guang; Huang, Shisheng; Li, Guanglei; Feng, Songjie; Liu, Yajing; Li, Jianan; Yu, Wenxia; Zhang, Yu; Chen, Jia; Sun, Qiang; Huang, Xingxu

    2018-06-14

    A recently developed adenine base editor (ABE) efficiently converts A to G and is potentially useful for clinical applications. However, its precision and efficiency in vivo remains to be addressed. Here we achieve A-to-G conversion in vivo at frequencies up to 100% by microinjection of ABE mRNA together with sgRNAs. We then generate mouse models harboring clinically relevant mutations at Ar and Hoxd13, which recapitulates respective clinical defects. Furthermore, we achieve both C-to-T and A-to-G base editing by using a combination of ABE and SaBE3, thus creating mouse model harboring multiple mutations. We also demonstrate the specificity of ABE by deep sequencing and whole-genome sequencing (WGS). Taken together, ABE is highly efficient and precise in vivo, making it feasible to model and potentially cure relevant genetic diseases.

  16. Bimodal imprint chips for peptide screening: integration of high-throughput sequencing by MS and affinity analyses by surface plasmon resonance imaging.

    PubMed

    Wang, Weizhi; Li, Menglin; Wei, Zewen; Wang, Zihua; Bu, Xiangli; Lai, Wenjia; Yang, Shu; Gong, He; Zheng, Hui; Wang, Yuqiao; Liu, Ying; Li, Qin; Fang, Qiaojun; Hu, Zhiyuan

    2014-04-15

    Peptide probes and drugs have widespread applications in disease diagnostics and therapy. The demand for peptides ligands with high affinity and high specificity toward various targets has surged in the biomedical field in recent years. The traditional peptide screening procedure involves selection, sequencing, and characterization steps, and each step is manual and tedious. Herein, we developed a bimodal imprint microarray system to embrace the whole peptide screening process. Silver-sputtered silicon chip fabricated with microwell array can trap and pattern the candidate peptide beads in a one-well-one-bead manner. Peptides on beads were photocleaved in situ. A portion of the peptide in each well was transferred to a gold-coated chip to print the peptide array for high-throughput affinity analyses by surface plasmon resonance imaging (SPRi), and the peptide left in the silver-sputtered chip was ready for in situ single bead sequencing by matrix-assisted laser desorption ionization time-of-flight mass spectrometry (MALDI-TOF-MS). Using the bimodal imprint chip system, affinity peptides toward AHA were efficiently screened out from the 7 × 10(4) peptide library. The method provides a solution for high efficiency peptide screening.

  17. Sequence-specific antimicrobials using efficiently delivered RNA-guided nucleases.

    PubMed

    Citorik, Robert J; Mimee, Mark; Lu, Timothy K

    2014-11-01

    Current antibiotics tend to be broad spectrum, leading to indiscriminate killing of commensal bacteria and accelerated evolution of drug resistance. Here, we use CRISPR-Cas technology to create antimicrobials whose spectrum of activity is chosen by design. RNA-guided nucleases (RGNs) targeting specific DNA sequences are delivered efficiently to microbial populations using bacteriophage or bacteria carrying plasmids transmissible by conjugation. The DNA targets of RGNs can be undesirable genes or polymorphisms, including antibiotic resistance and virulence determinants in carbapenem-resistant Enterobacteriaceae and enterohemorrhagic Escherichia coli. Delivery of RGNs significantly improves survival in a Galleria mellonella infection model. We also show that RGNs enable modulation of complex bacterial populations by selective knockdown of targeted strains based on genetic signatures. RGNs constitute a class of highly discriminatory, customizable antimicrobials that enact selective pressure at the DNA level to reduce the prevalence of undesired genes, minimize off-target effects and enable programmable remodeling of microbiota.

  18. Site- and strand-specific nicking of DNA by fusion proteins derived from MutH and I-SceI or TALE repeats.

    PubMed

    Gabsalilow, Lilia; Schierling, Benno; Friedhoff, Peter; Pingoud, Alfred; Wende, Wolfgang

    2013-04-01

    Targeted genome engineering requires nucleases that introduce a highly specific double-strand break in the genome that is either processed by homology-directed repair in the presence of a homologous repair template or by non-homologous end-joining (NHEJ) that usually results in insertions or deletions. The error-prone NHEJ can be efficiently suppressed by 'nickases' that produce a single-strand break rather than a double-strand break. Highly specific nickases have been produced by engineering of homing endonucleases and more recently by modifying zinc finger nucleases (ZFNs) composed of a zinc finger array and the catalytic domain of the restriction endonuclease FokI. These ZF-nickases work as heterodimers in which one subunit has a catalytically inactive FokI domain. We present two different approaches to engineer highly specific nickases; both rely on the sequence-specific nicking activity of the DNA mismatch repair endonuclease MutH which we fused to a DNA-binding module, either a catalytically inactive variant of the homing endonuclease I-SceI or the DNA-binding domain of the TALE protein AvrBs4. The fusion proteins nick strand specifically a bipartite recognition sequence consisting of the MutH and the I-SceI or TALE recognition sequences, respectively, with a more than 1000-fold preference over a stand-alone MutH site. TALE-MutH is a programmable nickase.

  19. Efficient error correction for next-generation sequencing of viral amplicons

    PubMed Central

    2012-01-01

    Background Next-generation sequencing allows the analysis of an unprecedented number of viral sequence variants from infected patients, presenting a novel opportunity for understanding virus evolution, drug resistance and immune escape. However, sequencing in bulk is error prone. Thus, the generated data require error identification and correction. Most error-correction methods to date are not optimized for amplicon analysis and assume that the error rate is randomly distributed. Recent quality assessment of amplicon sequences obtained using 454-sequencing showed that the error rate is strongly linked to the presence and size of homopolymers, position in the sequence and length of the amplicon. All these parameters are strongly sequence specific and should be incorporated into the calibration of error-correction algorithms designed for amplicon sequencing. Results In this paper, we present two new efficient error correction algorithms optimized for viral amplicons: (i) k-mer-based error correction (KEC) and (ii) empirical frequency threshold (ET). Both were compared to a previously published clustering algorithm (SHORAH), in order to evaluate their relative performance on 24 experimental datasets obtained by 454-sequencing of amplicons with known sequences. All three algorithms show similar accuracy in finding true haplotypes. However, KEC and ET were significantly more efficient than SHORAH in removing false haplotypes and estimating the frequency of true ones. Conclusions Both algorithms, KEC and ET, are highly suitable for rapid recovery of error-free haplotypes obtained by 454-sequencing of amplicons from heterogeneous viruses. The implementations of the algorithms and data sets used for their testing are available at: http://alan.cs.gsu.edu/NGS/?q=content/pyrosequencing-error-correction-algorithm PMID:22759430

  20. Efficient error correction for next-generation sequencing of viral amplicons.

    PubMed

    Skums, Pavel; Dimitrova, Zoya; Campo, David S; Vaughan, Gilberto; Rossi, Livia; Forbi, Joseph C; Yokosawa, Jonny; Zelikovsky, Alex; Khudyakov, Yury

    2012-06-25

    Next-generation sequencing allows the analysis of an unprecedented number of viral sequence variants from infected patients, presenting a novel opportunity for understanding virus evolution, drug resistance and immune escape. However, sequencing in bulk is error prone. Thus, the generated data require error identification and correction. Most error-correction methods to date are not optimized for amplicon analysis and assume that the error rate is randomly distributed. Recent quality assessment of amplicon sequences obtained using 454-sequencing showed that the error rate is strongly linked to the presence and size of homopolymers, position in the sequence and length of the amplicon. All these parameters are strongly sequence specific and should be incorporated into the calibration of error-correction algorithms designed for amplicon sequencing. In this paper, we present two new efficient error correction algorithms optimized for viral amplicons: (i) k-mer-based error correction (KEC) and (ii) empirical frequency threshold (ET). Both were compared to a previously published clustering algorithm (SHORAH), in order to evaluate their relative performance on 24 experimental datasets obtained by 454-sequencing of amplicons with known sequences. All three algorithms show similar accuracy in finding true haplotypes. However, KEC and ET were significantly more efficient than SHORAH in removing false haplotypes and estimating the frequency of true ones. Both algorithms, KEC and ET, are highly suitable for rapid recovery of error-free haplotypes obtained by 454-sequencing of amplicons from heterogeneous viruses.The implementations of the algorithms and data sets used for their testing are available at: http://alan.cs.gsu.edu/NGS/?q=content/pyrosequencing-error-correction-algorithm.

  1. Programmable DNA-Guided Artificial Restriction Enzymes.

    PubMed

    Enghiad, Behnam; Zhao, Huimin

    2017-05-19

    Restriction enzymes are essential tools for recombinant DNA technology that have revolutionized modern biological research. However, they have limited sequence specificity and availability. Here we report a Pyrococcus furiosus Argonaute (PfAgo) based platform for generating artificial restriction enzymes (AREs) capable of recognizing and cleaving DNA sequences at virtually any arbitrary site and generating defined sticky ends of varying length. Short DNA guides are used to direct PfAgo to target sites for cleavage at high temperatures (>87 °C) followed by reannealing of the cleaved single stranded DNAs. We used this platform to generate over 18 AREs for DNA fingerprinting and molecular cloning of PCR-amplified or genomic DNAs. These AREs work as efficiently as their naturally occurring counterparts, and some of them even do not have any naturally occurring counterparts, demonstrating easy programmability, generality, versatility, and high efficiency for this new technology.

  2. Construction and Evaluation of Normalized cDNA Libraries Enriched with Full-Length Sequences for Rapid Discovery of New Genes from Sisal (Agave sisalana Perr.) Different Developmental Stages

    PubMed Central

    Zhou, Wen-Zhao; Zhang, Yan-Mei; Lu, Jun-Ying; Li, Jun-Feng

    2012-01-01

    To provide a resource of sisal-specific expressed sequence data and facilitate this powerful approach in new gene research, the preparation of normalized cDNA libraries enriched with full-length sequences is necessary. Four libraries were produced with RNA pooled from Agave sisalana multiple tissues to increase efficiency of normalization and maximize the number of independent genes by SMART™ method and the duplex-specific nuclease (DSN). This procedure kept the proportion of full-length cDNAs in the subtracted/normalized libraries and dramatically enhanced the discovery of new genes. Sequencing of 3875 cDNA clones of libraries revealed 3320 unigenes with an average insert length about 1.2 kb, indicating that the non-redundancy of libraries was about 85.7%. These unigene functions were predicted by comparing their sequences to functional domain databases and extensively annotated with Gene Ontology (GO) terms. Comparative analysis of sisal unigenes and other plant genomes revealed that four putative MADS-box genes and knotted-like homeobox (knox) gene were obtained from a total of 1162 full-length transcripts. Furthermore, real-time PCR showed that the characteristics of their transcripts mainly depended on the tight expression regulation of a number of genes during the leaf and flower development. Analysis of individual library sequence data indicated that the pooled-tissue approach was highly effective in discovering new genes and preparing libraries for efficient deep sequencing. PMID:23202944

  3. Large scale RNAi screen in Tribolium reveals novel target genes for pest control and the proteasome as prime target.

    PubMed

    Ulrich, Julia; Dao, Van Anh; Majumdar, Upalparna; Schmitt-Engel, Christian; Schwirz, Jonas; Schultheis, Dorothea; Ströhlein, Nadi; Troelenberg, Nicole; Grossmann, Daniela; Richter, Tobias; Dönitz, Jürgen; Gerischer, Lizzy; Leboulle, Gérard; Vilcinskas, Andreas; Stanke, Mario; Bucher, Gregor

    2015-09-03

    Insect pest control is challenged by insecticide resistance and negative impact on ecology and health. One promising pest specific alternative is the generation of transgenic plants, which express double stranded RNAs targeting essential genes of a pest species. Upon feeding, the dsRNA induces gene silencing in the pest resulting in its death. However, the identification of efficient RNAi target genes remains a major challenge as genomic tools and breeding capacity is limited in most pest insects impeding whole-animal-high-throughput-screening. We use the red flour beetle Tribolium castaneum as a screening platform in order to identify the most efficient RNAi target genes. From about 5,000 randomly screened genes of the iBeetle RNAi screen we identify 11 novel and highly efficient RNAi targets. Our data allowed us to determine GO term combinations that are predictive for efficient RNAi target genes with proteasomal genes being most predictive. Finally, we show that RNAi target genes do not appear to act synergistically and that protein sequence conservation does not correlate with the number of potential off target sites. Our results will aid the identification of RNAi target genes in many pest species by providing a manageable number of excellent candidate genes to be tested and the proteasome as prime target. Further, the identified GO term combinations will help to identify efficient target genes from organ specific transcriptomes. Our off target analysis is relevant for the sequence selection used in transgenic plants.

  4. GRIDSS: sensitive and specific genomic rearrangement detection using positional de Bruijn graph assembly

    PubMed Central

    Do, Hongdo; Molania, Ramyar

    2017-01-01

    The identification of genomic rearrangements with high sensitivity and specificity using massively parallel sequencing remains a major challenge, particularly in precision medicine and cancer research. Here, we describe a new method for detecting rearrangements, GRIDSS (Genome Rearrangement IDentification Software Suite). GRIDSS is a multithreaded structural variant (SV) caller that performs efficient genome-wide break-end assembly prior to variant calling using a novel positional de Bruijn graph-based assembler. By combining assembly, split read, and read pair evidence using a probabilistic scoring, GRIDSS achieves high sensitivity and specificity on simulated, cell line, and patient tumor data, recently winning SV subchallenge #5 of the ICGC-TCGA DREAM8.5 Somatic Mutation Calling Challenge. On human cell line data, GRIDSS halves the false discovery rate compared to other recent methods while matching or exceeding their sensitivity. GRIDSS identifies nontemplate sequence insertions, microhomologies, and large imperfect homologies, estimates a quality score for each breakpoint, stratifies calls into high or low confidence, and supports multisample analysis. PMID:29097403

  5. DNA-Templated Polymerization of Side-Chain-Functionalized Peptide Nucleic Acid Aldehydes

    PubMed Central

    Kleiner, Ralph E.; Brudno, Yevgeny; Birnbaum, Michael E.; Liu, David R.

    2009-01-01

    The DNA-templated polymerization of synthetic building blocks provides a potential route to the laboratory evolution of sequence-defined polymers with structures and properties not necessarily limited to those of natural biopolymers. We previously reported the efficient and sequence-specific DNA-templated polymerization of peptide nucleic acid (PNA) aldehydes. Here, we report the enzyme-free, DNA-templated polymerization of side-chain-functionalized PNA tetramer and pentamer aldehydes. We observed that the polymerization of tetramer and pentamer PNA building blocks with a single lysine-based side chain at various positions in the building block could proceed efficiently and sequence-specifically. In addition, DNA-templated polymerization also proceeded efficiently and in a sequence-specific manner with pentamer PNA aldehydes containing two or three lysine side chains in a single building block to generate more densely functionalized polymers. To further our understanding of side-chain compatibility and expand the capabilities of this system, we also examined the polymerization efficiencies of 20 pentamer building blocks each containing one of five different side-chain groups and four different side-chain regio- and stereochemistries. Polymerization reactions were efficient for all five different side-chain groups and for three of the four combinations of side-chain regio- and stereochemistries. Differences in the efficiency and initial rate of polymerization correlate with the apparent melting temperature of each building block, which is dependent on side-chain regio- and stereochemistry, but relatively insensitive to side-chain structure among the substrates tested. Our findings represent a significant step towards the evolution of sequence-defined synthetic polymers and also demonstrate that enzyme-free nucleic acid-templated polymerization can occur efficiently using substrates with a wide range of side-chain structures, functionalization positions within each building block, and functionalization densities. PMID:18341334

  6. An Efficient Method for High-Fidelity BAC/PAC Retrofitting with a Selectable Marker for Mammalian Cell Transfection

    PubMed Central

    Wang, Zunde; Engler, Peter; Longacre, Angelika; Storb, Ursula

    2001-01-01

    Large-scale genomic sequencing projects have provided DNA sequence information for many genes, but the biological functions for most of them will only be known through functional studies. Bacterial artificial chromosomes (BACs) and P1-derived artificial chromosomes (PACs) are large genomic clones stably maintained in bacteria and are very important in functional studies through transfection because of their large size and stability. Because most BAC or PAC vectors do not have a mammalian selection marker, transfecting mammalian cells with genes cloned in BACs or PACs requires the insertion into the BAC/PAC of a mammalian selectable marker. However, currently available procedures are not satisfactory in efficiency and fidelity. We describe a very simple and efficient procedure that allows one to retrofit dozens of BACs in a day with no detectable deletions or unwanted recombination. We use a BAC/PAC retrofitting vector that, on transformation into competent BAC or PAC strains, will catalyze the specific insertion of itself into BAC/PAC vectors through in vivo cre/loxP site-specific recombination. PMID:11156622

  7. High-throughput identification of antigen-specific TCRs by TCR gene capture.

    PubMed

    Linnemann, Carsten; Heemskerk, Bianca; Kvistborg, Pia; Kluin, Roelof J C; Bolotin, Dmitriy A; Chen, Xiaojing; Bresser, Kaspar; Nieuwland, Marja; Schotte, Remko; Michels, Samira; Gomez-Eerland, Raquel; Jahn, Lorenz; Hombrink, Pleun; Legrand, Nicolas; Shu, Chengyi Jenny; Mamedov, Ilgar Z; Velds, Arno; Blank, Christian U; Haanen, John B A G; Turchaninova, Maria A; Kerkhoven, Ron M; Spits, Hergen; Hadrup, Sine Reker; Heemskerk, Mirjam H M; Blankenstein, Thomas; Chudakov, Dmitriy M; Bendle, Gavin M; Schumacher, Ton N M

    2013-11-01

    The transfer of T cell receptor (TCR) genes into patient T cells is a promising approach for the treatment of both viral infections and cancer. Although efficient methods exist to identify antibodies for the treatment of these diseases, comparable strategies to identify TCRs have been lacking. We have developed a high-throughput DNA-based strategy to identify TCR sequences by the capture and sequencing of genomic DNA fragments encoding the TCR genes. We establish the value of this approach by assembling a large library of cancer germline tumor antigen-reactive TCRs. Furthermore, by exploiting the quantitative nature of TCR gene capture, we show the feasibility of identifying antigen-specific TCRs in oligoclonal T cell populations from either human material or TCR-humanized mice. Finally, we demonstrate the ability to identify tumor-reactive TCRs within intratumoral T cell subsets without knowledge of antigen specificities, which may be the first step toward the development of autologous TCR gene therapy to target patient-specific neoantigens in human cancer.

  8. HAlign-II: efficient ultra-large multiple sequence alignment and phylogenetic tree reconstruction with distributed and parallel computing.

    PubMed

    Wan, Shixiang; Zou, Quan

    2017-01-01

    Multiple sequence alignment (MSA) plays a key role in biological sequence analyses, especially in phylogenetic tree construction. Extreme increase in next-generation sequencing results in shortage of efficient ultra-large biological sequence alignment approaches for coping with different sequence types. Distributed and parallel computing represents a crucial technique for accelerating ultra-large (e.g. files more than 1 GB) sequence analyses. Based on HAlign and Spark distributed computing system, we implement a highly cost-efficient and time-efficient HAlign-II tool to address ultra-large multiple biological sequence alignment and phylogenetic tree construction. The experiments in the DNA and protein large scale data sets, which are more than 1GB files, showed that HAlign II could save time and space. It outperformed the current software tools. HAlign-II can efficiently carry out MSA and construct phylogenetic trees with ultra-large numbers of biological sequences. HAlign-II shows extremely high memory efficiency and scales well with increases in computing resource. THAlign-II provides a user-friendly web server based on our distributed computing infrastructure. HAlign-II with open-source codes and datasets was established at http://lab.malab.cn/soft/halign.

  9. Allele-specific copy-number discovery from whole-genome and whole-exome sequencing.

    PubMed

    Wang, WeiBo; Wang, Wei; Sun, Wei; Crowley, James J; Szatkiewicz, Jin P

    2015-08-18

    Copy-number variants (CNVs) are a major form of genetic variation and a risk factor for various human diseases, so it is crucial to accurately detect and characterize them. It is conceivable that allele-specific reads from high-throughput sequencing data could be leveraged to both enhance CNV detection and produce allele-specific copy number (ASCN) calls. Although statistical methods have been developed to detect CNVs using whole-genome sequence (WGS) and/or whole-exome sequence (WES) data, information from allele-specific read counts has not yet been adequately exploited. In this paper, we develop an integrated method, called AS-GENSENG, which incorporates allele-specific read counts in CNV detection and estimates ASCN using either WGS or WES data. To evaluate the performance of AS-GENSENG, we conducted extensive simulations, generated empirical data using existing WGS and WES data sets and validated predicted CNVs using an independent methodology. We conclude that AS-GENSENG not only predicts accurate ASCN calls but also improves the accuracy of total copy number calls, owing to its unique ability to exploit information from both total and allele-specific read counts while accounting for various experimental biases in sequence data. Our novel, user-friendly and computationally efficient method and a complete analytic protocol is freely available at https://sourceforge.net/projects/asgenseng/. © The Author(s) 2015. Published by Oxford University Press on behalf of Nucleic Acids Research.

  10. How Changes in Anti-SD Sequences Would Affect SD Sequences in Escherichia coli and Bacillus subtilis.

    PubMed

    Abolbaghaei, Akram; Silke, Jordan R; Xia, Xuhua

    2017-05-05

    The 3' end of the small ribosomal RNAs (ssu rRNA) in bacteria is directly involved in the selection and binding of mRNA transcripts during translation initiation via well-documented interactions between a Shine-Dalgarno (SD) sequence located upstream of the initiation codon and an anti-SD (aSD) sequence at the 3' end of the ssu rRNA. Consequently, the 3' end of ssu rRNA (3'TAIL) is strongly conserved among bacterial species because a change in the region may impact the translation of many protein-coding genes. Escherichia coli and Bacillus subtilis differ in their 3' ends of ssu rRNA, being GAUC ACCUCCUUA 3' in E. coli and GAUC ACCUCCUU UCU3' or GAUC ACCUCCUU UCUA3' in B. subtilis Such differences in 3'TAIL lead to species-specific SDs (designated SD Ec for E. coli and SD Bs for B. subtilis ) that can form strong and well-positioned SD/aSD pairing in one species but not in the other. Selection mediated by the species-specific 3'TAIL is expected to favor SD Bs against SD Ec in B. subtilis , but favor SD Ec against SD Bs in E. coli Among well-positioned SDs, SD Ec is used more in E. coli than in B. subtilis , and SD Bs more in B. subtilis than in E. coli Highly expressed genes and genes of high translation efficiency tend to have longer SDs than lowly expressed genes and genes with low translation efficiency in both species, but more so in B. subtilis than in E. coli Both species overuse SDs matching the bolded part of the 3'TAIL shown above. The 3'TAIL difference contributes to the host specificity of phages. Copyright © 2017 Abolbaghaei et al.

  11. The largest subunit of RNA polymerase II as a new marker gene to study assemblages of arbuscular mycorrhizal fungi in the field.

    PubMed

    Stockinger, Herbert; Peyret-Guzzon, Marine; Koegel, Sally; Bouffaud, Marie-Lara; Redecker, Dirk

    2014-01-01

    Due to the potential of arbuscular mycorrhizal fungi (AMF, Glomeromycota) to improve plant growth and soil quality, the influence of agricultural practice on their diversity continues to be an important research question. Up to now studies of community diversity in AMF have exclusively been based on nuclear ribosomal gene regions, which in AMF show high intra-organism polymorphism, seriously complicating interpretation of these data. We designed specific PCR primers for 454 sequencing of a region of the largest subunit of RNA polymerase II gene, and established a new reference dataset comprising all major AMF lineages. This gene is known to be monomorphic within fungal isolates but shows an excellent barcode gap between species. We designed a primer set to amplify all known lineages of AMF and demonstrated its applicability in combination with high-throughput sequencing in a long-term tillage experiment. The PCR primers showed a specificity of 99.94% for glomeromycotan sequences. We found evidence of significant shifts of the AMF communities caused by soil management and showed that tillage effects on different AMF taxa are clearly more complex than previously thought. The high resolving power of high-throughput sequencing highlights the need for quantitative measurements to efficiently detect these effects.

  12. CoCoNUT: an efficient system for the comparison and analysis of genomes

    PubMed Central

    2008-01-01

    Background Comparative genomics is the analysis and comparison of genomes from different species. This area of research is driven by the large number of sequenced genomes and heavily relies on efficient algorithms and software to perform pairwise and multiple genome comparisons. Results Most of the software tools available are tailored for one specific task. In contrast, we have developed a novel system CoCoNUT (Computational Comparative geNomics Utility Toolkit) that allows solving several different tasks in a unified framework: (1) finding regions of high similarity among multiple genomic sequences and aligning them, (2) comparing two draft or multi-chromosomal genomes, (3) locating large segmental duplications in large genomic sequences, and (4) mapping cDNA/EST to genomic sequences. Conclusion CoCoNUT is competitive with other software tools w.r.t. the quality of the results. The use of state of the art algorithms and data structures allows CoCoNUT to solve comparative genomics tasks more efficiently than previous tools. With the improved user interface (including an interactive visualization component), CoCoNUT provides a unified, versatile, and easy-to-use software tool for large scale studies in comparative genomics. PMID:19014477

  13. Detecting very low allele fraction variants using targeted DNA sequencing and a novel molecular barcode-aware variant caller.

    PubMed

    Xu, Chang; Nezami Ranjbar, Mohammad R; Wu, Zhong; DiCarlo, John; Wang, Yexun

    2017-01-03

    Detection of DNA mutations at very low allele fractions with high accuracy will significantly improve the effectiveness of precision medicine for cancer patients. To achieve this goal through next generation sequencing, researchers need a detection method that 1) captures rare mutation-containing DNA fragments efficiently in the mix of abundant wild-type DNA; 2) sequences the DNA library extensively to deep coverage; and 3) distinguishes low level true variants from amplification and sequencing errors with high accuracy. Targeted enrichment using PCR primers provides researchers with a convenient way to achieve deep sequencing for a small, yet most relevant region using benchtop sequencers. Molecular barcoding (or indexing) provides a unique solution for reducing sequencing artifacts analytically. Although different molecular barcoding schemes have been reported in recent literature, most variant calling has been done on limited targets, using simple custom scripts. The analytical performance of barcode-aware variant calling can be significantly improved by incorporating advanced statistical models. We present here a highly efficient, simple and scalable enrichment protocol that integrates molecular barcodes in multiplex PCR amplification. In addition, we developed smCounter, an open source, generic, barcode-aware variant caller based on a Bayesian probabilistic model. smCounter was optimized and benchmarked on two independent read sets with SNVs and indels at 5 and 1% allele fractions. Variants were called with very good sensitivity and specificity within coding regions. We demonstrated that we can accurately detect somatic mutations with allele fractions as low as 1% in coding regions using our enrichment protocol and variant caller.

  14. Biotechnological applications of mobile group II introns and their reverse transcriptases: gene targeting, RNA-seq, and non-coding RNA analysis.

    PubMed

    Enyeart, Peter J; Mohr, Georg; Ellington, Andrew D; Lambowitz, Alan M

    2014-01-13

    Mobile group II introns are bacterial retrotransposons that combine the activities of an autocatalytic intron RNA (a ribozyme) and an intron-encoded reverse transcriptase to insert site-specifically into DNA. They recognize DNA target sites largely by base pairing of sequences within the intron RNA and achieve high DNA target specificity by using the ribozyme active site to couple correct base pairing to RNA-catalyzed intron integration. Algorithms have been developed to program the DNA target site specificity of several mobile group II introns, allowing them to be made into 'targetrons.' Targetrons function for gene targeting in a wide variety of bacteria and typically integrate at efficiencies high enough to be screened easily by colony PCR, without the need for selectable markers. Targetrons have found wide application in microbiological research, enabling gene targeting and genetic engineering of bacteria that had been intractable to other methods. Recently, a thermostable targetron has been developed for use in bacterial thermophiles, and new methods have been developed for using targetrons to position recombinase recognition sites, enabling large-scale genome-editing operations, such as deletions, inversions, insertions, and 'cut-and-pastes' (that is, translocation of large DNA segments), in a wide range of bacteria at high efficiency. Using targetrons in eukaryotes presents challenges due to the difficulties of nuclear localization and sub-optimal magnesium concentrations, although supplementation with magnesium can increase integration efficiency, and directed evolution is being employed to overcome these barriers. Finally, spurred by new methods for expressing group II intron reverse transcriptases that yield large amounts of highly active protein, thermostable group II intron reverse transcriptases from bacterial thermophiles are being used as research tools for a variety of applications, including qRT-PCR and next-generation RNA sequencing (RNA-seq). The high processivity and fidelity of group II intron reverse transcriptases along with their novel template-switching activity, which can directly link RNA-seq adaptor sequences to cDNAs during reverse transcription, open new approaches for RNA-seq and the identification and profiling of non-coding RNAs, with potentially wide applications in research and biotechnology.

  15. Unexpected substrate specificity of T4 DNA ligase revealed by in vitro selection

    NASA Technical Reports Server (NTRS)

    Harada, Kazuo; Orgel, Leslie E.

    1993-01-01

    We have used in vitro selection techniques to characterize DNA sequences that are ligated efficiently by T4 DNA ligase. We find that the ensemble of selected sequences ligates about 50 times as efficiently as the random mixture of sequences used as the input for selection. Surprisingly many of the selected sequences failed to produce a match at or close to the ligation junction. None of the 20 selected oligomers that we sequenced produced a match two bases upstream from the ligation junction.

  16. Use of tuf Sequences for Genus-Specific PCR Detection and Phylogenetic Analysis of 28 Streptococcal Species

    PubMed Central

    Picard, François J.; Ke, Danbing; Boudreau, Dominique K.; Boissinot, Maurice; Huletsky, Ann; Richard, Dave; Ouellette, Marc; Roy, Paul H.; Bergeron, Michel G.

    2004-01-01

    A 761-bp portion of the tuf gene (encoding the elongation factor Tu) from 28 clinically relevant streptococcal species was obtained by sequencing amplicons generated using broad-range PCR primers. These tuf sequences were used to select Streptococcus-specific PCR primers and to perform phylogenetic analysis. The specificity of the PCR assay was verified using 102 different bacterial species, including the 28 streptococcal species. Genomic DNA purified from all streptococcal species was efficiently detected, whereas there was no amplification with DNA from 72 of the 74 nonstreptococcal bacterial species tested. There was cross-amplification with DNAs from Enterococcus durans and Lactococcus lactis. However, the 15 to 31% nucleotide sequence divergence in the 761-bp tuf portion of these two species compared to any streptococcal tuf sequence provides ample sequence divergence to allow the development of internal probes specific to streptococci. The Streptococcus-specific assay was highly sensitive for all 28 streptococcal species tested (i.e., detection limit of 1 to 10 genome copies per PCR). The tuf sequence data was also used to perform extensive phylogenetic analysis, which was generally in agreement with phylogeny determined on the basis of 16S rRNA gene data. However, the tuf gene provided a better discrimination at the streptococcal species level that should be particularly useful for the identification of very closely related species. In conclusion, tuf appears more suitable than the 16S ribosomal RNA gene for the development of diagnostic assays for the detection and identification of streptococcal species because of its higher level of species-specific genetic divergence. PMID:15297518

  17. Structured oligonucleotides for target indexing to allow single-vessel PCR amplification and solid support microarray hybridization.

    PubMed

    Girard, Laurie D; Boissinot, Karel; Peytavi, Régis; Boissinot, Maurice; Bergeron, Michel G

    2015-02-07

    The combination of molecular diagnostic technologies is increasingly used to overcome limitations on sensitivity, specificity or multiplexing capabilities, and provide efficient lab-on-chip devices. Two such techniques, PCR amplification and microarray hybridization are used serially to take advantage of the high sensitivity and specificity of the former combined with high multiplexing capacities of the latter. These methods are usually performed in different buffers and reaction chambers. However, these elaborate methods have high complexity and cost related to reagent requirements, liquid storage and the number of reaction chambers to integrate into automated devices. Furthermore, microarray hybridizations have a sequence dependent efficiency not always predictable. In this work, we have developed the concept of a structured oligonucleotide probe which is activated by cleavage from polymerase exonuclease activity. This technology is called SCISSOHR for Structured Cleavage Induced Single-Stranded Oligonucleotide Hybridization Reaction. The SCISSOHR probes enable indexing the target sequence to a tag sequence. The SCISSOHR technology also allows the combination of nucleic acid amplification and microarray hybridization in a single vessel in presence of the PCR buffer only. The SCISSOHR technology uses an amplification probe that is irreversibly modified in presence of the target, releasing a single-stranded DNA tag for microarray hybridization. Each tag is composed of a 3-nucleotide sequence-dependent segment and a unique "target sequence-independent" 14-nucleotide segment allowing for optimal hybridization with minimal cross-hybridization. We evaluated the performance of five (5) PCR buffers to support microarray hybridization, compared to a conventional hybridization buffer. Finally, as a proof of concept, we developed a multiplexed assay for the amplification, detection, and identification of three (3) DNA targets. This new technology will facilitate the design of lab-on-chip microfluidic devices, while also reducing consumable costs. At term, it will allow the cost-effective automation of highly multiplexed assays for detection and identification of genetic targets.

  18. Translation efficiency of heterologous proteins is significantly affected by the genetic context of RBS sequences in engineered cyanobacterium Synechocystis sp. PCC 6803.

    PubMed

    Thiel, Kati; Mulaku, Edita; Dandapani, Hariharan; Nagy, Csaba; Aro, Eva-Mari; Kallio, Pauli

    2018-03-02

    Photosynthetic cyanobacteria have been studied as potential host organisms for direct solar-driven production of different carbon-based chemicals from CO 2 and water, as part of the development of sustainable future biotechnological applications. The engineering approaches, however, are still limited by the lack of comprehensive information on most optimal expression strategies and validated species-specific genetic elements which are essential for increasing the intricacy, predictability and efficiency of the systems. This study focused on the systematic evaluation of the key translational control elements, ribosome binding sites (RBS), in the cyanobacterial host Synechocystis sp. PCC 6803, with the objective of expanding the palette of tools for more rigorous engineering approaches. An expression system was established for the comparison of 13 selected RBS sequences in Synechocystis, using several alternative reporter proteins (sYFP2, codon-optimized GFPmut3 and ethylene forming enzyme) as quantitative indicators of the relative translation efficiencies. The set-up was shown to yield highly reproducible expression patterns in independent analytical series with low variation between biological replicates, thus allowing statistical comparison of the activities of the different RBSs in vivo. While the RBSs covered a relatively broad overall expression level range, the downstream gene sequence was demonstrated in a rigorous manner to have a clear impact on the resulting translational profiles. This was expected to reflect interfering sequence-specific mRNA-level interaction between the RBS and the coding region, yet correlation between potential secondary structure formation and observed translation levels could not be resolved with existing in silico prediction tools. The study expands our current understanding on the potential and limitations associated with the regulation of protein expression at translational level in engineered cyanobacteria. The acquired information can be used for selecting appropriate RBSs for optimizing over-expression constructs or multicistronic pathways in Synechocystis, while underlining the complications in predicting the activity due to gene-specific interactions which may reduce the translational efficiency for a given RBS-gene combination. Ultimately, the findings emphasize the need for additional characterized insulator sequence elements to decouple the interaction between the RBS and the coding region for future engineering approaches.

  19. DAMe: a toolkit for the initial processing of datasets with PCR replicates of double-tagged amplicons for DNA metabarcoding analyses.

    PubMed

    Zepeda-Mendoza, Marie Lisandra; Bohmann, Kristine; Carmona Baez, Aldo; Gilbert, M Thomas P

    2016-05-03

    DNA metabarcoding is an approach for identifying multiple taxa in an environmental sample using specific genetic loci and taxa-specific primers. When combined with high-throughput sequencing it enables the taxonomic characterization of large numbers of samples in a relatively time- and cost-efficient manner. One recent laboratory development is the addition of 5'-nucleotide tags to both primers producing double-tagged amplicons and the use of multiple PCR replicates to filter erroneous sequences. However, there is currently no available toolkit for the straightforward analysis of datasets produced in this way. We present DAMe, a toolkit for the processing of datasets generated by double-tagged amplicons from multiple PCR replicates derived from an unlimited number of samples. Specifically, DAMe can be used to (i) sort amplicons by tag combination, (ii) evaluate PCR replicates dissimilarity, and (iii) filter sequences derived from sequencing/PCR errors, chimeras, and contamination. This is attained by calculating the following parameters: (i) sequence content similarity between the PCR replicates from each sample, (ii) reproducibility of each unique sequence across the PCR replicates, and (iii) copy number of the unique sequences in each PCR replicate. We showcase the insights that can be obtained using DAMe prior to taxonomic assignment, by applying it to two real datasets that vary in their complexity regarding number of samples, sequencing libraries, PCR replicates, and used tag combinations. Finally, we use a third mock dataset to demonstrate the impact and importance of filtering the sequences with DAMe. DAMe allows the user-friendly manipulation of amplicons derived from multiple samples with PCR replicates built in a single or multiple sequencing libraries. It allows the user to: (i) collapse amplicons into unique sequences and sort them by tag combination while retaining the sample identifier and copy number information, (ii) identify sequences carrying unused tag combinations, (iii) evaluate the comparability of PCR replicates of the same sample, and (iv) filter tagged amplicons from a number of PCR replicates using parameters of minimum length, copy number, and reproducibility across the PCR replicates. This enables an efficient analysis of complex datasets, and ultimately increases the ease of handling datasets from large-scale studies.

  20. Generation of Aptamers from A Primer-Free Randomized ssDNA Library Using Magnetic-Assisted Rapid Aptamer Selection

    NASA Astrophysics Data System (ADS)

    Tsao, Shih-Ming; Lai, Ji-Ching; Horng, Horng-Er; Liu, Tu-Chen; Hong, Chin-Yih

    2017-04-01

    Aptamers are oligonucleotides that can bind to specific target molecules. Most aptamers are generated using random libraries in the standard systematic evolution of ligands by exponential enrichment (SELEX). Each random library contains oligonucleotides with a randomized central region and two fixed primer regions at both ends. The fixed primer regions are necessary for amplifying target-bound sequences by PCR. However, these extra-sequences may cause non-specific bindings, which potentially interfere with good binding for random sequences. The Magnetic-Assisted Rapid Aptamer Selection (MARAS) is a newly developed protocol for generating single-strand DNA aptamers. No repeat selection cycle is required in the protocol. This study proposes and demonstrates a method to isolate aptamers for C-reactive proteins (CRP) from a randomized ssDNA library containing no fixed sequences at 5‧ and 3‧ termini using the MARAS platform. Furthermore, the isolated primer-free aptamer was sequenced and binding affinity for CRP was analyzed. The specificity of the obtained aptamer was validated using blind serum samples. The result was consistent with monoclonal antibody-based nephelometry analysis, which indicated that a primer-free aptamer has high specificity toward targets. MARAS is a feasible platform for efficiently generating primer-free aptamers for clinical diagnoses.

  1. Sequence-defined cMET/HGFR-targeted Polymers as Gene Delivery Vehicles for the Theranostic Sodium Iodide Symporter (NIS) Gene

    PubMed Central

    Urnauer, Sarah; Morys, Stephan; Krhac Levacic, Ana; Müller, Andrea M; Schug, Christina; Schmohl, Kathrin A; Schwenk, Nathalie; Zach, Christian; Carlsen, Janette; Bartenstein, Peter; Wagner, Ernst; Spitzweg, Christine

    2016-01-01

    The sodium iodide symporter (NIS) as well-characterized theranostic gene represents an outstanding tool to target different cancer types allowing noninvasive imaging of functional NIS expression and therapeutic radioiodide application. Based on its overexpression on the surface of most cancer types, the cMET/hepatocyte growth factor receptor serves as ideal target for tumor-selective gene delivery. Sequence-defined polymers as nonviral gene delivery vehicles comprising polyethylene glycol (PEG) and cationic (oligoethanoamino) amide cores coupled with a cMET-binding peptide (cMBP2) were complexed with NIS-DNA and tested for receptor-specificity, transduction efficiency, and therapeutic efficacy in hepatocellular cancer cells HuH7. In vitro iodide uptake studies demonstrated high transduction efficiency and cMET-specificity of NIS-encoding polyplexes (cMBP2-PEG-Stp/NIS) compared to polyplexes without targeting ligand (Ala-PEG-Stp/NIS) and without coding DNA (cMBP2-PEG-Stp/Antisense-NIS). Tumor recruitment and vector biodistribution were investigated in vivo in a subcutaneous xenograft mouse model showing high tumor-selective iodide accumulation in cMBP2-PEG-Stp/NIS-treated mice (6.6 ± 1.6% ID/g 123I, biological half-life 3 hours) by 123I-scintigraphy. Therapy studies with three cycles of polyplexes and 131I application resulted in significant delay in tumor growth and prolonged survival. These data demonstrate the enormous potential of cMET-targeted sequence-defined polymers combined with the unique theranostic function of NIS allowing for optimized transfection efficiency while eliminating toxicity. PMID:27157666

  2. Adenine specific DNA chemical sequencing reaction.

    PubMed Central

    Iverson, B L; Dervan, P B

    1987-01-01

    Reaction of DNA with K2PdCl4 at pH 2.0 followed by a piperidine workup produces specific cleavage at adenine (A) residues. Product analysis revealed the K2PdCl4 reaction involves selective depurination at adenine, affording an excision reaction analogous to the other chemical DNA sequencing reactions. Adenine residues methylated at the exocyclic amine (N6) react with lower efficiency than unmethylated adenine in an identical sequence. This simple protocol specific for A may be a useful addition to current chemical sequencing reactions. Images PMID:3671067

  3. Pyicos: a versatile toolkit for the analysis of high-throughput sequencing data.

    PubMed

    Althammer, Sonja; González-Vallinas, Juan; Ballaré, Cecilia; Beato, Miguel; Eyras, Eduardo

    2011-12-15

    High-throughput sequencing (HTS) has revolutionized gene regulation studies and is now fundamental for the detection of protein-DNA and protein-RNA binding, as well as for measuring RNA expression. With increasing variety and sequencing depth of HTS datasets, the need for more flexible and memory-efficient tools to analyse them is growing. We describe Pyicos, a powerful toolkit for the analysis of mapped reads from diverse HTS experiments: ChIP-Seq, either punctuated or broad signals, CLIP-Seq and RNA-Seq. We prove the effectiveness of Pyicos to select for significant signals and show that its accuracy is comparable and sometimes superior to that of methods specifically designed for each particular type of experiment. Pyicos facilitates the analysis of a variety of HTS datatypes through its flexibility and memory efficiency, providing a useful framework for data integration into models of regulatory genomics. Open-source software, with tutorials and protocol files, is available at http://regulatorygenomics.upf.edu/pyicos or as a Galaxy server at http://regulatorygenomics.upf.edu/galaxy eduardo.eyras@upf.edu Supplementary data are available at Bioinformatics online.

  4. EvoGraph: On-The-Fly Efficient Mining of Evolving Graphs on GPU

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Sengupta, Dipanjan; Song, Shuaiwen

    With the prevalence of the World Wide Web and social networks, there has been a growing interest in high performance analytics for constantly-evolving dynamic graphs. Modern GPUs provide massive AQ1 amount of parallelism for efficient graph processing, but the challenges remain due to their lack of support for the near real-time streaming nature of dynamic graphs. Specifically, due to the current high volume and velocity of graph data combined with the complexity of user queries, traditional processing methods by first storing the updates and then repeatedly running static graph analytics on a sequence of versions or snapshots are deemed undesirablemore » and computational infeasible on GPU. We present EvoGraph, a highly efficient and scalable GPU- based dynamic graph analytics framework.« less

  5. Preparation of next-generation sequencing libraries using Nextera™ technology: simultaneous DNA fragmentation and adaptor tagging by in vitro transposition.

    PubMed

    Caruccio, Nicholas

    2011-01-01

    DNA library preparation is a common entry point and bottleneck for next-generation sequencing. Current methods generally consist of distinct steps that often involve significant sample loss and hands-on time: DNA fragmentation, end-polishing, and adaptor-ligation. In vitro transposition with Nextera™ Transposomes simultaneously fragments and covalently tags the target DNA, thereby combining these three distinct steps into a single reaction. Platform-specific sequencing adaptors can be added, and the sample can be enriched and bar-coded using limited-cycle PCR to prepare di-tagged DNA fragment libraries. Nextera technology offers a streamlined, efficient, and high-throughput method for generating bar-coded libraries compatible with multiple next-generation sequencing platforms.

  6. The Automated Array Assembly Task of the Low-cost Silicon Solar Array Project, Phase 2

    NASA Technical Reports Server (NTRS)

    Coleman, M. G.; Grenon, L.; Pastirik, E. M.; Pryor, R. A.; Sparks, T. G.

    1978-01-01

    An advanced process sequence for manufacturing high efficiency solar cells and modules in a cost-effective manner is discussed. Emphasis is on process simplicity and minimizing consumed materials. The process sequence incorporates texture etching, plasma processes for damage removal and patterning, ion implantation, low pressure silicon nitride deposition, and plated metal. A reliable module design is presented. Specific process step developments are given. A detailed cost analysis was performed to indicate future areas of fruitful cost reduction effort. Recommendations for advanced investigations are included.

  7. Maximizing mutagenesis with solubilized CRISPR-Cas9 ribonucleoprotein complexes.

    PubMed

    Burger, Alexa; Lindsay, Helen; Felker, Anastasia; Hess, Christopher; Anders, Carolin; Chiavacci, Elena; Zaugg, Jonas; Weber, Lukas M; Catena, Raul; Jinek, Martin; Robinson, Mark D; Mosimann, Christian

    2016-06-01

    CRISPR-Cas9 enables efficient sequence-specific mutagenesis for creating somatic or germline mutants of model organisms. Key constraints in vivo remain the expression and delivery of active Cas9-sgRNA ribonucleoprotein complexes (RNPs) with minimal toxicity, variable mutagenesis efficiencies depending on targeting sequence, and high mutation mosaicism. Here, we apply in vitro assembled, fluorescent Cas9-sgRNA RNPs in solubilizing salt solution to achieve maximal mutagenesis efficiency in zebrafish embryos. MiSeq-based sequence analysis of targeted loci in individual embryos using CrispRVariants, a customized software tool for mutagenesis quantification and visualization, reveals efficient bi-allelic mutagenesis that reaches saturation at several tested gene loci. Such virtually complete mutagenesis exposes loss-of-function phenotypes for candidate genes in somatic mutant embryos for subsequent generation of stable germline mutants. We further show that targeting of non-coding elements in gene regulatory regions using saturating mutagenesis uncovers functional control elements in transgenic reporters and endogenous genes in injected embryos. Our results establish that optimally solubilized, in vitro assembled fluorescent Cas9-sgRNA RNPs provide a reproducible reagent for direct and scalable loss-of-function studies and applications beyond zebrafish experiments that require maximal DNA cutting efficiency in vivo. © 2016. Published by The Company of Biologists Ltd.

  8. In vitro selection of high temperature Zn(2+)-dependent DNAzymes.

    PubMed

    Nelson, Kevin E; Bruesehoff, Peter J; Lu, Yi

    2005-08-01

    In vitro selection of Zn(2+)-dependent RNA-cleaving DNAzymes with activity at 90 degrees C has yielded a diverse spool of selected sequences. The RNA cleavage efficiency was found in all cases to be specific for Zn(2+) over Pb(2+), Ca(2+), Cd(2+), Co(2+), Hg(2+), and Mg(2+). The Zn(2+)-dependent activity assay of the most active sequence showed that the DNAzyme possesses an apparent Zn(2+)-binding dissociation constant of 234 muM and that its activity increases with increasing temperatures from 50-90 degrees C. A fit of the Arrhenius plot data gave E(a) = 15.3 kcal mol(-1). Surprisingly, the selected Zn(2+)-dependent DNAzymes showed only a modest (approximately 3-fold) activity enhancement over the background rate of cleavage of random sequences containing a single embedded ribonucleotide within an otherwise DNA oligonucleotide. The result is attributable to the ability of DNA to sustain cleavage activity at high temperature with minimal secondary structure when Zn(2+) is present. Since this effect is highly specific for Zn(2+), this metal ion may play a special role in molecular evolution of nucleic acids at high temperature.

  9. T1 weighted fat/water separated PROPELLER acquired with dual bandwidths.

    PubMed

    Rydén, Henric; Berglund, Johan; Norbeck, Ola; Avventi, Enrico; Skare, Stefan

    2018-04-24

    To describe a fat/water separated dual receiver bandwidth (rBW) spin echo PROPELLER sequence that eliminates the dead time associated with single rBW sequences. A nonuniform noise whitening by regularization of the fat/water inverse problem is proposed, to enable dual rBW reconstructions. Bipolar, flyback, and dual spin echo sequences were developed. All sequences acquire two echoes with different rBW without dead time. Chemical shift displacement was corrected by performing the fat/water separation in k-space, prior to gridding. The proposed sequences were compared to fat saturation, and single rBW sequences, in terms of SNR and CNR efficiency, using clinically relevant acquisition parameters. The impact of motion was investigated. Chemical shift correction greatly improved the image quality, especially at high resolution acquired with low rBW, and also improved motion estimates. SNR efficiency of the dual spin echo sequence was up to 20% higher than the single rBW acquisition, while CNR efficiency was 50% higher for the bipolar acquisition. Noise whitening was deemed necessary for all dual rBW acquisitions, rendering high image quality with strong and homogenous fat suppression. Dual rBW sequences eliminate the dead time present in single rBW sequences, which improves SNR efficiency. In combination with the proposed regularization, this enables highly efficient T1-weighted PROPELLER images without chemical shift displacement. © 2018 International Society for Magnetic Resonance in Medicine.

  10. Microbial identification by immunohybridization assay of artificial RNA labels

    NASA Technical Reports Server (NTRS)

    Kourentzi, Katerina D.; Fox, George E.; Willson, Richard C.

    2002-01-01

    Ribosomal RNA (rRNA) and engineered stable artificial RNAs (aRNAs) are frequently used to monitor bacteria in complex ecosystems. In this work, we describe a solid-phase immunocapture hybridization assay that can be used with low molecular weight RNA targets. A biotinylated DNA probe is efficiently hybridized in solution with the target RNA, and the DNA-RNA hybrids are captured on streptavidin-coated plates and quantified using a DNA-RNA heteroduplex-specific antibody conjugated to alkaline phosphatase. The assay was shown to be specific for both 5S rRNA and low molecular weight (LMW) artificial RNAs and highly sensitive, allowing detection of as little as 5.2 ng (0.15 pmol) in the case of 5S rRNA. Target RNAs were readily detected even in the presence of excess nontarget RNA. Detection using DNA probes as small as 17 bases targeting a repetitive artificial RNA sequence in an engineered RNA was more efficient than the detection of a unique sequence.

  11. Development of chromosome-specific markers with high polymorphism for allotetraploid cotton based on genome-wide characterization of simple sequence repeats in diploid cottons (Gossypium arboreum L. and Gossypium raimondii Ulbrich).

    PubMed

    Lu, Cairui; Zou, Changsong; Zhang, Youping; Yu, Daoqian; Cheng, Hailiang; Jiang, Pengfei; Yang, Wencui; Wang, Qiaolian; Feng, Xiaoxu; Prosper, Mtawa Andrew; Guo, Xiaoping; Song, Guoli

    2015-02-06

    Tetraploid cotton contains two sets of homologous chromosomes, the At- and Dt-subgenomes. Consequently, many markers in cotton were mapped to multiple positions during linkage genetic map construction, posing a challenge to anchoring linkage groups and mapping economically-important genes to particular chromosomes. Chromosome-specific markers could solve this problem. Recently, the genomes of two diploid species were sequenced whose progenitors were putative contributors of the At- and Dt-subgenomes to tetraploid cotton. These sequences provide a powerful tool for developing chromosome-specific markers given the high level of synteny among tetraploid and diploid cotton genomes. In this study, simple sequence repeats (SSRs) on each chromosome in the two diploid genomes were characterized. Chromosome-specific SSRs were developed by comparative analysis and proved to distinguish chromosomes. A total of 200,744 and 142,409 SSRs were detected on the 13 chromosomes of Gossypium arboreum L. and Gossypium raimondii Ulbrich, respectively. Chromosome-specific SSRs were obtained by comparing SSR flanking sequences from each chromosome with those from the other 25 chromosomes. The average was 7,996 per chromosome. To confirm their chromosome specificity, these SSRs were used to distinguish two homologous chromosomes in tetraploid cotton through linkage group construction. The chromosome-specific SSRs and previously-reported chromosome markers were grouped together, and no marker mapped to another homologous chromosome, proving that the chromosome-specific SSRs were unique and could distinguish homologous chromosomes in tetraploid cotton. Because longer dinucleotide AT-rich repeats were the most polymorphic in previous reports, the SSRs on each chromosome were sorted by motif type and repeat length for convenient selection. The primer sequences of all chromosome-specific SSRs were also made publicly available. Chromosome-specific SSRs are efficient tools for chromosome identification by anchoring linkage groups to particular chromosomes during genetic mapping and are especially useful in mapping of qualitative-trait genes or quantitative trait loci with just a few markers. The SSRs reported here will facilitate a number of genetic and genomic studies in cotton, including construction of high-density genetic maps, positional gene cloning, fingerprinting, and genetic diversity and comparative evolutionary analyses among Gossypium species.

  12. Evaluation and rational design of guide RNAs for efficient CRISPR/Cas9-mediated mutagenesis in Ciona

    PubMed Central

    Gandhi, Shashank; Haeussler, Maximilian; Razy-Krajka, Florian; Christiaen, Lionel; Stolfi, Alberto

    2017-01-01

    The CRISPR/Cas9 system has emerged as an important tool for various genome engineering applications. A current obstacle to high throughput applications of CRISPR/Cas9 is the imprecise prediction of highly active single guide RNAs (sgRNAs). We previously implemented the CRISPR/Cas9 system to induce tissue-specific mutations in the tunicate Ciona. In the present study, we designed and tested 83 single guide RNA (sgRNA) vectors targeting 23 genes expressed in the cardiopharyngeal progenitors and surrounding tissues of Ciona embryo. Using high-throughput sequencing of mutagenized alleles, we identified guide sequences that correlate with sgRNA mutagenesis activity and used this information for the rational design of all possible sgRNAs targeting the Ciona transcriptome. We also describe a one-step cloning-free protocol for the assembly of sgRNA expression cassettes. These cassettes can be directly electroporated as unpurified PCR products into Ciona embryos for sgRNA expression in vivo, resulting in high frequency of CRISPR/Cas9-mediated mutagenesis in somatic cells of electroporated embryos. We found a strong correlation between the frequency of an Ebf loss-of-function phenotype and the mutagenesis efficacies of individual Ebf-targeting sgRNAs tested using this method. We anticipate that our approach can be scaled up to systematically design and deliver highly efficient sgRNAs for the tissue-specific investigation of gene functions in Ciona. PMID:28341547

  13. Neptune: a bioinformatics tool for rapid discovery of genomic variation in bacterial populations

    PubMed Central

    Marinier, Eric; Zaheer, Rahat; Berry, Chrystal; Weedmark, Kelly A.; Domaratzki, Michael; Mabon, Philip; Knox, Natalie C.; Reimer, Aleisha R.; Graham, Morag R.; Chui, Linda; Patterson-Fortin, Laura; Zhang, Jian; Pagotto, Franco; Farber, Jeff; Mahony, Jim; Seyer, Karine; Bekal, Sadjia; Tremblay, Cécile; Isaac-Renton, Judy; Prystajecky, Natalie; Chen, Jessica; Slade, Peter

    2017-01-01

    Abstract The ready availability of vast amounts of genomic sequence data has created the need to rethink comparative genomics algorithms using ‘big data’ approaches. Neptune is an efficient system for rapidly locating differentially abundant genomic content in bacterial populations using an exact k-mer matching strategy, while accommodating k-mer mismatches. Neptune’s loci discovery process identifies sequences that are sufficiently common to a group of target sequences and sufficiently absent from non-targets using probabilistic models. Neptune uses parallel computing to efficiently identify and extract these loci from draft genome assemblies without requiring multiple sequence alignments or other computationally expensive comparative sequence analyses. Tests on simulated and real datasets showed that Neptune rapidly identifies regions that are both sensitive and specific. We demonstrate that this system can identify trait-specific loci from different bacterial lineages. Neptune is broadly applicable for comparative bacterial analyses, yet will particularly benefit pathogenomic applications, owing to efficient and sensitive discovery of differentially abundant genomic loci. The software is available for download at: http://github.com/phac-nml/neptune. PMID:29048594

  14. Development and application of triple antibody sandwich enzyme-linked immunosorbent assays for begomovirus detection using monoclonal antibodies against Tomato yellow leaf curl Thailand virus.

    PubMed

    Seepiban, Channarong; Charoenvilaisiri, Saengsoon; Warin, Nuchnard; Bhunchoth, Anjana; Phironrit, Namthip; Phuangrat, Bencharong; Chatchawankanphanich, Orawan; Attathom, Supat; Gajanandana, Oraprapai

    2017-05-30

    Tomato yellow leaf curl Thailand virus, TYLCTHV, is a begomovirus that causes severe losses of tomato crops in Thailand as well as several countries in Southeast and East Asia. The development of monoclonal antibodies (MAbs) and serological methods for detecting TYLCTHV is essential for epidemiological studies and screening for virus-resistant cultivars. The recombinant coat protein (CP) of TYLCTHV was expressed in Escherichia coli and used to generate MAbs against TYLCTHV through hybridoma technology. The MAbs were characterized and optimized to develop triple antibody sandwich enzyme-linked immunosorbent assays (TAS-ELISAs) for begomovirus detection. The efficiency of TAS-ELISAs for begomovirus detection was evaluated with tomato, pepper, eggplant, okra and cucurbit plants collected from several provinces in Thailand. Molecular identification of begomoviruses in these samples was also performed through PCR and DNA sequence analysis of the CP gene. Two MAbs (M1 and D2) were generated and used to develop TAS-ELISAs for begomovirus detection. The results of begomovirus detection in 147 field samples indicated that MAb M1 reacted with 2 begomovirus species, TYLCTHV and Tobacco leaf curl Yunnan virus (TbLCYnV), whereas MAb D2 reacted with 4 begomovirus species, TYLCTHV, TbLCYnV, Tomato leaf curl New Delhi virus (ToLCNDV) and Squash leaf curl China virus (SLCCNV). Phylogenetic analyses of CP amino acid sequences from these begomoviruses revealed that the CP sequences of begomoviruses recognized by the narrow-spectrum MAb M1 were highly conserved, sharing 93% identity with each other but only 72-81% identity with MAb M1-negative begomoviruses. The CP sequences of begomoviruses recognized by the broad-spectrum MAb D2 demonstrated a wider range of amino acid sequence identity, sharing 78-96% identity with each other and 72-91% identity with those that were not detected by MAb D2. TAS-ELISAs using the narrow-specificity MAb M1 proved highly efficient for the detection of TYLCTHV and TbLCYnV, whereas TAS-ELISAs using the broad-specificity MAb D2 were highly efficient for the detection of TYLCTHV, TbLCYnV, ToLCNDV and SLCCNV. Both newly developed assays allow for sensitive, inexpensive, high-throughput detection of begomoviruses in field plant samples, as well as screening for virus-resistant cultivars.

  15. Influence of sequence mismatches on the specificity of recombinase polymerase amplification technology.

    PubMed

    Daher, Rana K; Stewart, Gale; Boissinot, Maurice; Boudreau, Dominique K; Bergeron, Michel G

    2015-04-01

    Recombinase polymerase amplification (RPA) technology relies on three major proteins, recombinase proteins, single-strand binding proteins, and polymerases, to specifically amplify nucleic acid sequences in an isothermal format. The performance of RPA with respect to sequence mismatches of closely-related non-target molecules is not well documented and the influence of the number and distribution of mismatches in DNA sequences on RPA amplification reaction is not well understood. We investigated the specificity of RPA by testing closely-related species bearing naturally occurring mismatches for the tuf gene sequence of Pseudomonas aeruginosa and/or Mycobacterium tuberculosis and for the cfb gene sequence of Streptococcus agalactiae. In addition, the impact of the number and distribution of mismatches on RPA efficiency was assessed by synthetically generating 14 types of mismatched forward primers for detecting five bacterial species of high diagnostic relevance such as Clostridium difficile, Staphylococcus aureus, S. agalactiae, P. aeruginosa, and M. tuberculosis as well as Bacillus atropheus subsp. globigii for which we use the spores as internal control in diagnostic assays. A total of 87 mismatched primers were tested in this study. We observed that target specific RPA primers with mismatches (n > 1) at their 3'extrimity hampered RPA reaction. In addition, 3 mismatches covering both extremities and the center of the primer sequence negatively affected RPA yield. We demonstrated that the specificity of RPA was multifactorial. Therefore its application in clinical settings must be selected and validated a priori. We recommend that the selection of a target gene must consider the presence of closely-related non-target genes. It is advisable to choose target regions with a high number of mismatches (≥36%, relative to the size of amplicon) with respect to closely-related species and the best case scenario would be by choosing a unique target gene. Copyright © 2014 Elsevier Ltd. All rights reserved.

  16. Polymerase Spiral Reaction (PSR): A novel isothermal nucleic acid amplification method.

    PubMed

    Liu, Wei; Dong, Derong; Yang, Zhan; Zou, Dayang; Chen, Zeliang; Yuan, Jing; Huang, Liuyu

    2015-07-29

    In this study, we report a novel isothermal nucleic acid amplification method only requires one pair of primers and one enzyme, termed Polymerase Spiral Reaction (PSR) with high specificity, efficiency, and rapidity under isothermal condition. The recombinant plasmid of blaNDM-1 was imported to Escherichia coli BL21, and selected as the microbial target. PSR method employs a Bst DNA polymerase and a pair of primers designed targeting the blaNDM-1 gene sequence. The forward and reverse Tab primer sequences are reverse to each other at their 5' end (Nr and N), whereas their 3' end sequences are complementary to their respective target nucleic acid sequences. The PSR method was performed at a constant temperature 61 °C-65 °C, yielding a complicated spiral structure. PSR assay was monitored continuously in a real-time turbidimeter instrument or visually detected with the aid of a fluorescent dye (SYBR Greenı), and could be finished within 1 h with a high accumulation of 10(9) copies of the target and a fine sensitivity of 6 CFU per reaction. Clinical evaluation was also conducted using PSR, showing high specificity of this method. The PSR technique provides a convenient and cost-effective alternative for clinical screening, on-site diagnosis and primary quarantine purposes.

  17. Optimized guide RNA structure for genome editing via Cas9

    PubMed Central

    Xu, Jianyong; Lian, Wei; Jia, Yuning; Li, Lingyun; Huang, Zhong

    2017-01-01

    The genome editing tool Cas9-gRNA (guide RNA) has been successfully applied in different cell types and organisms with high efficiency. However, more efforts need to be made to enhance both efficiency and specificity. In the current study, we optimized the guide RNA structure of Streptococcus pyogenes CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats)/Cas (CRISPR-associated) system to improve its genome editing efficiency. Comparing with the original functional structure of guide RNA, which is composed of crRNA and tracrRNA, the widely used chimeric gRNA has shorter crRNA and tracrRNA sequence. The deleted RNA sequence could form extra loop structure, which might enhance the stability of the guide RNA structure and subsequently the genome editing efficiency. Thus the genome editing efficiency of different forms of guide RNA was tested. And we found that the chimeric structure of gRNA with original full length of crRNA and tracrRNA showed higher genome editing efficiency than the conventional chimeric structure or other types of gRNA we tested. Therefore our data here uncovered the new type of gRNA structure with higher genome editing efficiency. PMID:29212218

  18. Sequence and structure determinants of Drosophila Hsp70 mRNA translation: 5'UTR secondary structure specifically inhibits heat shock protein mRNA translation.

    PubMed Central

    Hess, M A; Duncan, R F

    1996-01-01

    Preferential translation of Drosophila heat shock protein 70 (Hsp70) mRNA requires only the 5'-untranslated region (5'-UTR). The sequence of this region suggests that it has relatively little secondary structure, which may facilitate efficient protein synthesis initiation. To determine whether minimal 5'-UTR secondary structure is required for preferential translation during heat shock, the effect of introducing stem-loops into the Hsp70 mRNA 5'-UTR was measured. Stem-loops of -11 kcal/mol abolished translation during heat shock, but did not reduce translation in non-heat shocked cells. A -22 kcal/mol stem-loop was required to comparably inhibit translation during growth at normal temperatures. To investigate whether specific sequence elements are also required for efficient preferential translation, deletion and mutation analyses were conducted in a truncated Hsp70 5'-UTR containing only the cap-proximal and AUG-proximal segments. Linker-scanner mutations in the cap-proximal segment (+1 to +37) did not impair translation. Re-ordering the segments reduced mRNA translational efficiency by 50%. Deleting the AUG-proximal segment severely inhibited translation. A 5-extension of the full-length leader specifically impaired heat shock translation. These results indicate that heat shock reduces the capacity to unwind 5-UTR secondary structure, allowing only mRNAs with minimal 5'-UTR secondary structure to be efficiently translated. A function for specific sequences is also suggested. PMID:8710519

  19. Hi-Plex for Simple, Accurate, and Cost-Effective Amplicon-based Targeted DNA Sequencing.

    PubMed

    Pope, Bernard J; Hammet, Fleur; Nguyen-Dumont, Tu; Park, Daniel J

    2018-01-01

    Hi-Plex is a suite of methods to enable simple, accurate, and cost-effective highly multiplex PCR-based targeted sequencing (Nguyen-Dumont et al., Biotechniques 58:33-36, 2015). At its core is the principle of using gene-specific primers (GSPs) to "seed" (or target) the reaction and universal primers to "drive" the majority of the reaction. In this manner, effects on amplification efficiencies across the target amplicons can, to a large extent, be restricted to early seeding cycles. Product sizes are defined within a relatively narrow range to enable high-specificity size selection, replication uniformity across target sites (including in the context of fragmented input DNA such as that derived from fixed tumor specimens (Nguyen-Dumont et al., Biotechniques 55:69-74, 2013; Nguyen-Dumont et al., Anal Biochem 470:48-51, 2015), and application of high-specificity genetic variant calling algorithms (Pope et al., Source Code Biol Med 9:3, 2014; Park et al., BMC Bioinformatics 17:165, 2016). Hi-Plex offers a streamlined workflow that is suitable for testing large numbers of specimens without the need for automation.

  20. Socrates: identification of genomic rearrangements in tumour genomes by re-aligning soft clipped reads

    PubMed Central

    Schröder, Jan; Hsu, Arthur; Boyle, Samantha E.; Macintyre, Geoff; Cmero, Marek; Tothill, Richard W.; Johnstone, Ricky W.; Shackleton, Mark; Papenfuss, Anthony T.

    2014-01-01

    Motivation: Methods for detecting somatic genome rearrangements in tumours using next-generation sequencing are vital in cancer genomics. Available algorithms use one or more sources of evidence, such as read depth, paired-end reads or split reads to predict structural variants. However, the problem remains challenging due to the significant computational burden and high false-positive or false-negative rates. Results: In this article, we present Socrates (SOft Clip re-alignment To idEntify Structural variants), a highly efficient and effective method for detecting genomic rearrangements in tumours that uses only split-read data. Socrates has single-nucleotide resolution, identifies micro-homologies and untemplated sequence at break points, has high sensitivity and high specificity and takes advantage of parallelism for efficient use of resources. We demonstrate using simulated and real data that Socrates performs well compared with a number of existing structural variant detection tools. Availability and implementation: Socrates is released as open source and available from http://bioinf.wehi.edu.au/socrates. Contact: papenfuss@wehi.edu.au Supplementary information: Supplementary data are available at Bioinformatics online. PMID:24389656

  1. Improved PCR assay for the specific detection and quantitation of Escherichia coli serotype O157 in water.

    PubMed

    Cho, Min Seok; Joh, Kiseong; Ahn, Tae-Young; Park, Dong Suk

    2014-09-01

    Escherichia coli serotype O157 is still a major global healthcare problem. However, only limited information is now available on the molecular and serological detection of pathogenic bacteria. Therefore, the development of appropriate strategies for their rapid identification and monitoring is still needed. In general, the sequence analysis based on stx, slt, eae, hlyA, rfb, and fliCh7 genes is widely employed for the identification of E. coli serotype O157; but there have been critical defects in the diagnosis and identification of E. coli serotype O157, in that they are also present in other E. coli serogroups. In this study, NCBI-BLAST searches using the nucleotide sequences of the putative regulatory protein gene from E. coli O157:H7 str. Sakai found sequence difference at the serotype level. The specific primers from the putative regulatory protein gene were designed and investigated for their sensitivity and specificity for detecting the pathogen in environment water samples. The specificity of the primer set was evaluated using genomic DNA from 8 isolates of E. coli serotype O157 and 32 other reference strains. In addition, the sensitivity and specificity of this assay were confirmed by successful identification of E. coli serotype O157 in environmental water samples. In conclusion, this study showed that the newly developed quantitative serotype-specific PCR method is a highly specific and efficient tool for the surveillance and rapid detection of high-risk E. coli serotype O157.

  2. A split recognition mode combined with cascade signal amplification strategy for highly specific, sensitive detection of microRNA.

    PubMed

    Wang, Rui; Wang, Lei; Zhao, Haiyan; Jiang, Wei

    2016-12-15

    MicroRNAs (miRNAs) are vital for many biological processes and have been regarded as cancer biomarkers. Specific and sensitive detection of miRNAs is essential for cancer diagnosis and therapy. Herein, a split recognition mode combined with cascade signal amplification strategy is developed for highly specific and sensitive detection of miRNA. The split recognition mode possesses two specific recognition processes, which are based on toehold-mediated strand displacement reaction (TSDR) and direct hybridization reaction. Two recognition probes, hairpin probe (HP) with overhanging toehold domain and assistant probe (AP), are specially designed. Firstly, the toehold domain of HP and AP recognize part of miRNA simultaneously, accompanied with TSDR to unfold the HP and form the stable DNA Y-shaped junction structure (YJS). Then, the AP in YJS can further act as primer to initiate strand displacement amplification, releasing numerous trigger sequences. Finally, the trigger sequences hybridize with padlock DNA to initiate circular rolling circle amplification and generate enhanced fluorescence responses. In this strategy, the dual recognition effect of split recognition mode guarantees the excellent selectivity to discriminate let-7b from high-homology sequences. Furthermore, the high amplification efficiency of cascade signal amplification guarantees a high sensitivity with the detection limit of 3.2 pM and the concentration of let-7b in total RNA sample extracted from Hela cells is determined. These results indicate our strategy will be a promising miRNA detection strategy in clinical diagnosis and disease treatment. Copyright © 2016 Elsevier B.V. All rights reserved.

  3. Methods and compositions for efficient nucleic acid sequencing

    DOEpatents

    Drmanac, Radoje

    2006-07-04

    Disclosed are novel methods and compositions for rapid and highly efficient nucleic acid sequencing based upon hybridization with two sets of small oligonucleotide probes of known sequences. Extremely large nucleic acid molecules, including chromosomes and non-amplified RNA, may be sequenced without prior cloning or subcloning steps. The methods of the invention also solve various current problems associated with sequencing technology such as, for example, high noise to signal ratios and difficult discrimination, attaching many nucleic acid fragments to a surface, preparing many, longer or more complex probes and labelling more species.

  4. Methods and compositions for efficient nucleic acid sequencing

    DOEpatents

    Drmanac, Radoje

    2002-01-01

    Disclosed are novel methods and compositions for rapid and highly efficient nucleic acid sequencing based upon hybridization with two sets of small oligonucleotide probes of known sequences. Extremely large nucleic acid molecules, including chromosomes and non-amplified RNA, may be sequenced without prior cloning or subcloning steps. The methods of the invention also solve various current problems associated with sequencing technology such as, for example, high noise to signal ratios and difficult discrimination, attaching many nucleic acid fragments to a surface, preparing many, longer or more complex probes and labelling more species.

  5. Selection of mRNA 5'-untranslated region sequence with high translation efficiency through ribosome display

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Mie, Masayasu; Shimizu, Shun; Takahashi, Fumio

    2008-08-15

    The 5'-untranslated region (5'-UTR) of mRNAs functions as a translation enhancer, promoting translation efficiency. Many in vitro translation systems exhibit a reduced efficiency in protein translation due to decreased translation initiation. The use of a 5'-UTR sequence with high translation efficiency greatly enhances protein production in these systems. In this study, we have developed an in vitro selection system that favors 5'-UTRs with high translation efficiency using a ribosome display technique. A 5'-UTR random library, comprised of 5'-UTRs tagged with a His-tag and Renilla luciferase (R-luc) fusion, were in vitro translated in rabbit reticulocytes. By limiting the translation period, onlymore » mRNAs with high translation efficiency were translated. During translation, mRNA, ribosome and translated R-luc with His-tag formed ternary complexes. They were collected with translated His-tag using Ni-particles. Extracted mRNA from ternary complex was amplified using RT-PCR and sequenced. Finally, 5'-UTR with high translation efficiency was obtained from random 5'-UTR library.« less

  6. Recovery of soil unicellular eukaryotes: an efficiency and activity analysis on the single cell level.

    PubMed

    Lentendu, Guillaume; Hübschmann, Thomas; Müller, Susann; Dunker, Susanne; Buscot, François; Wilhelm, Christian

    2013-12-01

    Eukaryotic unicellular organisms are an important part of the soil microbial community, but they are often neglected in soil functional microbial diversity analysis, principally due to the absence of specific investigation methods in the special soil environment. In this study we used a method based on high-density centrifugation to specifically isolate intact algal and yeast cells, with the aim to analyze them with flow cytometry and sort them for further molecular analysis such as deep sequencing. Recovery efficiency was tested at low abundance levels that fit those in natural environments (10(4) to 10(6) cells per g soil). Five algae and five yeast morphospecies isolated from soil were used for the testing. Recovery efficiency was between 1.5 to 43.16% and 2 to 30.2%, respectively, and was dependent on soil type for three of the algae. Control treatments without soil showed that the majority of cells were lost due to the method itself (58% and 55.8% respectively). However, the cell extraction technique did not much compromise cell vitality because a fluorescein di-acetate assay indicated high viability percentages (73.3% and 97.2% of cells, respectively). The low abundant algae and yeast morphospecies recovered from soil were cytometrically analyzed and sorted. Following, their DNA was isolated and amplified using specific primers. The developed workflow enables isolation and enrichment of intact autotrophic and heterotrophic soil unicellular eukaryotes from natural environments for subsequent application of deep sequencing technologies. Copyright © 2013 Elsevier B.V. All rights reserved.

  7. Discovery and characterization of a highly efficient enantioselective mandelonitrile hydrolase from Burkholderia cenocepacia J2315 by phylogeny-based enzymatic substrate specificity prediction.

    PubMed

    Wang, Hualei; Sun, Huihui; Wei, Dongzhi

    2013-02-18

    A nitrilase-mediated pathway has significant advantages in the production of optically pure (R)-(-)-mandelic acid. However, unwanted byproduct, low enantioselectivity, and specific activity reduce its value in practical applications. An ideal nitrilase that can efficiently hydrolyze mandelonitrile to optically pure (R)-(-)-mandelic acid without the unwanted byproduct is needed. A novel nitrilase (BCJ2315) was discovered from Burkholderia cenocepacia J2315 through phylogeny-based enzymatic substrate specificity prediction (PESSP). This nitrilase is a mandelonitrile hydrolase that could efficiently hydrolyze mandelonitrile to (R)-(-)-mandelic acid, with a high enantiomeric excess of 98.4%. No byproduct was observed in this hydrolysis process. BCJ2315 showed the highest identity of 71% compared with other nitrilases in the amino acid sequence. BCJ2315 possessed the highest activity toward mandelonitrile and took mandelonitrile as the optimal substrate based on the analysis of substrate specificity. The kinetic parameters Vmax, Km, Kcat, and Kcat/Km toward mandelonitrile were 45.4 μmol/min/mg, 0.14 mM, 15.4 s(-1), and 1.1×10(5) M(-1)s(-1), respectively. The recombinant Escherichia coli M15/BCJ2315 had a strong substrate tolerance and could completely hydrolyze mandelonitrile (100 mM) with fewer amounts of wet cells (10 mg/ml) within 1 h. PESSP is an efficient method for discovering an ideal mandelonitrile hydrolase. BCJ2315 has high affinity and catalytic efficiency toward mandelonitrile. This nitrilase has great advantages in the production of optically pure (R)-(-)-mandelic acid because of its high activity and enantioselectivity, strong substrate tolerance, and having no unwanted byproduct. Thus, BCJ2315 has great potential in the practical production of optically pure (R)-(-)-mandelic acid in the industry.

  8. A branch-migration based fluorescent probe for straightforward, sensitive and specific discrimination of DNA mutations

    PubMed Central

    Xiao, Xianjin; Wu, Tongbo; Xu, Lei; Chen, Wei

    2017-01-01

    Abstract Genetic mutations are important biomarkers for cancer diagnostics and surveillance. Preferably, the methods for mutation detection should be straightforward, highly specific and sensitive to low-level mutations within various sequence contexts, fast and applicable at room-temperature. Though some of the currently available methods have shown very encouraging results, their discrimination efficiency is still very low. Herein, we demonstrate a branch-migration based fluorescent probe (BM probe) which is able to identify the presence of known or unknown single-base variations at abundances down to 0.3%-1% within 5 min, even in highly GC-rich sequence regions. The discrimination factors between the perfect-match target and single-base mismatched target are determined to be 89–311 by measurement of their respective branch-migration products via polymerase elongation reactions. The BM probe not only enabled sensitive detection of two types of EGFR-associated point mutations located in GC-rich regions, but also successfully identified the BRAF V600E mutation in the serum from a thyroid cancer patient which could not be detected by the conventional sequencing method. The new method would be an ideal choice for high-throughput in vitro diagnostics and precise clinical treatment. PMID:28201758

  9. SeqTrim: a high-throughput pipeline for pre-processing any type of sequence read

    PubMed Central

    2010-01-01

    Background High-throughput automated sequencing has enabled an exponential growth rate of sequencing data. This requires increasing sequence quality and reliability in order to avoid database contamination with artefactual sequences. The arrival of pyrosequencing enhances this problem and necessitates customisable pre-processing algorithms. Results SeqTrim has been implemented both as a Web and as a standalone command line application. Already-published and newly-designed algorithms have been included to identify sequence inserts, to remove low quality, vector, adaptor, low complexity and contaminant sequences, and to detect chimeric reads. The availability of several input and output formats allows its inclusion in sequence processing workflows. Due to its specific algorithms, SeqTrim outperforms other pre-processors implemented as Web services or standalone applications. It performs equally well with sequences from EST libraries, SSH libraries, genomic DNA libraries and pyrosequencing reads and does not lead to over-trimming. Conclusions SeqTrim is an efficient pipeline designed for pre-processing of any type of sequence read, including next-generation sequencing. It is easily configurable and provides a friendly interface that allows users to know what happened with sequences at every pre-processing stage, and to verify pre-processing of an individual sequence if desired. The recommended pipeline reveals more information about each sequence than previously described pre-processors and can discard more sequencing or experimental artefacts. PMID:20089148

  10. Typing of artiodactyl MHC-DRB genes with the help of intronic simple repeated DNA sequences.

    PubMed

    Schwaiger, F W; Buitkamp, J; Weyers, E; Epplen, J T

    1993-02-01

    An efficient oligonucleotide typing method for the highly polymorphic MHC-DRB genes is described for artiodactyls like cattle, sheep and goat. By means of the polymerase chain reaction, the second exon of MHC-DRB is amplified as well as part of the adjacent intron containing a mixed simple repeat sequence. Using this primer combination we were able to amplify the MHC-DRB exons 2 and adjacent introns from all of the investigated 10 species of the family of Bovidae and giraffes. Therefore, the DRB genes of novel artiodactyl species can also be readily studied. Oligonucleotide probes specific for the polymorphisms of ungulate DRB genes are used with which sequences differing in at least one single base can be distinguished. Exonic polymorphism was found to be correlated with the allele lengths and the patterns of the repeat structures. Hence oligonucleotide probes specific for different simple repeats and polymorphic positions serve also for typing across species barriers. The strict correlation of sequence length and exonic polymorphism permits a preselection of specific oligonucleotides for hybridization. Thus more than 20 alleles can already be differentiated from each of the three species.

  11. Efficient generation of cavitation bubbles and reactive oxygen species using triggered high-intensity focused ultrasound sequence for sonodynamic treatment

    NASA Astrophysics Data System (ADS)

    Yasuda, Jun; Yoshizawa, Shin; Umemura, Shin-ichiro

    2016-07-01

    Sonodynamic treatment is a method of treating cancer using reactive oxygen species (ROS) generated by cavitation bubbles in collaboration with a sonosensitizer at a target tissue. In this treatment method, both localized ROS generation and ROS generation with high efficiency are important. In this study, a triggered high-intensity focused ultrasound (HIFU) sequence, which consists of a short, extremely high intensity pulse immediately followed by a long, moderate-intensity burst, was employed for the efficient generation of ROS. In experiments, a solution sealed in a chamber was exposed to a triggered HIFU sequence. Then, the distribution of generated ROS was observed by the luminol reaction, and the amount of generated ROS was quantified using KI method. As a result, the localized ROS generation was demonstrated by light emission from the luminol reaction. Moreover, it was demonstrated that the triggered HIFU sequence has higher efficiency of ROS generation by both the KI method and the luminol reaction emission.

  12. Structured oligonucleotides for target indexing to allow single-vessel PCR amplification and solid support microarray hybridization

    PubMed Central

    Girard, Laurie D.; Boissinot, Karel; Peytavi, Régis; Boissinot, Maurice; Bergeron, Michel G.

    2014-01-01

    The combination of molecular diagnostic technologies is increasingly used to overcome limitations on sensitivity, specificity or multiplexing capabilities, and provide efficient lab-on-chip devices. Two such techniques, PCR amplification and microarray hybridization are used serially to take advantage of the high sensitivity and specificity of the former combined with high multiplexing capacities of the latter. These methods are usually performed in different buffers and reaction chambers. However, these elaborate methods have a high complexity cost related to reagent requirements, liquid storage and the number of reaction chambers to integrate into automated devices. Furthermore, microarray hybridizations have a sequence dependent efficiency not always predictable. In this work, we have developed the concept of a structured oligonucleotide probe which is activated by cleavage from polymerase exonuclease activity. This technology is called SCISSOHR for Structured Cleavage Induced Single-Stranded Oligonucleotide Hybridization Reaction. The SCISSOHR probes enable indexing the target sequence to a tag sequence. The SCISSOHR technology also allows the combination of nucleic acid amplification and microarray hybridization in a single vessel in presence of the PCR buffer only. The SCISSOHR technology uses an amplification probe that is irreversibly modified in presence of the target, releasing a single-stranded DNA tag for microarray hybridization. Each tag is composed of a 3-nucleotidesequence-dependent segment and a unique “target sequence-independent” 14-nucleotide segment allowing for optimal hybridization with minimal cross-hybridization. We evaluated the performance of five (5) PCR buffers to support microarray hybridization, compared to a conventional hybridization buffer. Finally, as a proof of concept, we developed a multiplexed assay for the amplification, detection, and identification of three (3) DNA targets. This new technology will facilitate the design of lab-on-chip microfluidic devices, while also reducing consumable costs. At term, it will allow the cost-effective automation of highly multiplexed assays for detection and identification of genetic targets. PMID:25489607

  13. A tag-based approach for high-throughput analysis of CCWGG methylation.

    PubMed

    Denisova, Oksana V; Chernov, Andrei V; Koledachkina, Tatyana Y; Matvienko, Nicholas I

    2007-10-15

    Non-CpG methylation occurring in the context of CNG sequences is found in plants at a large number of genomic loci. However, there is still little information available about non-CpG methylation in mammals. Efficient methods that would allow detection of scarcely localized methylated sites in small quantities of DNA are required to elucidate the biological role of non-CpG methylation in both plants and animals. In this study, we tested a new whole genome approach to identify sites of CCWGG methylation (W is A or T), a particular case of CNG methylation, in genomic DNA. This technique is based on digestion of DNAs with methylation-sensitive restriction endonucleases EcoRII-C and AjnI. Short DNAs flanking methylated CCWGG sites (tags) are selectively purified and assembled in tandem arrays of up to nine tags. This allows high-throughput sequencing of tags, identification of flanking regions, and their exact positions in the genome. In this study, we tested specificity and efficiency of the approach.

  14. Kernel based machine learning algorithm for the efficient prediction of type III polyketide synthase family of proteins.

    PubMed

    Mallika, V; Sivakumar, K C; Jaichand, S; Soniya, E V

    2010-07-13

    Type III Polyketide synthases (PKS) are family of proteins considered to have significant roles in the biosynthesis of various polyketides in plants, fungi and bacteria. As these proteins shows positive effects to human health, more researches are going on regarding this particular protein. Developing a tool to identify the probability of sequence being a type III polyketide synthase will minimize the time consumption and manpower efforts. In this approach, we have designed and implemented PKSIIIpred, a high performance prediction server for type III PKS where the classifier is Support Vector Machines (SVMs). Based on the limited training dataset, the tool efficiently predicts the type III PKS superfamily of proteins with high sensitivity and specificity. The PKSIIIpred is available at http://type3pks.in/prediction/. We expect that this tool may serve as a useful resource for type III PKS researchers. Currently work is being progressed for further betterment of prediction accuracy by including more sequence features in the training dataset.

  15. Pyicos: a versatile toolkit for the analysis of high-throughput sequencing data

    PubMed Central

    Althammer, Sonja; González-Vallinas, Juan; Ballaré, Cecilia; Beato, Miguel; Eyras, Eduardo

    2011-01-01

    Motivation: High-throughput sequencing (HTS) has revolutionized gene regulation studies and is now fundamental for the detection of protein–DNA and protein–RNA binding, as well as for measuring RNA expression. With increasing variety and sequencing depth of HTS datasets, the need for more flexible and memory-efficient tools to analyse them is growing. Results: We describe Pyicos, a powerful toolkit for the analysis of mapped reads from diverse HTS experiments: ChIP-Seq, either punctuated or broad signals, CLIP-Seq and RNA-Seq. We prove the effectiveness of Pyicos to select for significant signals and show that its accuracy is comparable and sometimes superior to that of methods specifically designed for each particular type of experiment. Pyicos facilitates the analysis of a variety of HTS datatypes through its flexibility and memory efficiency, providing a useful framework for data integration into models of regulatory genomics. Availability: Open-source software, with tutorials and protocol files, is available at http://regulatorygenomics.upf.edu/pyicos or as a Galaxy server at http://regulatorygenomics.upf.edu/galaxy Contact: eduardo.eyras@upf.edu Supplementary Information: Supplementary data are available at Bioinformatics online. PMID:21994224

  16. Ruminal Bacterial Community Composition in Dairy Cows Is Dynamic over the Course of Two Lactations and Correlates with Feed Efficiency

    PubMed Central

    Jewell, Kelsea A.; McCormick, Caroline A.; Odt, Christine L.; Weimer, Paul J.

    2015-01-01

    Fourteen Holstein cows of similar ages were monitored through their first two lactation cycles, during which ruminal solids and liquids, milk samples, production data, and feed consumption data were collected for each cow during early (76 to 82 days in milk [DIM]), middle (151 to 157 DIM), and late (251 to 257 DIM) lactation periods. The bacterial community of each ruminal sample was determined by sequencing the region from V6 to V8 of the 16S rRNA gene using 454 pyrosequencing. Gross feed efficiency (GFE) for each cow was calculated by dividing her energy-corrected milk by dry matter intake (ECM/DMI) for each period of both lactation cycles. Four pairs of cows were identified that differed in milk production efficiency, as defined by residual feed intake (RFI), at the same level of ECM production. The most abundant phyla detected for all cows were Bacteroidetes (49.42%), Firmicutes (39.32%), Proteobacteria (5.67%), and Tenericutes (2.17%), and the most abundant genera included Prevotella (40.15%), Butyrivibrio (2.38%), Ruminococcus (2.35%), Coprococcus (2.29%), and Succiniclasticum (2.28%). The bacterial microbiota between the first and second lactation cycles were highly similar, but with a significant correlation between total community composition by ruminal phase and specific bacteria whose relative sequence abundances displayed significant positive or negative correlation with GFE or RFI. These data suggest that the ruminal bacterial community is dynamic in terms of membership and diversity and that specific members are associated with high and low milk production efficiency over two lactation cycles. PMID:25934629

  17. Engineering of a target site-specific recombinase by a combined evolution- and structure-guided approach

    PubMed Central

    Abi-Ghanem, Josephine; Chusainow, Janet; Karimova, Madina; Spiegel, Christopher; Hofmann-Sieber, Helga; Hauber, Joachim; Buchholz, Frank; Pisabarro, M. Teresa

    2013-01-01

    Site-specific recombinases (SSRs) can perform DNA rearrangements, including deletions, inversions and translocations when their naive target sequences are placed strategically into the genome of an organism. Hence, in order to employ SSRs in heterologous hosts, their target sites have to be introduced into the genome of an organism before the enzyme can be practically employed. Engineered SSRs hold great promise for biotechnology and advanced biomedical applications, as they promise to extend the usefulness of SSRs to allow efficient and specific recombination of pre-existing, natural genomic sequences. However, the generation of enzymes with desired properties remains challenging. Here, we use substrate-linked directed evolution in combination with molecular modeling to rationally engineer an efficient and specific recombinase (sTre) that readily and specifically recombines a sequence present in the HIV-1 genome. We elucidate the role of key residues implicated in the molecular recognition mechanism and we present a rationale for sTre’s enhanced specificity. Combining evolutionary and rational approaches should help in accelerating the generation of enzymes with desired properties for use in biotechnology and biomedicine. PMID:23275541

  18. Efficient Processing of the Immunodominant, HLA-A*0201-Restricted Human Immunodeficiency Virus Type 1 Cytotoxic T-Lymphocyte Epitope despite Multiple Variations in the Epitope Flanking Sequences

    PubMed Central

    Brander, Christian; Yang, Otto O.; Jones, Norman G.; Lee, Yun; Goulder, Philip; Johnson, R. Paul; Trocha, Alicja; Colbert, David; Hay, Christine; Buchbinder, Susan; Bergmann, Cornelia C.; Zweerink, Hans J.; Wolinsky, Steven; Blattner, William A.; Kalams, Spyros A.; Walker, Bruce D.

    1999-01-01

    Immune escape from cytotoxic T-lymphocyte (CTL) responses has been shown to occur not only by changes within the targeted epitope but also by changes in the flanking sequences which interfere with the processing of the immunogenic peptide. However, the frequency of such an escape mechanism has not been determined. To investigate whether naturally occurring variations in the flanking sequences of an immunodominant human immunodeficiency virus type 1 (HIV-1) Gag CTL epitope prevent antigen processing, cells infected with HIV-1 or vaccinia virus constructs encoding different patient-derived Gag sequences were tested for recognition by HLA-A*0201-restricted, p17-specific CTL. We found that the immunodominant p17 epitope (SL9) and its variants were efficiently processed from minigene expressing vectors and from six HIV-1 Gag variants expressed by recombinant vaccinia virus constructs. Furthermore, SL9-specific CTL clones derived from multiple donors efficiently inhibited virus replication when added to HLA-A*0201-bearing cells infected with primary or laboratory-adapted strains of virus, despite the variability in the SL9 flanking sequences. These data suggest that escape from this immunodominant CTL response is not frequently accomplished by changes in the epitope flanking sequences. PMID:10559335

  19. Evaluating multiplexed next-generation sequencing as a method in palynology for mixed pollen samples.

    PubMed

    Keller, A; Danner, N; Grimmer, G; Ankenbrand, M; von der Ohe, K; von der Ohe, W; Rost, S; Härtel, S; Steffan-Dewenter, I

    2015-03-01

    The identification of pollen plays an important role in ecology, palaeo-climatology, honey quality control and other areas. Currently, expert knowledge and reference collections are essential to identify pollen origin through light microscopy. Pollen identification through molecular sequencing and DNA barcoding has been proposed as an alternative approach, but the assessment of mixed pollen samples originating from multiple plant species is still a tedious and error-prone task. Next-generation sequencing has been proposed to avoid this hindrance. In this study we assessed mixed pollen probes through next-generation sequencing of amplicons from the highly variable, species-specific internal transcribed spacer 2 region of nuclear ribosomal DNA. Further, we developed a bioinformatic workflow to analyse these high-throughput data with a newly created reference database. To evaluate the feasibility, we compared results from classical identification based on light microscopy from the same samples with our sequencing results. We assessed in total 16 mixed pollen samples, 14 originated from honeybee colonies and two from solitary bee nests. The sequencing technique resulted in higher taxon richness (deeper assignments and more identified taxa) compared to light microscopy. Abundance estimations from sequencing data were significantly correlated with counted abundances through light microscopy. Simulation analyses of taxon specificity and sensitivity indicate that 96% of taxa present in the database are correctly identifiable at the genus level and 70% at the species level. Next-generation sequencing thus presents a useful and efficient workflow to identify pollen at the genus and species level without requiring specialised palynological expert knowledge. © 2014 German Botanical Society and The Royal Botanical Society of the Netherlands.

  20. Evaluation and rational design of guide RNAs for efficient CRISPR/Cas9-mediated mutagenesis in Ciona.

    PubMed

    Gandhi, Shashank; Haeussler, Maximilian; Razy-Krajka, Florian; Christiaen, Lionel; Stolfi, Alberto

    2017-05-01

    The CRISPR/Cas9 system has emerged as an important tool for various genome engineering applications. A current obstacle to high throughput applications of CRISPR/Cas9 is the imprecise prediction of highly active single guide RNAs (sgRNAs). We previously implemented the CRISPR/Cas9 system to induce tissue-specific mutations in the tunicate Ciona. In the present study, we designed and tested 83 single guide RNA (sgRNA) vectors targeting 23 genes expressed in the cardiopharyngeal progenitors and surrounding tissues of Ciona embryo. Using high-throughput sequencing of mutagenized alleles, we identified guide sequences that correlate with sgRNA mutagenesis activity and used this information for the rational design of all possible sgRNAs targeting the Ciona transcriptome. We also describe a one-step cloning-free protocol for the assembly of sgRNA expression cassettes. These cassettes can be directly electroporated as unpurified PCR products into Ciona embryos for sgRNA expression in vivo, resulting in high frequency of CRISPR/Cas9-mediated mutagenesis in somatic cells of electroporated embryos. We found a strong correlation between the frequency of an Ebf loss-of-function phenotype and the mutagenesis efficacies of individual Ebf-targeting sgRNAs tested using this method. We anticipate that our approach can be scaled up to systematically design and deliver highly efficient sgRNAs for the tissue-specific investigation of gene functions in Ciona. Copyright © 2017 Elsevier Inc. All rights reserved.

  1. Pre-capture multiplexing improves efficiency and cost-effectiveness of targeted genomic enrichment.

    PubMed

    Shearer, A Eliot; Hildebrand, Michael S; Ravi, Harini; Joshi, Swati; Guiffre, Angelica C; Novak, Barbara; Happe, Scott; LeProust, Emily M; Smith, Richard J H

    2012-11-14

    Targeted genomic enrichment (TGE) is a widely used method for isolating and enriching specific genomic regions prior to massively parallel sequencing. To make effective use of sequencer output, barcoding and sample pooling (multiplexing) after TGE and prior to sequencing (post-capture multiplexing) has become routine. While previous reports have indicated that multiplexing prior to capture (pre-capture multiplexing) is feasible, no thorough examination of the effect of this method has been completed on a large number of samples. Here we compare standard post-capture TGE to two levels of pre-capture multiplexing: 12 or 16 samples per pool. We evaluated these methods using standard TGE metrics and determined the ability to identify several classes of genetic mutations in three sets of 96 samples, including 48 controls. Our overall goal was to maximize cost reduction and minimize experimental time while maintaining a high percentage of reads on target and a high depth of coverage at thresholds required for variant detection. We adapted the standard post-capture TGE method for pre-capture TGE with several protocol modifications, including redesign of blocking oligonucleotides and optimization of enzymatic and amplification steps. Pre-capture multiplexing reduced costs for TGE by at least 38% and significantly reduced hands-on time during the TGE protocol. We found that pre-capture multiplexing reduced capture efficiency by 23 or 31% for pre-capture pools of 12 and 16, respectively. However efficiency losses at this step can be compensated by reducing the number of simultaneously sequenced samples. Pre-capture multiplexing and post-capture TGE performed similarly with respect to variant detection of positive control mutations. In addition, we detected no instances of sample switching due to aberrant barcode identification. Pre-capture multiplexing improves efficiency of TGE experiments with respect to hands-on time and reagent use compared to standard post-capture TGE. A decrease in capture efficiency is observed when using pre-capture multiplexing; however, it does not negatively impact variant detection and can be accommodated by the experimental design.

  2. Sequenced sorghum mutant library- an efficient platform for discovery of causal gene mutations

    USDA-ARS?s Scientific Manuscript database

    Ethyl methanesulfonate (EMS) efficiently generates high-density mutations in genomes. We applied whole-genome sequencing to 256 phenotyped mutant lines of sorghum (Sorghum bicolor L. Moench) to 16x coverage. Comparisons with the reference sequence revealed >1.8 million canonical EMS-induced G/C to A...

  3. High throughput SNP discovery and genotyping in grapevine (Vitis vinifera L.) by combining a re-sequencing approach and SNPlex technology

    PubMed Central

    Lijavetzky, Diego; Cabezas, José Antonio; Ibáñez, Ana; Rodríguez, Virginia; Martínez-Zapater, José M

    2007-01-01

    Background Single-nucleotide polymorphisms (SNPs) are the most abundant type of DNA sequence polymorphisms. Their higher availability and stability when compared to simple sequence repeats (SSRs) provide enhanced possibilities for genetic and breeding applications such as cultivar identification, construction of genetic maps, the assessment of genetic diversity, the detection of genotype/phenotype associations, or marker-assisted breeding. In addition, the efficiency of these activities can be improved thanks to the ease with which SNP genotyping can be automated. Expressed sequence tags (EST) sequencing projects in grapevine are allowing for the in silico detection of multiple putative sequence polymorphisms within and among a reduced number of cultivars. In parallel, the sequence of the grapevine cultivar Pinot Noir is also providing thousands of polymorphisms present in this highly heterozygous genome. Still the general application of those SNPs requires further validation since their use could be restricted to those specific genotypes. Results In order to develop a large SNP set of wide application in grapevine we followed a systematic re-sequencing approach in a group of 11 grape genotypes corresponding to ancient unrelated cultivars as well as wild plants. Using this approach, we have sequenced 230 gene fragments, what represents the analysis of over 1 Mb of grape DNA sequence. This analysis has allowed the discovery of 1573 SNPs with an average of one SNP every 64 bp (one SNP every 47 bp in non-coding regions and every 69 bp in coding regions). Nucleotide diversity in grape (π = 0.0051) was found to be similar to values observed in highly polymorphic plant species such as maize. The average number of haplotypes per gene sequence was estimated as six, with three haplotypes representing over 83% of the analyzed sequences. Short-range linkage disequilibrium (LD) studies within the analyzed sequences indicate the existence of a rapid decay of LD within the selected grapevine genotypes. To validate the use of the detected polymorphisms in genetic mapping, cultivar identification and genetic diversity studies we have used the SNPlex™ genotyping technology in a sample of grapevine genotypes and segregating progenies. Conclusion These results provide accurate values for nucleotide diversity in coding sequences and a first estimate of short-range LD in grapevine. Using SNPlex™ genotyping we have shown the application of a set of discovered SNPs as molecular markers for cultivar identification, linkage mapping and genetic diversity studies. Thus, the combination a highly efficient re-sequencing approach and the SNPlex™ high throughput genotyping technology provide a powerful tool for grapevine genetic analysis. PMID:18021442

  4. Evidence for Context-Dependent Complementarity of Non-Shine-Dalgarno Ribosome Binding Sites to Escherichia coli rRNA

    PubMed Central

    Barendt, Pamela A.; Shah, Najaf A.; Barendt, Gregory A.; Kothari, Parth A.; Sarkar, Casim A.

    2013-01-01

    While the ribosome has evolved to function in complex intracellular environments, these contexts do not easily allow for the study of its inherent capabilities. We have used a synthetic, well-defined, Escherichia coli (E. coli)-based translation system in conjunction with ribosome display, a powerful in vitro selection method, to identify ribosome binding sites (RBSs) that can promote the efficient translation of messenger RNAs (mRNAs) with a leader length representative of natural E. coli mRNAs. In previous work, we used a longer leader sequence and unexpectedly recovered highly efficient cytosine-rich sequences with complementarity to the 16S ribosomal RNA (rRNA) and similarity to eukaryotic RBSs. In the current study, Shine-Dalgarno (SD) sequences were prevalent but non-SD sequences were also heavily enriched and were dominated by novel guanine- and uracil-rich motifs which showed statistically significant complementarity to the 16S rRNA. Additionally, only SD motifs exhibited position-dependent decreases in sequence entropy, indicating that non-SD motifs likely operate by increasing the local concentration of ribosomes in the vicinity of the start codon, rather than by a position-dependent mechanism. These results further support the putative generality of mRNA-rRNA complementarity in facilitating mRNA translation, but also suggest that context (e.g., leader length and composition) dictates the specific subset of possible RBSs that are used for efficient translation of a given transcript. PMID:23427812

  5. PathoScope 2.0: a complete computational framework for strain identification in environmental or clinical sequencing samples

    PubMed Central

    2014-01-01

    Background Recent innovations in sequencing technologies have provided researchers with the ability to rapidly characterize the microbial content of an environmental or clinical sample with unprecedented resolution. These approaches are producing a wealth of information that is providing novel insights into the microbial ecology of the environment and human health. However, these sequencing-based approaches produce large and complex datasets that require efficient and sensitive computational analysis workflows. Many recent tools for analyzing metagenomic-sequencing data have emerged, however, these approaches often suffer from issues of specificity, efficiency, and typically do not include a complete metagenomic analysis framework. Results We present PathoScope 2.0, a complete bioinformatics framework for rapidly and accurately quantifying the proportions of reads from individual microbial strains present in metagenomic sequencing data from environmental or clinical samples. The pipeline performs all necessary computational analysis steps; including reference genome library extraction and indexing, read quality control and alignment, strain identification, and summarization and annotation of results. We rigorously evaluated PathoScope 2.0 using simulated data and data from the 2011 outbreak of Shiga-toxigenic Escherichia coli O104:H4. Conclusions The results show that PathoScope 2.0 is a complete, highly sensitive, and efficient approach for metagenomic analysis that outperforms alternative approaches in scope, speed, and accuracy. The PathoScope 2.0 pipeline software is freely available for download at: http://sourceforge.net/projects/pathoscope/. PMID:25225611

  6. 70% efficiency of bistate molecular machines explained by information theory, high dimensional geometry and evolutionary convergence.

    PubMed

    Schneider, Thomas D

    2010-10-01

    The relationship between information and energy is key to understanding biological systems. We can display the information in DNA sequences specifically bound by proteins by using sequence logos, and we can measure the corresponding binding energy. These can be compared by noting that one of the forms of the second law of thermodynamics defines the minimum energy dissipation required to gain one bit of information. Under the isothermal conditions that molecular machines function this is [Formula in text] joules per bit (kB is Boltzmann's constant and T is the absolute temperature). Then an efficiency of binding can be computed by dividing the information in a logo by the free energy of binding after it has been converted to bits. The isothermal efficiencies of not only genetic control systems, but also visual pigments are near 70%. From information and coding theory, the theoretical efficiency limit for bistate molecular machines is ln 2=0.6931. Evolutionary convergence to maximum efficiency is limited by the constraint that molecular states must be distinct from each other. The result indicates that natural molecular machines operate close to their information processing maximum (the channel capacity), and implies that nanotechnology can attain this goal.

  7. 70% efficiency of bistate molecular machines explained by information theory, high dimensional geometry and evolutionary convergence

    PubMed Central

    Schneider, Thomas D.

    2010-01-01

    The relationship between information and energy is key to understanding biological systems. We can display the information in DNA sequences specifically bound by proteins by using sequence logos, and we can measure the corresponding binding energy. These can be compared by noting that one of the forms of the second law of thermodynamics defines the minimum energy dissipation required to gain one bit of information. Under the isothermal conditions that molecular machines function this is joules per bit ( is Boltzmann's constant and T is the absolute temperature). Then an efficiency of binding can be computed by dividing the information in a logo by the free energy of binding after it has been converted to bits. The isothermal efficiencies of not only genetic control systems, but also visual pigments are near 70%. From information and coding theory, the theoretical efficiency limit for bistate molecular machines is ln 2 = 0.6931. Evolutionary convergence to maximum efficiency is limited by the constraint that molecular states must be distinct from each other. The result indicates that natural molecular machines operate close to their information processing maximum (the channel capacity), and implies that nanotechnology can attain this goal. PMID:20562221

  8. Improving transmission efficiency of large sequence alignment/map (SAM) files.

    PubMed

    Sakib, Muhammad Nazmus; Tang, Jijun; Zheng, W Jim; Huang, Chin-Tser

    2011-01-01

    Research in bioinformatics primarily involves collection and analysis of a large volume of genomic data. Naturally, it demands efficient storage and transfer of this huge amount of data. In recent years, some research has been done to find efficient compression algorithms to reduce the size of various sequencing data. One way to improve the transmission time of large files is to apply a maximum lossless compression on them. In this paper, we present SAMZIP, a specialized encoding scheme, for sequence alignment data in SAM (Sequence Alignment/Map) format, which improves the compression ratio of existing compression tools available. In order to achieve this, we exploit the prior knowledge of the file format and specifications. Our experimental results show that our encoding scheme improves compression ratio, thereby reducing overall transmission time significantly.

  9. Influence of quasi-specific sites on kinetics of target DNA search by a sequence-specific DNA-binding protein.

    PubMed

    Kemme, Catherine A; Esadze, Alexandre; Iwahara, Junji

    2015-11-10

    Functions of transcription factors require formation of specific complexes at particular sites in cis-regulatory elements of genes. However, chromosomal DNA contains numerous sites that are similar to the target sequences recognized by transcription factors. The influence of such "quasi-specific" sites on functions of the transcription factors is not well understood at present by experimental means. In this work, using fluorescence methods, we have investigated the influence of quasi-specific DNA sites on the efficiency of target location by the zinc finger DNA-binding domain of the inducible transcription factor Egr-1, which recognizes a 9 bp sequence. By stopped-flow assays, we measured the kinetics of Egr-1's association with a target site on 143 bp DNA in the presence of various competitor DNAs, including nonspecific and quasi-specific sites. The presence of quasi-specific sites on competitor DNA significantly decelerated the target association by the Egr-1 protein. The impact of the quasi-specific sites depended strongly on their affinity, their concentration, and the degree of their binding to the protein. To quantitatively describe the kinetic impact of the quasi-specific sites, we derived an analytical form of the apparent kinetic rate constant for the target association and used it for fitting to the experimental data. Our kinetic data with calf thymus DNA as a competitor suggested that there are millions of high-affinity quasi-specific sites for Egr-1 among the 3 billion bp of genomic DNA. This study quantitatively demonstrates that naturally abundant quasi-specific sites on DNA can considerably impede the target search processes of sequence-specific DNA-binding proteins.

  10. When seconds count: A study of communication variables in the opening segment of emergency calls.

    PubMed

    Penn, Claire; Koole, Tom; Nattrass, Rhona

    2017-09-01

    The opening sequence of an emergency call influences the efficiency of the ambulance dispatch time. The greeting sequences in 105 calls to a South African emergency service were analysed. Initial results suggested the advantage of a specific two-part opening sequence. An on-site experiment aimed at improving call efficiency was conducted during one shift (1100 calls). Results indicated reduced conversational repairs and a significant reduction of 4 seconds in mean call length. Implications for systems and training are derived.

  11. Pulseq: A rapid and hardware-independent pulse sequence prototyping framework.

    PubMed

    Layton, Kelvin J; Kroboth, Stefan; Jia, Feng; Littin, Sebastian; Yu, Huijun; Leupold, Jochen; Nielsen, Jon-Fredrik; Stöcker, Tony; Zaitsev, Maxim

    2017-04-01

    Implementing new magnetic resonance experiments, or sequences, often involves extensive programming on vendor-specific platforms, which can be time consuming and costly. This situation is exacerbated when research sequences need to be implemented on several platforms simultaneously, for example, at different field strengths. This work presents an alternative programming environment that is hardware-independent, open-source, and promotes rapid sequence prototyping. A novel file format is described to efficiently store the hardware events and timing information required for an MR pulse sequence. Platform-dependent interpreter modules convert the file to appropriate instructions to run the sequence on MR hardware. Sequences can be designed in high-level languages, such as MATLAB, or with a graphical interface. Spin physics simulation tools are incorporated into the framework, allowing for comparison between real and virtual experiments. Minimal effort is required to implement relatively advanced sequences using the tools provided. Sequences are executed on three different MR platforms, demonstrating the flexibility of the approach. A high-level, flexible and hardware-independent approach to sequence programming is ideal for the rapid development of new sequences. The framework is currently not suitable for large patient studies or routine scanning although this would be possible with deeper integration into existing workflows. Magn Reson Med 77:1544-1552, 2017. © 2016 International Society for Magnetic Resonance in Medicine. © 2016 International Society for Magnetic Resonance in Medicine.

  12. Simultaneous flow of water and solutes through geological membranes-I. Experimental investigation

    USGS Publications Warehouse

    Kharaka, Y.K.; Berry, F.A.P.

    1973-01-01

    The relative retardation by geological membranes of cations and anions generally present in subsurface waters was investigated using a high pressure and high temperature 'filtration cell'. The solutions were forced through different clays and a disaggregated shale subjected to compaction pressures up to 9500 psi and to temperatures from 20 to 70??C. The overall efficiences measured increased with increase of exchange capacity of the material used and with decrease in concentration of the input solution. The efficiency of a given membrane increased with increasing compaction pressure but decreased slightly at higher temperatures for solutions of the same ionic concentration. The results further show that geological membranes are specific for different dissolved species. The retardation sequences varied depending on the material used and on experimental conditions. The sequences for monovalent and divalent cations at laboratory temperatures were generally as follows: Li < Na < NH3 < K < Rb < Cs Mg < Ca < Sr < Ba. The sequences for anions at room temperature were variable, but at 70??C, the sequence was: HCO3 < I < B < SO4 < Cl < Br. Monovalent cations contrary to some field data were generally retarded with respect to divalent cations. The differences in the filtration ratios among the divalent cations were smaller than those between the monovalent cations. The passage rate of B, HCO3, I and NH3 was greatly increased at 70??C. ?? 1973.

  13. Metagenomic survey of bacterial diversity in the atmosphere of Mexico City using different sampling methods.

    PubMed

    Serrano-Silva, N; Calderón-Ezquerro, M C

    2018-04-01

    The identification of airborne bacteria has traditionally been performed by retrieval in culture media, but the bacterial diversity in the air is underestimated using this method because many bacteria are not readily cultured. Advances in DNA sequencing technology have produced a broad knowledge of genomics and metagenomics, which can greatly improve our ability to identify and study the diversity of airborne bacteria. However, researchers are facing several challenges, particularly the efficient retrieval of low-density microorganisms from the air and the lack of standardized protocols for sample collection and processing. In this study, we tested three methods for sampling bioaerosols - a Durham-type spore trap (Durham), a seven-day recording volumetric spore trap (HST), and a high-throughput 'Jet' spore and particle sampler (Jet) - and recovered metagenomic DNA for 16S rDNA sequencing. Samples were simultaneously collected with the three devices during one week, and the sequencing libraries were analyzed. A simple and efficient method for collecting bioaerosols and extracting good quality DNA for high-throughput sequencing was standardized. The Durham sampler collected preferentially Cyanobacteria, the HST Actinobacteria, Proteobacteria and Firmicutes, and the Jet mainly Proteobacteria and Firmicutes. The HST sampler collected the largest amount of airborne bacterial diversity. More experiments are necessary to select the right sampler, depending on study objectives, which may require monitoring and collecting specific airborne bacteria. Copyright © 2017 Elsevier Ltd. All rights reserved.

  14. Simultaneous non-contiguous deletions using large synthetic DNA and site-specific recombinases

    PubMed Central

    Krishnakumar, Radha; Grose, Carissa; Haft, Daniel H.; Zaveri, Jayshree; Alperovich, Nina; Gibson, Daniel G.; Merryman, Chuck; Glass, John I.

    2014-01-01

    Toward achieving rapid and large scale genome modification directly in a target organism, we have developed a new genome engineering strategy that uses a combination of bioinformatics aided design, large synthetic DNA and site-specific recombinases. Using Cre recombinase we swapped a target 126-kb segment of the Escherichia coli genome with a 72-kb synthetic DNA cassette, thereby effectively eliminating over 54 kb of genomic DNA from three non-contiguous regions in a single recombination event. We observed complete replacement of the native sequence with the modified synthetic sequence through the action of the Cre recombinase and no competition from homologous recombination. Because of the versatility and high-efficiency of the Cre-lox system, this method can be used in any organism where this system is functional as well as adapted to use with other highly precise genome engineering systems. Compared to present-day iterative approaches in genome engineering, we anticipate this method will greatly speed up the creation of reduced, modularized and optimized genomes through the integration of deletion analyses data, transcriptomics, synthetic biology and site-specific recombination. PMID:24914053

  15. Sequence-specific bias correction for RNA-seq data using recurrent neural networks.

    PubMed

    Zhang, Yao-Zhong; Yamaguchi, Rui; Imoto, Seiya; Miyano, Satoru

    2017-01-25

    The recent success of deep learning techniques in machine learning and artificial intelligence has stimulated a great deal of interest among bioinformaticians, who now wish to bring the power of deep learning to bare on a host of bioinformatical problems. Deep learning is ideally suited for biological problems that require automatic or hierarchical feature representation for biological data when prior knowledge is limited. In this work, we address the sequence-specific bias correction problem for RNA-seq data redusing Recurrent Neural Networks (RNNs) to model nucleotide sequences without pre-determining sequence structures. The sequence-specific bias of a read is then calculated based on the sequence probabilities estimated by RNNs, and used in the estimation of gene abundance. We explore the application of two popular RNN recurrent units for this task and demonstrate that RNN-based approaches provide a flexible way to model nucleotide sequences without knowledge of predetermined sequence structures. Our experiments show that training a RNN-based nucleotide sequence model is efficient and RNN-based bias correction methods compare well with the-state-of-the-art sequence-specific bias correction method on the commonly used MAQC-III data set. RNNs provides an alternative and flexible way to calculate sequence-specific bias without explicitly pre-determining sequence structures.

  16. Influence of Quasi-Specific Sites on Kinetics of Target DNA Search by a Sequence-Specific DNA-Binding Protein

    PubMed Central

    2015-01-01

    Functions of transcription factors require formation of specific complexes at particular sites in cis-regulatory elements of genes. However, chromosomal DNA contains numerous sites that are similar to the target sequences recognized by transcription factors. The influence of such “quasi-specific” sites on functions of the transcription factors is not well understood at present by experimental means. In this work, using fluorescence methods, we have investigated the influence of quasi-specific DNA sites on the efficiency of target location by the zinc finger DNA-binding domain of the inducible transcription factor Egr-1, which recognizes a 9 bp sequence. By stopped-flow assays, we measured the kinetics of Egr-1’s association with a target site on 143 bp DNA in the presence of various competitor DNAs, including nonspecific and quasi-specific sites. The presence of quasi-specific sites on competitor DNA significantly decelerated the target association by the Egr-1 protein. The impact of the quasi-specific sites depended strongly on their affinity, their concentration, and the degree of their binding to the protein. To quantitatively describe the kinetic impact of the quasi-specific sites, we derived an analytical form of the apparent kinetic rate constant for the target association and used it for fitting to the experimental data. Our kinetic data with calf thymus DNA as a competitor suggested that there are millions of high-affinity quasi-specific sites for Egr-1 among the 3 billion bp of genomic DNA. This study quantitatively demonstrates that naturally abundant quasi-specific sites on DNA can considerably impede the target search processes of sequence-specific DNA-binding proteins. PMID:26502071

  17. Resistance gene homologues in Theobroma cacao as useful genetic markers.

    PubMed

    Kuhn, D N; Heath, M; Wisser, R J; Meerow, A; Brown, J S; Lopes, U; Schnell, R J

    2003-07-01

    Resistance gene homologue (RGH) sequences have been developed into useful genetic markers for marker-assisted selection (MAS) of disease resistant Theobroma cacao. A plasmid library of amplified fragments was created from seven different cultivars of cacao. Over 600 cloned recombinant amplicons were evaluated. From these, 74 unique RGHs were identified that could be placed into 11 categories based on sequence analysis. Primers specific to each category were designed. The primers specific for a single RGH category amplified fragments of equal length from the seven different cultivars used to create the library. However, these fragments exhibited single-strand conformational polymorphism (SSCP), which allowed us to map six of the RGH categories in an F(2) population of T. cacao. RGHs 1, 4 and 5 were in the same linkage group, with RGH 4 and 5 separated by less than 4 cM. As SSCP can be efficiently performed on our automated sequencer, we have developed a convenient and rapid high throughput assay for RGH alleles.

  18. DROMPA: easy-to-handle peak calling and visualization software for the computational analysis and validation of ChIP-seq data.

    PubMed

    Nakato, Ryuichiro; Itoh, Tahehiko; Shirahige, Katsuhiko

    2013-07-01

    Chromatin immunoprecipitation with high-throughput sequencing (ChIP-seq) can identify genomic regions that bind proteins involved in various chromosomal functions. Although the development of next-generation sequencers offers the technology needed to identify these protein-binding sites, the analysis can be computationally challenging because sequencing data sometimes consist of >100 million reads/sample. Herein, we describe a cost-effective and time-efficient protocol that is generally applicable to ChIP-seq analysis; this protocol uses a novel peak-calling program termed DROMPA to identify peaks and an additional program, parse2wig, to preprocess read-map files. This two-step procedure drastically reduces computational time and memory requirements compared with other programs. DROMPA enables the identification of protein localization sites in repetitive sequences and efficiently identifies both broad and sharp protein localization peaks. Specifically, DROMPA outputs a protein-binding profile map in pdf or png format, which can be easily manipulated by users who have a limited background in bioinformatics. © 2013 The Authors Genes to Cells © 2013 by the Molecular Biology Society of Japan and Wiley Publishing Asia Pty Ltd.

  19. Complete genome sequence of enterohemorrhagic Escherichia coli O157:H7 and genomic comparison with a laboratory strain K-12.

    PubMed

    Hayashi, T; Makino, K; Ohnishi, M; Kurokawa, K; Ishii, K; Yokoyama, K; Han, C G; Ohtsubo, E; Nakayama, K; Murata, T; Tanaka, M; Tobe, T; Iida, T; Takami, H; Honda, T; Sasakawa, C; Ogasawara, N; Yasunaga, T; Kuhara, S; Shiba, T; Hattori, M; Shinagawa, H

    2001-02-28

    Escherichia coli O157:H7 is a major food-borne infectious pathogen that causes diarrhea, hemorrhagic colitis, and hemolytic uremic syndrome. Here we report the complete chromosome sequence of an O157:H7 strain isolated from the Sakai outbreak, and the results of genomic comparison with a benign laboratory strain, K-12 MG1655. The chromosome is 5.5 Mb in size, 859 Kb larger than that of K-12. We identified a 4.1-Mb sequence highly conserved between the two strains, which may represent the fundamental backbone of the E. coli chromosome. The remaining 1.4-Mb sequence comprises of O157:H7-specific sequences, most of which are horizontally transferred foreign DNAs. The predominant roles of bacteriophages in the emergence of O157:H7 is evident by the presence of 24 prophages and prophage-like elements that occupy more than half of the O157:H7-specific sequences. The O157:H7 chromosome encodes 1632 proteins and 20 tRNAs that are not present in K-12. Among these, at least 131 proteins are assumed to have virulence-related functions. Genome-wide codon usage analysis suggested that the O157:H7-specific tRNAs are involved in the efficient expression of the strain-specific genes. A complete set of the genes specific to O157:H7 presented here sheds new insight into the pathogenicity and the physiology of O157:H7, and will open a way to fully understand the molecular mechanisms underlying the O157:H7 infection.

  20. A Single Transcriptome of a Green Toad (Bufo viridis) Yields Candidate Genes for Sex Determination and -Differentiation and Non-Anonymous Population Genetic Markers

    PubMed Central

    Gerchen, Jörn F.; Reichert, Samuel J.; Röhr, Johannes T.; Dieterich, Christoph; Kloas, Werner

    2016-01-01

    Large genome size, including immense repetitive and non-coding fractions, still present challenges for capacity, bioinformatics and thus affordability of whole genome sequencing in most amphibians. Here, we test the performance of a single transcriptome to understand whether it can provide a cost-efficient resource for species with large unknown genomes. Using RNA from six different tissues from a single Palearctic green toad (Bufo viridis) specimen and Hiseq2000, we obtained 22,5 Mio reads and publish >100,000 unigene sequences. To evaluate efficacy and quality, we first use this data to identify green toad specific candidate genes, known from other vertebrates for their role in sex determination and differentiation. Of a list of 37 genes, the transcriptome yielded 32 (87%), many of which providing the first such data for this non-model anuran species. However, for many of these genes, only fragments could be retrieved. In order to allow also applications to population genetics, we further used the transcriptome for the targeted development of 21 non-anonymous microsatellites and tested them in genetic families and backcrosses. Eleven markers were specifically developed to be located on the B. viridis sex chromosomes; for eight markers we can indeed demonstrate sex-specific transmission in genetic families. Depending on phylogenetic distance, several markers, which are sex-linked in green toads, show high cross-amplification success across the anuran phylogeny, involving nine systematic anuran families. Our data support the view that single transcriptome sequencing (based on multiple tissues) provides a reliable genomic resource and cost-efficient method for non-model amphibian species with large genome size and, despite limitations, should be considered as long as genome sequencing remains unaffordable for most species. PMID:27232626

  1. Using Next Generation Sequencing for Multiplexed Trait-Linked Markers in Wheat

    PubMed Central

    Bernardo, Amy; Wang, Shan; St. Amand, Paul; Bai, Guihua

    2015-01-01

    With the advent of next generation sequencing (NGS) technologies, single nucleotide polymorphisms (SNPs) have become the major type of marker for genotyping in many crops. However, the availability of SNP markers for important traits of bread wheat ( Triticum aestivum L.) that can be effectively used in marker-assisted selection (MAS) is still limited and SNP assays for MAS are usually uniplex. A shift from uniplex to multiplex assays will allow the simultaneous analysis of multiple markers and increase MAS efficiency. We designed 33 locus-specific markers from SNP or indel-based marker sequences that linked to 20 different quantitative trait loci (QTL) or genes of agronomic importance in wheat and analyzed the amplicon sequences using an Ion Torrent Proton Sequencer and a custom allele detection pipeline to determine the genotypes of 24 selected germplasm accessions. Among the 33 markers, 27 were successfully multiplexed and 23 had 100% SNP call rates. Results from analysis of "kompetitive allele-specific PCR" (KASP) and sequence tagged site (STS) markers developed from the same loci fully verified the genotype calls of 23 markers. The NGS-based multiplexed assay developed in this study is suitable for rapid and high-throughput screening of SNPs and some indel-based markers in wheat. PMID:26625271

  2. Sequence comparison of prefrontal cortical brain transcriptome from a tame and an aggressive silver fox (Vulpes vulpes).

    PubMed

    Kukekova, Anna V; Johnson, Jennifer L; Teiling, Clotilde; Li, Lewyn; Oskina, Irina N; Kharlamova, Anastasiya V; Gulevich, Rimma G; Padte, Ravee; Dubreuil, Michael M; Vladimirova, Anastasiya V; Shepeleva, Darya V; Shikhevich, Svetlana G; Sun, Qi; Ponnala, Lalit; Temnykh, Svetlana V; Trut, Lyudmila N; Acland, Gregory M

    2011-10-03

    Two strains of the silver fox (Vulpes vulpes), with markedly different behavioral phenotypes, have been developed by long-term selection for behavior. Foxes from the tame strain exhibit friendly behavior towards humans, paralleling the sociability of canine puppies, whereas foxes from the aggressive strain are defensive and exhibit aggression to humans. To understand the genetic differences underlying these behavioral phenotypes fox-specific genomic resources are needed. cDNA from mRNA from pre-frontal cortex of a tame and an aggressive fox was sequenced using the Roche 454 FLX Titanium platform (> 2.5 million reads & 0.9 Gbase of tame fox sequence; >3.3 million reads & 1.2 Gbase of aggressive fox sequence). Over 80% of the fox reads were assembled into contigs. Mapping fox reads against the fox transcriptome assembly and the dog genome identified over 30,000 high confidence fox-specific SNPs. Fox transcripts for approximately 14,000 genes were identified using SwissProt and the dog RefSeq databases. An at least 2-fold expression difference between the two samples (p < 0.05) was observed for 335 genes, fewer than 3% of the total number of genes identified in the fox transcriptome. Transcriptome sequencing significantly expanded genomic resources available for the fox, a species without a sequenced genome. In a very cost efficient manner this yielded a large number of fox-specific SNP markers for genetic studies and provided significant insights into the gene expression profile of the fox pre-frontal cortex; expression differences between the two fox samples; and a catalogue of potentially important gene-specific sequence variants. This result demonstrates the utility of this approach for developing genomic resources in species with limited genomic information.

  3. Sequence comparison of prefrontal cortical brain transcriptome from a tame and an aggressive silver fox (Vulpes vulpes)

    PubMed Central

    2011-01-01

    Background Two strains of the silver fox (Vulpes vulpes), with markedly different behavioral phenotypes, have been developed by long-term selection for behavior. Foxes from the tame strain exhibit friendly behavior towards humans, paralleling the sociability of canine puppies, whereas foxes from the aggressive strain are defensive and exhibit aggression to humans. To understand the genetic differences underlying these behavioral phenotypes fox-specific genomic resources are needed. Results cDNA from mRNA from pre-frontal cortex of a tame and an aggressive fox was sequenced using the Roche 454 FLX Titanium platform (> 2.5 million reads & 0.9 Gbase of tame fox sequence; >3.3 million reads & 1.2 Gbase of aggressive fox sequence). Over 80% of the fox reads were assembled into contigs. Mapping fox reads against the fox transcriptome assembly and the dog genome identified over 30,000 high confidence fox-specific SNPs. Fox transcripts for approximately 14,000 genes were identified using SwissProt and the dog RefSeq databases. An at least 2-fold expression difference between the two samples (p < 0.05) was observed for 335 genes, fewer than 3% of the total number of genes identified in the fox transcriptome. Conclusions Transcriptome sequencing significantly expanded genomic resources available for the fox, a species without a sequenced genome. In a very cost efficient manner this yielded a large number of fox-specific SNP markers for genetic studies and provided significant insights into the gene expression profile of the fox pre-frontal cortex; expression differences between the two fox samples; and a catalogue of potentially important gene-specific sequence variants. This result demonstrates the utility of this approach for developing genomic resources in species with limited genomic information. PMID:21967120

  4. Read count-based method for high-throughput allelic genotyping of transposable elements and structural variants.

    PubMed

    Kuhn, Alexandre; Ong, Yao Min; Quake, Stephen R; Burkholder, William F

    2015-07-08

    Like other structural variants, transposable element insertions can be highly polymorphic across individuals. Their functional impact, however, remains poorly understood. Current genome-wide approaches for genotyping insertion-site polymorphisms based on targeted or whole-genome sequencing remain very expensive and can lack accuracy, hence new large-scale genotyping methods are needed. We describe a high-throughput method for genotyping transposable element insertions and other types of structural variants that can be assayed by breakpoint PCR. The method relies on next-generation sequencing of multiplex, site-specific PCR amplification products and read count-based genotype calls. We show that this method is flexible, efficient (it does not require rounds of optimization), cost-effective and highly accurate. This method can benefit a wide range of applications from the routine genotyping of animal and plant populations to the functional study of structural variants in humans.

  5. Identifying risk factors for exposure to culturable allergenic moulds in energy efficient homes by using highly specific monoclonal antibodies

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Sharpe, Richard A.; Cocq, Kate Le; Nikolaou, Vasilis

    The aim of this study was to determine the accuracy of monoclonal antibodies (mAbs) in identifying culturable allergenic fungi present in visible mould growth in energy efficient homes, and to identify risk factors for exposure to these known allergenic fungi. Swabs were taken from fungal contaminated surfaces and culturable yeasts and moulds isolated by using mycological culture. Soluble antigens from cultures were tested by ELISA using mAbs specific to the culturable allergenic fungi Aspergillus and Penicillium spp., Ulocladium, Alternaria, and Epicoccum spp., Cladosporium spp., Fusarium spp., and Trichoderma spp. Diagnostic accuracies of the ELISA tests were determined by sequencing ofmore » the internally transcribed spacer 1 (ITS1)-5.8S-ITS2-encoding regions of recovered fungi following ELISA. There was 100% concordance between the two methods, with ELISAs providing genus-level identity and ITS sequencing providing species-level identities (210 out of 210 tested). Species of Aspergillus/Penicillium, Cladosporium, Ulocladium/Alternaria/Epicoccum, Fusarium and Trichoderma were detected in 82% of the samples. The presence of condensation was associated with an increased risk of surfaces being contaminated by Aspergillus/Penicillium spp. and Cladosporium spp., whereas moisture within the building fabric (water ingress/rising damp) was only associated with increased risk of Aspergillus/Penicillium spp. Property type and energy efficiency levels were found to moderate the risk of indoor surfaces becoming contaminated with Aspergillus/Penicillium and Cladosporium which in turn was modified by the presence of condensation, water ingress and rising damp, consistent with previous literature. - Highlights: • Monoclonal antibodies were used to track culturable allergenic moulds in homes. • Allergenic moulds were recovered from 82% of swabs from contaminated surfaces. • The mAbs were highly specific with 100% agreement to PCR of recovered fungi. • Improvements to energy efficiency lowered risk of exposure to allergenic fungi.« less

  6. Development of ITS sequence based molecular marker to distinguish, Tribulus terrestris L. (Zygophyllaceae) from its adulterants.

    PubMed

    Balasubramani, Subramani Paranthaman; Murugan, Ramar; Ravikumar, Kaliamoorthy; Venkatasubramanian, Padma

    2010-09-01

    Tribulus terrestris L. (Zygophyllaceae) is one of the highly traded raw drugs and also used as a stimulative food additive in Europe and USA. While, Ayurvedic Pharmacopoeia of India recognizes T. terrestris as Goksura, Tribulus lanuginosus and T. subramanyamii are also traded by the same name raising issues of quality control. The nuclear ribosomal RNA genes and ITS (internal transcribed spacer) sequence were used to develop species-specific DNA markers. The species-specific markers efficiently amplified 295bp for T. terrestris (TT1F and TT1R), 300bp for T. lanuginosus (TL1F and TL1R) and 214bp for T. subramanyamii (TS1F and TS1R). These DNA markers can be used to distinguish T. terrestris from its adulterants. Copyright (c) 2010 Elsevier B.V. All rights reserved.

  7. Molecular Identification of Sex in Phoenix dactylifera Using Inter Simple Sequence Repeat Markers.

    PubMed

    Al-Ameri, Abdulhafed A; Al-Qurainy, Fahad; Gaafar, Abdel-Rhman Z; Khan, Salim; Nadeem, M

    2016-01-01

    Early sex identification of Date Palm (Phoenix dactylifera L.) at seedling stage is an economically desirable objective, which will significantly increase the profits of seed based cultivation. The utilization of molecular markers at this stage for early and rapid identification of sex is important due to the lack of morphological markers. In this study, a total of two hundred Inter Simple Sequence Repeat (ISSR) primers were screened among male and female Date palm plants to identify putative sex-specific marker, out of which only two primers (IS_A02 and IS_A71) were found to be associated with sex. The primer IS_A02 produced a unique band of size 390 bp and was found clearly in all female plants, while it was absent in all male plants. Contrary to this, the primer IS_A71 produced a unique band of size 380 bp and was clearly found in all male plants, whereas it was absent in all the female plants. Subsequently, these specific fragments were excised, purified, and sequenced for the development of sequence specific markers further in future for the implementation on dioecious Date Palm for sex determination. These markers are efficient, highly reliable, and reproducible for sex identification at the early stage of seedling.

  8. Specific beta1-adrenergic receptor silencing with small interfering RNA lowers high blood pressure and improves cardiac function in myocardial ischemia.

    PubMed

    Arnold, Anne-Sophie; Tang, Yao Liang; Qian, Keping; Shen, Leping; Valencia, Valery; Phillips, Michael Ian; Zhang, Yuan Clare

    2007-01-01

    Beta-blockers are widely used and effective for treating hypertension, acute myocardial infarction (MI) and heart failure, but they present side-effects mainly due to antagonism of beta2-adrenergic receptor (AR). Currently available beta-blockers are at best selective but not specific for beta1 or beta2-AR. To specifically inhibit the expression of the beta1-AR, we developed a small interfering RNA (siRNA) targeted to beta1-AR. Three different sequences of beta1 siRNA were delivered into C6-2B cells with 90% efficiency. One of the three sequences reduced the level of beta1-AR mRNA by 70%. The siRNA was highly specific for beta1-AR inhibition with no overlap with beta2-AR. To test this in vivo, systemic injection of beta1 siRNA complexed with liposomes resulted in efficient delivery into the heart, lung, kidney and liver, and effectively reduced beta1-AR expression in the heart without altering beta2-AR. beta1 siRNA significantly lowered blood pressure of spontaneously hypertensive rats (SHR) for at least 12 days and reduced cardiac hypertrophy following a single injection. Pretreatment with beta1 siRNA 3 days before induction of MI in Wistar rats significantly improved cardiac function, as demonstrated by dP/dt and electrocardiogram following the MI. The protective mechanism involved reduction of cardiomyocyte apoptosis in the beta1 siRNA-treated hearts. The present study demonstrates the possibility of using siRNA for treating cardiovascular diseases and may represent a novel beta-blocker specific for beta1-AR.

  9. Integration of promoters, inverted repeat sequences and proteomic data into a model for high silencing efficiency of coeliac disease related gliadins in bread wheat

    PubMed Central

    2013-01-01

    Background Wheat gluten has unique nutritional and technological characteristics, but is also a major trigger of allergies and intolerances. One of the most severe diseases caused by gluten is coeliac disease. The peptides produced in the digestive tract by the incomplete digestion of gluten proteins trigger the disease. The majority of the epitopes responsible reside in the gliadin fraction of gluten. The location of the multiple gliadin genes in blocks has to date complicated their elimination by classical breeding techniques or by the use of biotechnological tools. As an approach to silence multiple gliadin genes we have produced 38 transgenic lines of bread wheat containing combinations of two endosperm-specific promoters and three different inverted repeat sequences to silence three fractions of gliadins by RNA interference. Results The effects of the RNA interference constructs on the content of the gluten proteins, total protein and starch, thousand seed weights and SDSS quality tests of flour were analyzed in these transgenic lines in two consecutive years. The characteristics of the inverted repeat sequences were the main factor that determined the efficiency of silencing. The promoter used had less influence on silencing, although a synergy in silencing efficiency was observed when the two promoters were used simultaneously. Genotype and the environment also influenced silencing efficiency. Conclusions We conclude that to obtain wheat lines with an optimum reduction of toxic gluten epitopes one needs to take into account the factors of inverted repeat sequences design, promoter choice and also the wheat background used. PMID:24044767

  10. Cross-species bacterial artificial chromosome (BAC) library screening via overgo-based hybridization and BAC-contig mapping of a yield enhancement quantitative trait locus (QTL) yld1.1 in the Malaysian wild rice Oryza rufipogon.

    PubMed

    Song, Beng-Kah; Nadarajah, Kalaivani; Romanov, Michael N; Ratnam, Wickneswari

    2005-01-01

    The construction of BAC-contig physical maps is an important step towards a partial or ultimate genome sequence analysis. Here, we describe our initial efforts to apply an overgo approach to screen a BAC library of the Malaysian wild rice species, Oryza rufipogon. Overgo design is based on repetitive element masking and sequence uniqueness, and uses short probes (approximately 40 bp), making this method highly efficient and specific. Pairs of 24-bp oligos that contain an 8-bp overlap were developed from the publicly available genomic sequences of the cultivated rice, O. sativa, to generate 20 overgo probes for a 1-Mb region that encompasses a yield enhancement QTL yld1.1 in O. rufipogon. The advantages of a high similarity in melting temperature, hybridization kinetics and specific activities of overgos further enabled a pooling strategy for library screening by filter hybridization. Two pools of ten overgos each were hybridized to high-density filters representing the O. rufipogon genomic BAC library. These screening tests succeeded in providing 69 PCR-verified positive hits from a total of 23,040 BAC clones of the entire O. rufipogon library. A minimal tilling path of clones was generated to contribute to a fully covered BAC-contig map of the targeted 1-Mb region. The developed protocol for overgo design based on O. sativa sequences as a comparative genomic framework, and the pooled overgo hybridization screening technique are suitable means for high-resolution physical mapping and the identification of BAC candidates for sequencing.

  11. Efficient modification of CCR5 in primary human hematopoietic cells using a megaTAL nuclease and AAV donor template.

    PubMed

    Sather, Blythe D; Romano Ibarra, Guillermo S; Sommer, Karen; Curinga, Gabrielle; Hale, Malika; Khan, Iram F; Singh, Swati; Song, Yumei; Gwiazda, Kamila; Sahni, Jaya; Jarjour, Jordan; Astrakhan, Alexander; Wagner, Thor A; Scharenberg, Andrew M; Rawlings, David J

    2015-09-30

    Genetic mutations or engineered nucleases that disrupt the HIV co-receptor CCR5 block HIV infection of CD4(+) T cells. These findings have motivated the engineering of CCR5-specific nucleases for application as HIV therapies. The efficacy of this approach relies on efficient biallelic disruption of CCR5, and the ability to efficiently target sequences that confer HIV resistance to the CCR5 locus has the potential to further improve clinical outcomes. We used RNA-based nuclease expression paired with adeno-associated virus (AAV)-mediated delivery of a CCR5-targeting donor template to achieve highly efficient targeted recombination in primary human T cells. This method consistently achieved 8 to 60% rates of homology-directed recombination into the CCR5 locus in T cells, with over 80% of cells modified with an MND-GFP expression cassette exhibiting biallelic modification. MND-GFP-modified T cells maintained a diverse repertoire and engrafted in immune-deficient mice as efficiently as unmodified cells. Using this method, we integrated sequences coding chimeric antigen receptors (CARs) into the CCR5 locus, and the resulting targeted CAR T cells exhibited antitumor or anti-HIV activity. Alternatively, we introduced the C46 HIV fusion inhibitor, generating T cell populations with high rates of biallelic CCR5 disruption paired with potential protection from HIV with CXCR4 co-receptor tropism. Finally, this protocol was applied to adult human mobilized CD34(+) cells, resulting in 15 to 20% homologous gene targeting. Our results demonstrate that high-efficiency targeted integration is feasible in primary human hematopoietic cells and highlight the potential of gene editing to engineer T cell products with myriad functional properties. Copyright © 2015, American Association for the Advancement of Science.

  12. Single-Nucleotide-Specific Targeting of the Tf1 Retrotransposon Promoted by the DNA-Binding Protein Sap1 of Schizosaccharomyces pombe.

    PubMed

    Hickey, Anthony; Esnault, Caroline; Majumdar, Anasuya; Chatterjee, Atreyi Ghatak; Iben, James R; McQueen, Philip G; Yang, Andrew X; Mizuguchi, Takeshi; Grewal, Shiv I S; Levin, Henry L

    2015-11-01

    Transposable elements (TEs) constitute a substantial fraction of the eukaryotic genome and, as a result, have a complex relationship with their host that is both adversarial and dependent. To minimize damage to cellular genes, TEs possess mechanisms that target integration to sequences of low importance. However, the retrotransposon Tf1 of Schizosaccharomyces pombe integrates with a surprising bias for promoter sequences of stress-response genes. The clustering of integration in specific promoters suggests that Tf1 possesses a targeting mechanism that is important for evolutionary adaptation to changes in environment. We report here that Sap1, an essential DNA-binding protein, plays an important role in Tf1 integration. A mutation in Sap1 resulted in a 10-fold drop in Tf1 transposition, and measures of transposon intermediates support the argument that the defect occurred in the process of integration. Published ChIP-Seq data on Sap1 binding combined with high-density maps of Tf1 integration that measure independent insertions at single-nucleotide positions show that 73.4% of all integration occurs at genomic sequences bound by Sap1. This represents high selectivity because Sap1 binds just 6.8% of the genome. A genome-wide analysis of promoter sequences revealed that Sap1 binding and amounts of integration correlate strongly. More important, an alignment of the DNA-binding motif of Sap1 revealed integration clustered on both sides of the motif and showed high levels specifically at positions +19 and -9. These data indicate that Sap1 contributes to the efficiency and position of Tf1 integration. Copyright © 2015 by the Genetics Society of America.

  13. Single-Nucleotide-Specific Targeting of the Tf1 Retrotransposon Promoted by the DNA-Binding Protein Sap1 of Schizosaccharomyces pombe

    PubMed Central

    Hickey, Anthony; Esnault, Caroline; Majumdar, Anasuya; Chatterjee, Atreyi Ghatak; Iben, James R.; McQueen, Philip G.; Yang, Andrew X.; Mizuguchi, Takeshi; Grewal, Shiv I. S.; Levin, Henry L.

    2015-01-01

    Transposable elements (TEs) constitute a substantial fraction of the eukaryotic genome and, as a result, have a complex relationship with their host that is both adversarial and dependent. To minimize damage to cellular genes, TEs possess mechanisms that target integration to sequences of low importance. However, the retrotransposon Tf1 of Schizosaccharomyces pombe integrates with a surprising bias for promoter sequences of stress-response genes. The clustering of integration in specific promoters suggests that Tf1 possesses a targeting mechanism that is important for evolutionary adaptation to changes in environment. We report here that Sap1, an essential DNA-binding protein, plays an important role in Tf1 integration. A mutation in Sap1 resulted in a 10-fold drop in Tf1 transposition, and measures of transposon intermediates support the argument that the defect occurred in the process of integration. Published ChIP-Seq data on Sap1 binding combined with high-density maps of Tf1 integration that measure independent insertions at single-nucleotide positions show that 73.4% of all integration occurs at genomic sequences bound by Sap1. This represents high selectivity because Sap1 binds just 6.8% of the genome. A genome-wide analysis of promoter sequences revealed that Sap1 binding and amounts of integration correlate strongly. More important, an alignment of the DNA-binding motif of Sap1 revealed integration clustered on both sides of the motif and showed high levels specifically at positions +19 and −9. These data indicate that Sap1 contributes to the efficiency and position of Tf1 integration. PMID:26358720

  14. Deciphering the molecular and functional basis of Dbl family proteins: a novel systematic approach toward classification of selective activation of the Rho family proteins.

    PubMed

    Jaiswal, Mamta; Dvorsky, Radovan; Ahmadian, Mohammad Reza

    2013-02-08

    The diffuse B-cell lymphoma (Dbl) family of the guanine nucleotide exchange factors is a direct activator of the Rho family proteins. The Rho family proteins are involved in almost every cellular process that ranges from fundamental (e.g. the establishment of cell polarity) to highly specialized processes (e.g. the contraction of vascular smooth muscle cells). Abnormal activation of the Rho proteins is known to play a crucial role in cancer, infectious and cognitive disorders, and cardiovascular diseases. However, the existence of 74 Dbl proteins and 25 Rho-related proteins in humans, which are largely uncharacterized, has led to increasing complexity in identifying specific upstream pathways. Thus, we comprehensively investigated sequence-structure-function-property relationships of 21 representatives of the Dbl protein family regarding their specificities and activities toward 12 Rho family proteins. The meta-analysis approach provides an unprecedented opportunity to broadly profile functional properties of Dbl family proteins, including catalytic efficiency, substrate selectivity, and signaling specificity. Our analysis has provided novel insights into the following: (i) understanding of the relative differences of various Rho protein members in nucleotide exchange; (ii) comparing and defining individual and overall guanine nucleotide exchange factor activities of a large representative set of the Dbl proteins toward 12 Rho proteins; (iii) grouping the Dbl family into functionally distinct categories based on both their catalytic efficiencies and their sequence-structural relationships; (iv) identifying conserved amino acids as fingerprints of the Dbl and Rho protein interaction; and (v) defining amino acid sequences conserved within, but not between, Dbl subfamilies. Therefore, the characteristics of such specificity-determining residues identified the regions or clusters conserved within the Dbl subfamilies.

  15. Effective DNA Inhibitors of Cathepsin G by In Vitro Selection

    PubMed Central

    Gatto, Barbara; Vianini, Elena; Lucatello, Lorena; Sissi, Claudia; Moltrasio, Danilo; Pescador, Rodolfo; Porta, Roberto; Palumbo, Manlio

    2008-01-01

    Cathepsin G (CatG) is a chymotrypsin-like protease released upon degranulation of neutrophils. In several inflammatory and ischaemic diseases the impaired balance between CatG and its physiological inhibitors leads to tissue destruction and platelet aggregation. Inhibitors of CatG are suitable for the treatment of inflammatory diseases and procoagulant conditions. DNA released upon the death of neutrophils at injury sites binds CatG. Moreover, short DNA fragments are more inhibitory than genomic DNA. Defibrotide, a single stranded polydeoxyribonucleotide with antithrombotic effect is also a potent CatG inhibitor. Given the above experimental evidences we employed a selection protocol to assess whether DNA inhibition of CatG may be ascribed to specific sequences present in defibrotide DNA. A Selex protocol was applied to identify the single-stranded DNA sequences exhibiting the highest affinity for CatG, the diversity of a combinatorial pool of oligodeoxyribonucleotides being a good representation of the complexity found in defibrotide. Biophysical and biochemical studies confirmed that the selected sequences bind tightly to the target enzyme and also efficiently inhibit its catalytic activity. Sequence analysis carried out to unveil a motif responsible for CatG recognition showed a recurrence of alternating TG repeats in the selected CatG binders, adopting an extended conformation that grants maximal interaction with the highly charged protein surface. This unprecedented finding is validated by our results showing high affinity and inhibition of CatG by specific DNA sequences of variable length designed to maximally reduce pairing/folding interactions. PMID:19325843

  16. Streaming fragment assignment for real-time analysis of sequencing experiments

    PubMed Central

    Roberts, Adam; Pachter, Lior

    2013-01-01

    We present eXpress, a software package for highly efficient probabilistic assignment of ambiguously mapping sequenced fragments. eXpress uses a streaming algorithm with linear run time and constant memory use. It can determine abundances of sequenced molecules in real time, and can be applied to ChIP-seq, metagenomics and other large-scale sequencing data. We demonstrate its use on RNA-seq data, showing greater efficiency than other quantification methods. PMID:23160280

  17. An efficient annotation and gene-expression derivation tool for Illumina Solexa datasets.

    PubMed

    Hosseini, Parsa; Tremblay, Arianne; Matthews, Benjamin F; Alkharouf, Nadim W

    2010-07-02

    The data produced by an Illumina flow cell with all eight lanes occupied, produces well over a terabyte worth of images with gigabytes of reads following sequence alignment. The ability to translate such reads into meaningful annotation is therefore of great concern and importance. Very easily, one can get flooded with such a great volume of textual, unannotated data irrespective of read quality or size. CASAVA, a optional analysis tool for Illumina sequencing experiments, enables the ability to understand INDEL detection, SNP information, and allele calling. To not only extract from such analysis, a measure of gene expression in the form of tag-counts, but furthermore to annotate such reads is therefore of significant value. We developed TASE (Tag counting and Analysis of Solexa Experiments), a rapid tag-counting and annotation software tool specifically designed for Illumina CASAVA sequencing datasets. Developed in Java and deployed using jTDS JDBC driver and a SQL Server backend, TASE provides an extremely fast means of calculating gene expression through tag-counts while annotating sequenced reads with the gene's presumed function, from any given CASAVA-build. Such a build is generated for both DNA and RNA sequencing. Analysis is broken into two distinct components: DNA sequence or read concatenation, followed by tag-counting and annotation. The end result produces output containing the homology-based functional annotation and respective gene expression measure signifying how many times sequenced reads were found within the genomic ranges of functional annotations. TASE is a powerful tool to facilitate the process of annotating a given Illumina Solexa sequencing dataset. Our results indicate that both homology-based annotation and tag-count analysis are achieved in very efficient times, providing researchers to delve deep in a given CASAVA-build and maximize information extraction from a sequencing dataset. TASE is specially designed to translate sequence data in a CASAVA-build into functional annotations while producing corresponding gene expression measurements. Achieving such analysis is executed in an ultrafast and highly efficient manner, whether the analysis be a single-read or paired-end sequencing experiment. TASE is a user-friendly and freely available application, allowing rapid analysis and annotation of any given Illumina Solexa sequencing dataset with ease.

  18. Accurate Sample Assignment in a Multiplexed, Ultrasensitive, High-Throughput Sequencing Assay for Minimal Residual Disease.

    PubMed

    Bartram, Jack; Mountjoy, Edward; Brooks, Tony; Hancock, Jeremy; Williamson, Helen; Wright, Gary; Moppett, John; Goulden, Nick; Hubank, Mike

    2016-07-01

    High-throughput sequencing (HTS) (next-generation sequencing) of the rearranged Ig and T-cell receptor genes promises to be less expensive and more sensitive than current methods of monitoring minimal residual disease (MRD) in patients with acute lymphoblastic leukemia. However, the adoption of new approaches by clinical laboratories requires careful evaluation of all potential sources of error and the development of strategies to ensure the highest accuracy. Timely and efficient clinical use of HTS platforms will depend on combining multiple samples (multiplexing) in each sequencing run. Here we examine the Ig heavy-chain gene HTS on the Illumina MiSeq platform for MRD. We identify errors associated with multiplexing that could potentially impact the accuracy of MRD analysis. We optimize a strategy that combines high-purity, sequence-optimized oligonucleotides, dual indexing, and an error-aware demultiplexing approach to minimize errors and maximize sensitivity. We present a probability-based, demultiplexing pipeline Error-Aware Demultiplexer that is suitable for all MiSeq strategies and accurately assigns samples to the correct identifier without excessive loss of data. Finally, using controls quantified by digital PCR, we show that HTS-MRD can accurately detect as few as 1 in 10(6) copies of specific leukemic MRD. Crown Copyright © 2016. Published by Elsevier Inc. All rights reserved.

  19. SINE_scan: an efficient tool to discover short interspersed nuclear elements (SINEs) in large-scale genomic datasets.

    PubMed

    Mao, Hongliang; Wang, Hao

    2017-03-01

    Short Interspersed Nuclear Elements (SINEs) are transposable elements (TEs) that amplify through a copy-and-paste mode via RNA intermediates. The computational identification of new SINEs are challenging because of their weak structural signals and rapid diversification in sequences. Here we report SINE_Scan, a highly efficient program to predict SINE elements in genomic DNA sequences. SINE_Scan integrates hallmark of SINE transposition, copy number and structural signals to identify a SINE element. SINE_Scan outperforms the previously published de novo SINE discovery program. It shows high sensitivity and specificity in 19 plant and animal genome assemblies, of which sizes vary from 120 Mb to 3.5 Gb. It identifies numerous new families and substantially increases the estimation of the abundance of SINEs in these genomes. The code of SINE_Scan is freely available at http://github.com/maohlzj/SINE_Scan , implemented in PERL and supported on Linux. wangh8@fudan.edu.cn. Supplementary data are available at Bioinformatics online. © The Author 2016. Published by Oxford University Press.

  20. SINE_scan: an efficient tool to discover short interspersed nuclear elements (SINEs) in large-scale genomic datasets

    PubMed Central

    Mao, Hongliang

    2017-01-01

    Abstract Motivation: Short Interspersed Nuclear Elements (SINEs) are transposable elements (TEs) that amplify through a copy-and-paste mode via RNA intermediates. The computational identification of new SINEs are challenging because of their weak structural signals and rapid diversification in sequences. Results: Here we report SINE_Scan, a highly efficient program to predict SINE elements in genomic DNA sequences. SINE_Scan integrates hallmark of SINE transposition, copy number and structural signals to identify a SINE element. SINE_Scan outperforms the previously published de novo SINE discovery program. It shows high sensitivity and specificity in 19 plant and animal genome assemblies, of which sizes vary from 120 Mb to 3.5 Gb. It identifies numerous new families and substantially increases the estimation of the abundance of SINEs in these genomes. Availability and Implementation: The code of SINE_Scan is freely available at http://github.com/maohlzj/SINE_Scan, implemented in PERL and supported on Linux. Contact: wangh8@fudan.edu.cn Supplementary information: Supplementary data are available at Bioinformatics online. PMID:28062442

  1. Enzyme-free detection and quantification of double-stranded nucleic acids.

    PubMed

    Feuillie, Cécile; Merheb, Maxime Mohamad; Gillet, Benjamin; Montagnac, Gilles; Hänni, Catherine; Daniel, Isabelle

    2012-08-01

    We have developed a fully enzyme-free SERRS hybridization assay for specific detection of double-stranded DNA sequences. Although all DNA detection methods ranging from PCR to high-throughput sequencing rely on enzymes, this method is unique for being totally non-enzymatic. The efficiency of enzymatic processes is affected by alterations, modifications, and/or quality of DNA. For instance, a limitation of most DNA polymerases is their inability to process DNA damaged by blocking lesions. As a result, enzymatic amplification and sequencing of degraded DNA often fail. In this study we succeeded in detecting and quantifying, within a mixture, relative amounts of closely related double-stranded DNA sequences from Rupicapra rupicapra (chamois) and Capra hircus (goat). The non-enzymatic SERRS assay presented here is the corner stone of a promising approach to overcome the failure of DNA polymerase when DNA is too degraded or when the concentration of polymerase inhibitors is too high. It is the first time double-stranded DNA has been detected with a truly non-enzymatic SERRS-based method. This non-enzymatic, inexpensive, rapid assay is therefore a breakthrough in nucleic acid detection.

  2. Learning Behavior Characterization with Multi-Feature, Hierarchical Activity Sequences

    ERIC Educational Resources Information Center

    Ye, Cheng; Segedy, James R.; Kinnebrew, John S.; Biswas, Gautam

    2015-01-01

    This paper discusses Multi-Feature Hierarchical Sequential Pattern Mining, MFH-SPAM, a novel algorithm that efficiently extracts patterns from students' learning activity sequences. This algorithm extends an existing sequential pattern mining algorithm by dynamically selecting the level of specificity for hierarchically-defined features…

  3. DNA Photo Lithography with Cinnamate-based Photo-Bio-Nano-Glue

    NASA Astrophysics Data System (ADS)

    Feng, Lang; Li, Minfeng; Romulus, Joy; Sha, Ruojie; Royer, John; Wu, Kun-Ta; Xu, Qin; Seeman, Nadrian; Weck, Marcus; Chaikin, Paul

    2013-03-01

    We present a technique to make patterned functional surfaces, using a cinnamate photo cross-linker and photolithography. We have designed and modified a complementary set of single DNA strands to incorporate a pair of opposing cinnamate molecules. On exposure to 360nm UV, the cinnamate makes a highly specific covalent bond permanently linking only the complementary strands containing the cinnamates. We have studied this specific and efficient crosslinking with cinnamate-containing DNA in solution and on particles. UV addressability allows us to pattern surfaces functionally. The entire surface is coated with a DNA sequence A incorporating cinnamate. DNA strands A'B with one end containing a complementary cinnamated sequence A' attached to another sequence B, are then hybridized to the surface. UV photolithography is used to bind the A'B strand in a specific pattern. The system is heated and the unbound DNA is washed away. The pattern is then observed by thermo-reversibly hybridizing either fluorescently dyed B' strands complementary to B, or colloids coated with B' strands. Our techniques can be used to reversibly and/or permanently bind, via DNA linkers, an assortment of molecules, proteins and nanostructures. Potential applications range from advanced self-assembly, such as templated self-replication schemes recently reported, to designed physical and chemical patterns, to high-resolution multi-functional DNA surfaces for genetic detection or DNA computing.

  4. Properties of a U1 RNA enhancer-like sequence.

    PubMed Central

    Ciliberto, G; Palla, F; Tebb, G; Mattaj, I W; Philipson, L

    1987-01-01

    The properties of a X.laevis U1B snRNA gene enhancer have been studied by microinjection in Xenopus oocytes. The enhancer-like sequence, defined as a short DNA stretch that is able to activate transcription in an orientation independent manner, is interchangeable between different U snRNA genes. The enhancer sequence alone does not, however, efficiently activate transcription from an SV40 pol II promoter but regains its activity when combined with the U-gene specific proximal sequence element. DNase I protection experiments show that the X.laevis U1B enhancer can interact specifically with a nuclear factor present in mammalian cells. Images PMID:3031597

  5. Single reaction, real time RT-PCR detection of all known avian and human metapneumoviruses.

    PubMed

    Lemaitre, E; Allée, C; Vabret, A; Eterradossi, N; Brown, P A

    2018-01-01

    Current molecular methods for the detection of avian and human metapneumovirus (AMPV, HMPV) are specifically targeted towards each virus species or individual subgroups of these. Here a broad range SYBR Green I real time RT-PCR was developed which amplified a highly conserved fragment of sequence in the N open reading frame. This method was sufficiently efficient and specific in detecting all MPVs. Its validation according to the NF U47-600 norm for the four AMPV subgroups estimated low limits of detection between 1000 and 10copies/μL, similar with detection levels described previously for real time RT-PCRs targeting specific subgroups. RNA viruses present a challenge for the design of durable molecular diagnostic test due to the rate of change in their genome sequences which can vary substantially in different areas and over time. The fact that the regions of sequence for primer hybridization in the described method have remained sufficiently conserved since the AMPV and HMPV diverged, should give the best chance of continued detection of current subgroups and of potential unknown or future emerging MPV strains. Copyright © 2017 Elsevier B.V. All rights reserved.

  6. Crosslinking transcription factors to their recognition sequences with PtII complexes

    NASA Technical Reports Server (NTRS)

    Chu, B. C.; Orgel, L. E.

    1992-01-01

    We have prepared phosphorothioate-containing cyclic oligodeoxynucleotides that fold into 'dumbbells' containing CRE and TRE sequences, the binding sequences for the CREB and JUN proteins, respectively. Six phosphorothioate residues were introduced into each of the recognition sequences. K2PtCl4 crosslinks CRE to CREB and TRE to JUN. The extent of crosslinking is about eight times greater than that observed with standard oligodeoxynucleotides and amounts to 30-50% of the efficiency of non-covalent association as estimated by gel-shift assays. Crosslinking is reversed by incubation with NaCN. The crosslinking reaction is specific--a dumbbell oligonucleotide with six phosphorothioate groups introduced into the Sp1 recognition sequence could not be crosslinked efficiently to CREB or JUN proteins with K2PtCl4. The binding of TRE to CREB is not strong enough for effective detection by gel-shift assays, but the TRE-CREB complex is crosslinked efficiently by K2PtCl4 and can then readily be detected.

  7. AMPLISAS: a web server for multilocus genotyping using next-generation amplicon sequencing data.

    PubMed

    Sebastian, Alvaro; Herdegen, Magdalena; Migalska, Magdalena; Radwan, Jacek

    2016-03-01

    Next-generation sequencing (NGS) technologies are revolutionizing the fields of biology and medicine as powerful tools for amplicon sequencing (AS). Using combinations of primers and barcodes, it is possible to sequence targeted genomic regions with deep coverage for hundreds, even thousands, of individuals in a single experiment. This is extremely valuable for the genotyping of gene families in which locus-specific primers are often difficult to design, such as the major histocompatibility complex (MHC). The utility of AS is, however, limited by the high intrinsic sequencing error rates of NGS technologies and other sources of error such as polymerase amplification or chimera formation. Correcting these errors requires extensive bioinformatic post-processing of NGS data. Amplicon Sequence Assignment (AMPLISAS) is a tool that performs analysis of AS results in a simple and efficient way, while offering customization options for advanced users. AMPLISAS is designed as a three-step pipeline consisting of (i) read demultiplexing, (ii) unique sequence clustering and (iii) erroneous sequence filtering. Allele sequences and frequencies are retrieved in excel spreadsheet format, making them easy to interpret. AMPLISAS performance has been successfully benchmarked against previously published genotyped MHC data sets obtained with various NGS technologies. © 2015 John Wiley & Sons Ltd.

  8. Flow cytometric purification of Colletotrichum higginsianum biotrophic hyphae from Arabidopsis leaves for stage-specific transcriptome analysis.

    PubMed

    Takahara, Hiroyuki; Dolf, Andreas; Endl, Elmar; O'Connell, Richard

    2009-08-01

    Generation of stage-specific cDNA libraries is a powerful approach to identify pathogen genes that are differentially expressed during plant infection. Biotrophic pathogens develop specialized infection structures inside living plant cells, but sampling the transcriptome of these structures is problematic due to the low ratio of fungal to plant RNA, and the lack of efficient methods to isolate them from infected plants. Here we established a method, based on fluorescence-activated cell sorting (FACS), to purify the intracellular biotrophic hyphae of Colletotrichum higginsianum from homogenates of infected Arabidopsis leaves. Specific selection of viable hyphae using a fluorescent vital marker provided intact RNA for cDNA library construction. Pilot-scale sequencing showed that the library was enriched with plant-induced and pathogenicity-related fungal genes, including some encoding small, soluble secreted proteins that represent candidate fungal effectors. The high purity of the hyphae (94%) prevented contamination of the library by sequences derived from host cells or other fungal cell types. RT-PCR confirmed that genes identified in the FACS-purified hyphae were also expressed in planta. The method has wide applicability for isolating the infection structures of other plant pathogens, and will facilitate cell-specific transcriptome analysis via deep sequencing and microarray hybridization, as well as proteomic analyses.

  9. Highly Efficient CRISPR/Cas9-Mediated Cloning and Functional Characterization of Gastric Cancer-Derived Epstein-Barr Virus Strains.

    PubMed

    Kanda, Teru; Furuse, Yuki; Oshitani, Hitoshi; Kiyono, Tohru

    2016-05-01

    The Epstein-Barr virus (EBV) is etiologically linked to approximately 10% of gastric cancers, in which viral genomes are maintained as multicopy episomes. EBV-positive gastric cancer cells are incompetent for progeny virus production, making viral DNA cloning extremely difficult. Here we describe a highly efficient strategy for obtaining bacterial artificial chromosome (BAC) clones of EBV episomes by utilizing a CRISPR/Cas9-mediated strand break of the viral genome and subsequent homology-directed repair. EBV strains maintained in two gastric cancer cell lines (SNU719 and YCCEL1) were cloned, and their complete viral genome sequences were determined. Infectious viruses of gastric cancer cell-derived EBVs were reconstituted, and the viruses established stable latent infections in immortalized keratinocytes. While Ras oncoprotein overexpression caused massive vacuolar degeneration and cell death in control keratinocytes, EBV-infected keratinocytes survived in the presence of Ras expression. These results implicate EBV infection in predisposing epithelial cells to malignant transformation by inducing resistance to oncogene-induced cell death. Recent progress in DNA-sequencing technology has accelerated EBV whole-genome sequencing, and the repertoire of sequenced EBV genomes is increasing progressively. Accordingly, the presence of EBV variant strains that may be relevant to EBV-associated diseases has begun to attract interest. Clearly, the determination of additional disease-associated viral genome sequences will facilitate the identification of any disease-specific EBV variants. We found that CRISPR/Cas9-mediated cleavage of EBV episomal DNA enabled the cloning of disease-associated viral strains with unprecedented efficiency. As a proof of concept, two gastric cancer cell-derived EBV strains were cloned, and the infection of epithelial cells with reconstituted viruses provided important clues about the mechanism of EBV-mediated epithelial carcinogenesis. This experimental system should contribute to establishing the relationship between viral genome variation and EBV-associated diseases. Copyright © 2016, American Society for Microbiology. All Rights Reserved.

  10. Characterization of Hungarian isolates of zucchini yellow mosaic virus (ZYMV, potyvirus) transmitted by seeds of Cucurbita pepo var Styriaca.

    PubMed

    Tóbiás, István; Palkovics, László

    2003-04-01

    Zucchini yellow mosaic virus (ZYMV) has emerged as an important pathogen of cucurbits within the last few years in Hungary. The Hungarian isolates show a high biological variability, have specific nucleotide and amino acid sequences in the N-terminal region of coat protein and form a distinct branch in the phylogenetic tree. The virus is spread very efficiently in the field by several aphid species in a non-persistent manner. It can be transmitted by seed in holl-less seeded oil pumpkin (Cucurbita pepo (L) var Styriaca), although at a very low rate. Three isolates from seed transmission assay experiments were chosen and their nucleotide sequences of coat proteins have been compared with the available CP sequences of ZYMV. According to the sequence analysis, the Hungarian isolates belong to the Central European branch in the phylogenetic tree and, together with the ZYMV isolates from Austria and Slovenia, share specific amino acids at positions 16, 17, 27 and 37 which are characteristic only to these isolates. The phylogenetic tree suggests the common origin of distantly distributed isolates which can be attributed to widespread seed transmission.

  11. Fast discovery and visualization of conserved regions in DNA sequences using quasi-alignment

    PubMed Central

    2013-01-01

    Background Next Generation Sequencing techniques are producing enormous amounts of biological sequence data and analysis becomes a major computational problem. Currently, most analysis, especially the identification of conserved regions, relies heavily on Multiple Sequence Alignment and its various heuristics such as progressive alignment, whose run time grows with the square of the number and the length of the aligned sequences and requires significant computational resources. In this work, we present a method to efficiently discover regions of high similarity across multiple sequences without performing expensive sequence alignment. The method is based on approximating edit distance between segments of sequences using p-mer frequency counts. Then, efficient high-throughput data stream clustering is used to group highly similar segments into so called quasi-alignments. Quasi-alignments have numerous applications such as identifying species and their taxonomic class from sequences, comparing sequences for similarities, and, as in this paper, discovering conserved regions across related sequences. Results In this paper, we show that quasi-alignments can be used to discover highly similar segments across multiple sequences from related or different genomes efficiently and accurately. Experiments on a large number of unaligned 16S rRNA sequences obtained from the Greengenes database show that the method is able to identify conserved regions which agree with known hypervariable regions in 16S rRNA. Furthermore, the experiments show that the proposed method scales well for large data sets with a run time that grows only linearly with the number and length of sequences, whereas for existing multiple sequence alignment heuristics the run time grows super-linearly. Conclusion Quasi-alignment-based algorithms can detect highly similar regions and conserved areas across multiple sequences. Since the run time is linear and the sequences are converted into a compact clustering model, we are able to identify conserved regions fast or even interactively using a standard PC. Our method has many potential applications such as finding characteristic signature sequences for families of organisms and studying conserved and variable regions in, for example, 16S rRNA. PMID:24564200

  12. Fast discovery and visualization of conserved regions in DNA sequences using quasi-alignment.

    PubMed

    Nagar, Anurag; Hahsler, Michael

    2013-01-01

    Next Generation Sequencing techniques are producing enormous amounts of biological sequence data and analysis becomes a major computational problem. Currently, most analysis, especially the identification of conserved regions, relies heavily on Multiple Sequence Alignment and its various heuristics such as progressive alignment, whose run time grows with the square of the number and the length of the aligned sequences and requires significant computational resources. In this work, we present a method to efficiently discover regions of high similarity across multiple sequences without performing expensive sequence alignment. The method is based on approximating edit distance between segments of sequences using p-mer frequency counts. Then, efficient high-throughput data stream clustering is used to group highly similar segments into so called quasi-alignments. Quasi-alignments have numerous applications such as identifying species and their taxonomic class from sequences, comparing sequences for similarities, and, as in this paper, discovering conserved regions across related sequences. In this paper, we show that quasi-alignments can be used to discover highly similar segments across multiple sequences from related or different genomes efficiently and accurately. Experiments on a large number of unaligned 16S rRNA sequences obtained from the Greengenes database show that the method is able to identify conserved regions which agree with known hypervariable regions in 16S rRNA. Furthermore, the experiments show that the proposed method scales well for large data sets with a run time that grows only linearly with the number and length of sequences, whereas for existing multiple sequence alignment heuristics the run time grows super-linearly. Quasi-alignment-based algorithms can detect highly similar regions and conserved areas across multiple sequences. Since the run time is linear and the sequences are converted into a compact clustering model, we are able to identify conserved regions fast or even interactively using a standard PC. Our method has many potential applications such as finding characteristic signature sequences for families of organisms and studying conserved and variable regions in, for example, 16S rRNA.

  13. Identification of Immunogenic Hot Spots within Plum Pox Potyvirus Capsid Protein for Efficient Antigen Presentation

    PubMed Central

    Fernández-Fernández, M. Rosario; Martínez-Torrecuadrada, Jorge L.; Roncal, Fernando; Domínguez, Elvira; García, Juan Antonio

    2002-01-01

    PEPSCAN analysis has been used to characterize the immunogenic regions of the capsid protein (CP) in virions of plum pox potyvirus (PPV). In addition to the well-known highly immunogenic N- and C-terminal domains of CP, regions within the core domain of the protein have also shown high immunogenicity. Moreover, the N terminus of CP is not homogeneously immunogenic, alternatively showing regions frequently recognized by antibodies and others that are not recognized at all. These results have helped us to design efficient antigen presentation vectors based on PPV. As predicted by PEPSCAN analysis, a small displacement of the insertion site in a previously constructed vector, PPV-γ, turned the derived chimeras into efficient immunogens. Vectors expressing foreign peptides at different positions within a highly immunogenic region (amino acids 43 to 52) in the N-terminal domain of CP were the most effective at inducing specific antibody responses against the foreign sequence. PMID:12438590

  14. Incorporating Writing in an Integrated Calculus, Linear Algebra, and Differential Equations Sequence.

    ERIC Educational Resources Information Center

    Kelly, Susan E.; LeDocq, Rebecca Lewin

    2001-01-01

    Describes the specific courses in a sequence along with how the writing has been implemented in each course. Provides ideas for how to efficiently handle the additional paper load so students receive the necessary feedback while keeping the grading time reasonable. (Author/ASK)

  15. New Uses for Sensitivity Analysis: How Different Movement Tasks Effect Limb Model Parameter Sensitivity

    NASA Technical Reports Server (NTRS)

    Winters, J. M.; Stark, L.

    1984-01-01

    Original results for a newly developed eight-order nonlinear limb antagonistic muscle model of elbow flexion and extension are presented. A wider variety of sensitivity analysis techniques are used and a systematic protocol is established that shows how the different methods can be used efficiently to complement one another for maximum insight into model sensitivity. It is explicitly shown how the sensitivity of output behaviors to model parameters is a function of the controller input sequence, i.e., of the movement task. When the task is changed (for instance, from an input sequence that results in the usual fast movement task to a slower movement that may also involve external loading, etc.) the set of parameters with high sensitivity will in general also change. Such task-specific use of sensitivity analysis techniques identifies the set of parameters most important for a given task, and even suggests task-specific model reduction possibilities.

  16. Cis-acting elements in the promoter region of the human aldolase C gene.

    PubMed

    Buono, P; de Conciliis, L; Olivetta, E; Izzo, P; Salvatore, F

    1993-08-16

    We investigated the cis-acting sequences involved in the expression of the human aldolase C gene by transient transfections into human neuroblastoma cells (SKNBE). We demonstrate that 420 bp of the 5'-flanking DNA direct at high efficiency the transcription of the CAT reporter gene. A deletion between -420 bp and -164 bp causes a 60% decrease of CAT activity. Gel shift and DNase I footprinting analyses revealed four protected elements: A, B, C and D. Competition analyses indicate that Sp1 or factors sharing a similar sequence specificity bind to elements A and B, but not to elements C and D. Sequence analysis shows a half palindromic ERE motif (GGTCA), in elements B and D. Region D binds a transactivating factor which appears also essential to stabilize the initiation complex.

  17. Extended phase graphs with anisotropic diffusion

    NASA Astrophysics Data System (ADS)

    Weigel, M.; Schwenk, S.; Kiselev, V. G.; Scheffler, K.; Hennig, J.

    2010-08-01

    The extended phase graph (EPG) calculus gives an elegant pictorial description of magnetization response in multi-pulse MR sequences. The use of the EPG calculus enables a high computational efficiency for the quantitation of echo intensities even for complex sequences with multiple refocusing pulses with arbitrary flip angles. In this work, the EPG concept dealing with RF pulses with arbitrary flip angles and phases is extended to account for anisotropic diffusion in the presence of arbitrary varying gradients. The diffusion effect can be expressed by specific diffusion weightings of individual magnetization pathways. This can be represented as an action of a linear operator on the magnetization state. The algorithm allows easy integration of diffusion anisotropy effects. The formalism is validated on known examples from literature and used to calculate the effective diffusion weighting in multi-echo sequences with arbitrary refocusing flip angles.

  18. The Sequences of 1504 Mutants in the Model Rice Variety Kitaake Facilitate Rapid Functional Genomic Studies

    PubMed Central

    Pham, Nikki T.; Wei, Tong; Schackwitz, Wendy S.; Lipzen, Anna M.; Duong, Phat Q.; Jones, Kyle C.; Ruan, Deling; Bauer, Diane; Peng, Yi; Schmutz, Jeremy

    2017-01-01

    The availability of a whole-genome sequenced mutant population and the cataloging of mutations of each line at a single-nucleotide resolution facilitate functional genomic analysis. To this end, we generated and sequenced a fast-neutron-induced mutant population in the model rice cultivar Kitaake (Oryza sativa ssp japonica), which completes its life cycle in 9 weeks. We sequenced 1504 mutant lines at 45-fold coverage and identified 91,513 mutations affecting 32,307 genes, i.e., 58% of all rice genes. We detected an average of 61 mutations per line. Mutation types include single-base substitutions, deletions, insertions, inversions, translocations, and tandem duplications. We observed a high proportion of loss-of-function mutations. We identified an inversion affecting a single gene as the causative mutation for the short-grain phenotype in one mutant line. This result reveals the usefulness of the resource for efficient, cost-effective identification of genes conferring specific phenotypes. To facilitate public access to this genetic resource, we established an open access database called KitBase that provides access to sequence data and seed stocks. This population complements other available mutant collections and gene-editing technologies. This work demonstrates how inexpensive next-generation sequencing can be applied to generate a high-density catalog of mutations. PMID:28576844

  19. An efficient and comprehensive strategy for genetic diagnostics of polycystic kidney disease.

    PubMed

    Eisenberger, Tobias; Decker, Christian; Hiersche, Milan; Hamann, Ruben C; Decker, Eva; Neuber, Steffen; Frank, Valeska; Bolz, Hanno J; Fehrenbach, Henry; Pape, Lars; Toenshoff, Burkhard; Mache, Christoph; Latta, Kay; Bergmann, Carsten

    2015-01-01

    Renal cysts are clinically and genetically heterogeneous conditions. Autosomal dominant polycystic kidney disease (ADPKD) is the most frequent life-threatening genetic disease and mainly caused by mutations in PKD1. The presence of six PKD1 pseudogenes and tremendous allelic heterogeneity make molecular genetic testing challenging requiring laborious locus-specific amplification. Increasing evidence suggests a major role for PKD1 in early and severe cases of ADPKD and some patients with a recessive form. Furthermore it is becoming obvious that clinical manifestations can be mimicked by mutations in a number of other genes with the necessity for broader genetic testing. We established and validated a sequence capture based NGS testing approach for all genes known for cystic and polycystic kidney disease including PKD1. Thereby, we demonstrate that the applied standard mapping algorithm specifically aligns reads to the PKD1 locus and overcomes the complication of unspecific capture of pseudogenes. Employing careful and experienced assessment of NGS data, the method is shown to be very specific and equally sensitive as established methods. An additional advantage over conventional Sanger sequencing is the detection of copy number variations (CNVs). Sophisticated bioinformatic read simulation increased the high analytical depth of the validation study and further demonstrated the strength of the approach. We further raise some awareness of limitations and pitfalls of common NGS workflows when applied in complex regions like PKD1 demonstrating that quality of NGS needs more than high coverage of the target region. By this, we propose a time- and cost-efficient diagnostic strategy for comprehensive molecular genetic testing of polycystic kidney disease which is highly automatable and will be of particular value when therapeutic options for PKD emerge and genetic testing is needed for larger numbers of patients.

  20. FusionAnalyser: a new graphical, event-driven tool for fusion rearrangements discovery

    PubMed Central

    Piazza, Rocco; Pirola, Alessandra; Spinelli, Roberta; Valletta, Simona; Redaelli, Sara; Magistroni, Vera; Gambacorti-Passerini, Carlo

    2012-01-01

    Gene fusions are common driver events in leukaemias and solid tumours; here we present FusionAnalyser, a tool dedicated to the identification of driver fusion rearrangements in human cancer through the analysis of paired-end high-throughput transcriptome sequencing data. We initially tested FusionAnalyser by using a set of in silico randomly generated sequencing data from 20 known human translocations occurring in cancer and subsequently using transcriptome data from three chronic and three acute myeloid leukaemia samples. in all the cases our tool was invariably able to detect the presence of the correct driver fusion event(s) with high specificity. In one of the acute myeloid leukaemia samples, FusionAnalyser identified a novel, cryptic, in-frame ETS2–ERG fusion. A fully event-driven graphical interface and a flexible filtering system allow complex analyses to be run in the absence of any a priori programming or scripting knowledge. Therefore, we propose FusionAnalyser as an efficient and robust graphical tool for the identification of functional rearrangements in the context of high-throughput transcriptome sequencing data. PMID:22570408

  1. FusionAnalyser: a new graphical, event-driven tool for fusion rearrangements discovery.

    PubMed

    Piazza, Rocco; Pirola, Alessandra; Spinelli, Roberta; Valletta, Simona; Redaelli, Sara; Magistroni, Vera; Gambacorti-Passerini, Carlo

    2012-09-01

    Gene fusions are common driver events in leukaemias and solid tumours; here we present FusionAnalyser, a tool dedicated to the identification of driver fusion rearrangements in human cancer through the analysis of paired-end high-throughput transcriptome sequencing data. We initially tested FusionAnalyser by using a set of in silico randomly generated sequencing data from 20 known human translocations occurring in cancer and subsequently using transcriptome data from three chronic and three acute myeloid leukaemia samples. in all the cases our tool was invariably able to detect the presence of the correct driver fusion event(s) with high specificity. In one of the acute myeloid leukaemia samples, FusionAnalyser identified a novel, cryptic, in-frame ETS2-ERG fusion. A fully event-driven graphical interface and a flexible filtering system allow complex analyses to be run in the absence of any a priori programming or scripting knowledge. Therefore, we propose FusionAnalyser as an efficient and robust graphical tool for the identification of functional rearrangements in the context of high-throughput transcriptome sequencing data.

  2. Improved imputation accuracy of rare and low-frequency variants using population-specific high-coverage WGS-based imputation reference panel.

    PubMed

    Mitt, Mario; Kals, Mart; Pärn, Kalle; Gabriel, Stacey B; Lander, Eric S; Palotie, Aarno; Ripatti, Samuli; Morris, Andrew P; Metspalu, Andres; Esko, Tõnu; Mägi, Reedik; Palta, Priit

    2017-06-01

    Genetic imputation is a cost-efficient way to improve the power and resolution of genome-wide association (GWA) studies. Current publicly accessible imputation reference panels accurately predict genotypes for common variants with minor allele frequency (MAF)≥5% and low-frequency variants (0.5≤MAF<5%) across diverse populations, but the imputation of rare variation (MAF<0.5%) is still rather limited. In the current study, we evaluate imputation accuracy achieved with reference panels from diverse populations with a population-specific high-coverage (30 ×) whole-genome sequencing (WGS) based reference panel, comprising of 2244 Estonian individuals (0.25% of adult Estonians). Although the Estonian-specific panel contains fewer haplotypes and variants, the imputation confidence and accuracy of imputed low-frequency and rare variants was significantly higher. The results indicate the utility of population-specific reference panels for human genetic studies.

  3. Improved imputation accuracy of rare and low-frequency variants using population-specific high-coverage WGS-based imputation reference panel

    PubMed Central

    Mitt, Mario; Kals, Mart; Pärn, Kalle; Gabriel, Stacey B; Lander, Eric S; Palotie, Aarno; Ripatti, Samuli; Morris, Andrew P; Metspalu, Andres; Esko, Tõnu; Mägi, Reedik; Palta, Priit

    2017-01-01

    Genetic imputation is a cost-efficient way to improve the power and resolution of genome-wide association (GWA) studies. Current publicly accessible imputation reference panels accurately predict genotypes for common variants with minor allele frequency (MAF)≥5% and low-frequency variants (0.5≤MAF<5%) across diverse populations, but the imputation of rare variation (MAF<0.5%) is still rather limited. In the current study, we evaluate imputation accuracy achieved with reference panels from diverse populations with a population-specific high-coverage (30 ×) whole-genome sequencing (WGS) based reference panel, comprising of 2244 Estonian individuals (0.25% of adult Estonians). Although the Estonian-specific panel contains fewer haplotypes and variants, the imputation confidence and accuracy of imputed low-frequency and rare variants was significantly higher. The results indicate the utility of population-specific reference panels for human genetic studies. PMID:28401899

  4. Sox2 regulatory region 2 sequence works as a DNA nuclear targeting sequence enhancing the efficiency of an exogenous gene expression in ES cells

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Funabashi, Hisakage; Takatsu, Makoto; Saito, Mikako

    2010-10-01

    Research highlights: {yields} SV40-DTS worked as a DTS in ES cells as well as other types of cells. {yields} Sox2 regulatory region 2 worked as a DTS in ES cells and thus was termed as SRR2-DTS. {yields} SRR2-DTS was suggested as an ES cell-specific DTS. -- Abstract: In this report, the effects of two DNA nuclear targeting sequence (DTS) candidates on the gene expression efficiency in ES cells were investigated. Reporter plasmids containing the simian virus 40 (SV40) promoter/enhancer sequence (SV40-DTS), a DTS for various types of cells but not being reported yet for ES cells, and the 81 basemore » pairs of Sox2 regulatory region 2 (SRR2) where two transcriptional factors in ES cells, Oct3/4 and Sox2, are bound (SRR2-DTS), were introduced into cytoplasm in living cells by femtoinjection. The gene expression efficiencies of each plasmid in mouse insulinoma cell line MIN6 cells and mouse ES cells were then evaluated. Plasmids including SV40-DTS and SRR2-DTS exhibited higher gene expression efficiency comparing to plasmids without these DTSs, and thus it was concluded that both sequences work as a DTS in ES cells. In addition, it was suggested that SRR2-DTS works as an ES cell-specific DTS. To the best of our knowledge, this is the first report to confirm the function of DTSs in ES cells.« less

  5. Investigation of timing effects in modified composite quadrupolar echo pulse sequences by mean of average Hamiltonian theory

    NASA Astrophysics Data System (ADS)

    Mananga, Eugene Stephane

    2018-01-01

    The utility of the average Hamiltonian theory and its antecedent the Magnus expansion is presented. We assessed the concept of convergence of the Magnus expansion in quadrupolar spectroscopy of spin-1 via the square of the magnitude of the average Hamiltonian. We investigated this approach for two specific modified composite pulse sequences: COM-Im and COM-IVm. It is demonstrated that the size of the square of the magnitude of zero order average Hamiltonian obtained on the appropriated basis is a viable approach to study the convergence of the Magnus expansion. The approach turns to be efficient in studying pulse sequences in general and can be very useful to investigate coherent averaging in the development of high resolution NMR technique in solids. This approach allows comparing theoretically the two modified composite pulse sequences COM-Im and COM-IVm. We also compare theoretically the current modified composite sequences (COM-Im and COM-IVm) to the recently published modified composite pulse sequences (MCOM-I, MCOM-IV, MCOM-I_d, MCOM-IV_d).

  6. VariantBam: filtering and profiling of next-generational sequencing data using region-specific rules.

    PubMed

    Wala, Jeremiah; Zhang, Cheng-Zhong; Meyerson, Matthew; Beroukhim, Rameen

    2016-07-01

    We developed VariantBam, a C ++ read filtering and profiling tool for use with BAM, CRAM and SAM sequencing files. VariantBam provides a flexible framework for extracting sequencing reads or read-pairs that satisfy combinations of rules, defined by any number of genomic intervals or variant sites. We have implemented filters based on alignment data, sequence motifs, regional coverage and base quality. For example, VariantBam achieved a median size reduction ratio of 3.1:1 when applied to 10 lung cancer whole genome BAMs by removing large tags and selecting for only high-quality variant-supporting reads and reads matching a large dictionary of sequence motifs. Thus VariantBam enables efficient storage of sequencing data while preserving the most relevant information for downstream analysis. VariantBam and full documentation are available at github.com/jwalabroad/VariantBam rameen@broadinstitute.org Supplementary data are available at Bioinformatics online. © The Author 2016. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.

  7. Novel green tissue-specific synthetic promoters and cis-regulatory elements in rice.

    PubMed

    Wang, Rui; Zhu, Menglin; Ye, Rongjian; Liu, Zuoxiong; Zhou, Fei; Chen, Hao; Lin, Yongjun

    2015-12-11

    As an important part of synthetic biology, synthetic promoter has gradually become a hotspot in current biology. The purposes of the present study were to synthesize green tissue-specific promoters and to discover green tissue-specific cis-elements. We first assembled several regulatory sequences related to tissue-specific expression in different combinations, aiming to obtain novel green tissue-specific synthetic promoters. GUS assays of the transgenic plants indicated 5 synthetic promoters showed green tissue-specific expression patterns and different expression efficiencies in various tissues. Subsequently, we scanned and counted the cis-elements in different tissue-specific promoters based on the plant cis-elements database PLACE and the rice cDNA microarray database CREP for green tissue-specific cis-element discovery, resulting in 10 potential cis-elements. The flanking sequence of one potential core element (GEAT) was predicted by bioinformatics. Then, the combination of GEAT and its flanking sequence was functionally identified with synthetic promoter. GUS assays of the transgenic plants proved its green tissue-specificity. Furthermore, the function of GEAT flanking sequence was analyzed in detail with site-directed mutagenesis. Our study provides an example for the synthesis of rice tissue-specific promoters and develops a feasible method for screening and functional identification of tissue-specific cis-elements with their flanking sequences at the genome-wide level in rice.

  8. The analysis of novel microRNA mimic sequences in cancer cells reveals lack of specificity in stem-loop RT-qPCR-based microRNA detection.

    PubMed

    Winata, Patrick; Williams, Marissa; McGowan, Eileen; Nassif, Najah; van Zandwijk, Nico; Reid, Glen

    2017-11-17

    MicroRNAs are frequently downregulated in cancer, and restoring expression has tumour suppressive activity in tumour cells. Our recent phase I clinical trial investigated microRNA-based therapy in patients with malignant pleural mesothelioma. Treatment with TargomiRs, microRNA mimics with novel sequence packaged in EGFR antibody-targeted bacterial minicells, revealed clear signs of clinical activity. In order to detect delivery of microRNA mimics to tumour cells in future clinical trials, we tested hydrolysis probe-based assays specific for the sequence of the novel mimics in transfected mesothelioma cell lines using RT-qPCR. The custom assays efficiently and specifically amplified the consensus mimics. However, we found that these assays gave a signal when total RNA from untransfected and control mimic-transfected cells were used as templates. Further investigation revealed that the reverse transcription step using stem-loop primers appeared to introduce substantial non-specific amplification with either total RNA or synthetic RNA templates. This suggests that reverse transcription using stem-loop primers suffers from an intrinsic lack of specificity for the detection of highly similar microRNAs in the same family, especially when analysing total RNA. These results suggest that RT-qPCR is unlikely to be an effective means to detect delivery of microRNA mimic-based drugs to tumour cells in patients.

  9. Investigating the structure preserving encryption of high efficiency video coding (HEVC)

    NASA Astrophysics Data System (ADS)

    Shahid, Zafar; Puech, William

    2013-02-01

    This paper presents a novel method for the real-time protection of new emerging High Efficiency Video Coding (HEVC) standard. Structure preserving selective encryption is being performed in CABAC entropy coding module of HEVC, which is significantly different from CABAC entropy coding of H.264/AVC. In CABAC of HEVC, exponential Golomb coding is replaced by truncated Rice (TR) up to a specific value for binarization of transform coefficients. Selective encryption is performed using AES cipher in cipher feedback mode on a plaintext of binstrings in a context aware manner. The encrypted bitstream has exactly the same bit-rate and is format complaint. Experimental evaluation and security analysis of the proposed algorithm is performed on several benchmark video sequences containing different combinations of motion, texture and objects.

  10. An Efficient Strategy Combining SSR Markers- and Advanced QTL-seq-driven QTL Mapping Unravels Candidate Genes Regulating Grain Weight in Rice

    PubMed Central

    Daware, Anurag; Das, Sweta; Srivastava, Rishi; Badoni, Saurabh; Singh, Ashok K.; Agarwal, Pinky; Parida, Swarup K.; Tyagi, Akhilesh K.

    2016-01-01

    Development and use of genome-wide informative simple sequence repeat (SSR) markers and novel integrated genomic strategies are vital to drive genomics-assisted breeding applications and for efficient dissection of quantitative trait loci (QTLs) underlying complex traits in rice. The present study developed 6244 genome-wide informative SSR markers exhibiting in silico fragment length polymorphism based on repeat-unit variations among genomic sequences of 11 indica, japonica, aus, and wild rice accessions. These markers were mapped on diverse coding and non-coding sequence components of known cloned/candidate genes annotated from 12 chromosomes and revealed a much higher amplification (97%) and polymorphic potential (88%) along with wider genetic/functional diversity level (16–74% with a mean 53%) especially among accessions belonging to indica cultivar group, suggesting their utility in large-scale genomics-assisted breeding applications in rice. A high-density 3791 SSR markers-anchored genetic linkage map (IR 64 × Sonasal) spanning 2060 cM total map-length with an average inter-marker distance of 0.54 cM was generated. This reference genetic map identified six major genomic regions harboring robust QTLs (31% combined phenotypic variation explained with a 5.7–8.7 LOD) governing grain weight on six rice chromosomes. One strong grain weight major QTL region (OsqGW5.1) was narrowed-down by integrating traditional QTL mapping with high-resolution QTL region-specific integrated SSR and single nucleotide polymorphism markers-based QTL-seq analysis and differential expression profiling. This led us to delineate two natural allelic variants in two known cis-regulatory elements (RAV1AAT and CARGCW8GAT) of glycosyl hydrolase and serine carboxypeptidase genes exhibiting pronounced seed-specific differential regulation in low (Sonasal) and high (IR 64) grain weight mapping parental accessions. Our genome-wide SSR marker resource (polymorphic within/between diverse cultivar groups) and integrated genomic strategy can efficiently scan functionally relevant potential molecular tags (markers, candidate genes and alleles) regulating complex agronomic traits (grain weight) and expedite marker-assisted genetic enhancement in rice. PMID:27833617

  11. Engineering RNA phage MS2 virus-like particles for peptide display

    NASA Astrophysics Data System (ADS)

    Jordan, Sheldon Keith

    Phage display is a powerful and versatile technology that enables the selection of novel binding functions from large populations of randomly generated peptide sequences. Random sequences are genetically fused to a viral structural protein to produce complex peptide libraries. From a sufficiently complex library, phage bearing peptides with practically any desired binding activity can be physically isolated by affinity selection, and, since each particle carries in its genome the genetic information for its own replication, the selectants can be amplified by infection of bacteria. For certain applications however, existing phage display platforms have limitations. One such area is in the field of vaccine development, where the goal is to identify relevant epitopes by affinity-selection against an antibody target, and then to utilize them as immunogens to elicit a desired antibody response. Today, affinity selection is usually conducted using display on filamentous phages like M13. This technology provides an efficient means for epitope identification, but, because filamentous phages do not display peptides in the high-density, multivalent arrays the immune system prefers to recognize, they generally make poor immunogens and are typically useless as vaccines. This makes it necessary to confer immunogenicity by conjugating synthetic versions of the peptides to more immunogenic carriers. Unfortunately, when introduced into these new structural environments, the epitopes often fail to elicit relevant antibody responses. Thus, it would be advantageous to combine the epitope selection and immunogen functions into a single platform where the structural constraints present during affinity selection can be preserved during immunization. This dissertation describes efforts to develop a peptide display system based on the virus-like particles (VLPs) of bacteriophage MS2. Phage display technologies rely on (1) the identification of a site in a viral structural protein that is present on the surface of the virus particle and can accept foreign sequence insertions without disruption of protein folding and viral particle assembly, and (2) on the encapsidation of nucleic acid sequences encoding both the VLP and the peptide it displays. The experiments described here are aimed at satisfying the first of these two requirements by engineering efficient peptide display at two different sites in MS2 coat protein. First, we evaluated the suitability of the N-terminus of MS2 coat for peptide insertions. It was observed that random N-terminal 10-mer fusions generally disrupted protein folding and VLP assembly, but by bracketing the foreign sequences with certain specific dipeptides, these defects could be suppressed. Next, the suitability of a coat protein surface loop for foreign sequence insertion was tested. Specifically, random sequence peptides were inserted into the N-terminal-most AB-loop of a coat protein single-chain dimer. Again we found that efficient display required the presence of appropriate dipeptides bracketing the peptide insertion. Finally, it was shown that an N-terminal fusion that tended to interfere specifically with capsid assembly could be efficiently incorporated into mosaic particles when co-expressed with wild-type coat protein.

  12. The effectiveness of three regions in mitochondrial genome for aphid DNA barcoding: a case in Lachininae.

    PubMed

    Chen, Rui; Jiang, Li-Yun; Qiao, Ge-Xia

    2012-01-01

    The mitochondrial gene COI has been widely used by taxonomists as a standard DNA barcode sequence for the identification of many animal species. However, the COI region is of limited use for identifying certain species and is not efficiently amplified by PCR in all animal taxa. To evaluate the utility of COI as a DNA barcode and to identify other barcode genes, we chose the aphid subfamily Lachninae (Hemiptera: Aphididae) as the focus of our study. We compared the results obtained using COI with two other mitochondrial genes, COII and Cytb. In addition, we propose a new method to improve the efficiency of species identification using DNA barcoding. Three mitochondrial genes (COI, COII and Cytb) were sequenced and were used in the identification of over 80 species of Lachninae. The COI and COII genes demonstrated a greater PCR amplification efficiency than Cytb. Species identification using COII sequences had a higher frequency of success (96.9% in "best match" and 90.8% in "best close match") and yielded lower intra- and higher interspecific genetic divergence values than the other two markers. The use of "tag barcodes" is a new approach that involves attaching a species-specific tag to the standard DNA barcode. With this method, the "barcoding overlap" can be nearly eliminated. As a result, we were able to increase the identification success rate from 83.9% to 95.2% by using COI and the "best close match" technique. A COII-based identification system should be more effective in identifying lachnine species than COI or Cytb. However, the Cytb gene is an effective marker for the study of aphid population genetics due to its high sequence diversity. Furthermore, the use of "tag barcodes" can improve the accuracy of DNA barcoding identification by reducing or removing the overlap between intra- and inter-specific genetic divergence values.

  13. Regions flanking ori sequences affect the replication efficiency of the mitochondrial genome of ori+ petite mutants from yeast.

    PubMed

    Rayko, E; Goursot, R; Cherif-Zahar, B; Melis, R; Bernardi, G

    1988-03-31

    The mitochondrial genomes of progenies from 26 crosses between 17 cytoplasmic, spontaneous, suppressive, ori+ petite mutants of Saccharomyces cerevisiae have been studied by electrophoresis of restriction fragments. Only parental genomes (or occasionally, genomes derived from them by secondary excisions) were found in the progenies of the almost 500 diploids investigated; no evidence for illegitimate, site-specific mitochondrial recombination was detected. One of the parental genomes was always found to be predominate over the other one, although to different extents in different crosses. This predominance appears to be due to a higher replication efficiency, which is correlated with a greater density of ori sequences on the mitochondrial genome (and with a shorter repeat unit size of the latter). Exceptions to the 'repeat-unit-size rule' were found, however, even when the parental mitochondrial genomes carried the same ori sequence. This indicates that noncoding, intergenic sequences outside ori sequences also play a role in modulating replication efficiency. Since in different petites such sequences differ in primary structure, size, and position relative to ori sequences, this modulation is likely to take place through an indirect effect on DNA and nucleoid structure.

  14. snoSeeker: an advanced computational package for screening of guide and orphan snoRNA genes in the human genome.

    PubMed

    Yang, Jian-Hua; Zhang, Xiao-Chen; Huang, Zhan-Peng; Zhou, Hui; Huang, Mian-Bo; Zhang, Shu; Chen, Yue-Qin; Qu, Liang-Hu

    2006-01-01

    Small nucleolar RNAs (snoRNAs) represent an abundant group of non-coding RNAs in eukaryotes. They can be divided into guide and orphan snoRNAs according to the presence or absence of antisense sequence to rRNAs or snRNAs. Current snoRNA-searching programs, which are essentially based on sequence complementarity to rRNAs or snRNAs, exist only for the screening of guide snoRNAs. In this study, we have developed an advanced computational package, snoSeeker, which includes CDseeker and ACAseeker programs, for the highly efficient and specific screening of both guide and orphan snoRNA genes in mammalian genomes. By using these programs, we have systematically scanned four human-mammal whole-genome alignment (WGA) sequences and identified 54 novel candidates including 26 orphan candidates as well as 266 known snoRNA genes. Eighteen novel snoRNAs were further experimentally confirmed with four snoRNAs exhibiting a tissue-specific or restricted expression pattern. The results of this study provide the most comprehensive listing of two families of snoRNA genes in the human genome till date.

  15. Plant Aquaporins: Genome-Wide Identification, Transcriptomics, Proteomics, and Advanced Analytical Tools.

    PubMed

    Deshmukh, Rupesh K; Sonah, Humira; Bélanger, Richard R

    2016-01-01

    Aquaporins (AQPs) are channel-forming integral membrane proteins that facilitate the movement of water and many other small molecules. Compared to animals, plants contain a much higher number of AQPs in their genome. Homology-based identification of AQPs in sequenced species is feasible because of the high level of conservation of protein sequences across plant species. Genome-wide characterization of AQPs has highlighted several important aspects such as distribution, genetic organization, evolution and conserved features governing solute specificity. From a functional point of view, the understanding of AQP transport system has expanded rapidly with the help of transcriptomics and proteomics data. The efficient analysis of enormous amounts of data generated through omic scale studies has been facilitated through computational advancements. Prediction of protein tertiary structures, pore architecture, cavities, phosphorylation sites, heterodimerization, and co-expression networks has become more sophisticated and accurate with increasing computational tools and pipelines. However, the effectiveness of computational approaches is based on the understanding of physiological and biochemical properties, transport kinetics, solute specificity, molecular interactions, sequence variations, phylogeny and evolution of aquaporins. For this purpose, tools like Xenopus oocyte assays, yeast expression systems, artificial proteoliposomes, and lipid membranes have been efficiently exploited to study the many facets that influence solute transport by AQPs. In the present review, we discuss genome-wide identification of AQPs in plants in relation with recent advancements in analytical tools, and their availability and technological challenges as they apply to AQPs. An exhaustive review of omics resources available for AQP research is also provided in order to optimize their efficient utilization. Finally, a detailed catalog of computational tools and analytical pipelines is offered as a resource for AQP research.

  16. Plant Aquaporins: Genome-Wide Identification, Transcriptomics, Proteomics, and Advanced Analytical Tools

    PubMed Central

    Deshmukh, Rupesh K.; Sonah, Humira; Bélanger, Richard R.

    2016-01-01

    Aquaporins (AQPs) are channel-forming integral membrane proteins that facilitate the movement of water and many other small molecules. Compared to animals, plants contain a much higher number of AQPs in their genome. Homology-based identification of AQPs in sequenced species is feasible because of the high level of conservation of protein sequences across plant species. Genome-wide characterization of AQPs has highlighted several important aspects such as distribution, genetic organization, evolution and conserved features governing solute specificity. From a functional point of view, the understanding of AQP transport system has expanded rapidly with the help of transcriptomics and proteomics data. The efficient analysis of enormous amounts of data generated through omic scale studies has been facilitated through computational advancements. Prediction of protein tertiary structures, pore architecture, cavities, phosphorylation sites, heterodimerization, and co-expression networks has become more sophisticated and accurate with increasing computational tools and pipelines. However, the effectiveness of computational approaches is based on the understanding of physiological and biochemical properties, transport kinetics, solute specificity, molecular interactions, sequence variations, phylogeny and evolution of aquaporins. For this purpose, tools like Xenopus oocyte assays, yeast expression systems, artificial proteoliposomes, and lipid membranes have been efficiently exploited to study the many facets that influence solute transport by AQPs. In the present review, we discuss genome-wide identification of AQPs in plants in relation with recent advancements in analytical tools, and their availability and technological challenges as they apply to AQPs. An exhaustive review of omics resources available for AQP research is also provided in order to optimize their efficient utilization. Finally, a detailed catalog of computational tools and analytical pipelines is offered as a resource for AQP research. PMID:28066459

  17. Isolation and functional characterization of TIF-IB, a factor that confers promoter specificity to mouse RNA polymerase I.

    PubMed

    Schnapp, A; Clos, J; Hädelt, W; Schreck, R; Cvekl, A; Grummt, I

    1990-03-25

    The murine ribosomal gene promoter contains two cis-acting control elements which operate in concert to promote efficient and accurate transcription initiation by RNA polymerase I. The start site proximal core element which is indispensable for promoter recognition by RNA polymerase I (pol I) encompasses sequences from position -39 to -1. An upstream control element (UCE) which is located between nucleotides -142 and -112 stimulates the efficiency of transcription initiation both in vivo and in vitro. Here we report the isolation and functional characterization of a specific rDNA binding protein, the transcription initiation factor TIF-IB, which specifically interacts with the core region of the mouse ribosomal RNA gene promoter. Highly purified TIF-IB complements transcriptional activity in the presence of two other essential initiation factors TIF-IA and TIF-IC. We demonstrate that the binding efficiency of purified TIF-IB to the core promoter is strongly enhanced by the presence in cis of the UCE. This positive effect of upstream sequences on TIF-IB binding is observed throughout the purification procedure suggesting that the synergistic action of the two distant promoter elements is not mediated by a protein different from TIF-IB. Increasing the distance between both control elements still facilitates stable factor binding but eliminates transcriptional activation. The results demonstrate that TIF-IB binding to the rDNA promoter is an essential early step in the assembly of a functional transcription initiation complex. The subsequent interaction of TIF-IB with other auxiliary transcription initiation factors, however, requires the correct spacing between the UCE and the core promoter element.

  18. Rapid and highly efficient construction of TALE-based transcriptional regulators and nucleases for genome modification.

    PubMed

    Li, Lixin; Piatek, Marek J; Atef, Ahmed; Piatek, Agnieszka; Wibowo, Anjar; Fang, Xiaoyun; Sabir, J S M; Zhu, Jian-Kang; Mahfouz, Magdy M

    2012-03-01

    Transcription activator-like effectors (TALEs) can be used as DNA-targeting modules by engineering their repeat domains to dictate user-selected sequence specificity. TALEs have been shown to function as site-specific transcriptional activators in a variety of cell types and organisms. TALE nucleases (TALENs), generated by fusing the FokI cleavage domain to TALE, have been used to create genomic double-strand breaks. The identity of the TALE repeat variable di-residues, their number, and their order dictate the DNA sequence specificity. Because TALE repeats are nearly identical, their assembly by cloning or even by synthesis is challenging and time consuming. Here, we report the development and use of a rapid and straightforward approach for the construction of designer TALE (dTALE) activators and nucleases with user-selected DNA target specificity. Using our plasmid set of 100 repeat modules, researchers can assemble repeat domains for any 14-nucleotide target sequence in one sequential restriction-ligation cloning step and in only 24 h. We generated several custom dTALEs and dTALENs with new target sequence specificities and validated their function by transient expression in tobacco leaves and in vitro DNA cleavage assays, respectively. Moreover, we developed a web tool, called idTALE, to facilitate the design of dTALENs and the identification of their genomic targets and potential off-targets in the genomes of several model species. Our dTALE repeat assembly approach along with the web tool idTALE will expedite genome-engineering applications in a variety of cell types and organisms including plants.

  19. Genome-wide identification of allele-specific expression (ASE) in response to Marek's disease virus infection using next generation sequencing.

    PubMed

    Maceachern, Sean; Muir, William M; Crosby, Seth; Cheng, Hans H

    2011-06-03

    Marek's disease (MD), a T cell lymphoma induced by the highly oncogenic α-herpesvirus Marek's disease virus (MDV), is the main chronic infectious disease concern threatening the poultry industry. Enhancing genetic resistance to MD in commercial poultry is an attractive method to augment MD vaccines, which is currently the control method of choice. In order to optimally implement this control strategy through marker-assisted selection (MAS) and to gain biological information, it is necessary to identify specific genes that influence MD incidence. A genome-wide screen for allele-specific expression (ASE) in response to MDV infection was conducted. The highly inbred ADOL chicken lines 6 (MD resistant) and 7 (MD susceptible) were inter-mated in reciprocal crosses and half of the progeny challenged with MDV. Splenic RNA pools at a single time after infection for each treatment group point were generated, sequenced using a next generation sequencer, then analyzed for allele-specific expression (ASE). To validate and extend the results, Illumina GoldenGate assays for selected cSNPs were developed and used on all RNA samples from all 6 time points following MDV challenge. RNA sequencing resulted in 11-13+ million mappable reads per treatment group, 1.7+ Gb total sequence, and 22,655 high-confidence cSNPs. Analysis of these cSNPs revealed that 5360 cSNPs in 3773 genes exhibited statistically significant allelic imbalance. Of the 1536 GoldenGate assays, 1465 were successfully scored with all but 19 exhibiting evidence for allelic imbalance. ASE is an efficient method to identify potentially all or most of the genes influencing this complex trait. The identified cSNPs can be further evaluated in resource populations to determine their allelic direction and size of effect on genetic resistance to MD as well as being directly implemented in genomic selection programs. The described method, although demonstrated in inbred chicken lines, is applicable to all traits in any diploid species, and should prove to be a simple method to identify the majority of genes controlling any complex trait.

  20. Typing of canine parvovirus isolates using mini-sequencing based single nucleotide polymorphism analysis.

    PubMed

    Naidu, Hariprasad; Subramanian, B Mohana; Chinchkar, Shankar Ramchandra; Sriraman, Rajan; Rana, Samir Kumar; Srinivasan, V A

    2012-05-01

    The antigenic types of canine parvovirus (CPV) are defined based on differences in the amino acids of the major capsid protein VP2. Type specificity is conferred by a limited number of amino acid changes and in particular by few nucleotide substitutions. PCR based methods are not particularly suitable for typing circulating variants which differ in a few specific nucleotide substitutions. Assays for determining SNPs can detect efficiently nucleotide substitutions and can thus be adapted to identify CPV types. In the present study, CPV typing was performed by single nucleotide extension using the mini-sequencing technique. A mini-sequencing signature was established for all the four CPV types (CPV2, 2a, 2b and 2c) and feline panleukopenia virus. The CPV typing using the mini-sequencing reaction was performed for 13 CPV field isolates and the two vaccine strains available in our repository. All the isolates had been typed earlier by full-length sequencing of the VP2 gene. The typing results obtained from mini-sequencing matched completely with that of sequencing. Typing could be achieved with less than 100 copies of standard plasmid DNA constructs or ≤10¹ FAID₅₀ of virus by mini-sequencing technique. The technique was also efficient for detecting multiple types in mixed infections. Copyright © 2012 Elsevier B.V. All rights reserved.

  1. High-Throughput Next-Generation Sequencing of Polioviruses

    PubMed Central

    Montmayeur, Anna M.; Schmidt, Alexander; Zhao, Kun; Magaña, Laura; Iber, Jane; Castro, Christina J.; Chen, Qi; Henderson, Elizabeth; Ramos, Edward; Shaw, Jing; Tatusov, Roman L.; Dybdahl-Sissoko, Naomi; Endegue-Zanga, Marie Claire; Adeniji, Johnson A.; Oberste, M. Steven; Burns, Cara C.

    2016-01-01

    ABSTRACT The poliovirus (PV) is currently targeted for worldwide eradication and containment. Sanger-based sequencing of the viral protein 1 (VP1) capsid region is currently the standard method for PV surveillance. However, the whole-genome sequence is sometimes needed for higher resolution global surveillance. In this study, we optimized whole-genome sequencing protocols for poliovirus isolates and FTA cards using next-generation sequencing (NGS), aiming for high sequence coverage, efficiency, and throughput. We found that DNase treatment of poliovirus RNA followed by random reverse transcription (RT), amplification, and the use of the Nextera XT DNA library preparation kit produced significantly better results than other preparations. The average viral reads per total reads, a measurement of efficiency, was as high as 84.2% ± 15.6%. PV genomes covering >99 to 100% of the reference length were obtained and validated with Sanger sequencing. A total of 52 PV genomes were generated, multiplexing as many as 64 samples in a single Illumina MiSeq run. This high-throughput, sequence-independent NGS approach facilitated the detection of a diverse range of PVs, especially for those in vaccine-derived polioviruses (VDPV), circulating VDPV, or immunodeficiency-related VDPV. In contrast to results from previous studies on other viruses, our results showed that filtration and nuclease treatment did not discernibly increase the sequencing efficiency of PV isolates. However, DNase treatment after nucleic acid extraction to remove host DNA significantly improved the sequencing results. This NGS method has been successfully implemented to generate PV genomes for molecular epidemiology of the most recent PV isolates. Additionally, the ability to obtain full PV genomes from FTA cards will aid in facilitating global poliovirus surveillance. PMID:27927929

  2. Simian virus 40 major late promoter: an upstream DNA sequence required for efficient in vitro transcription.

    PubMed Central

    Brady, J; Radonovich, M; Thoren, M; Das, G; Salzman, N P

    1984-01-01

    We have previously identified an 11-base DNA sequence, 5'-G-G-T-A-C-C-T-A-A-C-C-3' (simian virus 40 [SV40] map position 294 to 304), which is important in the control of SV40 late RNA expression in vitro and in vivo (Brady et al., Cell 31:625-633, 1982). We report here the identification of another domain of the SV40 late promoter. A series of mutants with deletions extending from SV40 map position 0 to 300 was prepared by nuclease BAL 31 treatment. The cloned templates were then analyzed for efficiency and accuracy of late SV40 RNA expression in the Manley in vitro transcription system. Our studies showed that, in addition to the promoter domain near map position 300, there are essential DNA sequences between nucleotide positions 74 and 95 that are required for efficient expression of late SV40 RNA. Included in this SV40 DNA sequence were two of the six GGGCGG SV40 repeat sequences and an 11-nucleotide segment which showed strong homology with the upstream sequences required for the efficient in vitro and in vivo expression of the histone H2A gene. This upstream promoter sequence supported transcription with the same efficiency even when it was moved 72 nucleotides closer to the major late cap site. In vitro promoter competition analysis demonstrated that the upstream promoter sequence, independent of the 294 to 304 promoter element, is capable of binding polymerase-transcription factors required for SV40 late gene transcription. Finally, we show that DNA sequences which control the specificity of RNA initiation at nucleotide 325 lie downstream of map position 294. Images PMID:6321950

  3. Consistency of VDJ Rearrangement and Substitution Parameters Enables Accurate B Cell Receptor Sequence Annotation.

    PubMed

    Ralph, Duncan K; Matsen, Frederick A

    2016-01-01

    VDJ rearrangement and somatic hypermutation work together to produce antibody-coding B cell receptor (BCR) sequences for a remarkable diversity of antigens. It is now possible to sequence these BCRs in high throughput; analysis of these sequences is bringing new insight into how antibodies develop, in particular for broadly-neutralizing antibodies against HIV and influenza. A fundamental step in such sequence analysis is to annotate each base as coming from a specific one of the V, D, or J genes, or from an N-addition (a.k.a. non-templated insertion). Previous work has used simple parametric distributions to model transitions from state to state in a hidden Markov model (HMM) of VDJ recombination, and assumed that mutations occur via the same process across sites. However, codon frame and other effects have been observed to violate these parametric assumptions for such coding sequences, suggesting that a non-parametric approach to modeling the recombination process could be useful. In our paper, we find that indeed large modern data sets suggest a model using parameter-rich per-allele categorical distributions for HMM transition probabilities and per-allele-per-position mutation probabilities, and that using such a model for inference leads to significantly improved results. We present an accurate and efficient BCR sequence annotation software package using a novel HMM "factorization" strategy. This package, called partis (https://github.com/psathyrella/partis/), is built on a new general-purpose HMM compiler that can perform efficient inference given a simple text description of an HMM.

  4. A ddRAD-based genetic map and its integration with the genome assembly of Japanese eel (Anguilla japonica) provides insights into genome evolution after the teleost-specific genome duplication

    PubMed Central

    2014-01-01

    Background Recent advancements in next-generation sequencing technology have enabled cost-effective sequencing of whole or partial genomes, permitting the discovery and characterization of molecular polymorphisms. Double-digest restriction-site associated DNA sequencing (ddRAD-seq) is a powerful and inexpensive approach to developing numerous single nucleotide polymorphism (SNP) markers and constructing a high-density genetic map. To enrich genomic resources for Japanese eel (Anguilla japonica), we constructed a ddRAD-based genetic map using an Ion Torrent Personal Genome Machine and anchored scaffolds of the current genome assembly to 19 linkage groups of the Japanese eel. Furthermore, we compared the Japanese eel genome with genomes of model fishes to infer the history of genome evolution after the teleost-specific genome duplication. Results We generated the ddRAD-based linkage map of the Japanese eel, where the maps for female and male spanned 1748.8 cM and 1294.5 cM, respectively, and were arranged into 19 linkage groups. A total of 2,672 SNP markers and 115 Simple Sequence Repeat markers provide anchor points to 1,252 scaffolds covering 151 Mb (13%) of the current genome assembly of the Japanese eel. Comparisons among the Japanese eel, medaka, zebrafish and spotted gar genomes showed highly conserved synteny among teleosts and revealed part of the eight major chromosomal rearrangement events that occurred soon after the teleost-specific genome duplication. Conclusions The ddRAD-seq approach combined with the Ion Torrent Personal Genome Machine sequencing allowed us to conduct efficient and flexible SNP genotyping. The integration of the genetic map and the assembled sequence provides a valuable resource for fine mapping and positional cloning of quantitative trait loci associated with economically important traits and for investigating comparative genomics of the Japanese eel. PMID:24669946

  5. A ddRAD-based genetic map and its integration with the genome assembly of Japanese eel (Anguilla japonica) provides insights into genome evolution after the teleost-specific genome duplication.

    PubMed

    Kai, Wataru; Nomura, Kazuharu; Fujiwara, Atushi; Nakamura, Yoji; Yasuike, Motoshige; Ojima, Nobuhiko; Masaoka, Tetsuji; Ozaki, Akiyuki; Kazeto, Yukinori; Gen, Koichiro; Nagao, Jiro; Tanaka, Hideki; Kobayashi, Takanori; Ototake, Mitsuru

    2014-03-26

    Recent advancements in next-generation sequencing technology have enabled cost-effective sequencing of whole or partial genomes, permitting the discovery and characterization of molecular polymorphisms. Double-digest restriction-site associated DNA sequencing (ddRAD-seq) is a powerful and inexpensive approach to developing numerous single nucleotide polymorphism (SNP) markers and constructing a high-density genetic map. To enrich genomic resources for Japanese eel (Anguilla japonica), we constructed a ddRAD-based genetic map using an Ion Torrent Personal Genome Machine and anchored scaffolds of the current genome assembly to 19 linkage groups of the Japanese eel. Furthermore, we compared the Japanese eel genome with genomes of model fishes to infer the history of genome evolution after the teleost-specific genome duplication. We generated the ddRAD-based linkage map of the Japanese eel, where the maps for female and male spanned 1748.8 cM and 1294.5 cM, respectively, and were arranged into 19 linkage groups. A total of 2,672 SNP markers and 115 Simple Sequence Repeat markers provide anchor points to 1,252 scaffolds covering 151 Mb (13%) of the current genome assembly of the Japanese eel. Comparisons among the Japanese eel, medaka, zebrafish and spotted gar genomes showed highly conserved synteny among teleosts and revealed part of the eight major chromosomal rearrangement events that occurred soon after the teleost-specific genome duplication. The ddRAD-seq approach combined with the Ion Torrent Personal Genome Machine sequencing allowed us to conduct efficient and flexible SNP genotyping. The integration of the genetic map and the assembled sequence provides a valuable resource for fine mapping and positional cloning of quantitative trait loci associated with economically important traits and for investigating comparative genomics of the Japanese eel.

  6. 'Cold shock' increases the frequency of homology directed repair gene editing in induced pluripotent stem cells.

    PubMed

    Guo, Q; Mintier, G; Ma-Edmonds, M; Storton, D; Wang, X; Xiao, X; Kienzle, B; Zhao, D; Feder, John N

    2018-02-01

    Using CRISPR/Cas9 delivered as a RNA modality in conjunction with a lipid specifically formulated for large RNA molecules, we demonstrate that homology directed repair (HDR) rates between 20-40% can be achieved in induced pluripotent stem cells (iPSC). Furthermore, low HDR rates (between 1-20%) can be enhanced two- to ten-fold in both iPSCs and HEK293 cells by 'cold shocking' cells at 32 °C for 24-48 hours following transfection. This method can also increases the proportion of loci that have undergone complete sequence conversion across the donor sequence, or 'perfect HDR', as opposed to partial sequence conversion where nucleotides more distal to the CRISPR cut site are less efficiently incorporated ('partial HDR'). We demonstrate that the structure of the single-stranded DNA oligo donor can influence the fidelity of HDR, with oligos symmetric with respect to the CRISPR cleavage site and complementary to the target strand being more efficient at directing 'perfect HDR' compared to asymmetric non-target strand complementary oligos. Our protocol represents an efficient method for making CRISPR-mediated, specific DNA sequence changes within the genome that will facilitate the rapid generation of genetic models of human disease in iPSCs as well as other genome engineered cell lines.

  7. Draft Genome Sequence of Sporolactobacillus inulinus Strain CASD, an Efficient d-Lactic Acid-Producing Bacterium with High-Concentration Lactate Tolerance Capability

    PubMed Central

    Yu, Bo; Su, Fei; Wang, Limin; Xu, Ke; Zhao, Bo; Xu, Ping

    2011-01-01

    Sporolactobacillus inulinus CASD is an efficient d-lactic acid producer with high optical purity. Here we report for the first time the draft genome sequence of S. inulinus (2,930,096 bp). The large number of annotated two-component system genes makes it possible to explore the mechanism of extraordinary lactate tolerance of S. inulinus CASD. PMID:21952540

  8. Draft genome sequence of Sporolactobacillus inulinus strain CASD, an efficient D-lactic acid-producing bacterium with high-concentration lactate tolerance capability.

    PubMed

    Yu, Bo; Su, Fei; Wang, Limin; Xu, Ke; Zhao, Bo; Xu, Ping

    2011-10-01

    Sporolactobacillus inulinus CASD is an efficient D-lactic acid producer with high optical purity. Here we report for the first time the draft genome sequence of S. inulinus (2,930,096 bp). The large number of annotated two-component system genes makes it possible to explore the mechanism of extraordinary lactate tolerance of S. inulinus CASD.

  9. Genome sequence of the thermophilic strain Bacillus coagulans 2-6, an efficient producer of high-optical-purity L-lactic acid.

    PubMed

    Su, Fei; Yu, Bo; Sun, Jibin; Ou, Hong-Yu; Zhao, Bo; Wang, Limin; Qin, Jiayang; Tang, Hongzhi; Tao, Fei; Jarek, Michael; Scharfe, Maren; Ma, Cuiqing; Ma, Yanhe; Xu, Ping

    2011-09-01

    Bacillus coagulans 2-6 is an efficient producer of lactic acid. The genome of B. coagulans 2-6 has the smallest genome among the members of the genus Bacillus known to date. The frameshift mutation at the start of the d-lactate dehydrogenase sequence might be responsible for the production of high-optical-purity l-lactic acid.

  10. UVnovo: A De Novo Sequencing Algorithm Using Single Series of Fragment Ions via Chromophore Tagging and 351 nm Ultraviolet Photodissociation Mass Spectrometry

    PubMed Central

    Robotham, Scott A.; Horton, Andrew P.; Cannon, Joe R.; Cotham, Victoria C.; Marcotte, Edward M.; Brodbelt, Jennifer S.

    2016-01-01

    De novo peptide sequencing by mass spectrometry represents an important strategy for characterizing novel peptides and proteins, in which a peptide’s amino acid sequence is inferred directly from the precursor peptide mass and tandem mass spectrum (MS/MS or MS3) fragment ions, without comparison to a reference proteome. This method is ideal for organisms or samples lacking a complete or well-annotated reference sequence set. One of the major barriers to de novo spectral interpretation arises from confusion of N- and C-terminal ion series due to the symmetry between b and y ion pairs created by collisional activation methods (or c, z ions for electron-based activation methods). This is known as the ‘antisymmetric path problem’ and leads to inverted amino acid subsequences within a de novo reconstruction. Here, we combine several key strategies for de novo peptide sequencing into a single high-throughput pipeline: high efficiency carbamylation blocks lysine side chains, and subsequent tryptic digestion and N-terminal peptide derivatization with the ultraviolet chromophore AMCA yields peptides susceptible to 351 nm ultraviolet photodissociation (UVPD). UVPD-MS/MS of the AMCA-modified peptides then predominantly produces y ions in the MS/MS spectra, specifically addressing the antisymmetric path problem. Finally, the program UVnovo applies a random forest algorithm to automatically learn from and then interpret UVPD mass spectra, passing results to a hidden Markov model for de novo sequence prediction and scoring. We show this combined strategy provides high performance de novo peptide sequencing, enabling the de novo sequencing of thousands of peptides from an E. coli lysate at high confidence. PMID:26938041

  11. High throughput sequencing analysis of RNA libraries reveals the influences of initial library and PCR methods on SELEX efficiency.

    PubMed

    Takahashi, Mayumi; Wu, Xiwei; Ho, Michelle; Chomchan, Pritsana; Rossi, John J; Burnett, John C; Zhou, Jiehua

    2016-09-22

    The systemic evolution of ligands by exponential enrichment (SELEX) technique is a powerful and effective aptamer-selection procedure. However, modifications to the process can dramatically improve selection efficiency and aptamer performance. For example, droplet digital PCR (ddPCR) has been recently incorporated into SELEX selection protocols to putatively reduce the propagation of byproducts and avoid selection bias that result from differences in PCR efficiency of sequences within the random library. However, a detailed, parallel comparison of the efficacy of conventional solution PCR versus the ddPCR modification in the RNA aptamer-selection process is needed to understand effects on overall SELEX performance. In the present study, we took advantage of powerful high throughput sequencing technology and bioinformatics analysis coupled with SELEX (HT-SELEX) to thoroughly investigate the effects of initial library and PCR methods in the RNA aptamer identification. Our analysis revealed that distinct "biased sequences" and nucleotide composition existed in the initial, unselected libraries purchased from two different manufacturers and that the fate of the "biased sequences" was target-dependent during selection. Our comparison of solution PCR- and ddPCR-driven HT-SELEX demonstrated that PCR method affected not only the nucleotide composition of the enriched sequences, but also the overall SELEX efficiency and aptamer efficacy.

  12. A regulatory sequence from the retinoid X receptor γ gene directs expression to horizontal cells and photoreceptors in the embryonic chicken retina.

    PubMed

    Blixt, Maria K E; Hallböök, Finn

    2016-01-01

    Combining techniques of episomal vector gene-specific Cre expression and genomic integration using the piggyBac transposon system enables studies of gene expression-specific cell lineage tracing in the chicken retina. In this work, we aimed to target the retinal horizontal cell progenitors. A 208 bp gene regulatory sequence from the chicken retinoid X receptor γ gene (RXRγ208) was used to drive Cre expression. RXRγ is expressed in progenitors and photoreceptors during development. The vector was combined with a piggyBac "donor" vector containing a floxed STOP sequence followed by enhanced green fluorescent protein (EGFP), as well as a piggyBac helper vector for efficient integration into the host cell genome. The vectors were introduced into the embryonic chicken retina with in ovo electroporation. Tissue electroporation targets specific developmental time points and in specific structures. Cells that drove Cre expression from the regulatory RXRγ208 sequence excised the floxed STOP-sequence and expressed GFP. The approach generated a stable lineage with robust expression of GFP in retinal cells that have activated transcription from the RXRγ208 sequence. Furthermore, GFP was expressed in cells that express horizontal or photoreceptor markers when electroporation was performed between developmental stages 22 and 28. Electroporation of a stage 12 optic cup gave multiple cell types in accordance with RXRγ gene expression in the early retina. In this study, we describe an easy, cost-effective, and time-efficient method for testing regulatory sequences in general. More specifically, our results open up the possibility for further studies of the RXRγ-gene regulatory network governing the formation of photoreceptor and horizontal cells. In addition, the method presents approaches to target the expression of effector genes, such as regulators of cell fate or cell cycle progression, to these cells and their progenitor.

  13. Design of Protein Multi-specificity Using an Independent Sequence Search Reduces the Barrier to Low Energy Sequences

    PubMed Central

    Sevy, Alexander M.; Jacobs, Tim M.; Crowe, James E.; Meiler, Jens

    2015-01-01

    Computational protein design has found great success in engineering proteins for thermodynamic stability, binding specificity, or enzymatic activity in a ‘single state’ design (SSD) paradigm. Multi-specificity design (MSD), on the other hand, involves considering the stability of multiple protein states simultaneously. We have developed a novel MSD algorithm, which we refer to as REstrained CONvergence in multi-specificity design (RECON). The algorithm allows each state to adopt its own sequence throughout the design process rather than enforcing a single sequence on all states. Convergence to a single sequence is encouraged through an incrementally increasing convergence restraint for corresponding positions. Compared to MSD algorithms that enforce (constrain) an identical sequence on all states the energy landscape is simplified, which accelerates the search drastically. As a result, RECON can readily be used in simulations with a flexible protein backbone. We have benchmarked RECON on two design tasks. First, we designed antibodies derived from a common germline gene against their diverse targets to assess recovery of the germline, polyspecific sequence. Second, we design “promiscuous”, polyspecific proteins against all binding partners and measure recovery of the native sequence. We show that RECON is able to efficiently recover native-like, biologically relevant sequences in this diverse set of protein complexes. PMID:26147100

  14. Stable isotope, site-specific mass tagging for protein identification

    DOEpatents

    Chen, Xian

    2006-10-24

    Proteolytic peptide mass mapping as measured by mass spectrometry provides an important method for the identification of proteins, which are usually identified by matching the measured and calculated m/z values of the proteolytic peptides. A unique identification is, however, heavily dependent upon the mass accuracy and sequence coverage of the fragment ions generated by peptide ionization. The present invention describes a method for increasing the specificity, accuracy and efficiency of the assignments of particular proteolytic peptides and consequent protein identification, by the incorporation of selected amino acid residue(s) enriched with stable isotope(s) into the protein sequence without the need for ultrahigh instrumental accuracy. Selected amino acid(s) are labeled with .sup.13C/.sup.15N/.sup.2H and incorporated into proteins in a sequence-specific manner during cell culturing. Each of these labeled amino acids carries a defined mass change encoded in its monoisotopic distribution pattern. Through their characteristic patterns, the peptides with mass tag(s) can then be readily distinguished from other peptides in mass spectra. The present method of identifying unique proteins can also be extended to protein complexes and will significantly increase data search specificity, efficiency and accuracy for protein identifications.

  15. Whole exome sequencing is an efficient, sensitive and specific method for determining the genetic cause of short-rib thoracic dystrophies.

    PubMed

    McInerney-Leo, A M; Harris, J E; Leo, P J; Marshall, M S; Gardiner, B; Kinning, E; Leong, H Y; McKenzie, F; Ong, W P; Vodopiutz, J; Wicking, C; Brown, M A; Zankl, A; Duncan, E L

    2015-12-01

    Short-rib thoracic dystrophies (SRTDs) are congenital disorders due to defects in primary cilium function. SRTDs are recessively inherited with mutations identified in 14 genes to date (comprising 398 exons). Conventional mutation detection (usually by iterative Sanger sequencing) is inefficient and expensive, and often not undertaken. Whole exome massive parallel sequencing has been used to identify new genes for SRTD (WDR34, WDR60 and IFT172); however, the clinical utility of whole exome sequencing (WES) has not been established. WES was performed in 11 individuals with SRTDs. Compound heterozygous or homozygous mutations were identified in six confirmed SRTD genes in 10 individuals (IFT172, DYNC2H1, TTC21B, WDR60, WDR34 and NEK1), giving overall sensitivity of 90.9%. WES data from 993 unaffected individuals sequenced using similar technology showed two individuals with rare (minor allele frequency <0.005) compound heterozygous variants of unknown significance in SRTD genes (specificity >99%). Costs for consumables, laboratory processing and bioinformatic analysis were

  16. The influence of viral coding sequences on pestivirus IRES activity reveals further parallels with translation initiation in prokaryotes.

    PubMed Central

    Fletcher, Simon P; Ali, Iraj K; Kaminski, Ann; Digard, Paul; Jackson, Richard J

    2002-01-01

    Classical swine fever virus (CSFV) is a member of the pestivirus family, which shares many features in common with hepatitis C virus (HCV). It is shown here that CSFV has an exceptionally efficient cis-acting internal ribosome entry segment (IRES), which, like that of HCV, is strongly influenced by the sequences immediately downstream of the initiation codon, and is optimal with viral coding sequences in this position. Constructs that retained 17 or more codons of viral coding sequence exhibited full IRES activity, but with only 12 codons, activity was approximately 66% of maximum in vitro (though close to maximum in transfected BHK cells), whereas with just 3 codons or fewer, the activity was only approximately 15% of maximum. The minimal coding region elements required for high activity were exchanged between HCV and CSFV. Although maximum activity was observed in each case with the homologous combination of coding region and 5' UTR, the heterologous combinations were sufficiently active to rule out a highly specific functional interplay between the 5' UTR and coding sequences. On the other hand, inversion of the coding sequences resulted in low IRES activity, particularly with the HCV coding sequences. RNA structure probing showed that the efficiency of internal initiation of these chimeric constructs correlated most closely with the degree of single-strandedness of the region around and immediately downstream of the initiation codon. The low activity IRESs could not be rescued by addition of supplementary eIF4A (the initiation factor with ATP-dependent RNA helicase activity). The extreme sensitivity to secondary structure around the initiation codon is likely to be due to the fact that the eIF4F complex (which has eIF4A as one of its subunits) is not required for and does not participate in initiation on these IRESs. PMID:12515388

  17. New technology and resources for cryptococcal research

    PubMed Central

    Zhang, Nannan; Park, Yoon-Dong; Williamson, Peter R.

    2014-01-01

    Rapid advances in molecular biology and genome sequencing have enabled the generation of new technology and resources for cryptococcal research. RNAi-mediated specific gene knock down has become routine and more efficient by utilizing modified shRNA plasmids and convergent promoter RNAi constructs. This system was recently applied in a high-throughput screen to identify genes involved in host-pathogen interactions. Gene deletion efficiencies have also been improved by increasing rates of homologous recombination through a number of approaches, including a combination of double-joint PCR with split-marker transformation, the use of dominant selectable markers and the introduction of Cre-Loxp systems into Cryptococcus. Moreover, visualization of cryptococcal proteins has become more facile using fusions with codon-optimized fluorescent tags, such as green or red fluorescent proteins or, mCherry. Using recent genome-wide analytical tools, new transcriptional factors and regulatory proteins have been identified in novel virulence-related signaling pathways by employing microarray analysis, RNA-sequencing and proteomic analysis. PMID:25460849

  18. High-efficiency transformation of Pichia stipitis based on its URA3 gene and a homologous autonomous replication sequence, ARS2.

    PubMed Central

    Yang, V W; Marks, J A; Davis, B P; Jeffries, T W

    1994-01-01

    This paper describes the first high-efficiency transformation system for the xylose-fermenting yeast Pichia stipitis. The system includes integrating and autonomously replicating plasmids based on the gene for orotidine-5'-phosphate decarboxylase (URA3) and an autonomous replicating sequence (ARS) element (ARS2) isolated from P. stipitis CBS 6054. Ura- auxotrophs were obtained by selecting for resistance to 5-fluoroorotic acid and were identified as ura3 mutants by transformation with P. stipitis URA3. P. stipitis URA3 was cloned by its homology to Saccharomyces cerevisiae URA3, with which it is 69% identical in the coding region. P. stipitis ARS elements were cloned functionally through plasmid rescue. These sequences confer autonomous replication when cloned into vectors bearing the P. stipitis URA3 gene. P. stipitis ARS2 has features similar to those of the consensus ARS of S. cerevisiae and other ARS elements. Circular plasmids bearing the P. stipitis URA3 gene with various amounts of flanking sequences produced 600 to 8,600 Ura+ transformants per micrograms of DNA by electroporation. Most transformants obtained with circular vectors arose without integration of vector sequences. One vector yielded 5,200 to 12,500 Ura+ transformants per micrograms of DNA after it was linearized at various restriction enzyme sites within the P. stipitis URA3 insert. Transformants arising from linearized vectors produced stable integrants, and integration events were site specific for the genomic ura3 in 20% of the transformants examined. Plasmids bearing the P. stipitis URA3 gene and ARS2 element produced more than 30,000 transformants per micrograms of plasmid DNA. Autonomously replicating plasmids were stable for at least 50 generations in selection medium and were present at an average of 10 copies per nucleus. Images PMID:7811063

  19. Leveraging genome-wide datasets to quantify the functional role of the anti-Shine-Dalgarno sequence in regulating translation efficiency.

    PubMed

    Hockenberry, Adam J; Pah, Adam R; Jewett, Michael C; Amaral, Luís A N

    2017-01-01

    Studies dating back to the 1970s established that sequence complementarity between the anti-Shine-Dalgarno (aSD) sequence on prokaryotic ribosomes and the 5' untranslated region of mRNAs helps to facilitate translation initiation. The optimal location of aSD sequence binding relative to the start codon, the full extents of the aSD sequence and the functional form of the relationship between aSD sequence complementarity and translation efficiency have not been fully resolved. Here, we investigate these relationships by leveraging the sequence diversity of endogenous genes and recently available genome-wide estimates of translation efficiency. We show that-after accounting for predicted mRNA structure-aSD sequence complementarity increases the translation of endogenous mRNAs by roughly 50%. Further, we observe that this relationship is nonlinear, with translation efficiency maximized for mRNAs with intermediate levels of aSD sequence complementarity. The mechanistic insights that we observe are highly robust: we find nearly identical results in multiple datasets spanning three distantly related bacteria. Further, we verify our main conclusions by re-analysing a controlled experimental dataset. © 2017 The Authors.

  20. Generation of “LYmph Node Derived Antibody Libraries” (LYNDAL) for selecting fully human antibody fragments with therapeutic potential

    PubMed Central

    Diebolder, Philipp; Keller, Armin; Haase, Stephanie; Schlegelmilch, Anne; Kiefer, Jonathan D; Karimi, Tamana; Weber, Tobias; Moldenhauer, Gerhard; Kehm, Roland; Eis-Hübinger, Anna M; Jäger, Dirk; Federspil, Philippe A; Herold-Mende, Christel; Dyckhoff, Gerhard; Kontermann, Roland E; Arndt, Michaela AE; Krauss, Jürgen

    2014-01-01

    The development of efficient strategies for generating fully human monoclonal antibodies with unique functional properties that are exploitable for tailored therapeutic interventions remains a major challenge in the antibody technology field. Here, we present a methodology for recovering such antibodies from antigen-encountered human B cell repertoires. As the source for variable antibody genes, we cloned immunoglobulin G (IgG)-derived B cell repertoires from lymph nodes of 20 individuals undergoing surgery for head and neck cancer. Sequence analysis of unselected “LYmph Node Derived Antibody Libraries” (LYNDAL) revealed a naturally occurring distribution pattern of rearranged antibody sequences, representing all known variable gene families and most functional germline sequences. To demonstrate the feasibility for selecting antibodies with therapeutic potential from these repertoires, seven LYNDAL from donors with high serum titers against herpes simplex virus (HSV) were panned on recombinant glycoprotein B of HSV-1. Screening for specific binders delivered 34 single-chain variable fragments (scFvs) with unique sequences. Sequence analysis revealed extensive somatic hypermutation of enriched clones as a result of affinity maturation. Binding of scFvs to common glycoprotein B variants from HSV-1 and HSV-2 strains was highly specific, and the majority of analyzed antibody fragments bound to the target antigen with nanomolar affinity. From eight scFvs with HSV-neutralizing capacity in vitro, the most potent antibody neutralized 50% HSV-2 at 4.5 nM as a dimeric (scFv)2. We anticipate our approach to be useful for recovering fully human antibodies with therapeutic potential. PMID:24256717

  1. Generation of “LYmph Node Derived Antibody Libraries” (LYNDAL) for selecting fully human antibody fragments with therapeutic potential.

    PubMed

    Diebolder, Philipp; Keller, Armin; Haase, Stephanie; Schlegelmilch, Anne; Kiefer, Jonathan D; Karimi, Tamana; Weber, Tobias; Moldenhauer, Gerhard; Kehm, Roland; Eis-Hübinger, Anna M; Jäger, Dirk; Federspil, Philippe A; Herold-Mende, Christel; Dyckhoff, Gerhard; Kontermann, Roland E; Arndt, Michaela A E; Krauss, Jürgen

    2014-01-01

    The development of efficient strategies for generating fully human monoclonal antibodies with unique functional properties that are exploitable for tailored therapeutic interventions remains a major challenge in the antibody technology field. Here, we present a methodology for recovering such antibodies from antigen-encountered human B cell repertoires. As the source for variable antibody genes, we cloned immunoglobulin G (IgG)-derived B cell repertoires from lymph nodes of 20 individuals undergoing surgery for head and neck cancer. Sequence analysis of unselected “LYmph Node Derived Antibody Libraries” (LYNDAL) revealed a naturally occurring distribution pattern of rearranged antibody sequences, representing all known variable gene families and most functional germline sequences. To demonstrate the feasibility for selecting antibodies with therapeutic potential from these repertoires, seven LYNDAL from donors with high serum titers against herpes simplex virus (HSV) were panned on recombinant glycoprotein B of HSV-1. Screening for specific binders delivered 34 single-chain variable fragments (scFvs) with unique sequences. Sequence analysis revealed extensive somatic hypermutation of enriched clones as a result of affinity maturation. Binding of scFvs to common glycoprotein B variants from HSV-1 and HSV-2 strains was highly specific, and the majority of analyzed antibody fragments bound to the target antigen with nanomolar affinity. From eight scFvs with HSV-neutralizing capacity in vitro,the most potent antibody neutralized 50% HSV-2 at 4.5 nM as a dimeric (scFv)2. We anticipate our approach to be useful for recovering fully human antibodies with therapeutic potential.

  2. Genetic and Chemical Profiling of Gymnema sylvestre Accessions from Central India: Its Implication for Quality Control and Therapeutic Potential of Plant

    PubMed Central

    Verma, Ashutosh Kumar; Dhawan, Sunita Singh; Singh, Seema; Bharati, Kumar Avinash; Jyotsana

    2016-01-01

    Background: Gymnema sylvestre, a vulnerable plant species, is mentioned in Indian Pharmacopeia as an antidiabetic drug Objective: Study of genetic and chemical diversity and its implications in accessions of G. sylvestre Materials and Methods: Fourteen accessions of G. sylvestre collected from Central India and assessment of their genetic and chemical diversity were carried out using ISSR (inter simple sequence repeat) and HPLC (high performance liquid chromatography) fingerprinting methods Results: Among the screened 40 ISSR primers, 15 were found polymorphic and collectively produced nine unique accession-specific bands. The maximum and minimum numbers of amplicones were noted for ISSR-15 and ISSR-11, respectively. The ISSR -11 and ISSR-13 revealed 100% polymorphism. HPLC chromatograms showed that accessions possess the secondary metabolites of mid-polarity with considerable variability. Unknown peaks with retention time 2.63, 3.41, 23.83, 24.50, and 44.67 were found universal type. Comparative hierarchical clustering analysis based on foresaid fingerprints indicates that both techniques have equal potential to discriminate accessions according to percentage gymnemic acid in their leaf tissue. Second approach was noted more efficiently for separation of accessions according to their agro-climatic/collection site Conclusion: Highly polymorphic ISSRs could be utilized as molecular probes for further selection of high gymnemic acid yielding accessions. Observed accession specific bands may be used as a descriptor for plant accessions protection and converted into sequence tagged sites markers. Identified five universal type peaks could be helpful in identification of G. sylvestre-based various herbal preparations. SUMMARY Nine accession specific unique bandsFive marker peaks for G. sylvestre.Suitability of genetic and chemical fingerprinting Abbreviations used: HPLC: High Performance Liquid Chromatography, ISSR: Inter Simple Sequence Repeats, CTAB: Cetyl Trimethylammonium Bromide, DNTP: Deoxynucleotide Triphosphates PMID:27761067

  3. Genetic and Chemical Profiling of Gymnema sylvestre Accessions from Central India: Its Implication for Quality Control and Therapeutic Potential of Plant.

    PubMed

    Verma, Ashutosh Kumar; Dhawan, Sunita Singh; Singh, Seema; Bharati, Kumar Avinash; Jyotsana

    2016-07-01

    Gymnema sylvestre , a vulnerable plant species, is mentioned in Indian Pharmacopeia as an antidiabetic drug. Study of genetic and chemical diversity and its implications in accessions of G. sylvestre . Fourteen accessions of G. sylvestre collected from Central India and assessment of their genetic and chemical diversity were carried out using ISSR (inter simple sequence repeat) and HPLC (high performance liquid chromatography) fingerprinting methods. Among the screened 40 ISSR primers, 15 were found polymorphic and collectively produced nine unique accession-specific bands. The maximum and minimum numbers of amplicones were noted for ISSR-15 and ISSR-11, respectively. The ISSR -11 and ISSR-13 revealed 100% polymorphism. HPLC chromatograms showed that accessions possess the secondary metabolites of mid-polarity with considerable variability. Unknown peaks with retention time 2.63, 3.41, 23.83, 24.50, and 44.67 were found universal type. Comparative hierarchical clustering analysis based on foresaid fingerprints indicates that both techniques have equal potential to discriminate accessions according to percentage gymnemic acid in their leaf tissue. Second approach was noted more efficiently for separation of accessions according to their agro-climatic/collection site. Highly polymorphic ISSRs could be utilized as molecular probes for further selection of high gymnemic acid yielding accessions. Observed accession specific bands may be used as a descriptor for plant accessions protection and converted into sequence tagged sites markers. Identified five universal type peaks could be helpful in identification of G. sylvestre -based various herbal preparations. Nine accession specific unique bandsFive marker peaks for G. sylvestre .Suitability of genetic and chemical fingerprinting Abbreviations used: HPLC: High Performance Liquid Chromatography, ISSR: Inter Simple Sequence Repeats, CTAB: Cetyl Trimethylammonium Bromide, DNTP: Deoxynucleotide Triphosphates.

  4. An efficient strategy for producing a stable, replaceable, highly efficient transgene expression system in silkworm, Bombyx mori

    PubMed Central

    Long, Dingpei; Lu, Weijian; Zhang, Yuli; Bi, Lihui; Xiang, Zhonghuai; Zhao, Aichun

    2015-01-01

    We developed an efficient strategy that combines a method for the post-integration elimination of all transposon sequences, a site-specific recombination system, and an optimized fibroin H-chain expression system to produce a stable, replaceable, highly efficient transgene expression system in the silkworm (Bombyx mori) that overcomes the disadvantages of random insertion and post-integration instability of transposons. Here, we generated four different transgenic silkworm strains, and of one the transgenic strains, designated TS1-RgG2, with up to 16% (w/w) of the target protein in the cocoons, was selected. The subsequent elimination of all the transposon sequences from TS1-RgG2 was completed by the heat-shock-induced expression of the transposase in vivo. The resulting transgenic silkworm strain was designated TS3-g2 and contained only the attP-flanked optimized fibroin H-chain expression cassette in its genome. A phiC31/att-system-based recombinase-mediated cassette exchange (RMCE) method could be used to integrate other genes of interest into the same genome locus between the attP sites in TS3-g2. Controlling for position effects with phiC31-mediated RMCE will also allow the optimization of exogenous protein expression and fine gene function analyses in the silkworm. The strategy developed here is also applicable to other lepidopteran insects, to improve the ecological safety of transgenic strains in biocontrol programs. PMID:25739894

  5. A murine host cell factor required for nicking of the dimer bridge of MVM recognizes two CG nucleotides displaced by 10 basepairs.

    PubMed

    Liu, Q; Astell, C R

    1996-10-01

    During replication of the minute virus of mice (MVM) genome, a dimer replicative form (RF) intermediate is resolved into two monomer RF molecules in such a way as to retain a unique sequence within the left hand hairpin terminus of the viral genome. Although the proposed mechanism for resolution of the dimer RF remains uncertain, it likely involves site-specific nicking of the dimer bridge. The RF contains two double-stranded copies of the viral genome joined by the extended 3' hairpin. Minor sequence asymmetries within the 3' hairpin allow the two halves of the dimer bridge to be distinguished. The A half contains the sequence [sequence: see text], whereas the B half contains the sequence [sequence: see text]. Using an in vitro assay, we show that only the B half of the MVM dimer bridge is nicked site-specifically when incubated with crude NS-1 protein (expressed in insect cells) and mouse LA9 cellular extract. When highly purified NS-1, the major nonstructural protein of MVM, is used in this nicking reaction, there is an absolute requirement for the LA9 cellular extract, suggesting a cellular factor (or factors) is (are) required. A series of mutations were created in the putative host factor binding region (HFBR) on the B half of the MVM dimer bridge adjacent to the NS-1 binding site. Nicking assays of these B half mutants showed that two CG motifs displaced by 10 nucleotides are important for nicking. Gel mobility shift assays demonstrated that a host factor(s) can bind to the HFBR of the B half of the dimer bridge and efficient binding depends on the presence of both CG motifs. Competitor DNA containing the wild-type HFBR sequence is able to specifically inhibit nicking of the B half, indicating that the host factor(s) bound to the HFBR is(are) essential for site-specific nicking to occur.

  6. An Automated Pipeline for Engineering Many-Enzyme Pathways: Computational Sequence Design, Pathway Expression-Flux Mapping, and Scalable Pathway Optimization.

    PubMed

    Halper, Sean M; Cetnar, Daniel P; Salis, Howard M

    2018-01-01

    Engineering many-enzyme metabolic pathways suffers from the design curse of dimensionality. There are an astronomical number of synonymous DNA sequence choices, though relatively few will express an evolutionary robust, maximally productive pathway without metabolic bottlenecks. To solve this challenge, we have developed an integrated, automated computational-experimental pipeline that identifies a pathway's optimal DNA sequence without high-throughput screening or many cycles of design-build-test. The first step applies our Operon Calculator algorithm to design a host-specific evolutionary robust bacterial operon sequence with maximally tunable enzyme expression levels. The second step applies our RBS Library Calculator algorithm to systematically vary enzyme expression levels with the smallest-sized library. After characterizing a small number of constructed pathway variants, measurements are supplied to our Pathway Map Calculator algorithm, which then parameterizes a kinetic metabolic model that ultimately predicts the pathway's optimal enzyme expression levels and DNA sequences. Altogether, our algorithms provide the ability to efficiently map the pathway's sequence-expression-activity space and predict DNA sequences with desired metabolic fluxes. Here, we provide a step-by-step guide to applying the Pathway Optimization Pipeline on a desired multi-enzyme pathway in a bacterial host.

  7. Comprehensive sequence-flux mapping of a levoglucosan utilization pathway in E. coli

    DOE PAGES

    Klesmith, Justin R.; Bacik, John -Paul; Michalczyk, Ryszard; ...

    2015-09-14

    Synthetic metabolic pathways often suffer from low specific productivity, and new methods that quickly assess pathway functionality for many thousands of variants are urgently needed. Here we present an approach that enables the rapid and parallel determination of sequence effects on flux for complete gene-encoding sequences. We show that this method can be used to determine the effects of over 8000 single point mutants of a pyrolysis oil catabolic pathway implanted in Escherichia coli. Experimental sequence-function data sets predicted whether fitness-enhancing mutations to the enzyme levoglucosan kinase resulted from enhanced catalytic efficiency or enzyme stability. A structure of one designmore » incorporating 38 mutations elucidated the structural basis of high fitness mutations. One design incorporating 15 beneficial mutations supported a 15-fold improvement in growth rate and greater than 24-fold improvement in enzyme activity relative to the starting pathway. Lastly, this technique can be extended to improve a wide variety of designed pathways.« less

  8. Uncultivated Microbial Eukaryotic Diversity: A Method to Link ssu rRNA Gene Sequences with Morphology

    PubMed Central

    Hirst, Marissa B.; Kita, Kelley N.; Dawson, Scott C.

    2011-01-01

    Protists have traditionally been identified by cultivation and classified taxonomically based on their cellular morphologies and behavior. In the past decade, however, many novel protist taxa have been identified using cultivation independent ssu rRNA sequence surveys. New rRNA “phylotypes” from uncultivated eukaryotes have no connection to the wealth of prior morphological descriptions of protists. To link phylogenetically informative sequences with taxonomically informative morphological descriptions, we demonstrate several methods for combining whole cell rRNA-targeted fluorescent in situ hybridization (FISH) with cytoskeletal or organellar immunostaining. Either eukaryote or ciliate-specific ssu rRNA probes were combined with an anti-α-tubulin antibody or phalloidin, a common actin stain, to define cytoskeletal features of uncultivated protists in several environmental samples. The eukaryote ssu rRNA probe was also combined with Mitotracker® or a hydrogenosomal-specific anti-Hsp70 antibody to localize mitochondria and hydrogenosomes, respectively, in uncultivated protists from different environments. Using rRNA probes in combination with immunostaining, we linked ssu rRNA phylotypes with microtubule structure to describe flagellate and ciliate morphology in three diverse environments, and linked Naegleria spp. to their amoeboid morphology using actin staining in hay infusion samples. We also linked uncultivated ciliates to morphologically similar Colpoda-like ciliates using tubulin immunostaining with a ciliate-specific rRNA probe. Combining rRNA-targeted FISH with cytoskeletal immunostaining or stains targeting specific organelles provides a fast, efficient, high throughput method for linking genetic sequences with morphological features in uncultivated protists. When linked to phylotype, morphological descriptions of protists can both complement and vet the increasing number of sequences from uncultivated protists, including those of novel lineages, identified in diverse environments. PMID:22174774

  9. A Reference Viral Database (RVDB) To Enhance Bioinformatics Analysis of High-Throughput Sequencing for Novel Virus Detection

    PubMed Central

    Goodacre, Norman; Aljanahi, Aisha; Nandakumar, Subhiksha; Mikailov, Mike

    2018-01-01

    ABSTRACT Detection of distantly related viruses by high-throughput sequencing (HTS) is bioinformatically challenging because of the lack of a public database containing all viral sequences, without abundant nonviral sequences, which can extend runtime and obscure viral hits. Our reference viral database (RVDB) includes all viral, virus-related, and virus-like nucleotide sequences (excluding bacterial viruses), regardless of length, and with overall reduced cellular sequences. Semantic selection criteria (SEM-I) were used to select viral sequences from GenBank, resulting in a first-generation viral database (VDB). This database was manually and computationally reviewed, resulting in refined, semantic selection criteria (SEM-R), which were applied to a new download of updated GenBank sequences to create a second-generation VDB. Viral entries in the latter were clustered at 98% by CD-HIT-EST to reduce redundancy while retaining high viral sequence diversity. The viral identity of the clustered representative sequences (creps) was confirmed by BLAST searches in NCBI databases and HMMER searches in PFAM and DFAM databases. The resulting RVDB contained a broad representation of viral families, sequence diversity, and a reduced cellular content; it includes full-length and partial sequences and endogenous nonretroviral elements, endogenous retroviruses, and retrotransposons. Testing of RVDBv10.2, with an in-house HTS transcriptomic data set indicated a significantly faster run for virus detection than interrogating the entirety of the NCBI nonredundant nucleotide database, which contains all viral sequences but also nonviral sequences. RVDB is publically available for facilitating HTS analysis, particularly for novel virus detection. It is meant to be updated on a regular basis to include new viral sequences added to GenBank. IMPORTANCE To facilitate bioinformatics analysis of high-throughput sequencing (HTS) data for the detection of both known and novel viruses, we have developed a new reference viral database (RVDB) that provides a broad representation of different virus species from eukaryotes by including all viral, virus-like, and virus-related sequences (excluding bacteriophages), regardless of their size. In particular, RVDB contains endogenous nonretroviral elements, endogenous retroviruses, and retrotransposons. Sequences were clustered to reduce redundancy while retaining high viral sequence diversity. A particularly useful feature of RVDB is the reduction of cellular sequences, which can enhance the run efficiency of large transcriptomic and genomic data analysis and increase the specificity of virus detection. PMID:29564396

  10. A Reference Viral Database (RVDB) To Enhance Bioinformatics Analysis of High-Throughput Sequencing for Novel Virus Detection.

    PubMed

    Goodacre, Norman; Aljanahi, Aisha; Nandakumar, Subhiksha; Mikailov, Mike; Khan, Arifa S

    2018-01-01

    Detection of distantly related viruses by high-throughput sequencing (HTS) is bioinformatically challenging because of the lack of a public database containing all viral sequences, without abundant nonviral sequences, which can extend runtime and obscure viral hits. Our reference viral database (RVDB) includes all viral, virus-related, and virus-like nucleotide sequences (excluding bacterial viruses), regardless of length, and with overall reduced cellular sequences. Semantic selection criteria (SEM-I) were used to select viral sequences from GenBank, resulting in a first-generation viral database (VDB). This database was manually and computationally reviewed, resulting in refined, semantic selection criteria (SEM-R), which were applied to a new download of updated GenBank sequences to create a second-generation VDB. Viral entries in the latter were clustered at 98% by CD-HIT-EST to reduce redundancy while retaining high viral sequence diversity. The viral identity of the clustered representative sequences (creps) was confirmed by BLAST searches in NCBI databases and HMMER searches in PFAM and DFAM databases. The resulting RVDB contained a broad representation of viral families, sequence diversity, and a reduced cellular content; it includes full-length and partial sequences and endogenous nonretroviral elements, endogenous retroviruses, and retrotransposons. Testing of RVDBv10.2, with an in-house HTS transcriptomic data set indicated a significantly faster run for virus detection than interrogating the entirety of the NCBI nonredundant nucleotide database, which contains all viral sequences but also nonviral sequences. RVDB is publically available for facilitating HTS analysis, particularly for novel virus detection. It is meant to be updated on a regular basis to include new viral sequences added to GenBank. IMPORTANCE To facilitate bioinformatics analysis of high-throughput sequencing (HTS) data for the detection of both known and novel viruses, we have developed a new reference viral database (RVDB) that provides a broad representation of different virus species from eukaryotes by including all viral, virus-like, and virus-related sequences (excluding bacteriophages), regardless of their size. In particular, RVDB contains endogenous nonretroviral elements, endogenous retroviruses, and retrotransposons. Sequences were clustered to reduce redundancy while retaining high viral sequence diversity. A particularly useful feature of RVDB is the reduction of cellular sequences, which can enhance the run efficiency of large transcriptomic and genomic data analysis and increase the specificity of virus detection.

  11. Experiences Building Globus Genomics: A Next-Generation Sequencing Analysis Service using Galaxy, Globus, and Amazon Web Services

    PubMed Central

    Madduri, Ravi K.; Sulakhe, Dinanath; Lacinski, Lukasz; Liu, Bo; Rodriguez, Alex; Chard, Kyle; Dave, Utpal J.; Foster, Ian T.

    2014-01-01

    We describe Globus Genomics, a system that we have developed for rapid analysis of large quantities of next-generation sequencing (NGS) genomic data. This system achieves a high degree of end-to-end automation that encompasses every stage of data analysis including initial data retrieval from remote sequencing centers or storage (via the Globus file transfer system); specification, configuration, and reuse of multi-step processing pipelines (via the Galaxy workflow system); creation of custom Amazon Machine Images and on-demand resource acquisition via a specialized elastic provisioner (on Amazon EC2); and efficient scheduling of these pipelines over many processors (via the HTCondor scheduler). The system allows biomedical researchers to perform rapid analysis of large NGS datasets in a fully automated manner, without software installation or a need for any local computing infrastructure. We report performance and cost results for some representative workloads. PMID:25342933

  12. Experiences Building Globus Genomics: A Next-Generation Sequencing Analysis Service using Galaxy, Globus, and Amazon Web Services.

    PubMed

    Madduri, Ravi K; Sulakhe, Dinanath; Lacinski, Lukasz; Liu, Bo; Rodriguez, Alex; Chard, Kyle; Dave, Utpal J; Foster, Ian T

    2014-09-10

    We describe Globus Genomics, a system that we have developed for rapid analysis of large quantities of next-generation sequencing (NGS) genomic data. This system achieves a high degree of end-to-end automation that encompasses every stage of data analysis including initial data retrieval from remote sequencing centers or storage (via the Globus file transfer system); specification, configuration, and reuse of multi-step processing pipelines (via the Galaxy workflow system); creation of custom Amazon Machine Images and on-demand resource acquisition via a specialized elastic provisioner (on Amazon EC2); and efficient scheduling of these pipelines over many processors (via the HTCondor scheduler). The system allows biomedical researchers to perform rapid analysis of large NGS datasets in a fully automated manner, without software installation or a need for any local computing infrastructure. We report performance and cost results for some representative workloads.

  13. DOE Office of Scientific and Technical Information (OSTI.GOV)

    Klesmith, Justin R.; Bacik, John -Paul; Michalczyk, Ryszard

    Synthetic metabolic pathways often suffer from low specific productivity, and new methods that quickly assess pathway functionality for many thousands of variants are urgently needed. Here we present an approach that enables the rapid and parallel determination of sequence effects on flux for complete gene-encoding sequences. We show that this method can be used to determine the effects of over 8000 single point mutants of a pyrolysis oil catabolic pathway implanted in Escherichia coli. Experimental sequence-function data sets predicted whether fitness-enhancing mutations to the enzyme levoglucosan kinase resulted from enhanced catalytic efficiency or enzyme stability. A structure of one designmore » incorporating 38 mutations elucidated the structural basis of high fitness mutations. One design incorporating 15 beneficial mutations supported a 15-fold improvement in growth rate and greater than 24-fold improvement in enzyme activity relative to the starting pathway. Lastly, this technique can be extended to improve a wide variety of designed pathways.« less

  14. Extended phase graphs with anisotropic diffusion.

    PubMed

    Weigel, M; Schwenk, S; Kiselev, V G; Scheffler, K; Hennig, J

    2010-08-01

    The extended phase graph (EPG) calculus gives an elegant pictorial description of magnetization response in multi-pulse MR sequences. The use of the EPG calculus enables a high computational efficiency for the quantitation of echo intensities even for complex sequences with multiple refocusing pulses with arbitrary flip angles. In this work, the EPG concept dealing with RF pulses with arbitrary flip angles and phases is extended to account for anisotropic diffusion in the presence of arbitrary varying gradients. The diffusion effect can be expressed by specific diffusion weightings of individual magnetization pathways. This can be represented as an action of a linear operator on the magnetization state. The algorithm allows easy integration of diffusion anisotropy effects. The formalism is validated on known examples from literature and used to calculate the effective diffusion weighting in multi-echo sequences with arbitrary refocusing flip angles. Copyright 2010 Elsevier Inc. All rights reserved.

  15. Historical Perspective, Development and Applications of Next-Generation Sequencing in Plant Virology

    PubMed Central

    Barba, Marina; Czosnek, Henryk; Hadidi, Ahmed

    2014-01-01

    Next-generation high throughput sequencing technologies became available at the onset of the 21st century. They provide a highly efficient, rapid, and low cost DNA sequencing platform beyond the reach of the standard and traditional DNA sequencing technologies developed in the late 1970s. They are continually improved to become faster, more efficient and cheaper. They have been used in many fields of biology since 2004. In 2009, next-generation sequencing (NGS) technologies began to be applied to several areas of plant virology including virus/viroid genome sequencing, discovery and detection, ecology and epidemiology, replication and transcription. Identification and characterization of known and unknown viruses and/or viroids in infected plants are currently among the most successful applications of these technologies. It is expected that NGS will play very significant roles in many research and non-research areas of plant virology. PMID:24399207

  16. Polycaprolactone electrospun mesh conjugated with an MSC affinity peptide for MSC homing in vivo.

    PubMed

    Shao, Zhenxing; Zhang, Xin; Pi, Yanbin; Wang, Xiaokun; Jia, Zhuqing; Zhu, Jingxian; Dai, Linghui; Chen, Wenqing; Yin, Ling; Chen, Haifeng; Zhou, Chunyan; Ao, Yingfang

    2012-04-01

    Mesenchymal stem cell (MSC) is a promising cell source candidate in tissue engineering (TE) and regenerative medicine. However, the inability to target MSCs in tissues of interest with high efficiency and engraftment has become a significant barrier for MSC-based therapies. The mobilization and transfer of MSCs to defective/damaged sites in tissues or organs in vivo with high efficacy and efficiency has been a major concern. In the present study, we identified a peptide sequence (E7) with seven amino acids through phage display technology, which has a high specific affinity to bone marrow-derived MSCs. Subsequent analysis suggested that the peptide could efficiently interact specifically with MSCs without any species specificity. Thereafter, E7 was covalently conjugated onto polycaprolactone (PCL) electrospun meshes to construct an "MSC-homing device" for the recruitment of MSCs both in vitro and in vivo. The E7-conjugated PCL electrospun meshes were implanted into a cartilage defect site of rat knee joints, combined with a microfracture procedure to mobilize the endogenous MSCs. After 7 d of implantation, immunofluorescence staining showed that the cells grown into the E7-conjugated PCL electrospun meshes yielded a high positive rate for specific MSC surface markers (CD44, CD90, and CD105) compared with those in arginine-glycine-aspartic acid (RGD)-conjugated PCL electrospun meshes (63.67% vs. 3.03%; 59.37% vs. 2.98%; and 61.45% vs. 3.82%, respectively). Furthermore, the percentage of CD68 positive cells in the E7-conjugated PCL electrospun meshes was much lower than that in the RGD-conjugated PCL electrospun meshes (5.57% vs. 53.43%). This result indicates that E7-conjugated PCL electrospun meshes absorb much less inflammatory cells in vivo than RGD-conjugated PCL electrospun meshes. The results of the present study suggest that the identified E7 peptide sequence has a high specific affinity to MSCs. Covalently conjugating this peptide on the synthetic PCL mesh significantly enhanced the MSC recruitment of PCL in vivo. This method provides a wide range of potential applications in TE. Copyright © 2012 Elsevier Ltd. All rights reserved.

  17. DOE Office of Scientific and Technical Information (OSTI.GOV)

    Li, Guotian; Jain, Rashmi; Chern, Mawsheng

    The availability of a whole-genome sequenced mutant population and the cataloging of mutations of each line at a single-nucleotide resolution facilitate functional genomic analysis. To this end, we generated and sequenced a fast-neutron-induced mutant population in the model rice cultivar Kitaake (Oryza sativa ssp japonica), which completes its life cycle in 9 weeks. We sequenced 1504 mutant lines at 45-fold coverage and identified 91,513 mutations affecting 32,307 genes, i.e., 58% of all rice genes. We detected an average of 61 mutations per line. Mutation types include single-base substitutions, deletions, insertions, inversions, translocations, and tandem duplications. We observed a high proportionmore » of loss-of-function mutations. We identified an inversion affecting a single gene as the causative mutation for the short-grain phenotype in one mutant line. This result reveals the usefulness of the resource for efficient, cost-effective identification of genes conferring specific phenotypes. To facilitate public access to this genetic resource, we established an open access database called KitBase that provides access to sequence data and seed stocks. This population complements other available mutant collections and gene-editing technologies. In conclusion, this work demonstrates how inexpensive next-generation sequencing can be applied to generate a high-density catalog of mutations.« less

  18. Deep Sequencing of Random Mutant Libraries Reveals the Active Site of the Narrow Specificity CphA Metallo-β-Lactamase is Fragile to Mutations.

    PubMed

    Sun, Zhizeng; Mehta, Shrenik C; Adamski, Carolyn J; Gibbs, Richard A; Palzkill, Timothy

    2016-09-12

    CphA is a Zn(2+)-dependent metallo-β-lactamase that efficiently hydrolyzes only carbapenem antibiotics. To understand the sequence requirements for CphA function, single codon random mutant libraries were constructed for residues in and near the active site and mutants were selected for E. coli growth on increasing concentrations of imipenem, a carbapenem antibiotic. At high concentrations of imipenem that select for phenotypically wild-type mutants, the active-site residues exhibit stringent sequence requirements in that nearly all residues in positions that contact zinc, the substrate, or the catalytic water do not tolerate amino acid substitutions. In addition, at high imipenem concentrations a number of residues that do not directly contact zinc or substrate are also essential and do not tolerate substitutions. Biochemical analysis confirmed that amino acid substitutions at essential positions decreased the stability or catalytic activity of the CphA enzyme. Therefore, the CphA active - site is fragile to substitutions, suggesting active-site residues are optimized for imipenem hydrolysis. These results also suggest that resistance to inhibitors targeted to the CphA active site would be slow to develop because of the strong sequence constraints on function.

  19. SNP discovery through de novo deep sequencing using the next generation of DNA sequencers

    USDA-ARS?s Scientific Manuscript database

    The production of high volumes of DNA sequence data using new technologies has permitted more efficient identification of single nucleotide polymorphisms in vertebrate genomes. This chapter presented practical methodology for production and analysis of DNA sequence data for SNP discovery....

  20. Identification of Sequence Specificity of 5-Methylcytosine Oxidation by Tet1 Protein with High-Throughput Sequencing.

    PubMed

    Kizaki, Seiichiro; Chandran, Anandhakumar; Sugiyama, Hiroshi

    2016-03-02

    Tet (ten-eleven translocation) family proteins have the ability to oxidize 5-methylcytosine (mC) to 5-hydroxymethylcytosine (hmC), 5-formylcytosine (fC), and 5-carboxycytosine (caC). However, the oxidation reaction of Tet is not understood completely. Evaluation of genomic-level epigenetic changes by Tet protein requires unbiased identification of the highly selective oxidation sites. In this study, we used high-throughput sequencing to investigate the sequence specificity of mC oxidation by Tet1. A 6.6×10(4) -member mC-containing random DNA-sequence library was constructed. The library was subjected to Tet-reactive pulldown followed by high-throughput sequencing. Analysis of the obtained sequence data identified the Tet1-reactive sequences. We identified mCpG as a highly reactive sequence of Tet1 protein. © 2016 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.

  1. Minimal doses of a sequence-optimized transgene mediate high-level and long-term EPO expression in vivo: challenging CpG-free gene design.

    PubMed

    Kosovac, D; Wild, J; Ludwig, C; Meissner, S; Bauer, A P; Wagner, R

    2011-02-01

    Advanced gene delivery techniques can be combined with rational gene design to further improve the efficiency of plasmid DNA (pDNA)-mediated transgene expression in vivo. Herein, we analyzed the influence of intragenic sequence modifications on transgene expression in vitro and in vivo using murine erythropoietin (mEPO) as a transgene model. A single electro-gene transfer of an RNA- and codon-optimized mEPOopt gene into skeletal muscle resulted in a 3- to 4-fold increase of mEPO production sustained for >1 year and triggered a significant increase in hematocrit and hemoglobin without causing adverse effects. mEPO expression and hematologic levels were significantly lower when using comparable amounts of the wild type (mEPOwt) gene and only marginal effects were induced by mEPOΔCpG lacking intragenic CpG dinucleotides, even at high pDNA amounts. Corresponding with these observations, in vitro analysis of transfected cells revealed a 2- to 3-fold increased (mEPOopt) and 50% decreased (mEPOΔCpG) erythropoietin expression compared with mEPOwt, respectively. RNA analyses demonstrated that the specific design of the transgene sequence influenced expression levels by modulating transcriptional activity and nuclear plus cytoplasmic RNA amounts rather than translation. In sum, whereas CpG depletion negatively interferes with efficient expression in postmitotic tissues, mEPOopt doses <0.5 μg were sufficient to trigger optimal long-term hematologic effects encouraging the use of sequence-optimized transgenes to further reduce effective pDNA amounts.

  2. An efficient annotation and gene-expression derivation tool for Illumina Solexa datasets

    PubMed Central

    2010-01-01

    Background The data produced by an Illumina flow cell with all eight lanes occupied, produces well over a terabyte worth of images with gigabytes of reads following sequence alignment. The ability to translate such reads into meaningful annotation is therefore of great concern and importance. Very easily, one can get flooded with such a great volume of textual, unannotated data irrespective of read quality or size. CASAVA, a optional analysis tool for Illumina sequencing experiments, enables the ability to understand INDEL detection, SNP information, and allele calling. To not only extract from such analysis, a measure of gene expression in the form of tag-counts, but furthermore to annotate such reads is therefore of significant value. Findings We developed TASE (Tag counting and Analysis of Solexa Experiments), a rapid tag-counting and annotation software tool specifically designed for Illumina CASAVA sequencing datasets. Developed in Java and deployed using jTDS JDBC driver and a SQL Server backend, TASE provides an extremely fast means of calculating gene expression through tag-counts while annotating sequenced reads with the gene's presumed function, from any given CASAVA-build. Such a build is generated for both DNA and RNA sequencing. Analysis is broken into two distinct components: DNA sequence or read concatenation, followed by tag-counting and annotation. The end result produces output containing the homology-based functional annotation and respective gene expression measure signifying how many times sequenced reads were found within the genomic ranges of functional annotations. Conclusions TASE is a powerful tool to facilitate the process of annotating a given Illumina Solexa sequencing dataset. Our results indicate that both homology-based annotation and tag-count analysis are achieved in very efficient times, providing researchers to delve deep in a given CASAVA-build and maximize information extraction from a sequencing dataset. TASE is specially designed to translate sequence data in a CASAVA-build into functional annotations while producing corresponding gene expression measurements. Achieving such analysis is executed in an ultrafast and highly efficient manner, whether the analysis be a single-read or paired-end sequencing experiment. TASE is a user-friendly and freely available application, allowing rapid analysis and annotation of any given Illumina Solexa sequencing dataset with ease. PMID:20598141

  3. High-resolution melting (HRM) assay for the detection of recurrent BRCA1/BRCA2 germline mutations in Tunisian breast/ovarian cancer families.

    PubMed

    Riahi, Aouatef; Kharrat, Maher; Lariani, Imen; Chaabouni-Bouhamed, Habiba

    2014-12-01

    Germline deleterious mutations in the BRCA1/BRCA2 genes are associated with an increased risk for the development of breast and ovarian cancer. Given the large size of these genes the detection of such mutations represents a considerable technical challenge. Therefore, the development of cost-effective and rapid methods to identify these mutations became a necessity. High resolution melting analysis (HRM) is a rapid and efficient technique extensively employed as high-throughput mutation scanning method. The purpose of our study was to assess the specificity and sensitivity of HRM for BRCA1 and BRCA2 genes scanning. As a first step we estimate the ability of HRM for detection mutations in a set of 21 heterozygous samples harboring 8 different known BRCA1/BRCA2 variations, all samples had been preliminarily investigated by direct sequencing, and then we performed a blinded analysis by HRM in a set of 68 further sporadic samples of unknown genotype. All tested heterozygous BRCA1/BRCA2 variants were easily identified. However the HRM assay revealed further alteration that we initially had not searched (one unclassified variant). Furthermore, sequencing confirmed all the HRM detected mutations in the set of unknown samples, including homozygous changes, indicating that in this cohort, with the optimized assays, the mutations detections sensitivity and specificity were 100 %. HRM is a simple, rapid and efficient scanning method for known and unknown BRCA1/BRCA2 germline mutations. Consequently the method will allow for the economical screening of recurrent mutations in Tunisian population.

  4. Heterogeneous catalysis on the phage surface: Display of active human enteropeptidase.

    PubMed

    Gasparian, Marine E; Bobik, Tatyana V; Kim, Yana V; Ponomarenko, Natalia A; Dolgikh, Dmitry A; Gabibov, Alexander G; Kirpichnikov, Mikhail P

    2013-11-01

    Enteropeptidase (EC 3.4.21.9) plays a key role in mammalian digestion as the enzyme that physiologically activates trypsinogen by highly specific cleavage of the trypsinogen activation peptide following the recognition sequence D4K. The high specificity of enteropeptidase makes it a powerful tool in modern biotechnology. Here we describe the application of phage display technology to express active human enteropeptidase catalytic subunits (L-HEP) on M13 filamentous bacteriophage. The L-HEP/C122S gene was cloned in the g3p-based phagemid vector pHEN2m upstream of the sequence encoding the phage g3p protein and downstream of the signal peptide-encoding sequence. Heterogeneous catalysis of the synthetic peptide substrate (GDDDDK-β-naphthylamide) cleavage by phage-bound L-HEP was shown to have kinetic parameters similar to those of soluble enzyme, with the respective Km values of 19 μM and 20 μM and kcat of 115 and 92 s(-1). Fusion proteins containing a D4K cleavage site were cleaved with phage-bound L-HEP/C122S as well as by soluble L-HEP/C122S, and proteolysis was inhibited by soybean trypsin inhibitor. Rapid large-scale phage production, one-step purification of phage-bound L-HEP, and easy removal of enzyme activity from reaction samples by PEG precipitation make our approach suitable for the efficient removal of various tag sequences fused to the target proteins. The functional phage display technology developed in this study can be instrumental in constructing libraries of mutants to analyze the effect of structural changes on the activity and specificity of the enzyme or generate its desired variants for biotechnological applications. Copyright © 2013 Elsevier Masson SAS. All rights reserved.

  5. Determination of a Screening Metric for High Diversity DNA Libraries.

    PubMed

    Guido, Nicholas J; Handerson, Steven; Joseph, Elaine M; Leake, Devin; Kung, Li A

    2016-01-01

    The fields of antibody engineering, enzyme optimization and pathway construction rely increasingly on screening complex variant DNA libraries. These highly diverse libraries allow researchers to sample a maximized sequence space; and therefore, more rapidly identify proteins with significantly improved activity. The current state of the art in synthetic biology allows for libraries with billions of variants, pushing the limits of researchers' ability to qualify libraries for screening by measuring the traditional quality metrics of fidelity and diversity of variants. Instead, when screening variant libraries, researchers typically use a generic, and often insufficient, oversampling rate based on a common rule-of-thumb. We have developed methods to calculate a library-specific oversampling metric, based on fidelity, diversity, and representation of variants, which informs researchers, prior to screening the library, of the amount of oversampling required to ensure that the desired fraction of variant molecules will be sampled. To derive this oversampling metric, we developed a novel alignment tool to efficiently measure frequency counts of individual nucleotide variant positions using next-generation sequencing data. Next, we apply a method based on the "coupon collector" probability theory to construct a curve of upper bound estimates of the sampling size required for any desired variant coverage. The calculated oversampling metric will guide researchers to maximize their efficiency in using highly variant libraries.

  6. mCAL: A New Approach for Versatile Multiplex Action of Cas9 Using One sgRNA and Loci Flanked by a Programmed Target Sequence.

    PubMed

    Finnigan, Gregory C; Thorner, Jeremy

    2016-07-07

    Genome editing exploiting CRISPR/Cas9 has been adopted widely in academia and in the biotechnology industry to manipulate DNA sequences in diverse organisms. Molecular engineering of Cas9 itself and its guide RNA, and the strategies for using them, have increased efficiency, optimized specificity, reduced inappropriate off-target effects, and introduced modifications for performing other functions (transcriptional regulation, high-resolution imaging, protein recruitment, and high-throughput screening). Moreover, Cas9 has the ability to multiplex, i.e., to act at different genomic targets within the same nucleus. Currently, however, introducing concurrent changes at multiple loci involves: (i) identification of appropriate genomic sites, especially the availability of suitable PAM sequences; (ii) the design, construction, and expression of multiple sgRNA directed against those sites; (iii) potential difficulties in altering essential genes; and (iv) lingering concerns about "off-target" effects. We have devised a new approach that circumvents these drawbacks, as we demonstrate here using the yeast Saccharomyces cerevisiae First, any gene(s) of interest are flanked upstream and downstream with a single unique target sequence that does not normally exist in the genome. Thereafter, expression of one sgRNA and cotransformation with appropriate PCR fragments permits concomitant Cas9-mediated alteration of multiple genes (both essential and nonessential). The system we developed also allows for maintenance of the integrated, inducible Cas9-expression cassette or its simultaneous scarless excision. Our scheme-dubbed mCAL for " M: ultiplexing of C: as9 at A: rtificial L: oci"-can be applied to any organism in which the CRISPR/Cas9 methodology is currently being utilized. In principle, it can be applied to install synthetic sequences into the genome, to generate genomic libraries, and to program strains or cell lines so that they can be conveniently (and repeatedly) manipulated at multiple loci with extremely high efficiency. Copyright © 2016 Finnigan and Thorner.

  7. [Multiplex real-time PCR method for rapid detection of Marburg virus and Ebola virus].

    PubMed

    Yang, Yu; Bai, Lin; Hu, Kong-Xin; Yang, Zhi-Hong; Hu, Jian-Ping; Wang, Jing

    2012-08-01

    Marburg virus and Ebola virus are acute infections with high case fatality rates. A rapid, sensitive detection method was established to detect Marburg virus and Ebola virus by multiplex real-time fluorescence quantitative PCR. Designing primers and Taqman probes from highly conserved sequences of Marburg virus and Ebola virus through whole genome sequences alignment, Taqman probes labeled by FAM and Texas Red, the sensitivity of the multiplex real-time quantitative PCR assay was optimized by evaluating the different concentrations of primers and Probes. We have developed a real-time PCR method with the sensitivity of 30.5 copies/microl for Marburg virus positive plasmid and 28.6 copies/microl for Ebola virus positive plasmids, Japanese encephalitis virus, Yellow fever virus, Dengue virus were using to examine the specificity. The Multiplex real-time PCR assays provide a sensitive, reliable and efficient method to detect Marburg virus and Ebola virus simultaneously.

  8. Depletion of Unwanted Nucleic Acid Templates by Selective Cleavage: LNAzymes, Catalytically Active Oligonucleotides Containing Locked Nucleic Acids, Open a New Window for Detecting Rare Microbial Community Members

    PubMed Central

    Dolinšek, Jan; Dorninger, Christiane; Lagkouvardos, Ilias; Wagner, Michael

    2013-01-01

    Many studies of molecular microbial ecology rely on the characterization of microbial communities by PCR amplification, cloning, sequencing, and phylogenetic analysis of genes encoding rRNAs or functional marker enzymes. However, if the established clone libraries are dominated by one or a few sequence types, the cloned diversity is difficult to analyze by random clone sequencing. Here we present a novel approach to deplete unwanted sequence types from complex nucleic acid mixtures prior to cloning and downstream analyses. It employs catalytically active oligonucleotides containing locked nucleic acids (LNAzymes) for the specific cleavage of selected RNA targets. When combined with in vitro transcription and reverse transcriptase PCR, this LNAzyme-based technique can be used with DNA or RNA extracts from microbial communities. The simultaneous application of more than one specific LNAzyme allows the concurrent depletion of different sequence types from the same nucleic acid preparation. This new method was evaluated with defined mixtures of cloned 16S rRNA genes and then used to identify accompanying bacteria in an enrichment culture dominated by the nitrite oxidizer “Candidatus Nitrospira defluvii.” In silico analysis revealed that the majority of publicly deposited rRNA-targeted oligonucleotide probes may be used as specific LNAzymes with no or only minor sequence modifications. This efficient and cost-effective approach will greatly facilitate tasks such as the identification of microbial symbionts in nucleic acid preparations dominated by plastid or mitochondrial rRNA genes from eukaryotic hosts, the detection of contaminants in microbial cultures, and the analysis of rare organisms in microbial communities of highly uneven composition. PMID:23263968

  9. Increasing on-target cleavage efficiency for CRISPR/Cas9-induced large fragment deletion in Myxococcus xanthus.

    PubMed

    Yang, Ying-Jie; Wang, Ye; Li, Zhi-Feng; Gong, Ya; Zhang, Peng; Hu, Wen-Chao; Sheng, Duo-Hong; Li, Yue-Zhong

    2017-08-16

    The CRISPR/Cas9 system is a powerful tool for genome editing, in which the sgRNA binds and guides the Cas9 protein for the sequence-specific cleavage. The protocol is employable in different organisms, but is often limited by cell damage due to the endonuclease activity of the introduced Cas9 and the potential off-target DNA cleavage from incorrect guide by the 20 nt spacer. In this study, after resolving some critical limits, we have established an efficient CRISPR/Cas9 system for the deletion of large genome fragments related to the biosynthesis of secondary metabolites in Myxococcus xanthus cells. We revealed that the high expression of a codon-optimized cas9 gene in M. xanthus was cytotoxic, and developed a temporally high expression strategy to reduce the cell damage from high expressions of Cas9. We optimized the deletion protocol by using the tRNA-sgRNA-tRNA chimeric structure to ensure correct sgRNA sequence. We found that, in addition to the position-dependent nucleotide preference, the free energy of a 20 nt spacer was a key factor for the deletion efficiency. By using the developed protocol, we achieved the CRISPR/Cas9-induced deletion of large biosynthetic gene clusters for secondary metabolites in M. xanthus DK1622 and its epothilone-producing mutant. The findings and the proposals described in this paper were suggested to be workable in other organisms, for example, other Gram negative bacteria with high GC content.

  10. Chromosomal context and replication properties of ARS plasmids in Schizosaccharomyces pombe.

    PubMed

    Pratihar, Aditya S; Tripathi, Vishnu P; Yadav, Mukesh P; Dubey, Dharani D

    2015-12-01

    Short, specific DNA sequences called as Autonomously Replicating Sequence (ARS) elements function as plasmid as well as chromosomal replication origins in yeasts. As compared to ARSs, different chromosomal origins vary greatly in their efficiency and timing of replication probably due to their wider chromosomal context. The two Schizosaccharomyces pombe ARS elements, ars727 and ars2004, represent two extremities in their chromosomal origin activity - ars727 is inactive and late replicating, while ars2004 is a highly active, early-firing origin. To determine the effect of chromosomal context on the activity of these ARS elements, we have cloned them with their extended chromosomal context as well as in the context of each other in both orientations and analysed their replication efficiency by ARS and plasmid stability assays. We found that these ARS elements retain their origin activity in their extended/altered context. However, deletion of a 133-bp region of the previously reported ars727- associated late replication enforcing element (LRE) caused advancement in replication timing of the resulting plasmid. These results confirm the role of LRE in directing plasmid replication timing and suggest that the plasmid origin efficiency of ars2004 or ars727 remains unaltered by the extended chromosomal context.

  11. Efficient Genotyping of KRAS Mutant Non-Small Cell Lung Cancer Using a Multiplexed Droplet Digital PCR Approach.

    PubMed

    Pender, Alexandra; Garcia-Murillas, Isaac; Rana, Sareena; Cutts, Rosalind J; Kelly, Gavin; Fenwick, Kerry; Kozarewa, Iwanka; Gonzalez de Castro, David; Bhosle, Jaishree; O'Brien, Mary; Turner, Nicholas C; Popat, Sanjay; Downward, Julian

    2015-01-01

    Droplet digital PCR (ddPCR) can be used to detect low frequency mutations in oncogene-driven lung cancer. The range of KRAS point mutations observed in NSCLC necessitates a multiplex approach to efficient mutation detection in circulating DNA. Here we report the design and optimisation of three discriminatory ddPCR multiplex assays investigating nine different KRAS mutations using PrimePCR™ ddPCR™ Mutation Assays and the Bio-Rad QX100 system. Together these mutations account for 95% of the nucleotide changes found in KRAS in human cancer. Multiplex reactions were optimised on genomic DNA extracted from KRAS mutant cell lines and tested on DNA extracted from fixed tumour tissue from a cohort of lung cancer patients without prior knowledge of the specific KRAS genotype. The multiplex ddPCR assays had a limit of detection of better than 1 mutant KRAS molecule in 2,000 wild-type KRAS molecules, which compared favourably with a limit of detection of 1 in 50 for next generation sequencing and 1 in 10 for Sanger sequencing. Multiplex ddPCR assays thus provide a highly efficient methodology to identify KRAS mutations in lung adenocarcinoma.

  12. Efficient Genotyping of KRAS Mutant Non-Small Cell Lung Cancer Using a Multiplexed Droplet Digital PCR Approach

    PubMed Central

    Pender, Alexandra; Garcia-Murillas, Isaac; Rana, Sareena; Cutts, Rosalind J.; Kelly, Gavin; Fenwick, Kerry; Kozarewa, Iwanka; Gonzalez de Castro, David; Bhosle, Jaishree; O’Brien, Mary; Turner, Nicholas C.; Popat, Sanjay; Downward, Julian

    2015-01-01

    Droplet digital PCR (ddPCR) can be used to detect low frequency mutations in oncogene-driven lung cancer. The range of KRAS point mutations observed in NSCLC necessitates a multiplex approach to efficient mutation detection in circulating DNA. Here we report the design and optimisation of three discriminatory ddPCR multiplex assays investigating nine different KRAS mutations using PrimePCR™ ddPCR™ Mutation Assays and the Bio-Rad QX100 system. Together these mutations account for 95% of the nucleotide changes found in KRAS in human cancer. Multiplex reactions were optimised on genomic DNA extracted from KRAS mutant cell lines and tested on DNA extracted from fixed tumour tissue from a cohort of lung cancer patients without prior knowledge of the specific KRAS genotype. The multiplex ddPCR assays had a limit of detection of better than 1 mutant KRAS molecule in 2,000 wild-type KRAS molecules, which compared favourably with a limit of detection of 1 in 50 for next generation sequencing and 1 in 10 for Sanger sequencing. Multiplex ddPCR assays thus provide a highly efficient methodology to identify KRAS mutations in lung adenocarcinoma. PMID:26413866

  13. Homeologous plastid DNA transformation in tobacco is mediated by multiple recombination events.

    PubMed Central

    Kavanagh, T A; Thanh, N D; Lao, N T; McGrath, N; Peter, S O; Horváth, E M; Dix, P J; Medgyesy, P

    1999-01-01

    Efficient plastid transformation has been achieved in Nicotiana tabacum using cloned plastid DNA of Solanum nigrum carrying mutations conferring spectinomycin and streptomycin resistance. The use of the incompletely homologous (homeologous) Solanum plastid DNA as donor resulted in a Nicotiana plastid transformation frequency comparable with that of other experiments where completely homologous plastid DNA was introduced. Physical mapping and nucleotide sequence analysis of the targeted plastid DNA region in the transformants demonstrated efficient site-specific integration of the 7.8-kb Solanum plastid DNA and the exclusion of the vector DNA. The integration of the cloned Solanum plastid DNA into the Nicotiana plastid genome involved multiple recombination events as revealed by the presence of discontinuous tracts of Solanum-specific sequences that were interspersed between Nicotiana-specific markers. Marked position effects resulted in very frequent cointegration of the nonselected peripheral donor markers located adjacent to the vector DNA. Data presented here on the efficiency and features of homeologous plastid DNA recombination are consistent with the existence of an active RecA-mediated, but a diminished mismatch, recombination/repair system in higher-plant plastids. PMID:10388829

  14. Efficient HIV-1 inhibition by a 16 nt-long RNA aptamer designed by combining in vitro selection and in silico optimisation strategies

    PubMed Central

    Sánchez-Luque, Francisco J.; Stich, Michael; Manrubia, Susanna; Briones, Carlos; Berzal-Herranz, Alfredo

    2014-01-01

    The human immunodeficiency virus type-1 (HIV-1) genome contains multiple, highly conserved structural RNA domains that play key roles in essential viral processes. Interference with the function of these RNA domains either by disrupting their structures or by blocking their interaction with viral or cellular factors may seriously compromise HIV-1 viability. RNA aptamers are amongst the most promising synthetic molecules able to interact with structural domains of viral genomes. However, aptamer shortening up to their minimal active domain is usually necessary for scaling up production, what requires very time-consuming, trial-and-error approaches. Here we report on the in vitro selection of 64 nt-long specific aptamers against the complete 5′-untranslated region of HIV-1 genome, which inhibit more than 75% of HIV-1 production in a human cell line. The analysis of the selected sequences and structures allowed for the identification of a highly conserved 16 nt-long stem-loop motif containing a common 8 nt-long apical loop. Based on this result, an in silico designed 16 nt-long RNA aptamer, termed RNApt16, was synthesized, with sequence 5′-CCCCGGCAAGGAGGGG-3′. The HIV-1 inhibition efficiency of such an aptamer was close to 85%, thus constituting the shortest RNA molecule so far described that efficiently interferes with HIV-1 replication. PMID:25175101

  15. Genotyping microarray: Mutation screening in Spanish families with autosomal dominant retinitis pigmentosa

    PubMed Central

    García-Hoyos, María; Cortón, Marta; Ávila-Fernández, Almudena; Riveiro-Álvarez, Rosa; Giménez, Ascensión; Hernan, Inma; Carballo, Miguel; Ayuso, Carmen

    2012-01-01

    Purpose Presently, 22 genes have been described in association with autosomal dominant retinitis pigmentosa (adRP); however, they explain only 50% of all cases, making genetic diagnosis of this disease difficult and costly. The aim of this study was to evaluate a specific genotyping microarray for its application to the molecular diagnosis of adRP in Spanish patients. Methods We analyzed 139 unrelated Spanish families with adRP. Samples were studied by using a genotyping microarray (adRP). All mutations found were further confirmed with automatic sequencing. Rhodopsin (RHO) sequencing was performed in all negative samples for the genotyping microarray. Results The adRP genotyping microarray detected the mutation associated with the disease in 20 of the 139 families with adRP. As in other populations, RHO was found to be the most frequently mutated gene in these families (7.9% of the microarray genotyped families). The rate of false positives (microarray results not confirmed with sequencing) and false negatives (mutations in RHO detected with sequencing but not with the genotyping microarray) were established, and high levels of analytical sensitivity (95%) and specificity (100%) were found. Diagnostic accuracy was 15.1%. Conclusions The adRP genotyping microarray is a quick, cost-efficient first step in the molecular diagnosis of Spanish patients with adRP. PMID:22736939

  16. Tolerance of DNA Mismatches in Dmc1 Recombinase-mediated DNA Strand Exchange.

    PubMed

    Borgogno, María V; Monti, Mariela R; Zhao, Weixing; Sung, Patrick; Argaraña, Carlos E; Pezza, Roberto J

    2016-03-04

    Recombination between homologous chromosomes is required for the faithful meiotic segregation of chromosomes and leads to the generation of genetic diversity. The conserved meiosis-specific Dmc1 recombinase catalyzes homologous recombination triggered by DNA double strand breaks through the exchange of parental DNA sequences. Although providing an efficient rate of DNA strand exchange between polymorphic alleles, Dmc1 must also guard against recombination between divergent sequences. How DNA mismatches affect Dmc1-mediated DNA strand exchange is not understood. We have used fluorescence resonance energy transfer to study the mechanism of Dmc1-mediated strand exchange between DNA oligonucleotides with different degrees of heterology. The efficiency of strand exchange is highly sensitive to the location, type, and distribution of mismatches. Mismatches near the 3' end of the initiating DNA strand have a small effect, whereas most mismatches near the 5' end impede strand exchange dramatically. The Hop2-Mnd1 protein complex stimulates Dmc1-catalyzed strand exchange on homologous DNA or containing a single mismatch. We observed that Dmc1 can reject divergent DNA sequences while bypassing a few mismatches in the DNA sequence. Our findings have important implications in understanding meiotic recombination. First, Dmc1 acts as an initial barrier for heterologous recombination, with the mismatch repair system providing a second level of proofreading, to ensure that ectopic sequences are not recombined. Second, Dmc1 stepping over infrequent mismatches is likely critical for allowing recombination between the polymorphic sequences of homologous chromosomes, thus contributing to gene conversion and genetic diversity. © 2016 by The American Society for Biochemistry and Molecular Biology, Inc.

  17. Tolerance of DNA Mismatches in Dmc1 Recombinase-mediated DNA Strand Exchange*

    PubMed Central

    Borgogno, María V.; Monti, Mariela R.; Zhao, Weixing; Sung, Patrick; Argaraña, Carlos E.; Pezza, Roberto J.

    2016-01-01

    Recombination between homologous chromosomes is required for the faithful meiotic segregation of chromosomes and leads to the generation of genetic diversity. The conserved meiosis-specific Dmc1 recombinase catalyzes homologous recombination triggered by DNA double strand breaks through the exchange of parental DNA sequences. Although providing an efficient rate of DNA strand exchange between polymorphic alleles, Dmc1 must also guard against recombination between divergent sequences. How DNA mismatches affect Dmc1-mediated DNA strand exchange is not understood. We have used fluorescence resonance energy transfer to study the mechanism of Dmc1-mediated strand exchange between DNA oligonucleotides with different degrees of heterology. The efficiency of strand exchange is highly sensitive to the location, type, and distribution of mismatches. Mismatches near the 3′ end of the initiating DNA strand have a small effect, whereas most mismatches near the 5′ end impede strand exchange dramatically. The Hop2-Mnd1 protein complex stimulates Dmc1-catalyzed strand exchange on homologous DNA or containing a single mismatch. We observed that Dmc1 can reject divergent DNA sequences while bypassing a few mismatches in the DNA sequence. Our findings have important implications in understanding meiotic recombination. First, Dmc1 acts as an initial barrier for heterologous recombination, with the mismatch repair system providing a second level of proofreading, to ensure that ectopic sequences are not recombined. Second, Dmc1 stepping over infrequent mismatches is likely critical for allowing recombination between the polymorphic sequences of homologous chromosomes, thus contributing to gene conversion and genetic diversity. PMID:26709229

  18. Sequence-Based Discovery Demonstrates That Fixed Light Chain Human Transgenic Rats Produce a Diverse Repertoire of Antigen-Specific Antibodies.

    PubMed

    Harris, Katherine E; Aldred, Shelley Force; Davison, Laura M; Ogana, Heather Anne N; Boudreau, Andrew; Brüggemann, Marianne; Osborn, Michael; Ma, Biao; Buelow, Benjamin; Clarke, Starlynn C; Dang, Kevin H; Iyer, Suhasini; Jorgensen, Brett; Pham, Duy T; Pratap, Payal P; Rangaswamy, Udaya S; Schellenberger, Ute; van Schooten, Wim C; Ugamraj, Harshad S; Vafa, Omid; Buelow, Roland; Trinklein, Nathan D

    2018-01-01

    We created a novel transgenic rat that expresses human antibodies comprising a diverse repertoire of heavy chains with a single common rearranged kappa light chain (IgKV3-15-JK1). This fixed light chain animal, called OmniFlic, presents a unique system for human therapeutic antibody discovery and a model to study heavy chain repertoire diversity in the context of a constant light chain. The purpose of this study was to analyze heavy chain variable gene usage, clonotype diversity, and to describe the sequence characteristics of antigen-specific monoclonal antibodies (mAbs) isolated from immunized OmniFlic animals. Using next-generation sequencing antibody repertoire analysis, we measured heavy chain variable gene usage and the diversity of clonotypes present in the lymph node germinal centers of 75 OmniFlic rats immunized with 9 different protein antigens. Furthermore, we expressed 2,560 unique heavy chain sequences sampled from a diverse set of clonotypes as fixed light chain antibody proteins and measured their binding to antigen by ELISA. Finally, we measured patterns and overall levels of somatic hypermutation in the full B-cell repertoire and in the 2,560 mAbs tested for binding. The results demonstrate that OmniFlic animals produce an abundance of antigen-specific antibodies with heavy chain clonotype diversity that is similar to what has been described with unrestricted light chain use in mammals. In addition, we show that sequence-based discovery is a highly effective and efficient way to identify a large number of diverse monoclonal antibodies to a protein target of interest.

  19. Optimal control design of turbo spin‐echo sequences with applications to parallel‐transmit systems

    PubMed Central

    Hoogduin, Hans; Hajnal, Joseph V.; van den Berg, Cornelis A. T.; Luijten, Peter R.; Malik, Shaihan J.

    2016-01-01

    Purpose The design of turbo spin‐echo sequences is modeled as a dynamic optimization problem which includes the case of inhomogeneous transmit radiofrequency fields. This problem is efficiently solved by optimal control techniques making it possible to design patient‐specific sequences online. Theory and Methods The extended phase graph formalism is employed to model the signal evolution. The design problem is cast as an optimal control problem and an efficient numerical procedure for its solution is given. The numerical and experimental tests address standard multiecho sequences and pTx configurations. Results Standard, analytically derived flip angle trains are recovered by the numerical optimal control approach. New sequences are designed where constraints on radiofrequency total and peak power are included. In the case of parallel transmit application, the method is able to calculate the optimal echo train for two‐dimensional and three‐dimensional turbo spin echo sequences in the order of 10 s with a single central processing unit (CPU) implementation. The image contrast is maintained through the whole field of view despite inhomogeneities of the radiofrequency fields. Conclusion The optimal control design sheds new light on the sequence design process and makes it possible to design sequences in an online, patient‐specific fashion. Magn Reson Med 77:361–373, 2017. © 2016 The Authors Magnetic Resonance in Medicine published by Wiley Periodicals, Inc. on behalf of International Society for Magnetic Resonance in Medicine PMID:26800383

  20. The genome sequence of the model ascomycete fungus Podospora anserina.

    PubMed

    Espagne, Eric; Lespinet, Olivier; Malagnac, Fabienne; Da Silva, Corinne; Jaillon, Olivier; Porcel, Betina M; Couloux, Arnaud; Aury, Jean-Marc; Ségurens, Béatrice; Poulain, Julie; Anthouard, Véronique; Grossetete, Sandrine; Khalili, Hamid; Coppin, Evelyne; Déquard-Chablat, Michelle; Picard, Marguerite; Contamine, Véronique; Arnaise, Sylvie; Bourdais, Anne; Berteaux-Lecellier, Véronique; Gautheret, Daniel; de Vries, Ronald P; Battaglia, Evy; Coutinho, Pedro M; Danchin, Etienne Gj; Henrissat, Bernard; Khoury, Riyad El; Sainsard-Chanet, Annie; Boivin, Antoine; Pinan-Lucarré, Bérangère; Sellem, Carole H; Debuchy, Robert; Wincker, Patrick; Weissenbach, Jean; Silar, Philippe

    2008-01-01

    The dung-inhabiting ascomycete fungus Podospora anserina is a model used to study various aspects of eukaryotic and fungal biology, such as ageing, prions and sexual development. We present a 10X draft sequence of P. anserina genome, linked to the sequences of a large expressed sequence tag collection. Similar to higher eukaryotes, the P. anserina transcription/splicing machinery generates numerous non-conventional transcripts. Comparison of the P. anserina genome and orthologous gene set with the one of its close relatives, Neurospora crassa, shows that synteny is poorly conserved, the main result of evolution being gene shuffling in the same chromosome. The P. anserina genome contains fewer repeated sequences and has evolved new genes by duplication since its separation from N. crassa, despite the presence of the repeat induced point mutation mechanism that mutates duplicated sequences. We also provide evidence that frequent gene loss took place in the lineages leading to P. anserina and N. crassa. P. anserina contains a large and highly specialized set of genes involved in utilization of natural carbon sources commonly found in its natural biotope. It includes genes potentially involved in lignin degradation and efficient cellulose breakdown. The features of the P. anserina genome indicate a highly dynamic evolution since the divergence of P. anserina and N. crassa, leading to the ability of the former to use specific complex carbon sources that match its needs in its natural biotope.

  1. The Sequences of 1504 Mutants in the Model Rice Variety Kitaake Facilitate Rapid Functional Genomic Studies.

    PubMed

    Li, Guotian; Jain, Rashmi; Chern, Mawsheng; Pham, Nikki T; Martin, Joel A; Wei, Tong; Schackwitz, Wendy S; Lipzen, Anna M; Duong, Phat Q; Jones, Kyle C; Jiang, Liangrong; Ruan, Deling; Bauer, Diane; Peng, Yi; Barry, Kerrie W; Schmutz, Jeremy; Ronald, Pamela C

    2017-06-01

    The availability of a whole-genome sequenced mutant population and the cataloging of mutations of each line at a single-nucleotide resolution facilitate functional genomic analysis. To this end, we generated and sequenced a fast-neutron-induced mutant population in the model rice cultivar Kitaake ( Oryza sativa ssp japonica ), which completes its life cycle in 9 weeks. We sequenced 1504 mutant lines at 45-fold coverage and identified 91,513 mutations affecting 32,307 genes, i.e., 58% of all rice genes. We detected an average of 61 mutations per line. Mutation types include single-base substitutions, deletions, insertions, inversions, translocations, and tandem duplications. We observed a high proportion of loss-of-function mutations. We identified an inversion affecting a single gene as the causative mutation for the short-grain phenotype in one mutant line. This result reveals the usefulness of the resource for efficient, cost-effective identification of genes conferring specific phenotypes. To facilitate public access to this genetic resource, we established an open access database called KitBase that provides access to sequence data and seed stocks. This population complements other available mutant collections and gene-editing technologies. This work demonstrates how inexpensive next-generation sequencing can be applied to generate a high-density catalog of mutations. © 2017 American Society of Plant Biologists. All rights reserved.

  2. The Sequences of 1,504 Mutants in the Model Rice Variety Kitaake Facilitate Rapid Functional Genomic Studies

    DOE PAGES

    Li, Guotian; Jain, Rashmi; Chern, Mawsheng; ...

    2017-06-02

    The availability of a whole-genome sequenced mutant population and the cataloging of mutations of each line at a single-nucleotide resolution facilitate functional genomic analysis. To this end, we generated and sequenced a fast-neutron-induced mutant population in the model rice cultivar Kitaake (Oryza sativa ssp japonica), which completes its life cycle in 9 weeks. We sequenced 1504 mutant lines at 45-fold coverage and identified 91,513 mutations affecting 32,307 genes, i.e., 58% of all rice genes. We detected an average of 61 mutations per line. Mutation types include single-base substitutions, deletions, insertions, inversions, translocations, and tandem duplications. We observed a high proportionmore » of loss-of-function mutations. We identified an inversion affecting a single gene as the causative mutation for the short-grain phenotype in one mutant line. This result reveals the usefulness of the resource for efficient, cost-effective identification of genes conferring specific phenotypes. To facilitate public access to this genetic resource, we established an open access database called KitBase that provides access to sequence data and seed stocks. This population complements other available mutant collections and gene-editing technologies. In conclusion, this work demonstrates how inexpensive next-generation sequencing can be applied to generate a high-density catalog of mutations.« less

  3. An analysis by metabolic labelling of the encephalomyocarditis virus ribosomal frameshifting efficiency and stimulators.

    PubMed

    Ling, Roger; Firth, Andrew E

    2017-08-01

    Programmed -1 ribosomal frameshifting is a mechanism of gene expression whereby specific signals within messenger RNAs direct a proportion of ribosomes to shift -1 nt and continue translating in the new reading frame. Such frameshifting normally depends on an RNA structure stimulator 3'-adjacent to a 'slippery' heptanucleotide shift site sequence. Recently we identified an unusual frameshifting mechanism in encephalomyocarditis virus, where the stimulator involves a trans-acting virus protein. Thus, in contrast to other examples of -1 frameshifting, the efficiency of frameshifting in encephalomyocarditis virus is best studied in the context of virus infection. Here we use metabolic labelling to analyse the frameshifting efficiency of wild-type and mutant viruses. Confirming previous results, frameshifting depends on a G_GUU_UUU shift site sequence and a 3'-adjacent stem-loop structure, but is not appreciably affected by the 'StopGo' sequence present ~30 nt upstream. At late timepoints, frameshifting was estimated to be 46-76 % efficient.

  4. Cell type-specific termination of transcription by transposable element sequences.

    PubMed

    Conley, Andrew B; Jordan, I King

    2012-09-30

    Transposable elements (TEs) encode sequences necessary for their own transposition, including signals required for the termination of transcription. TE sequences within the introns of human genes show an antisense orientation bias, which has been proposed to reflect selection against TE sequences in the sense orientation owing to their ability to terminate the transcription of host gene transcripts. While there is evidence in support of this model for some elements, the extent to which TE sequences actually terminate transcription of human gene across the genome remains an open question. Using high-throughput sequencing data, we have characterized over 9,000 distinct TE-derived sequences that provide transcription termination sites for 5,747 human genes across eight different cell types. Rarefaction curve analysis suggests that there may be twice as many TE-derived termination sites (TE-TTS) genome-wide among all human cell types. The local chromatin environment for these TE-TTS is similar to that seen for 3' UTR canonical TTS and distinct from the chromatin environment of other intragenic TE sequences. However, those TE-TTS located within the introns of human genes were found to be far more cell type-specific than the canonical TTS. TE-TTS were much more likely to be found in the sense orientation than other intragenic TE sequences of the same TE family and TE-TTS in the sense orientation terminate transcription more efficiently than those found in the antisense orientation. Alu sequences were found to provide a large number of relatively weak TTS, whereas LTR elements provided a smaller number of much stronger TTS. TE sequences provide numerous termination sites to human genes, and TE-derived TTS are particularly cell type-specific. Thus, TE sequences provide a powerful mechanism for the diversification of transcriptional profiles between cell types and among evolutionary lineages, since most TE-TTS are evolutionarily young. The extent of transcription termination by TEs seen here, along with the preference for sense-oriented TE insertions to provide TTS, is consistent with the observed antisense orientation bias of human TEs.

  5. Exploiting sequence similarity to validate the sensitivity of SNP arrays in detecting fine-scaled copy number variations.

    PubMed

    Wong, Gerard; Leckie, Christopher; Gorringe, Kylie L; Haviv, Izhak; Campbell, Ian G; Kowalczyk, Adam

    2010-04-15

    High-density single nucleotide polymorphism (SNP) genotyping arrays are efficient and cost effective platforms for the detection of copy number variation (CNV). To ensure accuracy in probe synthesis and to minimize production costs, short oligonucleotide probe sequences are used. The use of short probe sequences limits the specificity of binding targets in the human genome. The specificity of these short probeset sequences has yet to be fully analysed against a normal reference human genome. Sequence similarity can artificially elevate or suppress copy number measurements, and hence reduce the reliability of affected probe readings. For the purpose of detecting narrow CNVs reliably down to the width of a single probeset, sequence similarity is an important issue that needs to be addressed. We surveyed the Affymetrix Human Mapping SNP arrays for probeset sequence similarity against the reference human genome. Utilizing sequence similarity results, we identified a collection of fine-scaled putative CNVs between gender from autosomal probesets whose sequence matches various loci on the sex chromosomes. To detect these variations, we utilized our statistical approach, Detecting REcurrent Copy number change using rank-order Statistics (DRECS), and showed that its performance was superior and more stable than the t-test in detecting CNVs. Through the application of DRECS on the HapMap population datasets with multi-matching probesets filtered, we identified biologically relevant SNPs in aberrant regions across populations with known association to physical traits, such as height, covered by the span of a single probe. This provided empirical confirmation of the existence of naturally occurring narrow CNVs as well as the sensitivity of the Affymetrix SNP array technology in detecting them. The MATLAB implementation of DRECS is available at http://ww2.cs.mu.oz.au/ approximately gwong/DRECS/index.html.

  6. Chimeric TALE recombinases with programmable DNA sequence specificity.

    PubMed

    Mercer, Andrew C; Gaj, Thomas; Fuller, Roberta P; Barbas, Carlos F

    2012-11-01

    Site-specific recombinases are powerful tools for genome engineering. Hyperactivated variants of the resolvase/invertase family of serine recombinases function without accessory factors, and thus can be re-targeted to sequences of interest by replacing native DNA-binding domains (DBDs) with engineered zinc-finger proteins (ZFPs). However, imperfect modularity with particular domains, lack of high-affinity binding to all DNA triplets, and difficulty in construction has hindered the widespread adoption of ZFPs in unspecialized laboratories. The discovery of a novel type of DBD in transcription activator-like effector (TALE) proteins from Xanthomonas provides an alternative to ZFPs. Here we describe chimeric TALE recombinases (TALERs): engineered fusions between a hyperactivated catalytic domain from the DNA invertase Gin and an optimized TALE architecture. We use a library of incrementally truncated TALE variants to identify TALER fusions that modify DNA with efficiency and specificity comparable to zinc-finger recombinases in bacterial cells. We also show that TALERs recombine DNA in mammalian cells. The TALER architecture described herein provides a platform for insertion of customized TALE domains, thus significantly expanding the targeting capacity of engineered recombinases and their potential applications in biotechnology and medicine.

  7. Sequence Polishing Library (SPL) v10.0

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Oberortner, Ernst

    The Sequence Polishing Library (SPL) is a suite of software tools in order to automate "Design for Synthesis and Assembly" workflows. Specifically: The SPL "Converter" tool converts files among the following sequence data exchange formats: CSV, FASTA, GenBank, and Synthetic Biology Open Language (SBOL); The SPL "Juggler" tool optimizes the codon usages of DNA coding sequences according to an optimization strategy, a user-specific codon usage table and genetic code. In addition, the SPL "Juggler" can translate amino acid sequences into DNA sequences.:The SPL "Polisher" verifies NA sequences against DNA synthesis constraints, such as GC content, repeating k-mers, and restriction sites.more » In case of violations, the "Polisher" reports the violations in a comprehensive manner. The "Polisher" tool can also modify the violating regions according to an optimization strategy, a user-specific codon usage table and genetic code;The SPL "Partitioner" decomposes large DNA sequences into smaller building blocks with partial overlaps that enable an efficient assembly. The "Partitioner" enables the user to configure the characteristics of the overlaps, which are mostly determined by the utilized assembly protocol, such as length, GC content, or melting temperature.« less

  8. An efficient method for the prediction of deleterious multiple-point mutations in the secondary structure of RNAs using suboptimal folding solutions

    PubMed Central

    Churkin, Alexander; Barash, Danny

    2008-01-01

    Background RNAmute is an interactive Java application which, given an RNA sequence, calculates the secondary structure of all single point mutations and organizes them into categories according to their similarity to the predicted structure of the wild type. The secondary structure predictions are performed using the Vienna RNA package. A more efficient implementation of RNAmute is needed, however, to extend from the case of single point mutations to the general case of multiple point mutations, which may often be desired for computational predictions alongside mutagenesis experiments. But analyzing multiple point mutations, a process that requires traversing all possible mutations, becomes highly expensive since the running time is O(nm) for a sequence of length n with m-point mutations. Using Vienna's RNAsubopt, we present a method that selects only those mutations, based on stability considerations, which are likely to be conformational rearranging. The approach is best examined using the dot plot representation for RNA secondary structure. Results Using RNAsubopt, the suboptimal solutions for a given wild-type sequence are calculated once. Then, specific mutations are selected that are most likely to cause a conformational rearrangement. For an RNA sequence of about 100 nts and 3-point mutations (n = 100, m = 3), for example, the proposed method reduces the running time from several hours or even days to several minutes, thus enabling the practical application of RNAmute to the analysis of multiple-point mutations. Conclusion A highly efficient addition to RNAmute that is as user friendly as the original application but that facilitates the practical analysis of multiple-point mutations is presented. Such an extension can now be exploited prior to site-directed mutagenesis experiments by virologists, for example, who investigate the change of function in an RNA virus via mutations that disrupt important motifs in its secondary structure. A complete explanation of the application, called MultiRNAmute, is available at [1]. PMID:18445289

  9. Analysis of endoscopic third ventriculostomy patency by MRI: value of different pulse sequences, the sequence parameters, and the imaging planes for investigation of flow void.

    PubMed

    Dinçer, Alp; Yildiz, Erdem; Kohan, Saeed; Memet Özek, M

    2011-01-01

    The aim of the study is to evaluate the efficiency of turbo spin-echo (TSE), three-dimensional constructive interference in the steady state (3D CISS) and cine phase contrast (Cine PC) sequences in determining flow through the endoscopic third ventriculostomy (ETV) fenestration, and to determine the effect of various TSE sequence parameters. The study was approved by our institutional review board and informed consent from all patients was obtained. Two groups of patients were included: group I (24 patients with good clinical outcome after ETV) and group II (22 patients with hydrocephalus evaluated preoperatively). The imaging protocol for both groups was identical. TSE T2 with various sequence parameters and imaging planes, and 3D CISS, followed by cine PC were obtained. Flow void was graded as four-point scales. The sensitivity, specificity, accuracy, positive and negative predictive values of sequences were calculated. Bidirectional flow through the fenestration was detected in all group I patients by cine PC. Stroke volumes through the fenestration in group I ranged 10-160.8 ml/min. There was no correlation between the presence of reversed flow and flow void grading. Also, there was no correlation between the stroke volumes and flow void grading. The sensitivity of 3D CISS was low, and 2 mm sagittal TSE T2, nearly equal to cine PC, provided best result. Cine PC and TSE T2 both have high confidence in the assessment of the flow through the fenestration. But, sequence parameters significantly affect the efficiency of TSE T2.

  10. Bacteroides fragilis mobilizable transposon Tn5520 requires a 71 base pair origin of transfer sequence and a single mobilization protein for relaxosome formation during conjugation.

    PubMed

    Vedantam, Gayatri; Knopf, Sarah; Hecht, David W

    2006-01-01

    Tn5520 is the smallest known bacterial mobilizable transposon and was isolated from an antibiotic resistant Bacteroides fragilis clinical isolate. When a conjugation apparatus is provided in trans, Tn5520 is mobilized (transferred) efficiently within, and from, both Bacteroides spp. and Escherichia coli. Only two genes are present on Tn5520; one encodes an integrase, and the other a multifunctional mobilization (Mob) protein BmpH. BmpH is essential for Tn5520 mobility. The focus of this study was to identify the Tn5520 origin of conjugative transfer (oriT) and to study BmpH-oriT binding. We delimited the functional Tn5520 oriT to a 71 bp sequence upstream of the bmpH gene. A plasmid vector harbouring this minimal 71 bp oriT was mobilized at the same frequency as that of intact Tn5520. The minimal oriT contains one 17 bp inverted repeat (IR) sequence. We constructed and tested multiple IR mutants and showed that the IR was essential in its entirety for mobilization. A nick site sequence (5'-GCTAC-3') was also identified within the minimal oriT; this sequence resembled nick sites found in plasmids of Gram positive origin. We further showed that mutation of a highly conserved GC dinucleotide in the nick site sequence completely abolished mobilization. We also purified BmpH and showed that it specifically bound a Tn5520 oriT fragment in electrophoretic mobility shift assays. We also identified non-nick site sequences within the minimal oriT that were essential for mobilization. We hypothesize that transposon-based single Mob protein systems may contribute to efficient gene dissemination from Bacteroides spp., because fewer DNA processing proteins are required for relaxosome formation.

  11. Latency-associated transcript (LAT) exon 1 controls herpes simplex virus species-specific phenotypes: reactivation in the guinea pig genital model and neuron subtype-specific latent expression of LAT.

    PubMed

    Bertke, Andrea S; Patel, Amita; Imai, Yumi; Apakupakul, Kathleen; Margolis, Todd P; Krause, Philip R

    2009-10-01

    Herpes simplex virus 1 (HSV-1) and HSV-2 cause similar acute infections but differ in their abilities to reactivate from trigeminal and lumbosacral dorsal root ganglia. During latency, HSV-1 and HSV-2 also preferentially express their latency-associated transcripts (LATs) in different sensory neuronal subtypes that are positive for A5 and KH10 markers, respectively. Chimeric virus studies showed that LAT region sequences influence both of these viral species-specific phenotypes. To further map the LAT region sequences responsible for these phenotypes, we constructed the chimeric virus HSV2-LAT-E1, in which exon 1 (from the LAT TATA to the intron splice site) was replaced by the corresponding sequence from HSV-1 LAT. In intravaginally infected guinea pigs, HSV2-LAT-E1 reactivated inefficiently relative to the efficiency of its rescuant and wild-type HSV-2, but it yielded similar levels of viral DNA, LAT, and ICP0 during acute and latent infection. HSV2-LAT-E1 preferentially expressed the LAT in A5+ neurons (as does HSV-1), while the chimeric viruses HSV2-LAT-P1 (LAT promoter swap) and HSV2-LAT-S1 (LAT sequence swap downstream of the promoter) exhibited neuron subtype-specific latent LAT expression phenotypes more similar to that of HSV-2 than that of HSV-1. Rescuant viruses displayed the wild-type HSV-2 phenotypes of efficient reactivation in the guinea pig genital model and a tendency to express LAT in KH10+ neurons. The region that is critical for HSV species-specific differences in latency and reactivation thus lies between the LAT TATA and the intron splice site, and minor differences in the 5' ends of chimeric sequences in HSV2-LAT-E1 and HSV2-LAT-S1 point to sequences immediately downstream of the LAT TATA.

  12. Latency-Associated Transcript (LAT) Exon 1 Controls Herpes Simplex Virus Species-Specific Phenotypes: Reactivation in the Guinea Pig Genital Model and Neuron Subtype-Specific Latent Expression of LAT▿

    PubMed Central

    Bertke, Andrea S.; Patel, Amita; Imai, Yumi; Apakupakul, Kathleen; Margolis, Todd P.; Krause, Philip R.

    2009-01-01

    Herpes simplex virus 1 (HSV-1) and HSV-2 cause similar acute infections but differ in their abilities to reactivate from trigeminal and lumbosacral dorsal root ganglia. During latency, HSV-1 and HSV-2 also preferentially express their latency-associated transcripts (LATs) in different sensory neuronal subtypes that are positive for A5 and KH10 markers, respectively. Chimeric virus studies showed that LAT region sequences influence both of these viral species-specific phenotypes. To further map the LAT region sequences responsible for these phenotypes, we constructed the chimeric virus HSV2-LAT-E1, in which exon 1 (from the LAT TATA to the intron splice site) was replaced by the corresponding sequence from HSV-1 LAT. In intravaginally infected guinea pigs, HSV2-LAT-E1 reactivated inefficiently relative to the efficiency of its rescuant and wild-type HSV-2, but it yielded similar levels of viral DNA, LAT, and ICP0 during acute and latent infection. HSV2-LAT-E1 preferentially expressed the LAT in A5+ neurons (as does HSV-1), while the chimeric viruses HSV2-LAT-P1 (LAT promoter swap) and HSV2-LAT-S1 (LAT sequence swap downstream of the promoter) exhibited neuron subtype-specific latent LAT expression phenotypes more similar to that of HSV-2 than that of HSV-1. Rescuant viruses displayed the wild-type HSV-2 phenotypes of efficient reactivation in the guinea pig genital model and a tendency to express LAT in KH10+ neurons. The region that is critical for HSV species-specific differences in latency and reactivation thus lies between the LAT TATA and the intron splice site, and minor differences in the 5′ ends of chimeric sequences in HSV2-LAT-E1 and HSV2-LAT-S1 point to sequences immediately downstream of the LAT TATA. PMID:19641003

  13. Googling DNA sequences on the World Wide Web.

    PubMed

    Hajibabaei, Mehrdad; Singer, Gregory A C

    2009-11-10

    New web-based technologies provide an excellent opportunity for sharing and accessing information and using web as a platform for interaction and collaboration. Although several specialized tools are available for analyzing DNA sequence information, conventional web-based tools have not been utilized for bioinformatics applications. We have developed a novel algorithm and implemented it for searching species-specific genomic sequences, DNA barcodes, by using popular web-based methods such as Google. We developed an alignment independent character based algorithm based on dividing a sequence library (DNA barcodes) and query sequence to words. The actual search is conducted by conventional search tools such as freely available Google Desktop Search. We implemented our algorithm in two exemplar packages. We developed pre and post-processing software to provide customized input and output services, respectively. Our analysis of all publicly available DNA barcode sequences shows a high accuracy as well as rapid results. Our method makes use of conventional web-based technologies for specialized genetic data. It provides a robust and efficient solution for sequence search on the web. The integration of our search method for large-scale sequence libraries such as DNA barcodes provides an excellent web-based tool for accessing this information and linking it to other available categories of information on the web.

  14. Sequence specificity of the human mRNA N6-adenosine methylase in vitro.

    PubMed Central

    Harper, J E; Miceli, S M; Roberts, R J; Manley, J L

    1990-01-01

    N6-adenosine methylation is a frequent modification of mRNAs and their precursors, but little is known about the mechanism of the reaction or the function of the modification. To explore these questions, we developed conditions to examine N6-adenosine methylase activity in HeLa cell nuclear extracts. Transfer of the methyl group from S-[3H methyl]-adenosylmethionine to unlabeled random copolymer RNA substrates of varying ribonucleotide composition revealed a substrate specificity consistent with a previously deduced consensus sequence, Pu[G greater than A]AC[A/C/U]. 32-P labeled RNA substrates of defined sequence were used to examine the minimum sequence requirements for methylation. Each RNA was 20 nucleotides long, and contained either the core consensus sequence GGACU, or some variation of this sequence. RNAs containing GGACU, either in single or multiple copies, were good substrates for methylation, whereas RNAs containing single base substitutions within the GGACU sequence gave dramatically reduced methylation. These results demonstrate that the N6-adenosine methylase has a strict sequence specificity, and that there is no requirement for extended sequences or secondary structures for methylation. Recognition of this sequence does not require an RNA component, as micrococcal nuclease pretreatment of nuclear extracts actually increased methylation efficiency. Images PMID:2216767

  15. Targeting vector construction through recombineering.

    PubMed

    Malureanu, Liviu A

    2011-01-01

    Gene targeting in mouse embryonic stem cells is an essential, yet still very expensive and highly time-consuming, tool and method to study gene function at the organismal level or to create mouse models of human diseases. Conventional cloning-based methods have been largely used for generating targeting vectors, but are hampered by a number of limiting factors, including the variety and location of restriction enzymes in the gene locus of interest, the specific PCR amplification of repetitive DNA sequences, and cloning of large DNA fragments. Recombineering is a technique that exploits the highly efficient homologous recombination function encoded by λ phage in Escherichia coli. Bacteriophage-based recombination can recombine homologous sequences as short as 30-50 bases, allowing manipulations such as insertion, deletion, or mutation of virtually any genomic region. The large availability of mouse genomic bacterial artificial chromosome (BAC) libraries covering most of the genome facilitates the retrieval of genomic DNA sequences from the bacterial chromosomes through recombineering. This chapter describes a successfully applied protocol and aims to be a detailed guide through the steps of generation of targeting vectors through recombineering.

  16. Integrated digital error suppression for improved detection of circulating tumor DNA

    PubMed Central

    Kurtz, David M.; Chabon, Jacob J.; Scherer, Florian; Stehr, Henning; Liu, Chih Long; Bratman, Scott V.; Say, Carmen; Zhou, Li; Carter, Justin N.; West, Robert B.; Sledge, George W.; Shrager, Joseph B.; Loo, Billy W.; Neal, Joel W.; Wakelee, Heather A.; Diehn, Maximilian; Alizadeh, Ash A.

    2016-01-01

    High-throughput sequencing of circulating tumor DNA (ctDNA) promises to facilitate personalized cancer therapy. However, low quantities of cell-free DNA (cfDNA) in the blood and sequencing artifacts currently limit analytical sensitivity. To overcome these limitations, we introduce an approach for integrated digital error suppression (iDES). Our method combines in silico elimination of highly stereotypical background artifacts with a molecular barcoding strategy for the efficient recovery of cfDNA molecules. Individually, these two methods each improve the sensitivity of cancer personalized profiling by deep sequencing (CAPP-Seq) by ~3 fold, and synergize when combined to yield ~15-fold improvements. As a result, iDES-enhanced CAPP-Seq facilitates noninvasive variant detection across hundreds of kilobases. Applied to clinical non-small cell lung cancer (NSCLC) samples, our method enabled biopsy-free profiling of EGFR kinase domain mutations with 92% sensitivity and 96% specificity and detection of ctDNA down to 4 in 105 cfDNA molecules. We anticipate that iDES will aid the noninvasive genotyping and detection of ctDNA in research and clinical settings. PMID:27018799

  17. Low-Energy Electron-Induced Strand Breaks in Telomere-Derived DNA Sequences-Influence of DNA Sequence and Topology.

    PubMed

    Rackwitz, Jenny; Bald, Ilko

    2018-03-26

    During cancer radiation therapy high-energy radiation is used to reduce tumour tissue. The irradiation produces a shower of secondary low-energy (<20 eV) electrons, which are able to damage DNA very efficiently by dissociative electron attachment. Recently, it was suggested that low-energy electron-induced DNA strand breaks strongly depend on the specific DNA sequence with a high sensitivity of G-rich sequences. Here, we use DNA origami platforms to expose G-rich telomere sequences to low-energy (8.8 eV) electrons to determine absolute cross sections for strand breakage and to study the influence of sequence modifications and topology of telomeric DNA on the strand breakage. We find that the telomeric DNA 5'-(TTA GGG) 2 is more sensitive to low-energy electrons than an intermixed sequence 5'-(TGT GTG A) 2 confirming the unique electronic properties resulting from G-stacking. With increasing length of the oligonucleotide (i.e., going from 5'-(GGG ATT) 2 to 5'-(GGG ATT) 4 ), both the variety of topology and the electron-induced strand break cross sections increase. Addition of K + ions decreases the strand break cross section for all sequences that are able to fold G-quadruplexes or G-intermediates, whereas the strand break cross section for the intermixed sequence remains unchanged. These results indicate that telomeric DNA is rather sensitive towards low-energy electron-induced strand breakage suggesting significant telomere shortening that can also occur during cancer radiation therapy. © 2018 Wiley-VCH Verlag GmbH & Co. KGaA, Weinheim.

  18. Peptide de novo sequencing of mixture tandem mass spectra

    PubMed Central

    Hotta, Stéphanie Yuki Kolbeck; Verano‐Braga, Thiago; Kjeldsen, Frank

    2016-01-01

    The impact of mixture spectra deconvolution on the performance of four popular de novo sequencing programs was tested using artificially constructed mixture spectra as well as experimental proteomics data. Mixture fragmentation spectra are recognized as a limitation in proteomics because they decrease the identification performance using database search engines. De novo sequencing approaches are expected to be even more sensitive to the reduction in mass spectrum quality resulting from peptide precursor co‐isolation and thus prone to false identifications. The deconvolution approach matched complementary b‐, y‐ions to each precursor peptide mass, which allowed the creation of virtual spectra containing sequence specific fragment ions of each co‐isolated peptide. Deconvolution processing resulted in equally efficient identification rates but increased the absolute number of correctly sequenced peptides. The improvement was in the range of 20–35% additional peptide identifications for a HeLa lysate sample. Some correct sequences were identified only using unprocessed spectra; however, the number of these was lower than those where improvement was obtained by mass spectral deconvolution. Tight candidate peptide score distribution and high sensitivity to small changes in the mass spectrum introduced by the employed deconvolution method could explain some of the missing peptide identifications. PMID:27329701

  19. High-throughput assays for DNA gyrase and other topoisomerases

    PubMed Central

    Maxwell, Anthony; Burton, Nicolas P.; O'Hagan, Natasha

    2006-01-01

    We have developed high-throughput microtitre plate-based assays for DNA gyrase and other DNA topoisomerases. These assays exploit the fact that negatively supercoiled plasmids form intermolecular triplexes more efficiently than when they are relaxed. Two assays are presented, one using capture of a plasmid containing a single triplex-forming sequence by an oligonucleotide tethered to the surface of a microtitre plate and subsequent detection by staining with a DNA-specific fluorescent dye. The other uses capture of a plasmid containing two triplex-forming sequences by an oligonucleotide tethered to the surface of a microtitre plate and subsequent detection by a second oligonucleotide that is radiolabelled. The assays are shown to be appropriate for assaying DNA supercoiling by Escherichia coli DNA gyrase and DNA relaxation by eukaryotic topoisomerases I and II, and E.coli topoisomerase IV. The assays are readily adaptable to other enzymes that change DNA supercoiling (e.g. restriction enzymes) and are suitable for use in a high-throughput format. PMID:16936317

  20. High-throughput assays for DNA gyrase and other topoisomerases.

    PubMed

    Maxwell, Anthony; Burton, Nicolas P; O'Hagan, Natasha

    2006-01-01

    We have developed high-throughput microtitre plate-based assays for DNA gyrase and other DNA topoisomerases. These assays exploit the fact that negatively supercoiled plasmids form intermolecular triplexes more efficiently than when they are relaxed. Two assays are presented, one using capture of a plasmid containing a single triplex-forming sequence by an oligonucleotide tethered to the surface of a microtitre plate and subsequent detection by staining with a DNA-specific fluorescent dye. The other uses capture of a plasmid containing two triplex-forming sequences by an oligonucleotide tethered to the surface of a microtitre plate and subsequent detection by a second oligonucleotide that is radiolabelled. The assays are shown to be appropriate for assaying DNA supercoiling by Escherichia coli DNA gyrase and DNA relaxation by eukaryotic topoisomerases I and II, and E.coli topoisomerase IV. The assays are readily adaptable to other enzymes that change DNA supercoiling (e.g. restriction enzymes) and are suitable for use in a high-throughput format.

  1. A 'new lease of life': FnCpf1 possesses DNA cleavage activity for genome editing in human cells.

    PubMed

    Tu, Mengjun; Lin, Li; Cheng, Yilu; He, Xiubin; Sun, Huihui; Xie, Haihua; Fu, Junhao; Liu, Changbao; Li, Jin; Chen, Ding; Xi, Haitao; Xue, Dongyu; Liu, Qi; Zhao, Junzhao; Gao, Caixia; Song, Zongming; Qu, Jia; Gu, Feng

    2017-11-02

    Cpf1 nucleases were recently reported to be highly specific and programmable nucleases with efficiencies comparable to those of SpCas9. AsCpf1 and LbCpf1 require a single crRNA and recognize a 5'-TTTN-3' protospacer adjacent motif (PAM) at the 5' end of the protospacer for genome editing. For widespread application in precision site-specific human genome editing, the range of sequences that AsCpf1 and LbCpf1 can recognize is limited due to the size of this PAM. To address this limitation, we sought to identify a novel Cpf1 nuclease with simpler PAM requirements. Specifically, here we sought to test and engineer FnCpf1, one reported Cpf1 nuclease (FnCpf1) only requires 5'-TTN-3' as a PAM but does not exhibit detectable levels of nuclease-induced indels at certain locus in human cells. Surprisingly, we found that FnCpf1 possesses DNA cleavage activity in human cells at multiple loci. We also comprehensively and quantitatively examined various FnCpf1 parameters in human cells, including spacer sequence, direct repeat sequence and the PAM sequence. Our study identifies FnCpf1 as a new member of the Cpf1 family for human genome editing with distinctive characteristics, which shows promise as a genome editing tool with the potential for both research and therapeutic applications. © The Author(s) 2017. Published by Oxford University Press on behalf of Nucleic Acids Research.

  2. A ‘new lease of life’: FnCpf1 possesses DNA cleavage activity for genome editing in human cells

    PubMed Central

    Tu, Mengjun; Lin, Li; Cheng, Yilu; He, Xiubin; Sun, Huihui; Xie, Haihua; Fu, Junhao; Liu, Changbao; Li, Jin; Chen, Ding; Xi, Haitao; Xue, Dongyu; Liu, Qi; Zhao, Junzhao; Gao, Caixia; Song, Zongming; Qu, Jia

    2017-01-01

    Abstract Cpf1 nucleases were recently reported to be highly specific and programmable nucleases with efficiencies comparable to those of SpCas9. AsCpf1 and LbCpf1 require a single crRNA and recognize a 5′-TTTN-3′ protospacer adjacent motif (PAM) at the 5′ end of the protospacer for genome editing. For widespread application in precision site-specific human genome editing, the range of sequences that AsCpf1 and LbCpf1 can recognize is limited due to the size of this PAM. To address this limitation, we sought to identify a novel Cpf1 nuclease with simpler PAM requirements. Specifically, here we sought to test and engineer FnCpf1, one reported Cpf1 nuclease (FnCpf1) only requires 5′-TTN-3′ as a PAM but does not exhibit detectable levels of nuclease-induced indels at certain locus in human cells. Surprisingly, we found that FnCpf1 possesses DNA cleavage activity in human cells at multiple loci. We also comprehensively and quantitatively examined various FnCpf1 parameters in human cells, including spacer sequence, direct repeat sequence and the PAM sequence. Our study identifies FnCpf1 as a new member of the Cpf1 family for human genome editing with distinctive characteristics, which shows promise as a genome editing tool with the potential for both research and therapeutic applications. PMID:28977650

  3. PET-Tool: a software suite for comprehensive processing and managing of Paired-End diTag (PET) sequence data.

    PubMed

    Chiu, Kuo Ping; Wong, Chee-Hong; Chen, Qiongyu; Ariyaratne, Pramila; Ooi, Hong Sain; Wei, Chia-Lin; Sung, Wing-Kin Ken; Ruan, Yijun

    2006-08-25

    We recently developed the Paired End diTag (PET) strategy for efficient characterization of mammalian transcriptomes and genomes. The paired end nature of short PET sequences derived from long DNA fragments raised a new set of bioinformatics challenges, including how to extract PETs from raw sequence reads, and correctly yet efficiently map PETs to reference genome sequences. To accommodate and streamline data analysis of the large volume PET sequences generated from each PET experiment, an automated PET data process pipeline is desirable. We designed an integrated computation program package, PET-Tool, to automatically process PET sequences and map them to the genome sequences. The Tool was implemented as a web-based application composed of four modules: the Extractor module for PET extraction; the Examiner module for analytic evaluation of PET sequence quality; the Mapper module for locating PET sequences in the genome sequences; and the Project Manager module for data organization. The performance of PET-Tool was evaluated through the analyses of 2.7 million PET sequences. It was demonstrated that PET-Tool is accurate and efficient in extracting PET sequences and removing artifacts from large volume dataset. Using optimized mapping criteria, over 70% of quality PET sequences were mapped specifically to the genome sequences. With a 2.4 GHz LINUX machine, it takes approximately six hours to process one million PETs from extraction to mapping. The speed, accuracy, and comprehensiveness have proved that PET-Tool is an important and useful component in PET experiments, and can be extended to accommodate other related analyses of paired-end sequences. The Tool also provides user-friendly functions for data quality check and system for multi-layer data management.

  4. MW-assisted synthesis of LiFePO 4 for high power applications

    NASA Astrophysics Data System (ADS)

    Beninati, Sabina; Damen, Libero; Mastragostino, Marina

    LiFePO 4/C was prepared by solid-state reaction from Li 3PO 4, Fe 3(PO 4) 2·8H 2O, carbon and glucose in a few minutes in a scientific MW (microwave) oven with temperature and power control. The material was characterized by X-ray diffraction, scanning electron microscopy and by TGA analysis to evaluate carbon content. The electrochemical characterization as positive electrode in EC (ethylene carbonate)-DMC (dimethylcarbonate) 1 M LiPF 6 was performed by galvanostatic charge-discharge cycles at C/10 to evaluate specific capacity and by sequences of 10 s discharge-charge pulses, at different high C-rates (5-45C) to evaluate pulse-specific power in simulate operative conditions for full-HEV application. The maximum pulse-specific power and, particularly, pulse efficiency values are quite high and make MW synthesis a very promising route for mass production of LiFePO 4/C for full-HEV batteries at low energy costs.

  5. Adaptive efficient compression of genomes

    PubMed Central

    2012-01-01

    Modern high-throughput sequencing technologies are able to generate DNA sequences at an ever increasing rate. In parallel to the decreasing experimental time and cost necessary to produce DNA sequences, computational requirements for analysis and storage of the sequences are steeply increasing. Compression is a key technology to deal with this challenge. Recently, referential compression schemes, storing only the differences between a to-be-compressed input and a known reference sequence, gained a lot of interest in this field. However, memory requirements of the current algorithms are high and run times often are slow. In this paper, we propose an adaptive, parallel and highly efficient referential sequence compression method which allows fine-tuning of the trade-off between required memory and compression speed. When using 12 MB of memory, our method is for human genomes on-par with the best previous algorithms in terms of compression ratio (400:1) and compression speed. In contrast, it compresses a complete human genome in just 11 seconds when provided with 9 GB of main memory, which is almost three times faster than the best competitor while using less main memory. PMID:23146997

  6. The Draft Genome Sequence of a Novel High-Efficient Butanol-Producing Bacterium Clostridium Diolis Strain WST.

    PubMed

    Chen, Chaoyang; Sun, Chongran; Wu, Yi-Rui

    2018-03-21

    A wild-type solventogenic strain Clostridium diolis WST, isolated from mangrove sediments, was characterized to produce high amount of butanol and acetone with negligible level of ethanol and acids from glucose via a unique acetone-butanol (AB) fermentation pathway. Through the genomic sequencing, the assembled draft genome of strain WST is calculated to be 5.85 Mb with a GC content of 29.69% and contains 5263 genes that contribute to the annotation of 5049 protein-coding sequences. Within these annotated genes, the butanol dehydrogenase gene (bdh) was determined to be in a higher amount from strain WST compared to other Clostridial strains, which is positively related to its high-efficient production of butanol. Therefore, we present a draft genome sequence analysis of strain WST in this article that should facilitate to further understand the solventogenic mechanism of this special microorganism.

  7. Single-cell analyses of transcriptional heterogeneity during drug tolerance transition in cancer cells by RNA sequencing.

    PubMed

    Lee, Mei-Chong Wendy; Lopez-Diaz, Fernando J; Khan, Shahid Yar; Tariq, Muhammad Akram; Dayn, Yelena; Vaske, Charles Joseph; Radenbaugh, Amie J; Kim, Hyunsung John; Emerson, Beverly M; Pourmand, Nader

    2014-11-04

    The acute cellular response to stress generates a subpopulation of reversibly stress-tolerant cells under conditions that are lethal to the majority of the population. Stress tolerance is attributed to heterogeneity of gene expression within the population to ensure survival of a minority. We performed whole transcriptome sequencing analyses of metastatic human breast cancer cells subjected to the chemotherapeutic agent paclitaxel at the single-cell and population levels. Here we show that specific transcriptional programs are enacted within untreated, stressed, and drug-tolerant cell groups while generating high heterogeneity between single cells within and between groups. We further demonstrate that drug-tolerant cells contain specific RNA variants residing in genes involved in microtubule organization and stabilization, as well as cell adhesion and cell surface signaling. In addition, the gene expression profile of drug-tolerant cells is similar to that of untreated cells within a few doublings. Thus, single-cell analyses reveal the dynamics of the stress response in terms of cell-specific RNA variants driving heterogeneity, the survival of a minority population through generation of specific RNA variants, and the efficient reconversion of stress-tolerant cells back to normalcy.

  8. Single-cell analyses of transcriptional heterogeneity during drug tolerance transition in cancer cells by RNA sequencing

    PubMed Central

    Lee, Mei-Chong Wendy; Lopez-Diaz, Fernando J.; Khan, Shahid Yar; Tariq, Muhammad Akram; Dayn, Yelena; Vaske, Charles Joseph; Radenbaugh, Amie J.; Kim, Hyunsung John; Emerson, Beverly M.; Pourmand, Nader

    2014-01-01

    The acute cellular response to stress generates a subpopulation of reversibly stress-tolerant cells under conditions that are lethal to the majority of the population. Stress tolerance is attributed to heterogeneity of gene expression within the population to ensure survival of a minority. We performed whole transcriptome sequencing analyses of metastatic human breast cancer cells subjected to the chemotherapeutic agent paclitaxel at the single-cell and population levels. Here we show that specific transcriptional programs are enacted within untreated, stressed, and drug-tolerant cell groups while generating high heterogeneity between single cells within and between groups. We further demonstrate that drug-tolerant cells contain specific RNA variants residing in genes involved in microtubule organization and stabilization, as well as cell adhesion and cell surface signaling. In addition, the gene expression profile of drug-tolerant cells is similar to that of untreated cells within a few doublings. Thus, single-cell analyses reveal the dynamics of the stress response in terms of cell-specific RNA variants driving heterogeneity, the survival of a minority population through generation of specific RNA variants, and the efficient reconversion of stress-tolerant cells back to normalcy. PMID:25339441

  9. Encapsidation of Host RNAs by Cucumber Necrosis Virus Coat Protein during both Agroinfiltration and Infection.

    PubMed

    Ghoshal, Kankana; Theilmann, Jane; Reade, Ron; Maghodia, Ajay; Rochon, D'Ann

    2015-11-01

    Next-generation sequence analysis of virus-like particles (VLPs) produced during agroinfiltration of cucumber necrosis virus (CNV) coat protein (CP) and of authentic CNV virions was conducted to assess if host RNAs can be encapsidated by CNV CP. VLPs containing host RNAs were found to be produced during agroinfiltration, accumulating to approximately 1/60 the level that CNV virions accumulated during infection. VLPs contained a variety of host RNA species, including the major rRNAs as well as cytoplasmic, chloroplast, and mitochondrial mRNAs. The most predominant host RNA species encapsidated in VLPs were chloroplast encoded, consistent with the efficient targeting of CNV CP to chloroplasts during agroinfiltration. Interestingly, droplet digital PCR analysis showed that the CNV CP mRNA expressed during agroinfiltration was the most efficiently encapsidated mRNA, suggesting that the CNV CP open reading frame may contain a high-affinity site or sites for CP binding and thus contribute to the specificity of CNV RNA encapsidation. Approximately 0.09% to 0.7% of the RNA derived from authentic CNV virions contained host RNA, with chloroplast RNA again being the most prominent species. This is consistent with our previous finding that a small proportion of CNV CP enters chloroplasts during the infection process and highlights the possibility that chloroplast targeting is a significant aspect of CNV infection. Remarkably, 6 to 8 of the top 10 most efficiently encapsidated nucleus-encoded RNAs in CNV virions correspond to retrotransposon or retrotransposon-like RNA sequences. Thus, CNV could potentially serve as a vehicle for horizontal transmission of retrotransposons to new hosts and thereby significantly influence genome evolution. Viruses predominantly encapsidate their own virus-related RNA species due to the possession of specific sequences and/or structures on viral RNA which serve as high-affinity binding sites for the coat protein. In this study, we show, using next-generation sequence analysis, that CNV also encapsidates host RNA species, which account for ∼0.1% of the RNA packaged in CNV particles. The encapsidated host RNAs predominantly include chloroplast RNAs, reinforcing previous observations that CNV CP enters chloroplasts during infection. Remarkably, the most abundantly encapsidated cytoplasmic mRNAs consisted of retrotransposon-like RNA sequences, similar to findings recently reported for flock house virus (A. Routh, T. Domitrovic, and J. E. Johnson, Proc Natl Acad Sci U S A 109:1907-1912, 2012). Encapsidation of retrotransposon sequences may contribute to their horizontal transmission should CNV virions carrying retrotransposons infect a new host. Such an event could lead to large-scale genomic changes in a naive plant host, thus facilitating host evolutionary novelty. Copyright © 2015, American Society for Microbiology. All Rights Reserved.

  10. Long-term functional adeno-associated virus-microdystrophin expression in the dystrophic CXMDj dog.

    PubMed

    Koo, Taeyoung; Okada, Takashi; Athanasopoulos, Takis; Foster, Helen; Takeda, Shin'ichi; Dickson, George

    2011-09-01

    Duchenne muscular dystrophy (DMD) is a severe, inherited, muscle-wasting disorder caused by mutations in the dystrophin gene. Preclinical studies of adeno-associated virus gene therapy for DMD have been described in mouse and dog models of this disease. However, low and transient expression of microdystrophin in dystrophic dogs and a lack of long-term microdystrophin expression associated with a CD8(+)  T-cell response in DMD patients suggests that the development of improved microdystrophin genes and delivery strategies is essential for successful clinical trials in DMD patients. We have previously shown the efficiency of mRNA sequence optimization of mouse microdystrophin in ameliorating the pathology of dystrophic mdx mice. In the present study, we generated adeno-associated virus (AAV)2/8 vectors expressing an mRNA sequence-optimized canine microdystrophin under the control of a muscle-specific promoter and injected intramuscularly into a single canine X-linked muscular dystrophy (CXMDj) dog. Expression of stable and high levels of microdystrophin was observed along with an association of the dystrophin-associated protein complex in intramuscularly injected muscles of a CXMDj dog for at least 8 weeks without immune responses. Treated muscles were highly protected from dystrophic damage, with reduced levels of myofiber permeability and central nucleation. The data obtained in the present study suggest that the use of canine-specific and mRNA sequence-optimized microdystrophin genes in conjunction with a muscle-specific promoter results in high and stable levels of microdystrophin expression in a canine model of DMD. This approach will potentially allow the reduction of dosage and contribute towards the development of a safe and effective AAV gene therapy clinical trial protocol for DMD. Copyright © 2011 John Wiley & Sons, Ltd.

  11. A peripheral component interconnect express-based scalable and highly integrated pulsed spectrometer for solution state dynamic nuclear polarization.

    PubMed

    He, Yugui; Feng, Jiwen; Zhang, Zhi; Wang, Chao; Wang, Dong; Chen, Fang; Liu, Maili; Liu, Chaoyang

    2015-08-01

    High sensitivity, high data rates, fast pulses, and accurate synchronization all represent challenges for modern nuclear magnetic resonance spectrometers, which make any expansion or adaptation of these devices to new techniques and experiments difficult. Here, we present a Peripheral Component Interconnect Express (PCIe)-based highly integrated distributed digital architecture pulsed spectrometer that is implemented with electron and nucleus double resonances and is scalable specifically for broad dynamic nuclear polarization (DNP) enhancement applications, including DNP-magnetic resonance spectroscopy/imaging (DNP-MRS/MRI). The distributed modularized architecture can implement more transceiver channels flexibly to meet a variety of MRS/MRI instrumentation needs. The proposed PCIe bus with high data rates can significantly improve data transmission efficiency and communication reliability and allow precise control of pulse sequences. An external high speed double data rate memory chip is used to store acquired data and pulse sequence elements, which greatly accelerates the execution of the pulse sequence, reduces the TR (time of repetition) interval, and improves the accuracy of TR in imaging sequences. Using clock phase-shift technology, we can produce digital pulses accurately with high timing resolution of 1 ns and narrow widths of 4 ns to control the microwave pulses required by pulsed DNP and ensure overall system synchronization. The proposed spectrometer is proved to be both feasible and reliable by observation of a maximum signal enhancement factor of approximately -170 for (1)H, and a high quality water image was successfully obtained by DNP-enhanced spin-echo (1)H MRI at 0.35 T.

  12. PASTA: splice junction identification from RNA-Sequencing data

    PubMed Central

    2013-01-01

    Background Next generation transcriptome sequencing (RNA-Seq) is emerging as a powerful experimental tool for the study of alternative splicing and its regulation, but requires ad-hoc analysis methods and tools. PASTA (Patterned Alignments for Splicing and Transcriptome Analysis) is a splice junction detection algorithm specifically designed for RNA-Seq data, relying on a highly accurate alignment strategy and on a combination of heuristic and statistical methods to identify exon-intron junctions with high accuracy. Results Comparisons against TopHat and other splice junction prediction software on real and simulated datasets show that PASTA exhibits high specificity and sensitivity, especially at lower coverage levels. Moreover, PASTA is highly configurable and flexible, and can therefore be applied in a wide range of analysis scenarios: it is able to handle both single-end and paired-end reads, it does not rely on the presence of canonical splicing signals, and it uses organism-specific regression models to accurately identify junctions. Conclusions PASTA is a highly efficient and sensitive tool to identify splicing junctions from RNA-Seq data. Compared to similar programs, it has the ability to identify a higher number of real splicing junctions, and provides highly annotated output files containing detailed information about their location and characteristics. Accurate junction data in turn facilitates the reconstruction of the splicing isoforms and the analysis of their expression levels, which will be performed by the remaining modules of the PASTA pipeline, still under development. Use of PASTA can therefore enable the large-scale investigation of transcription and alternative splicing. PMID:23557086

  13. Flanking sequence determination and event-specific detection of genetically modified wheat B73-6-1.

    PubMed

    Xu, Junyi; Cao, Jijuan; Cao, Dongmei; Zhao, Tongtong; Huang, Xin; Zhang, Piqiao; Luan, Fengxia

    2013-05-01

    In order to establish a specific identification method for genetically modified (GM) wheat, exogenous insert DNA and flanking sequence between exogenous fragment and recombinant chromosome of GM wheat B73-6-1 were successfully acquired by means of conventional polymerase chain reaction (PCR) and thermal asymmetric interlaced (TAIL)-PCR strategies. Newly acquired exogenous fragment covered the full-length sequence of transformed genes such as transformed plasmid and corresponding functional genes including marker uidA, herbicide-resistant bar, ubiquitin promoter, and high-molecular-weight gluten subunit. The flanking sequence between insert DNA revealed high similarity with Triticum turgidum A gene (GenBank: AY494981.1). A specific PCR detection method for GM wheat B73-6-1 was established on the basis of primers designed according to the flanking sequence. This specific PCR method was validated by GM wheat, GM corn, GM soybean, GM rice, and non-GM wheat. The specifically amplified target band was observed only in GM wheat B73-6-1. This method is of high specificity, high reproducibility, rapid identification, and excellent accuracy for the identification of GM wheat B73-6-1.

  14. Development of an Efficient Entire-Capsid-Coding-Region Amplification Method for Direct Detection of Poliovirus from Stool Extracts

    PubMed Central

    Kilpatrick, David R.; Nakamura, Tomofumi; Burns, Cara C.; Bukbuk, David; Oderinde, Soji B.; Oberste, M. Steven; Kew, Olen M.; Pallansch, Mark A.; Shimizu, Hiroyuki

    2014-01-01

    Laboratory diagnosis has played a critical role in the Global Polio Eradication Initiative since 1988, by isolating and identifying poliovirus (PV) from stool specimens by using cell culture as a highly sensitive system to detect PV. In the present study, we aimed to develop a molecular method to detect PV directly from stool extracts, with a high efficiency comparable to that of cell culture. We developed a method to efficiently amplify the entire capsid coding region of human enteroviruses (EVs) including PV. cDNAs of the entire capsid coding region (3.9 kb) were obtained from as few as 50 copies of PV genomes. PV was detected from the cDNAs with an improved PV-specific real-time reverse transcription-PCR system and nucleotide sequence analysis of the VP1 coding region. For assay validation, we analyzed 84 stool extracts that were positive for PV in cell culture and detected PV genomes from 100% of the extracts (84/84 samples) with this method in combination with a PV-specific extraction method. PV could be detected in 2/4 stool extract samples that were negative for PV in cell culture. In PV-positive samples, EV species C viruses were also detected with high frequency (27% [23/86 samples]). This method would be useful for direct detection of PV from stool extracts without using cell culture. PMID:25339406

  15. BAC-pool 454-sequencing: A rapid and efficient approach to sequence complex tetraploid cotton genomes

    USDA-ARS?s Scientific Manuscript database

    New and emerging next generation sequencing technologies have been promising in reducing sequencing costs, but not significantly for complex polyploid plant genomes such as cotton. Large and highly repetitive genome of G. hirsutum (~2.5GB) is less amenable and cost-intensive with traditional BAC-by...

  16. Animal selection for whole genome sequencing by quantifying the unique contribution of homozygous haplotypes sequenced

    USDA-ARS?s Scientific Manuscript database

    Major whole genome sequencing projects promise to identify rare and causal variants within livestock species; however, the efficient selection of animals for sequencing remains a major problem within these surveys. The goal of this project was to develop a library of high accuracy genetic variants f...

  17. Genotype Specification Language.

    PubMed

    Wilson, Erin H; Sagawa, Shiori; Weis, James W; Schubert, Max G; Bissell, Michael; Hawthorne, Brian; Reeves, Christopher D; Dean, Jed; Platt, Darren

    2016-06-17

    We describe here the Genotype Specification Language (GSL), a language that facilitates the rapid design of large and complex DNA constructs used to engineer genomes. The GSL compiler implements a high-level language based on traditional genetic notation, as well as a set of low-level DNA manipulation primitives. The language allows facile incorporation of parts from a library of cloned DNA constructs and from the "natural" library of parts in fully sequenced and annotated genomes. GSL was designed to engage genetic engineers in their native language while providing a framework for higher level abstract tooling. To this end we define four language levels, Level 0 (literal DNA sequence) through Level 3, with increasing abstraction of part selection and construction paths. GSL targets an intermediate language based on DNA slices that translates efficiently into a wide range of final output formats, such as FASTA and GenBank, and includes formats that specify instructions and materials such as oligonucleotide primers to allow the physical construction of the GSL designs by individual strain engineers or an automated DNA assembly core facility.

  18. An efficient and scalable analysis framework for variant extraction and refinement from population-scale DNA sequence data.

    PubMed

    Jun, Goo; Wing, Mary Kate; Abecasis, Gonçalo R; Kang, Hyun Min

    2015-06-01

    The analysis of next-generation sequencing data is computationally and statistically challenging because of the massive volume of data and imperfect data quality. We present GotCloud, a pipeline for efficiently detecting and genotyping high-quality variants from large-scale sequencing data. GotCloud automates sequence alignment, sample-level quality control, variant calling, filtering of likely artifacts using machine-learning techniques, and genotype refinement using haplotype information. The pipeline can process thousands of samples in parallel and requires less computational resources than current alternatives. Experiments with whole-genome and exome-targeted sequence data generated by the 1000 Genomes Project show that the pipeline provides effective filtering against false positive variants and high power to detect true variants. Our pipeline has already contributed to variant detection and genotyping in several large-scale sequencing projects, including the 1000 Genomes Project and the NHLBI Exome Sequencing Project. We hope it will now prove useful to many medical sequencing studies. © 2015 Jun et al.; Published by Cold Spring Harbor Laboratory Press.

  19. Consequences of Normalizing Transcriptomic and Genomic Libraries of Plant Genomes Using a Duplex-Specific Nuclease and Tetramethylammonium Chloride

    PubMed Central

    Froenicke, Lutz; Lavelle, Dean; Martineau, Belinda; Perroud, Bertrand; Michelmore, Richard

    2013-01-01

    Several applications of high throughput genome and transcriptome sequencing would benefit from a reduction of the high-copy-number sequences in the libraries being sequenced and analyzed, particularly when applied to species with large genomes. We adapted and analyzed the consequences of a method that utilizes a thermostable duplex-specific nuclease for reducing the high-copy components in transcriptomic and genomic libraries prior to sequencing. This reduces the time, cost, and computational effort of obtaining informative transcriptomic and genomic sequence data for both fully sequenced and non-sequenced genomes. It also reduces contamination from organellar DNA in preparations of nuclear DNA. Hybridization in the presence of 3 M tetramethylammonium chloride (TMAC), which equalizes the rates of hybridization of GC and AT nucleotide pairs, reduced the bias against sequences with high GC content. Consequences of this method on the reduction of high-copy and enrichment of low-copy sequences are reported for Arabidopsis and lettuce. PMID:23409088

  20. Consequences of normalizing transcriptomic and genomic libraries of plant genomes using a duplex-specific nuclease and tetramethylammonium chloride.

    PubMed

    Matvienko, Marta; Kozik, Alexander; Froenicke, Lutz; Lavelle, Dean; Martineau, Belinda; Perroud, Bertrand; Michelmore, Richard

    2013-01-01

    Several applications of high throughput genome and transcriptome sequencing would benefit from a reduction of the high-copy-number sequences in the libraries being sequenced and analyzed, particularly when applied to species with large genomes. We adapted and analyzed the consequences of a method that utilizes a thermostable duplex-specific nuclease for reducing the high-copy components in transcriptomic and genomic libraries prior to sequencing. This reduces the time, cost, and computational effort of obtaining informative transcriptomic and genomic sequence data for both fully sequenced and non-sequenced genomes. It also reduces contamination from organellar DNA in preparations of nuclear DNA. Hybridization in the presence of 3 M tetramethylammonium chloride (TMAC), which equalizes the rates of hybridization of GC and AT nucleotide pairs, reduced the bias against sequences with high GC content. Consequences of this method on the reduction of high-copy and enrichment of low-copy sequences are reported for Arabidopsis and lettuce.

  1. Multicolor fluorescent biosensor for multiplexed detection of DNA.

    PubMed

    Hu, Rong; Liu, Tao; Zhang, Xiao-Bing; Huan, Shuang-Yan; Wu, Cuichen; Fu, Ting; Tan, Weihong

    2014-05-20

    Development of efficient methods for highly sensitive and rapid screening of specific oligonucleotide sequences is essential to the early diagnosis of serious diseases. In this work, an aggregated cationic perylene diimide (PDI) derivative was found to efficiently quench the fluorescence emission of a variety of anionic oligonucleotide-labeled fluorophores that emit at wavelengths from the visible to NIR region. This broad-spectrum quencher was then adopted to develop a multicolor biosensor via a label-free approach for multiplexed fluorescent detection of DNA. The aggregated perylene derivative exhibits a very high quenching efficiency on all ssDNA-labeled dyes associated with biosensor detection, having efficiency values of 98.3 ± 0.9%, 97 ± 1.1%, and 98.2 ± 0.6% for FAM, TAMRA, and Cy5, respectively. An exonuclease-assisted autocatalytic target recycling amplification was also integrated into the sensing system. High quenching efficiency combined with autocatalytic target recycling amplification afforded the biosensor with high sensitivity toward target DNA, resulting in a detection limit of 20 pM, which is about 50-fold lower than that of traditional unamplified homogeneous fluorescent assay methods. The quencher did not interfere with the catalytic activity of nuclease, and the biosensor could be manipulated in either preaddition or postaddition manner with similar sensitivity. Moreover, the proposed sensing system allows for simultaneous and multicolor analysis of several oligonucleotides in homogeneous solution, demonstrating its potential application in the rapid screening of multiple biotargets.

  2. Elimination sequence optimization for SPAR

    NASA Technical Reports Server (NTRS)

    Hogan, Harry A.

    1986-01-01

    SPAR is a large-scale computer program for finite element structural analysis. The program allows user specification of the order in which the joints of a structure are to be eliminated since this order can have significant influence over solution performance, in terms of both storage requirements and computer time. An efficient elimination sequence can improve performance by over 50% for some problems. Obtaining such sequences, however, requires the expertise of an experienced user and can take hours of tedious effort to affect. Thus, an automatic elimination sequence optimizer would enhance productivity by reducing the analysts' problem definition time and by lowering computer costs. Two possible methods for automating the elimination sequence specifications were examined. Several algorithms based on the graph theory representations of sparse matrices were studied with mixed results. Significant improvement in the program performance was achieved, but sequencing by an experienced user still yields substantially better results. The initial results provide encouraging evidence that the potential benefits of such an automatic sequencer would be well worth the effort.

  3. Pulse sequences for efficient multi-cycle terahertz generation in periodically poled lithium niobate.

    PubMed

    Ravi, Koustuban; Schimpf, Damian N; Kärtner, Franz X

    2016-10-31

    The use of laser pulse sequences to drive the cascaded difference frequency generation of high energy, high peak-power and multi-cycle terahertz pulses in cryogenically cooled (100 K) periodically poled Lithium Niobate is proposed and studied. Detailed simulations considering the coupled nonlinear interaction of terahertz and optical waves (or pump depletion), show that unprecedented optical-to-terahertz energy conversion efficiencies > 5%, peak electric fields of hundred(s) of mega volts/meter at terahertz pulse durations of hundred(s) of picoseconds can be achieved. The proposed methods are shown to circumvent laser induced damage limitations at Joule-level pumping by 1µm lasers to enable multi-cycle terahertz sources with pulse energies > 10 milli-joules. Various pulse sequence formats are proposed and analyzed. Numerical calculations for periodically poled structures accounting for cascaded difference frequency generation, self-phase-modulation, cascaded second harmonic generation and laser induced damage are introduced. The physics governing terahertz generation using pulse sequences in this high conversion efficiency regime, limitations and practical considerations are discussed. It is shown that varying the poling period along the crystal length and further reduction of absorption can lead to even higher energy conversion efficiencies >10%. In addition to numerical calculations, an analytic formulation valid for arbitrary pulse formats and closed-form expressions for important cases are presented. Parameters optimizing conversion efficiency in the 0.1-1 THz range, the corresponding peak electric fields, crystal lengths and terahertz pulse properties are furnished.

  4. High-sensitivity HLA typing by Saturated Tiling Capture Sequencing (STC-Seq).

    PubMed

    Jiao, Yang; Li, Ran; Wu, Chao; Ding, Yibin; Liu, Yanning; Jia, Danmei; Wang, Lifeng; Xu, Xiang; Zhu, Jing; Zheng, Min; Jia, Junling

    2018-01-15

    Highly polymorphic human leukocyte antigen (HLA) genes are responsible for fine-tuning the adaptive immune system. High-resolution HLA typing is important for the treatment of autoimmune and infectious diseases. Additionally, it is routinely performed for identifying matched donors in transplantation medicine. Although many HLA typing approaches have been developed, the complexity, low-efficiency and high-cost of current HLA-typing assays limit their application in population-based high-throughput HLA typing for donors, which is required for creating large-scale databases for transplantation and precision medicine. Here, we present a cost-efficient Saturated Tiling Capture Sequencing (STC-Seq) approach to capturing 14 HLA class I and II genes. The highly efficient capture (an approximately 23,000-fold enrichment) of these genes allows for simplified allele calling. Tests on five genes (HLA-A/B/C/DRB1/DQB1) from 31 human samples and 351 datasets using STC-Seq showed results that were 98% consistent with the known two sets of digitals (field1 and field2) genotypes. Additionally, STC can capture genomic DNA fragments longer than 3 kb from HLA loci, making the library compatible with the third-generation sequencing. STC-Seq is a highly accurate and cost-efficient method for HLA typing which can be used to facilitate the establishment of population-based HLA databases for the precision and transplantation medicine.

  5. Protocols for efficient simulations of long-time protein dynamics using coarse-grained CABS model.

    PubMed

    Jamroz, Michal; Kolinski, Andrzej; Kmiecik, Sebastian

    2014-01-01

    Coarse-grained (CG) modeling is a well-acknowledged simulation approach for getting insight into long-time scale protein folding events at reasonable computational cost. Depending on the design of a CG model, the simulation protocols vary from highly case-specific-requiring user-defined assumptions about the folding scenario-to more sophisticated blind prediction methods for which only a protein sequence is required. Here we describe the framework protocol for the simulations of long-term dynamics of globular proteins, with the use of the CABS CG protein model and sequence data. The simulations can start from a random or a selected (e.g., native) structure. The described protocol has been validated using experimental data for protein folding model systems-the prediction results agreed well with the experimental results.

  6. Energy efficiency trade-offs drive nucleotide usage in transcribed regions

    PubMed Central

    Chen, Wei-Hua; Lu, Guanting; Bork, Peer; Hu, Songnian; Lercher, Martin J.

    2016-01-01

    Efficient nutrient usage is a trait under universal selection. A substantial part of cellular resources is spent on making nucleotides. We thus expect preferential use of cheaper nucleotides especially in transcribed sequences, which are often amplified thousand-fold compared with genomic sequences. To test this hypothesis, we derive a mutation-selection-drift equilibrium model for nucleotide skews (strand-specific usage of ‘A' versus ‘T' and ‘G' versus ‘C'), which explains nucleotide skews across 1,550 prokaryotic genomes as a consequence of selection on efficient resource usage. Transcription-related selection generally favours the cheaper nucleotides ‘U' and ‘C' at synonymous sites. However, the information encoded in mRNA is further amplified through translation. Due to unexpected trade-offs in the codon table, cheaper nucleotides encode on average energetically more expensive amino acids. These trade-offs apply to both strand-specific nucleotide usage and GC content, causing a universal bias towards the more expensive nucleotides ‘A' and ‘G' at non-synonymous coding sites. PMID:27098217

  7. PHYSICO2: an UNIX based standalone procedure for computation of physicochemical, window-dependent and substitution based evolutionary properties of protein sequences along with automated block preparation tool, version 2.

    PubMed

    Banerjee, Shyamashree; Gupta, Parth Sarthi Sen; Nayek, Arnab; Das, Sunit; Sur, Vishma Pratap; Seth, Pratyay; Islam, Rifat Nawaz Ul; Bandyopadhyay, Amal K

    2015-01-01

    Automated genome sequencing procedure is enriching the sequence database very fast. To achieve a balance between the entry of sequences in the database and their analyses, efficient software is required. In this end PHYSICO2, compare to earlier PHYSICO and other public domain tools, is most efficient in that it i] extracts physicochemical, window-dependent and homologousposition-based-substitution (PWS) properties including positional and BLOCK-specific diversity and conservation, ii] provides users with optional-flexibility in setting relevant input-parameters, iii] helps users to prepare BLOCK-FASTA-file by the use of Automated Block Preparation Tool of the program, iv] performs fast, accurate and user-friendly analyses and v] redirects itemized outputs in excel format along with detailed methodology. The program package contains documentation describing application of methods. Overall the program acts as efficient PWS-analyzer and finds application in sequence-bioinformatics. PHYSICO2: is freely available at http://sourceforge.net/projects/physico2/ along with its documentation at https://sourceforge.net/projects/physico2/files/Documentation.pdf/download for all users.

  8. PHYSICO2: an UNIX based standalone procedure for computation of physicochemical, window-dependent and substitution based evolutionary properties of protein sequences along with automated block preparation tool, version 2

    PubMed Central

    Banerjee, Shyamashree; Gupta, Parth Sarthi Sen; Nayek, Arnab; Das, Sunit; Sur, Vishma Pratap; Seth, Pratyay; Islam, Rifat Nawaz Ul; Bandyopadhyay, Amal K

    2015-01-01

    Automated genome sequencing procedure is enriching the sequence database very fast. To achieve a balance between the entry of sequences in the database and their analyses, efficient software is required. In this end PHYSICO2, compare to earlier PHYSICO and other public domain tools, is most efficient in that it i] extracts physicochemical, window-dependent and homologousposition-based-substitution (PWS) properties including positional and BLOCK-specific diversity and conservation, ii] provides users with optional-flexibility in setting relevant input-parameters, iii] helps users to prepare BLOCK-FASTA-file by the use of Automated Block Preparation Tool of the program, iv] performs fast, accurate and user-friendly analyses and v] redirects itemized outputs in excel format along with detailed methodology. The program package contains documentation describing application of methods. Overall the program acts as efficient PWS-analyzer and finds application in sequence-bioinformatics. Availability PHYSICO2: is freely available at http://sourceforge.net/projects/physico2/ along with its documentation at https://sourceforge.net/projects/physico2/files/Documentation.pdf/download for all users. PMID:26339154

  9. Design and Validation of CRISPR/Cas9 Systems for Targeted Gene Modification in Induced Pluripotent Stem Cells.

    PubMed

    Lee, Ciaran M; Zhu, Haibao; Davis, Timothy H; Deshmukh, Harshahardhan; Bao, Gang

    2017-01-01

    The CRISPR/Cas9 system is a powerful tool for precision genome editing. The ability to accurately modify genomic DNA in situ with single nucleotide precision opens up new possibilities for not only basic research but also biotechnology applications and clinical translation. In this chapter, we outline the procedures for design, screening, and validation of CRISPR/Cas9 systems for targeted modification of coding sequences in the human genome and how to perform genome editing in induced pluripotent stem cells with high efficiency and specificity.

  10. Preparation of High-Efficiency Cytochrome c-Imprinted Polymer on the Surface of Magnetic Carbon Nanotubes by Epitope Approach via Metal Chelation and Six-Membered Ring.

    PubMed

    Qin, Ya-Ping; Li, Dong-Yan; He, Xi-Wen; Li, Wen-You; Zhang, Yu-Kui

    2016-04-27

    A novel epitope molecularly imprinted polymer on the surface of magnetic carbon nanotubes (MCNTs@EMIP) was successfully fabricated to specifically recognize target protein cytochrome c (Cyt C) with high performance. The peptides sequences corresponding to the surface-exposed C-terminus domains of Cyt C was selected as epitope template molecule, and commercially available zinc acrylate and ethylene glycol dimethacrylate (EGDMA) were employed as functional monomer and cross-linker, respectively, to synthesize MIP via free radical polymerization. The epitope was immobilized via metal chelation and six-membered ring formed between the functional monomer and the hydroxyl and amino groups of the epitope. The resulting MCNTs@EMIP exhibited specific recognition ability toward target Cyt C including more satisfactory imprinting factor (about 11.7) than that of other reported imprinting methods. In addition, the MCNTs@EMIP demonstrated a high adsorption amount (about 780.0 mg g(-1)) and excellent selectivity. Besides, the magnetic property of the support material made the processes easy and highly efficient by assistance of an external magnetic field. High-performance liquid chromatography analysis of Cyt C in bovine blood real sample and protein mixture indicated that the specificity was not affected by other competitive proteins, which forcefully stated that the MCNTs@EMIP had potential to be applied in bioseparation area. In brief, this study provided a new protocol to detect target protein in complex sample via epitope imprinting approach and surface imprinting strategy.

  11. A peripheral component interconnect express-based scalable and highly integrated pulsed spectrometer for solution state dynamic nuclear polarization

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    He, Yugui; Liu, Chaoyang, E-mail: chyliu@wipm.ac.cn; State Key Laboratory of Magnet Resonance and Atomic and Molecular Physics, Wuhan Institute of Physics and Mathematics, Chinese Academy of Sciences, Wuhan 430071

    2015-08-15

    High sensitivity, high data rates, fast pulses, and accurate synchronization all represent challenges for modern nuclear magnetic resonance spectrometers, which make any expansion or adaptation of these devices to new techniques and experiments difficult. Here, we present a Peripheral Component Interconnect Express (PCIe)-based highly integrated distributed digital architecture pulsed spectrometer that is implemented with electron and nucleus double resonances and is scalable specifically for broad dynamic nuclear polarization (DNP) enhancement applications, including DNP-magnetic resonance spectroscopy/imaging (DNP-MRS/MRI). The distributed modularized architecture can implement more transceiver channels flexibly to meet a variety of MRS/MRI instrumentation needs. The proposed PCIe bus with highmore » data rates can significantly improve data transmission efficiency and communication reliability and allow precise control of pulse sequences. An external high speed double data rate memory chip is used to store acquired data and pulse sequence elements, which greatly accelerates the execution of the pulse sequence, reduces the TR (time of repetition) interval, and improves the accuracy of TR in imaging sequences. Using clock phase-shift technology, we can produce digital pulses accurately with high timing resolution of 1 ns and narrow widths of 4 ns to control the microwave pulses required by pulsed DNP and ensure overall system synchronization. The proposed spectrometer is proved to be both feasible and reliable by observation of a maximum signal enhancement factor of approximately −170 for {sup 1}H, and a high quality water image was successfully obtained by DNP-enhanced spin-echo {sup 1}H MRI at 0.35 T.« less

  12. The genome-wide DNA sequence specificity of the anti-tumour drug bleomycin in human cells.

    PubMed

    Murray, Vincent; Chen, Jon K; Tanaka, Mark M

    2016-07-01

    The cancer chemotherapeutic agent, bleomycin, cleaves DNA at specific sites. For the first time, the genome-wide DNA sequence specificity of bleomycin breakage was determined in human cells. Utilising Illumina next-generation DNA sequencing techniques, over 200 million bleomycin cleavage sites were examined to elucidate the bleomycin genome-wide DNA selectivity. The genome-wide bleomycin cleavage data were analysed by four different methods to determine the cellular DNA sequence specificity of bleomycin strand breakage. For the most highly cleaved DNA sequences, the preferred site of bleomycin breakage was at 5'-GT* dinucleotide sequences (where the asterisk indicates the bleomycin cleavage site), with lesser cleavage at 5'-GC* dinucleotides. This investigation also determined longer bleomycin cleavage sequences, with preferred cleavage at 5'-GT*A and 5'- TGT* trinucleotide sequences, and 5'-TGT*A tetranucleotides. For cellular DNA, the hexanucleotide DNA sequence 5'-RTGT*AY (where R is a purine and Y is a pyrimidine) was the most highly cleaved DNA sequence. It was striking that alternating purine-pyrimidine sequences were highly cleaved by bleomycin. The highest intensity cleavage sites in cellular and purified DNA were very similar although there were some minor differences. Statistical nucleotide frequency analysis indicated a G nucleotide was present at the -3 position (relative to the cleavage site) in cellular DNA but was absent in purified DNA.

  13. GSP: A web-based platform for designing genome-specific primers in polyploids

    USDA-ARS?s Scientific Manuscript database

    The sequences among subgenomes in a polyploid species have high similarity. This makes difficult to design genome-specific primers for sequence analysis. We present a web-based platform named GSP for designing genome-specific primers to distinguish subgenome sequences in the polyploid genome backgr...

  14. The rate and efficiency of high-mass star formation along the Hubble sequence

    NASA Technical Reports Server (NTRS)

    Devereux, Nicholas A.; Young, Judith S.

    1991-01-01

    Data obtained with IRAS are used to compare and contrast the global star formation rates for a galactic sample which represents essentially all known noninteracting spiral and lenticular galaxies within 40 Mpc. The distribution of 60 micron luminosity is similar for spirals of types Sa-Scd inclusively, although the luminosities of the very early and very late types are, on average, one order of magnitude lower. High-mass star formation rates are similar for early, intermediate, and late type spirals, and the average high-mass star formation rate per unit molecular gas mass is independent of type for spiral galaxies. A remarkable homogeneity exists in the high-mass star-forming capabilities of spiral galaxies, particularly among the Sa-Scd types. The Hubble sequence is therefore not a sequence in the present-day rate or production efficiency of high-mass stars.

  15. Molecular analysis of microbial community in a groundwater sample polluted by landfill leachate and seawater*

    PubMed Central

    Tian, Yang-jie; Yang, Hong; Wu, Xiu-juan; Li, Dao-tang

    2005-01-01

    Seashore landfill aquifers are environments of special physicochemical conditions (high organic load and high salinity), and microbes in leachate-polluted aquifers play a significant role for intrinsic bioremediation. In order to characterize microbial diversity and look for clues on the relationship between microbial community structure and hydrochemistry, a culture-independent examination of a typical groundwater sample obtained from a seashore landfill was conducted by sequence analysis of 16S rDNA clone library. Two sets of universal 16S rDNA primers were used to amplify DNA extracted from the groundwater so that problems arising from primer efficiency and specificity could be reduced. Of 74 clones randomly selected from the libraries, 30 contained unique sequences whose analysis showed that the majority of them belonged to bacteria (95.9%), with Proteobacteria (63.5%) being the dominant division. One archaeal sequence and one eukaryotic sequence were found as well. Bacterial sequences belonging to the following phylogenic groups were identified: Bacteroidetes (20.3%), β, γ, δ and ε-subdivisions of Proteobacteria (47.3%, 9.5%, 5.4% and 1.3%, respectively), Firmicutes (1.4%), Actinobacteria (2.7%), Cyanobacteria (2.7%). The percentages of Proteobacteria and Bacteroides in seawater were greater than those in the groundwater from a non-seashore landfill, indicating a possible influence of seawater. Quite a few sequences had close relatives in marine or hypersaline environments. Many sequences showed affiliations with microbes involved in anaerobic fermentation. The remarkable abundance of sequences related to (per)chlorate-reducing bacteria (ClRB) in the groundwater was significant and worthy of further study. PMID:15682499

  16. Mining new crystal protein genes from Bacillus thuringiensis on the basis of mixed plasmid-enriched genome sequencing and a computational pipeline.

    PubMed

    Ye, Weixing; Zhu, Lei; Liu, Yingying; Crickmore, Neil; Peng, Donghai; Ruan, Lifang; Sun, Ming

    2012-07-01

    We have designed a high-throughput system for the identification of novel crystal protein genes (cry) from Bacillus thuringiensis strains. The system was developed with two goals: (i) to acquire the mixed plasmid-enriched genomic sequence of B. thuringiensis using next-generation sequencing biotechnology, and (ii) to identify cry genes with a computational pipeline (using BtToxin_scanner). In our pipeline method, we employed three different kinds of well-developed prediction methods, BLAST, hidden Markov model (HMM), and support vector machine (SVM), to predict the presence of Cry toxin genes. The pipeline proved to be fast (average speed, 1.02 Mb/min for proteins and open reading frames [ORFs] and 1.80 Mb/min for nucleotide sequences), sensitive (it detected 40% more protein toxin genes than a keyword extraction method using genomic sequences downloaded from GenBank), and highly specific. Twenty-one strains from our laboratory's collection were selected based on their plasmid pattern and/or crystal morphology. The plasmid-enriched genomic DNA was extracted from these strains and mixed for Illumina sequencing. The sequencing data were de novo assembled, and a total of 113 candidate cry sequences were identified using the computational pipeline. Twenty-seven candidate sequences were selected on the basis of their low level of sequence identity to known cry genes, and eight full-length genes were obtained with PCR. Finally, three new cry-type genes (primary ranks) and five cry holotypes, which were designated cry8Ac1, cry7Ha1, cry21Ca1, cry32Fa1, and cry21Da1 by the B. thuringiensis Toxin Nomenclature Committee, were identified. The system described here is both efficient and cost-effective and can greatly accelerate the discovery of novel cry genes.

  17. SNP discovery by high-throughput sequencing in soybean

    PubMed Central

    2010-01-01

    Background With the advance of new massively parallel genotyping technologies, quantitative trait loci (QTL) fine mapping and map-based cloning become more achievable in identifying genes for important and complex traits. Development of high-density genetic markers in the QTL regions of specific mapping populations is essential for fine-mapping and map-based cloning of economically important genes. Single nucleotide polymorphisms (SNPs) are the most abundant form of genetic variation existing between any diverse genotypes that are usually used for QTL mapping studies. The massively parallel sequencing technologies (Roche GS/454, Illumina GA/Solexa, and ABI/SOLiD), have been widely applied to identify genome-wide sequence variations. However, it is still remains unclear whether sequence data at a low sequencing depth are enough to detect the variations existing in any QTL regions of interest in a crop genome, and how to prepare sequencing samples for a complex genome such as soybean. Therefore, with the aims of identifying SNP markers in a cost effective way for fine-mapping several QTL regions, and testing the validation rate of the putative SNPs predicted with Solexa short sequence reads at a low sequencing depth, we evaluated a pooled DNA fragment reduced representation library and SNP detection methods applied to short read sequences generated by Solexa high-throughput sequencing technology. Results A total of 39,022 putative SNPs were identified by the Illumina/Solexa sequencing system using a reduced representation DNA library of two parental lines of a mapping population. The validation rates of these putative SNPs predicted with low and high stringency were 72% and 85%, respectively. One hundred sixty four SNP markers resulted from the validation of putative SNPs and have been selectively chosen to target a known QTL, thereby increasing the marker density of the targeted region to one marker per 42 K bp. Conclusions We have demonstrated how to quickly identify large numbers of SNPs for fine mapping of QTL regions by applying massively parallel sequencing combined with genome complexity reduction techniques. This SNP discovery approach is more efficient for targeting multiple QTL regions in a same genetic population, which can be applied to other crops. PMID:20701770

  18. Efficient DNA binding and nuclear uptake by distamycin derivatives conjugated to octa-arginine sequences.

    PubMed

    Vázquez, Olalla; Blanco-Canosa, Juan B; Vázquez, M Eugenio; Martínez-Costas, Jose; Castedo, Luis; Mascareñas, José L

    2008-11-24

    Efficient targeting of DNA by designed molecules requires not only careful fine-tuning of their DNA-recognition properties, but also appropriate cell internalization of the compounds so that they can reach the cell nucleus in a short period of time. Previous observations in our group on the relatively high affinity displayed by conjugates between distamycin derivatives and bZIP basic regions for A-rich DNA sites, led us to investigate whether the covalent attachment of a positively charged cell-penetrating peptide to a distamycin-like tripyrrole might yield high affinity DNA binders with improved cell internalization properties. Our work has led to the discovery of synthetic tripyrrole-octa-arginine conjugates that are capable of targeting specific DNA sites that contain A-rich tracts with low nanomolar affinity; they simultaneously exhibit excellent membrane and nuclear translocation properties in living HeLa cells.

  19. Rapid bursts and slow declines: on the possible evolutionary trajectories of enzymes

    PubMed Central

    Newton, Matilda S.; Arcus, Vickery L.; Patrick, Wayne M.

    2015-01-01

    The evolution of enzymes is often viewed as following a smooth and steady trajectory, from barely functional primordial catalysts to the highly active and specific enzymes that we observe today. In this review, we summarize experimental data that suggest a different reality. Modern examples, such as the emergence of enzymes that hydrolyse human-made pesticides, demonstrate that evolution can be extraordinarily rapid. Experiments to infer and resurrect ancient sequences suggest that some of the first organisms present on the Earth are likely to have possessed highly active enzymes. Reconciling these observations, we argue that rapid bursts of strong selection for increased catalytic efficiency are interspersed with much longer periods in which the catalytic power of an enzyme erodes, through neutral drift and selection for other properties such as cellular energy efficiency or regulation. Thus, many enzymes may have already passed their catalytic peaks. PMID:25926697

  20. SAR and scan-time optimized 3D whole-brain double inversion recovery imaging at 7T.

    PubMed

    Pracht, Eberhard D; Feiweier, Thorsten; Ehses, Philipp; Brenner, Daniel; Roebroeck, Alard; Weber, Bernd; Stöcker, Tony

    2018-05-01

    The aim of this project was to implement an ultra-high field (UHF) optimized double inversion recovery (DIR) sequence for gray matter (GM) imaging, enabling whole brain coverage in short acquisition times ( ≈5 min, image resolution 1 mm 3 ). A 3D variable flip angle DIR turbo spin echo (TSE) sequence was optimized for UHF application. We implemented an improved, fast, and specific absorption rate (SAR) efficient TSE imaging module, utilizing improved reordering. The DIR preparation was tailored to UHF application. Additionally, fat artifacts were minimized by employing water excitation instead of fat saturation. GM images, covering the whole brain, were acquired in 7 min scan time at 1 mm isotropic resolution. SAR issues were overcome by using a dedicated flip angle calculation considering SAR and SNR efficiency. Furthermore, UHF related artifacts were minimized. The suggested sequence is suitable to generate GM images with whole-brain coverage at UHF. Due to the short total acquisition times and overall robustness, this approach can potentially enable DIR application in a routine setting and enhance lesion detection in neurological diseases. Magn Reson Med 79:2620-2628, 2018. © 2017 International Society for Magnetic Resonance in Medicine. © 2017 International Society for Magnetic Resonance in Medicine.

  1. Deciphering the genomic targets of alkylating polyamide conjugates using high-throughput sequencing

    PubMed Central

    Chandran, Anandhakumar; Syed, Junetha; Taylor, Rhys D.; Kashiwazaki, Gengo; Sato, Shinsuke; Hashiya, Kaori; Bando, Toshikazu; Sugiyama, Hiroshi

    2016-01-01

    Chemically engineered small molecules targeting specific genomic sequences play an important role in drug development research. Pyrrole-imidazole polyamides (PIPs) are a group of molecules that can bind to the DNA minor-groove and can be engineered to target specific sequences. Their biological effects rely primarily on their selective DNA binding. However, the binding mechanism of PIPs at the chromatinized genome level is poorly understood. Herein, we report a method using high-throughput sequencing to identify the DNA-alkylating sites of PIP-indole-seco-CBI conjugates. High-throughput sequencing analysis of conjugate 2 showed highly similar DNA-alkylating sites on synthetic oligos (histone-free DNA) and on human genomes (chromatinized DNA context). To our knowledge, this is the first report identifying alkylation sites across genomic DNA by alkylating PIP conjugates using high-throughput sequencing. PMID:27098039

  2. Cross-Specificities between cII-like Proteins and pRE-like Promoters of Lambdoid Bacteriophages

    PubMed Central

    Wulff, Daniel L.; Mahoney, Michael E.

    1987-01-01

    We have investigated the activation of transcription from the pRE promoters of phages λ, 21 and P22 by the λ and 21 cII proteins and the P22 c1 (cII-like) protein, using an in vivo system in which cII protein from a derepressed prophage activates transcription from a pRE DNA fragment on a multicopy plasmid. We find that each protein is highly specific for its own cognate pRE promoter, although measureable cross-reactions are observed. The primary recognition sequence for cII protein on λ pRE is a pair of TTGC repeat sequences in the sequence 5'-TTGCN 6TTGC-3' at the -35 region of the promoter. This same sequence is found in 21 pRE, while P22 pRE has the sequence 5'-TTGCN6TTGT-3', which is the same as that of λctr1, a pRE+ variant of λ. λctr1 pRE is half as active as λ + pRE when assayed with either the λ cII or the P22 c1 proteins. Therefore, the single base change in the P22 repeat sequence cannot explain why the P22 c1 protein is much more active with P22 pRE than λ p RE. The dya5 mutation, a G→A change at position -43 of pRE, makes pRE a stronger promoter when assayed with either the λ or 21 cII proteins or the P22 c1 protein. We conclude that efficient activation of a cII-dependent promoter by a cII protein requires sequence information in addition to the TTGC repeat sequences. We do not know the characteristics of the proteins which are responsible for the specificity of each protein for its own cognate promoter. However, λdya8, which has a Glu27→Lys alteration in the λ cII protein and a cII+ phenotype, results in a mutant cII protein that is much more highly specific than wild-type cII protein for its own cognate λ p RE promoter. This is especially remarkable because the dya8 amino acid alteration makes the helix-2 region (the region of the protein predicted to make contact with the phosphodiester backbone of the DNA) of λ cII protein conform exactly with the helix-2 region of the P22 c1 protein in both charge and charge distribution. PMID:2953649

  3. Design of the hairpin ribozyme for targeting specific RNA sequences.

    PubMed

    Hampel, A; DeYoung, M B; Galasinski, S; Siwkowski, A

    1997-01-01

    The following steps should be taken when designing the hairpin ribozyme to cleave a specific target sequence: 1. Select a target sequence containing BN*GUC where B is C, G, or U. 2. Select the target sequence in areas least likely to have extensive interfering structure. 3. Design the conventional hairpin ribozyme as shown in Fig. 1, such that it can form a 4 bp helix 2 and helix 1 lengths up to 10 bp. 4. Synthesize this ribozyme from single-stranded DNA templates with a double-stranded T7 promoter. 5. Prepare a series of short substrates capable of forming a range of helix 1 lengths of 5-10 bp. 6. Identify these by direct RNA sequencing. 7. Assay the extent of cleavage of each substrate to identify the optimal length of helix 1. 8. Prepare the hairpin tetraloop ribozyme to determine if catalytic efficiency can be improved.

  4. Metagenomic analysis of a desulphurisation system used to treat biogas from vinasse methanisation.

    PubMed

    Dias, Marcela França; Colturato, Luis Felipe; de Oliveira, João Paulo; Leite, Laura Rabelo; Oliveira, Guilherme; Chernicharo, Carlos Augusto; de Araújo, Juliana Calabria

    2016-04-01

    We investigated the response of microbial community to changes in H2S loading rate in a microaerated desulphurisation system treating biogas from vinasse methanisation. H2S removal efficiency was high, and both COD and DO seemed to be important parameters to biomass activity. DGGE analysis retrieved sequences of sulphide-oxidising bacteria (SOB), such as Thioalkalimicrobium sp. Deep sequencing analysis revealed that the microbial community was complex and remained constant throughout the experiment. Most sequences belonged to Firmicutes and Proteobacteria, and, to a lesser extent, Bacteroidetes, Chloroflexi, and Synergistetes. Despite the high sulphide removal efficiency, the abundance of the taxa of SOB was low, and was negatively affected by the high sulphide loading rate. Copyright © 2016 Elsevier Ltd. All rights reserved.

  5. Sequence independent amplification of DNA

    DOEpatents

    Bohlander, S.K.

    1998-03-24

    The present invention is a rapid sequence-independent amplification procedure (SIA). Even minute amounts of DNA from various sources can be amplified independent of any sequence requirements of the DNA or any a priori knowledge of any sequence characteristics of the DNA to be amplified. This method allows, for example, the sequence independent amplification of microdissected chromosomal material and the reliable construction of high quality fluorescent in situ hybridization (FISH) probes from YACs or from other sources. These probes can be used to localize YACs on metaphase chromosomes but also--with high efficiency--in interphase nuclei. 25 figs.

  6. Sequence independent amplification of DNA

    DOEpatents

    Bohlander, Stefan K.

    1998-01-01

    The present invention is a rapid sequence-independent amplification procedure (SIA). Even minute amounts of DNA from various sources can be amplified independent of any sequence requirements of the DNA or any a priori knowledge of any sequence characteristics of the DNA to be amplified. This method allows, for example the sequence independent amplification of microdissected chromosomal material and the reliable construction of high quality fluorescent in situ hybridization (FISH) probes from YACs or from other sources. These probes can be used to localize YACs on metaphase chromosomes but also--with high efficiency--in interphase nuclei.

  7. QQ-SNV: single nucleotide variant detection at low frequency by comparing the quality quantiles.

    PubMed

    Van der Borght, Koen; Thys, Kim; Wetzels, Yves; Clement, Lieven; Verbist, Bie; Reumers, Joke; van Vlijmen, Herman; Aerssens, Jeroen

    2015-11-10

    Next generation sequencing enables studying heterogeneous populations of viral infections. When the sequencing is done at high coverage depth ("deep sequencing"), low frequency variants can be detected. Here we present QQ-SNV (http://sourceforge.net/projects/qqsnv), a logistic regression classifier model developed for the Illumina sequencing platforms that uses the quantiles of the quality scores, to distinguish true single nucleotide variants from sequencing errors based on the estimated SNV probability. To train the model, we created a dataset of an in silico mixture of five HIV-1 plasmids. Testing of our method in comparison to the existing methods LoFreq, ShoRAH, and V-Phaser 2 was performed on two HIV and four HCV plasmid mixture datasets and one influenza H1N1 clinical dataset. For default application of QQ-SNV, variants were called using a SNV probability cutoff of 0.5 (QQ-SNV(D)). To improve the sensitivity we used a SNV probability cutoff of 0.0001 (QQ-SNV(HS)). To also increase specificity, SNVs called were overruled when their frequency was below the 80(th) percentile calculated on the distribution of error frequencies (QQ-SNV(HS-P80)). When comparing QQ-SNV versus the other methods on the plasmid mixture test sets, QQ-SNV(D) performed similarly to the existing approaches. QQ-SNV(HS) was more sensitive on all test sets but with more false positives. QQ-SNV(HS-P80) was found to be the most accurate method over all test sets by balancing sensitivity and specificity. When applied to a paired-end HCV sequencing study, with lowest spiked-in true frequency of 0.5%, QQ-SNV(HS-P80) revealed a sensitivity of 100% (vs. 40-60% for the existing methods) and a specificity of 100% (vs. 98.0-99.7% for the existing methods). In addition, QQ-SNV required the least overall computation time to process the test sets. Finally, when testing on a clinical sample, four putative true variants with frequency below 0.5% were consistently detected by QQ-SNV(HS-P80) from different generations of Illumina sequencers. We developed and successfully evaluated a novel method, called QQ-SNV, for highly efficient single nucleotide variant calling on Illumina deep sequencing virology data.

  8. Efficient T-cell receptor signaling requires a high-affinity interaction between the Gads C-SH3 domain and the SLP-76 RxxK motif.

    PubMed

    Seet, Bruce T; Berry, Donna M; Maltzman, Jonathan S; Shabason, Jacob; Raina, Monica; Koretzky, Gary A; McGlade, C Jane; Pawson, Tony

    2007-02-07

    The relationship between the binding affinity and specificity of modular interaction domains is potentially important in determining biological signaling responses. In signaling from the T-cell receptor (TCR), the Gads C-terminal SH3 domain binds a core RxxK sequence motif in the SLP-76 scaffold. We show that residues surrounding this motif are largely optimized for binding the Gads C-SH3 domain resulting in a high-affinity interaction (K(D)=8-20 nM) that is essential for efficient TCR signaling in Jurkat T cells, since Gads-mediated signaling declines with decreasing affinity. Furthermore, the SLP-76 RxxK motif has evolved a very high specificity for the Gads C-SH3 domain. However, TCR signaling in Jurkat cells is tolerant of potential SLP-76 crossreactivity, provided that very high-affinity binding to the Gads C-SH3 domain is maintained. These data provide a quantitative argument that the affinity of the Gads C-SH3 domain for SLP-76 is physiologically important and suggest that the integrity of TCR signaling in vivo is sustained both by strong selection of SLP-76 for the Gads C-SH3 domain and by a capacity to buffer intrinsic crossreactivity.

  9. Rapid authentication of the precious herb saffron by loop-mediated isothermal amplification (LAMP) based on internal transcribed spacer 2 (ITS2) sequence

    PubMed Central

    Zhao, Mingming; Shi, Yuhua; Wu, Lan; Guo, Licheng; Liu, Wei; Xiong, Chao; Yan, Song; Sun, Wei; Chen, Shilin

    2016-01-01

    Saffron is one of the most expensive species of Chinese herbs and has been subjected to various types of adulteration because of its high price and limited production. The present study introduces a loop-mediated isothermal amplification (LAMP) technique for the differentiation of saffron from its adulterants. This novel technique is sensitive, efficient and simple. Six specific LAMP primers were designed on the basis of the nucleotide sequence of the internal transcribed spacer 2 (ITS2) nuclear ribosomal DNA of Crocus sativus. All LAMP amplifications were performed successfully, and visual detection occurred within 60 min at isothermal conditions of 65 °C. The results indicated that the LAMP primers are accurate and highly specific for the discrimination of saffron from its adulterants. In particular, 10 fg of genomic DNA was determined to be the limit for template accuracy of LAMP in saffron. Thus, the proposed novel, simple, and sensitive LAMP assay is well suited for immediate on-site discrimination of herbal materials. Based on the study, a practical standard operating procedure (SOP) for utilizing the LAMP protocol for herbal authentication is provided. PMID:27146605

  10. Rapid authentication of the precious herb saffron by loop-mediated isothermal amplification (LAMP) based on internal transcribed spacer 2 (ITS2) sequence.

    PubMed

    Zhao, Mingming; Shi, Yuhua; Wu, Lan; Guo, Licheng; Liu, Wei; Xiong, Chao; Yan, Song; Sun, Wei; Chen, Shilin

    2016-05-05

    Saffron is one of the most expensive species of Chinese herbs and has been subjected to various types of adulteration because of its high price and limited production. The present study introduces a loop-mediated isothermal amplification (LAMP) technique for the differentiation of saffron from its adulterants. This novel technique is sensitive, efficient and simple. Six specific LAMP primers were designed on the basis of the nucleotide sequence of the internal transcribed spacer 2 (ITS2) nuclear ribosomal DNA of Crocus sativus. All LAMP amplifications were performed successfully, and visual detection occurred within 60 min at isothermal conditions of 65 °C. The results indicated that the LAMP primers are accurate and highly specific for the discrimination of saffron from its adulterants. In particular, 10 fg of genomic DNA was determined to be the limit for template accuracy of LAMP in saffron. Thus, the proposed novel, simple, and sensitive LAMP assay is well suited for immediate on-site discrimination of herbal materials. Based on the study, a practical standard operating procedure (SOP) for utilizing the LAMP protocol for herbal authentication is provided.

  11. Characterization of Satellite DNA Sequences from the Commercially Important Marine Rotifers Brachionus rotundiformis and Brachionus plicatilis.

    PubMed

    Boehm; Gibson; Lubzens

    2000-01-01

    This study was initiated to search for species-specific and strain-specific satellite DNA sequences for which oligonucleotide primers could be designed to differentiate between various commercially important strains of the marine monogonont rotifers Brachionus rotundiformis and Brachionus plicatilis. Two unrelated, highly reiterated satellite sequences were cloned and characterized. The eight sequenced monomers from B. rotundiformis and six from B. plicatilis had low intrarepeat variability and were similar in their overall lengths, A + T compositions, and high degrees of repeated motif substructure. However, hybridizations to 19 representative strains, sequence characterizations, and GenBank searches indicated that these two satellites are morphotype-specific and population-specific, respectively, and share little homology to each other or to other characterized sequences in the database. Primer pairs designed for the B. rotundiformis satellite confirmed hybridization specificities on polymerase chain reaction and could serve as a useful molecular diagnostic tool to identify strains belonging to the SS morphotype, which are gaining widespread usage as first feeds for marine fish in commercial production.

  12. Influence of sequence and size of DNA on packaging efficiency of parvovirus MVM-based vectors.

    PubMed

    Brandenburger, A; Coessens, E; El Bakkouri, K; Velu, T

    1999-05-01

    We have derived a vector from the autonomous parvovirus MVM(p), which expresses human IL-2 specifically in transformed cells (Russell et al., J. Virol 1992;66:2821-2828). Testing the therapeutic potential of these vectors in vivo requires high-titer stocks. Stocks with a titer of 10(9) can be obtained after concentration and purification (Avalosse et al., J. Virol. Methods 1996;62:179-183), but this method requires large culture volumes and cannot easily be scaled up. We wanted to increase the production of recombinant virus at the initial transfection step. Poor vector titers could be due to inadequate genome amplification or to inefficient packaging. Here we show that intracellular amplification of MVM vector genomes is not the limiting factor for vector production. Several vector genomes of different size and/or structure were amplified to an equal extent. Their amplification was also equivalent to that of a cotransfected wild-type genome. We did not observe any interference between vector and wild-type genomes at the level of DNA amplification. Despite equivalent genome amplification, vector titers varied greatly between the different genomes, presumably owing to differences in packaging efficiency. Genomes with a size close to 100% that of wild type were packaged most efficiently with loss of efficiency at lower and higher sizes. However, certain genomes of identical size showed different packaging efficiencies, illustrating the importance of the DNA sequence, and probably its structure.

  13. Virus-Clip: a fast and memory-efficient viral integration site detection tool at single-base resolution with annotation capability.

    PubMed

    Ho, Daniel W H; Sze, Karen M F; Ng, Irene O L

    2015-08-28

    Viral integration into the human genome upon infection is an important risk factor for various human malignancies. We developed viral integration site detection tool called Virus-Clip, which makes use of information extracted from soft-clipped sequencing reads to identify exact positions of human and virus breakpoints of integration events. With initial read alignment to virus reference genome and streamlined procedures, Virus-Clip delivers a simple, fast and memory-efficient solution to viral integration site detection. Moreover, it can also automatically annotate the integration events with the corresponding affected human genes. Virus-Clip has been verified using whole-transcriptome sequencing data and its detection was validated to have satisfactory sensitivity and specificity. Marked advancement in performance was detected, compared to existing tools. It is applicable to versatile types of data including whole-genome sequencing, whole-transcriptome sequencing, and targeted sequencing. Virus-Clip is available at http://web.hku.hk/~dwhho/Virus-Clip.zip.

  14. The opportunities and challenges of large-scale molecular approaches to songbird neurobiology

    PubMed Central

    Mello, C.V.; Clayton, D.F.

    2014-01-01

    High-through put methods for analyzing genome structure and function are having a large impact in song-bird neurobiology. Methods include genome sequencing and annotation, comparative genomics, DNA microarrays and transcriptomics, and the development of a brain atlas of gene expression. Key emerging findings include the identification of complex transcriptional programs active during singing, the robust brain expression of non-coding RNAs, evidence of profound variations in gene expression across brain regions, and the identification of molecular specializations within song production and learning circuits. Current challenges include the statistical analysis of large datasets, effective genome curations, the efficient localization of gene expression changes to specific neuronal circuits and cells, and the dissection of behavioral and environmental factors that influence brain gene expression. The field requires efficient methods for comparisons with organisms like chicken, which offer important anatomical, functional and behavioral contrasts. As sequencing costs plummet, opportunities emerge for comparative approaches that may help reveal evolutionary transitions contributing to vocal learning, social behavior and other properties that make songbirds such compelling research subjects. PMID:25280907

  15. NEBNext Direct: A Novel, Rapid, Hybridization-Based Approach for the Capture and Library Conversion of Genomic Regions of Interest.

    PubMed

    Emerman, Amy B; Bowman, Sarah K; Barry, Andrew; Henig, Noa; Patel, Kruti M; Gardner, Andrew F; Hendrickson, Cynthia L

    2017-07-05

    Next-generation sequencing (NGS) is a powerful tool for genomic studies, translational research, and clinical diagnostics that enables the detection of single nucleotide polymorphisms, insertions and deletions, copy number variations, and other genetic variations. Target enrichment technologies improve the efficiency of NGS by only sequencing regions of interest, which reduces sequencing costs while increasing coverage of the selected targets. Here we present NEBNext Direct ® , a hybridization-based, target-enrichment approach that addresses many of the shortcomings of traditional target-enrichment methods. This approach features a simple, 7-hr workflow that uses enzymatic removal of off-target sequences to achieve a high specificity for regions of interest. Additionally, unique molecular identifiers are incorporated for the identification and filtering of PCR duplicates. The same protocol can be used across a wide range of input amounts, input types, and panel sizes, enabling NEBNext Direct to be broadly applicable across a wide variety of research and diagnostic needs. © 2017 by John Wiley & Sons, Inc. Copyright © 2017 John Wiley & Sons, Inc.

  16. Structure based re-design of the binding specificity of anti-apoptotic Bcl-xL

    PubMed Central

    Chen, T. Scott; Palacios, Hector; Keating, Amy E.

    2012-01-01

    Many native proteins are multi-specific and interact with numerous partners, which can confound analysis of their functions. Protein design provides a potential route to generating synthetic variants of native proteins with more selective binding profiles. Re-designed proteins could be used as research tools, diagnostics or therapeutics. In this work, we used a library screening approach to re-engineer the multi-specific anti-apoptotic protein Bcl-xL to remove its interactions with many of its binding partners, making it a high affinity and selective binder of the BH3 region of pro-apoptotic protein Bad. To overcome the enormity of the potential Bcl-xL sequence space, we developed and applied a computational/experimental framework that used protein structure information to generate focused combinatorial libraries. Sequence features were identified using structure-based modeling, and an optimization algorithm based on integer programming was used to select degenerate codons that maximally covered these features. A constraint on library size was used to ensure thorough sampling. Using yeast surface display to screen a designed library of Bcl-xL variants, we successfully identified a protein with ~1,000-fold improvement in binding specificity for the BH3 region of Bad over the BH3 region of Bim. Although negative design was targeted only against the BH3 region of Bim, the best re-designed protein was globally specific against binding to 10 other peptides corresponding to native BH3 motifs. Our design framework demonstrates an efficient route to highly specific protein binders and may readily be adapted for application to other design problems. PMID:23154169

  17. Development and validation of real-time PCR screening methods for detection of cry1A.105 and cry2Ab2 genes in genetically modified organisms.

    PubMed

    Dinon, Andréia Z; Prins, Theo W; van Dijk, Jeroen P; Arisi, Ana Carolina M; Scholtens, Ingrid M J; Kok, Esther J

    2011-05-01

    Primers and probes were developed for the element-specific detection of cry1A.105 and cry2Ab2 genes, based on their DNA sequence as present in GM maize MON89034. Cry genes are present in many genetically modified (GM) plants and they are important targets for developing GMO element-specific detection methods. Element-specific methods can be of use to screen for the presence of GMOs in food and feed supply chains. Moreover, a combination of GMO elements may indicate the potential presence of unapproved GMOs (UGMs). Primer-probe combinations were evaluated in terms of specificity, efficiency and limit of detection. Except for specificity, the complete experiment was performed in 9 PCR runs, on 9 different days and by testing 8 DNA concentrations. The results showed a high specificity and efficiency for cry1A.105 and cry2Ab2 detection. The limit of detection was between 0.05 and 0.01 ng DNA per PCR reaction for both assays. These data confirm the applicability of these new primer-probe combinations for element detection that can contribute to the screening for GM and UGM crops in food and feed samples.

  18. The genome sequence of the model ascomycete fungus Podospora anserina

    PubMed Central

    Espagne, Eric; Lespinet, Olivier; Malagnac, Fabienne; Da Silva, Corinne; Jaillon, Olivier; Porcel, Betina M; Couloux, Arnaud; Aury, Jean-Marc; Ségurens, Béatrice; Poulain, Julie; Anthouard, Véronique; Grossetete, Sandrine; Khalili, Hamid; Coppin, Evelyne; Déquard-Chablat, Michelle; Picard, Marguerite; Contamine, Véronique; Arnaise, Sylvie; Bourdais, Anne; Berteaux-Lecellier, Véronique; Gautheret, Daniel; de Vries, Ronald P; Battaglia, Evy; Coutinho, Pedro M; Danchin, Etienne GJ; Henrissat, Bernard; Khoury, Riyad EL; Sainsard-Chanet, Annie; Boivin, Antoine; Pinan-Lucarré, Bérangère; Sellem, Carole H; Debuchy, Robert; Wincker, Patrick; Weissenbach, Jean; Silar, Philippe

    2008-01-01

    Background The dung-inhabiting ascomycete fungus Podospora anserina is a model used to study various aspects of eukaryotic and fungal biology, such as ageing, prions and sexual development. Results We present a 10X draft sequence of P. anserina genome, linked to the sequences of a large expressed sequence tag collection. Similar to higher eukaryotes, the P. anserina transcription/splicing machinery generates numerous non-conventional transcripts. Comparison of the P. anserina genome and orthologous gene set with the one of its close relatives, Neurospora crassa, shows that synteny is poorly conserved, the main result of evolution being gene shuffling in the same chromosome. The P. anserina genome contains fewer repeated sequences and has evolved new genes by duplication since its separation from N. crassa, despite the presence of the repeat induced point mutation mechanism that mutates duplicated sequences. We also provide evidence that frequent gene loss took place in the lineages leading to P. anserina and N. crassa. P. anserina contains a large and highly specialized set of genes involved in utilization of natural carbon sources commonly found in its natural biotope. It includes genes potentially involved in lignin degradation and efficient cellulose breakdown. Conclusion The features of the P. anserina genome indicate a highly dynamic evolution since the divergence of P. anserina and N. crassa, leading to the ability of the former to use specific complex carbon sources that match its needs in its natural biotope. PMID:18460219

  19. Global Analysis of Transcription Factor-Binding Sites in Yeast Using ChIP-Seq

    PubMed Central

    Lefrançois, Philippe; Gallagher, Jennifer E. G.; Snyder, Michael

    2016-01-01

    Transcription factors influence gene expression through their ability to bind DNA at specific regulatory elements. Specific DNA-protein interactions can be isolated through the chromatin immunoprecipitation (ChIP) procedure, in which DNA fragments bound by the protein of interest are recovered. ChIP is followed by high-throughput DNA sequencing (Seq) to determine the genomic provenance of ChIP DNA fragments and their relative abundance in the sample. This chapter describes a ChIP-Seq strategy adapted for budding yeast to enable the genome-wide characterization of binding sites of transcription factors (TFs) and other DNA-binding proteins in an efficient and cost-effective way. Yeast strains with epitope-tagged TFs are most commonly used for ChIP-Seq, along with their matching untagged control strains. The initial step of ChIP involves the cross-linking of DNA and proteins. Next, yeast cells are lysed and sonicated to shear chromatin into smaller fragments. An antibody against an epitope-tagged TF is used to pull down chromatin complexes containing DNA and the TF of interest. DNA is then purified and proteins degraded. Specific barcoded adapters for multiplex DNA sequencing are ligated to ChIP DNA. Short DNA sequence reads (28–36 base pairs) are parsed according to the barcode and aligned against the yeast reference genome, thus generating a nucleotide-resolution map of transcription factor-binding sites and their occupancy. PMID:25213249

  20. Vander Lugt correlation of DNA sequence data

    NASA Astrophysics Data System (ADS)

    Christens-Barry, William A.; Hawk, James F.; Martin, James C.

    1990-12-01

    DNA, the molecule containing the genetic code of an organism, is a linear chain of subunits. It is the sequence of subunits, of which there are four kinds, that constitutes the unique blueprint of an individual. This sequence is the focus of a large number of analyses performed by an army of geneticists, biologists, and computer scientists. Most of these analyses entail searches for specific subsequences within the larger set of sequence data. Thus, most analyses are essentially pattern recognition or correlation tasks. Yet, there are special features to such analysis that influence the strategy and methods of an optical pattern recognition approach. While the serial processing employed in digital electronic computers remains the main engine of sequence analyses, there is no fundamental reason that more efficient parallel methods cannot be used. We describe an approach using optical pattern recognition (OPR) techniques based on matched spatial filtering. This allows parallel comparison of large blocks of sequence data. In this study we have simulated a Vander Lugt1 architecture implementing our approach. Searches for specific target sequence strings within a block of DNA sequence from the Co/El plasmid2 are performed.

  1. Pooled-DNA Sequencing for Elucidating New Genomic Risk Factors, Rare Variants Underlying Alzheimer's Disease.

    PubMed

    Jin, Sheng Chih; Benitez, Bruno A; Deming, Yuetiva; Cruchaga, Carlos

    2016-01-01

    Analyses of genome-wide association studies (GWAS) for complex disorders usually identify common variants with a relatively small effect size that only explain a small proportion of phenotypic heritability. Several studies have suggested that a significant fraction of heritability may be explained by low-frequency (minor allele frequency (MAF) of 1-5 %) and rare-variants that are not contained in the commercial GWAS genotyping arrays (Schork et al., Curr Opin Genet Dev 19:212, 2009). Rare variants can also have relatively large effects on risk for developing human diseases or disease phenotype (Cruchaga et al., PLoS One 7:e31039, 2012). However, it is necessary to perform next-generation sequencing (NGS) studies in a large population (>4,000 samples) to detect a significant rare-variant association. Several NGS methods, such as custom capture sequencing and amplicon-based sequencing, are designed to screen a small proportion of the genome, but most of these methods are limited in the number of samples that can be multiplexed (i.e. most sequencing kits only provide 96 distinct index). Additionally, the sequencing library preparation for 4,000 samples remains expensive and thus conducting NGS studies with the aforementioned methods are not feasible for most research laboratories.The need for low-cost large scale rare-variant detection makes pooled-DNA sequencing an ideally efficient and cost-effective technique to identify rare variants in target regions by sequencing hundreds to thousands of samples. Our recent work has demonstrated that pooled-DNA sequencing can accurately detect rare variants in targeted regions in multiple DNA samples with high sensitivity and specificity (Jin et al., Alzheimers Res Ther 4:34, 2012). In these studies we used a well-established pooled-DNA sequencing approach and a computational package, SPLINTER (short indel prediction by large deviation inference and nonlinear true frequency estimation by recursion) (Vallania et al., Genome Res 20:1711, 2010), for accurate identification of rare variants in large DNA pools. Given an average sequencing coverage of 30× per haploid genome, SPLINTER can detect rare variants and short indels up to 4 base pairs (bp) with high sensitivity and specificity (up to 1 haploid allele in a pool as large as 500 individuals). Step-by-step instructions on how to conduct pooled-DNA sequencing experiments and data analyses are described in this chapter.

  2. RNAi screen for rapid therapeutic target identification in leukemia patients

    PubMed Central

    Tyner, Jeffrey W.; Deininger, Michael W.; Loriaux, Marc M.; Chang, Bill H.; Gotlib, Jason R.; Willis, Stephanie G.; Erickson, Heidi; Kovacsovics, Tibor; O'Hare, Thomas; Heinrich, Michael C.; Druker, Brian J.

    2009-01-01

    Targeted therapy has vastly improved outcomes in certain types of cancer. Extension of this paradigm across a broad spectrum of malignancies will require an efficient method to determine the molecular vulnerabilities of cancerous cells. Improvements in sequencing technology will soon enable high-throughput sequencing of entire genomes of cancer patients; however, determining the relevance of identified sequence variants will require complementary functional analyses. Here, we report an RNAi-assisted protein target identification (RAPID) technology that individually assesses targeting of each member of the tyrosine kinase gene family. We demonstrate that RAPID screening of primary leukemia cells from 30 patients identifies targets that are critical to survival of the malignant cells from 10 of these individuals. We identify known, activating mutations in JAK2 and K-RAS, as well as patient-specific sensitivity to down-regulation of FLT1, CSF1R, PDGFR, ROR1, EPHA4/5, JAK1/3, LMTK3, LYN, FYN, PTK2B, and N-RAS. We also describe a previously undescribed, somatic, activating mutation in the thrombopoietin receptor that is sensitive to down-stream pharmacologic inhibition. Hence, the RAPID technique can quickly identify molecular vulnerabilities in malignant cells. Combination of this technique with whole-genome sequencing will represent an ideal tool for oncogenic target identification such that specific therapies can be matched with individual patients. PMID:19433805

  3. Identifying risk factors for exposure to culturable allergenic moulds in energy efficient homes by using highly specific monoclonal antibodies.

    PubMed

    Sharpe, Richard A; Cocq, Kate Le; Nikolaou, Vasilis; Osborne, Nicholas J; Thornton, Christopher R

    2016-01-01

    The aim of this study was to determine the accuracy of monoclonal antibodies (mAbs) in identifying culturable allergenic fungi present in visible mould growth in energy efficient homes, and to identify risk factors for exposure to these known allergenic fungi. Swabs were taken from fungal contaminated surfaces and culturable yeasts and moulds isolated by using mycological culture. Soluble antigens from cultures were tested by ELISA using mAbs specific to the culturable allergenic fungi Aspergillus and Penicillium spp., Ulocladium, Alternaria, and Epicoccum spp., Cladosporium spp., Fusarium spp., and Trichoderma spp. Diagnostic accuracies of the ELISA tests were determined by sequencing of the internally transcribed spacer 1 (ITS1)-5.8S-ITS2-encoding regions of recovered fungi following ELISA. There was 100% concordance between the two methods, with ELISAs providing genus-level identity and ITS sequencing providing species-level identities (210 out of 210 tested). Species of Aspergillus/Penicillium, Cladosporium, Ulocladium/Alternaria/Epicoccum, Fusarium and Trichoderma were detected in 82% of the samples. The presence of condensation was associated with an increased risk of surfaces being contaminated by Aspergillus/Penicillium spp. and Cladosporium spp., whereas moisture within the building fabric (water ingress/rising damp) was only associated with increased risk of Aspergillus/Penicillium spp. Property type and energy efficiency levels were found to moderate the risk of indoor surfaces becoming contaminated with Aspergillus/Penicillium and Cladosporium which in turn was modified by the presence of condensation, water ingress and rising damp, consistent with previous literature. Copyright © 2015 Elsevier Inc. All rights reserved.

  4. High efficiency family shuffling based on multi-step PCR and in vivo DNA recombination in yeast: statistical and functional analysis of a combinatorial library between human cytochrome P450 1A1 and 1A2.

    PubMed

    Abécassis, V; Pompon, D; Truan, G

    2000-10-15

    The design of a family shuffling strategy (CLERY: Combinatorial Libraries Enhanced by Recombination in Yeast) associating PCR-based and in vivo recombination and expression in yeast is described. This strategy was tested using human cytochrome P450 CYP1A1 and CYP1A2 as templates, which share 74% nucleotide sequence identity. Construction of highly shuffled libraries of mosaic structures and reduction of parental gene contamination were two major goals. Library characterization involved multiprobe hybridization on DNA macro-arrays. The statistical analysis of randomly selected clones revealed a high proportion of chimeric genes (86%) and a homogeneous representation of the parental contribution among the sequences (55.8 +/- 2.5% for parental sequence 1A2). A microtiter plate screening system was designed to achieve colorimetric detection of polycyclic hydrocarbon hydroxylation by transformed yeast cells. Full sequences of five randomly picked and five functionally selected clones were analyzed. Results confirmed the shuffling efficiency and allowed calculation of the average length of sequence exchange and mutation rates. The efficient and statistically representative generation of mosaic structures by this type of family shuffling in a yeast expression system constitutes a novel and promising tool for structure-function studies and tuning enzymatic activities of multicomponent eucaryote complexes involving non-soluble enzymes.

  5. Alignment-free genome tree inference by learning group-specific distance metrics.

    PubMed

    Patil, Kaustubh R; McHardy, Alice C

    2013-01-01

    Understanding the evolutionary relationships between organisms is vital for their in-depth study. Gene-based methods are often used to infer such relationships, which are not without drawbacks. One can now attempt to use genome-scale information, because of the ever increasing number of genomes available. This opportunity also presents a challenge in terms of computational efficiency. Two fundamentally different methods are often employed for sequence comparisons, namely alignment-based and alignment-free methods. Alignment-free methods rely on the genome signature concept and provide a computationally efficient way that is also applicable to nonhomologous sequences. The genome signature contains evolutionary signal as it is more similar for closely related organisms than for distantly related ones. We used genome-scale sequence information to infer taxonomic distances between organisms without additional information such as gene annotations. We propose a method to improve genome tree inference by learning specific distance metrics over the genome signature for groups of organisms with similar phylogenetic, genomic, or ecological properties. Specifically, our method learns a Mahalanobis metric for a set of genomes and a reference taxonomy to guide the learning process. By applying this method to more than a thousand prokaryotic genomes, we showed that, indeed, better distance metrics could be learned for most of the 18 groups of organisms tested here. Once a group-specific metric is available, it can be used to estimate the taxonomic distances for other sequenced organisms from the group. This study also presents a large scale comparison between 10 methods--9 alignment-free and 1 alignment-based.

  6. XRN1 Is a Species-Specific Virus Restriction Factor in Yeasts

    PubMed Central

    Rowley, Paul A.; Ho, Brandon; Bushong, Sarah; Johnson, Arlen; Sawyer, Sara L.

    2016-01-01

    In eukaryotes, the degradation of cellular mRNAs is accomplished by Xrn1 and the cytoplasmic exosome. Because viral RNAs often lack canonical caps or poly-A tails, they can also be vulnerable to degradation by these host exonucleases. Yeast lack sophisticated mechanisms of innate and adaptive immunity, but do use RNA degradation as an antiviral defense mechanism. We find a highly refined, species-specific relationship between Xrn1p and the “L-A” totiviruses of different Saccharomyces yeast species. We show that the gene XRN1 has evolved rapidly under positive natural selection in Saccharomyces yeast, resulting in high levels of Xrn1p protein sequence divergence from one yeast species to the next. We also show that these sequence differences translate to differential interactions with the L-A virus, where Xrn1p from S. cerevisiae is most efficient at controlling the L-A virus that chronically infects S. cerevisiae, and Xrn1p from S. kudriavzevii is most efficient at controlling the L-A-like virus that we have discovered within S. kudriavzevii. All Xrn1p orthologs are equivalent in their interaction with another virus-like parasite, the Ty1 retrotransposon. Thus, Xrn1p appears to co-evolve with totiviruses to maintain its potent antiviral activity and limit viral propagation in Saccharomyces yeasts. We demonstrate that Xrn1p physically interacts with the Gag protein encoded by the L-A virus, suggesting a host-virus interaction that is more complicated than just Xrn1p-mediated nucleolytic digestion of viral RNAs. PMID:27711183

  7. Identification and correction of systematic error in high-throughput sequence data

    PubMed Central

    2011-01-01

    Background A feature common to all DNA sequencing technologies is the presence of base-call errors in the sequenced reads. The implications of such errors are application specific, ranging from minor informatics nuisances to major problems affecting biological inferences. Recently developed "next-gen" sequencing technologies have greatly reduced the cost of sequencing, but have been shown to be more error prone than previous technologies. Both position specific (depending on the location in the read) and sequence specific (depending on the sequence in the read) errors have been identified in Illumina and Life Technology sequencing platforms. We describe a new type of systematic error that manifests as statistically unlikely accumulations of errors at specific genome (or transcriptome) locations. Results We characterize and describe systematic errors using overlapping paired reads from high-coverage data. We show that such errors occur in approximately 1 in 1000 base pairs, and that they are highly replicable across experiments. We identify motifs that are frequent at systematic error sites, and describe a classifier that distinguishes heterozygous sites from systematic error. Our classifier is designed to accommodate data from experiments in which the allele frequencies at heterozygous sites are not necessarily 0.5 (such as in the case of RNA-Seq), and can be used with single-end datasets. Conclusions Systematic errors can easily be mistaken for heterozygous sites in individuals, or for SNPs in population analyses. Systematic errors are particularly problematic in low coverage experiments, or in estimates of allele-specific expression from RNA-Seq data. Our characterization of systematic error has allowed us to develop a program, called SysCall, for identifying and correcting such errors. We conclude that correction of systematic errors is important to consider in the design and interpretation of high-throughput sequencing experiments. PMID:22099972

  8. zUMIs - A fast and flexible pipeline to process RNA sequencing data with UMIs.

    PubMed

    Parekh, Swati; Ziegenhain, Christoph; Vieth, Beate; Enard, Wolfgang; Hellmann, Ines

    2018-06-01

    Single-cell RNA-sequencing (scRNA-seq) experiments typically analyze hundreds or thousands of cells after amplification of the cDNA. The high throughput is made possible by the early introduction of sample-specific bar codes (BCs), and the amplification bias is alleviated by unique molecular identifiers (UMIs). Thus, the ideal analysis pipeline for scRNA-seq data needs to efficiently tabulate reads according to both BC and UMI. zUMIs is a pipeline that can handle both known and random BCs and also efficiently collapse UMIs, either just for exon mapping reads or for both exon and intron mapping reads. If BC annotation is missing, zUMIs can accurately detect intact cells from the distribution of sequencing reads. Another unique feature of zUMIs is the adaptive downsampling function that facilitates dealing with hugely varying library sizes but also allows the user to evaluate whether the library has been sequenced to saturation. To illustrate the utility of zUMIs, we analyzed a single-nucleus RNA-seq dataset and show that more than 35% of all reads map to introns. Also, we show that these intronic reads are informative about expression levels, significantly increasing the number of detected genes and improving the cluster resolution. zUMIs flexibility makes if possible to accommodate data generated with any of the major scRNA-seq protocols that use BCs and UMIs and is the most feature-rich, fast, and user-friendly pipeline to process such scRNA-seq data.

  9. BG7: A New Approach for Bacterial Genome Annotation Designed for Next Generation Sequencing Data

    PubMed Central

    Pareja-Tobes, Pablo; Manrique, Marina; Pareja-Tobes, Eduardo; Pareja, Eduardo; Tobes, Raquel

    2012-01-01

    BG7 is a new system for de novo bacterial, archaeal and viral genome annotation based on a new approach specifically designed for annotating genomes sequenced with next generation sequencing technologies. The system is versatile and able to annotate genes even in the step of preliminary assembly of the genome. It is especially efficient detecting unexpected genes horizontally acquired from bacterial or archaeal distant genomes, phages, plasmids, and mobile elements. From the initial phases of the gene annotation process, BG7 exploits the massive availability of annotated protein sequences in databases. BG7 predicts ORFs and infers their function based on protein similarity with a wide set of reference proteins, integrating ORF prediction and functional annotation phases in just one step. BG7 is especially tolerant to sequencing errors in start and stop codons, to frameshifts, and to assembly or scaffolding errors. The system is also tolerant to the high level of gene fragmentation which is frequently found in not fully assembled genomes. BG7 current version – which is developed in Java, takes advantage of Amazon Web Services (AWS) cloud computing features, but it can also be run locally in any operating system. BG7 is a fast, automated and scalable system that can cope with the challenge of analyzing the huge amount of genomes that are being sequenced with NGS technologies. Its capabilities and efficiency were demonstrated in the 2011 EHEC Germany outbreak in which BG7 was used to get the first annotations right the next day after the first entero-hemorrhagic E. coli genome sequences were made publicly available. The suitability of BG7 for genome annotation has been proved for Illumina, 454, Ion Torrent, and PacBio sequencing technologies. Besides, thanks to its plasticity, our system could be very easily adapted to work with new technologies in the future. PMID:23185310

  10. Targeted Next-generation Sequencing and Bioinformatics Pipeline to Evaluate Genetic Determinants of Constitutional Disease.

    PubMed

    Dilliott, Allison A; Farhan, Sali M K; Ghani, Mahdi; Sato, Christine; Liang, Eric; Zhang, Ming; McIntyre, Adam D; Cao, Henian; Racacho, Lemuel; Robinson, John F; Strong, Michael J; Masellis, Mario; Bulman, Dennis E; Rogaeva, Ekaterina; Lang, Anthony; Tartaglia, Carmela; Finger, Elizabeth; Zinman, Lorne; Turnbull, John; Freedman, Morris; Swartz, Rick; Black, Sandra E; Hegele, Robert A

    2018-04-04

    Next-generation sequencing (NGS) is quickly revolutionizing how research into the genetic determinants of constitutional disease is performed. The technique is highly efficient with millions of sequencing reads being produced in a short time span and at relatively low cost. Specifically, targeted NGS is able to focus investigations to genomic regions of particular interest based on the disease of study. Not only does this further reduce costs and increase the speed of the process, but it lessens the computational burden that often accompanies NGS. Although targeted NGS is restricted to certain regions of the genome, preventing identification of potential novel loci of interest, it can be an excellent technique when faced with a phenotypically and genetically heterogeneous disease, for which there are previously known genetic associations. Because of the complex nature of the sequencing technique, it is important to closely adhere to protocols and methodologies in order to achieve sequencing reads of high coverage and quality. Further, once sequencing reads are obtained, a sophisticated bioinformatics workflow is utilized to accurately map reads to a reference genome, to call variants, and to ensure the variants pass quality metrics. Variants must also be annotated and curated based on their clinical significance, which can be standardized by applying the American College of Medical Genetics and Genomics Pathogenicity Guidelines. The methods presented herein will display the steps involved in generating and analyzing NGS data from a targeted sequencing panel, using the ONDRISeq neurodegenerative disease panel as a model, to identify variants that may be of clinical significance.

  11. Detecting novel genes with sparse arrays

    PubMed Central

    Haiminen, Niina; Smit, Bart; Rautio, Jari; Vitikainen, Marika; Wiebe, Marilyn; Martinez, Diego; Chee, Christine; Kunkel, Joe; Sanchez, Charles; Nelson, Mary Anne; Pakula, Tiina; Saloheimo, Markku; Penttilä, Merja; Kivioja, Teemu

    2014-01-01

    Species-specific genes play an important role in defining the phenotype of an organism. However, current gene prediction methods can only efficiently find genes that share features such as sequence similarity or general sequence characteristics with previously known genes. Novel sequencing methods and tiling arrays can be used to find genes without prior information and they have demonstrated that novel genes can still be found from extensively studied model organisms. Unfortunately, these methods are expensive and thus are not easily applicable, e.g., to finding genes that are expressed only in very specific conditions. We demonstrate a method for finding novel genes with sparse arrays, applying it on the 33.9 Mb genome of the filamentous fungus Trichoderma reesei. Our computational method does not require normalisations between arrays and it takes into account the multiple-testing problem typical for analysis of microarray data. In contrast to tiling arrays, that use overlapping probes, only one 25mer microarray oligonucleotide probe was used for every 100 b. Thus, only relatively little space on a microarray slide was required to cover the intergenic regions of a genome. The analysis was done as a by-product of a conventional microarray experiment with no additional costs. We found at least 23 good candidates for novel transcripts that could code for proteins and all of which were expressed at high levels. Candidate genes were found to neighbour ire1 and cre1 and many other regulatory genes. Our simple, low-cost method can easily be applied to finding novel species-specific genes without prior knowledge of their sequence properties. PMID:20691772

  12. Improvement of energy efficiency via spectrum optimization of excitation sequence for multichannel simultaneously triggered airborne sonar system

    NASA Astrophysics Data System (ADS)

    Meng, Qing-Hao; Yao, Zhen-Jing; Peng, Han-Yang

    2009-12-01

    Both the energy efficiency and correlation characteristics are important in airborne sonar systems to realize multichannel ultrasonic transducers working together. High energy efficiency can increase echo energy and measurement range, and sharp autocorrelation and flat cross correlation can help eliminate cross-talk among multichannel transducers. This paper addresses energy efficiency optimization under the premise that cross-talk between different sonar transducers can be avoided. The nondominated sorting genetic algorithm-II is applied to optimize both the spectrum and correlation characteristics of the excitation sequence. The central idea of the spectrum optimization is to distribute most of the energy of the excitation sequence within the frequency band of the sonar transducer; thus, less energy is filtered out by the transducers. Real experiments show that a sonar system consisting of eight-channel Polaroid 600 series electrostatic transducers excited with 2 ms optimized pulse-position-modulation sequences can work together without cross-talk and can measure distances up to 650 cm with maximal 1% relative error.

  13. MetaGenSense: A web-application for analysis and exploration of high throughput sequencing metagenomic data

    PubMed Central

    Denis, Jean-Baptiste; Vandenbogaert, Mathias; Caro, Valérie

    2016-01-01

    The detection and characterization of emerging infectious agents has been a continuing public health concern. High Throughput Sequencing (HTS) or Next-Generation Sequencing (NGS) technologies have proven to be promising approaches for efficient and unbiased detection of pathogens in complex biological samples, providing access to comprehensive analyses. As NGS approaches typically yield millions of putatively representative reads per sample, efficient data management and visualization resources have become mandatory. Most usually, those resources are implemented through a dedicated Laboratory Information Management System (LIMS), solely to provide perspective regarding the available information. We developed an easily deployable web-interface, facilitating management and bioinformatics analysis of metagenomics data-samples. It was engineered to run associated and dedicated Galaxy workflows for the detection and eventually classification of pathogens. The web application allows easy interaction with existing Galaxy metagenomic workflows, facilitates the organization, exploration and aggregation of the most relevant sample-specific sequences among millions of genomic sequences, allowing them to determine their relative abundance, and associate them to the most closely related organism or pathogen. The user-friendly Django-Based interface, associates the users’ input data and its metadata through a bio-IT provided set of resources (a Galaxy instance, and both sufficient storage and grid computing power). Galaxy is used to handle and analyze the user’s input data from loading, indexing, mapping, assembly and DB-searches. Interaction between our application and Galaxy is ensured by the BioBlend library, which gives API-based access to Galaxy’s main features. Metadata about samples, runs, as well as the workflow results are stored in the LIMS. For metagenomic classification and exploration purposes, we show, as a proof of concept, that integration of intuitive exploratory tools, like Krona for representation of taxonomic classification, can be achieved very easily. In the trend of Galaxy, the interface enables the sharing of scientific results to fellow team members. PMID:28451381

  14. MetaGenSense: A web-application for analysis and exploration of high throughput sequencing metagenomic data.

    PubMed

    Correia, Damien; Doppelt-Azeroual, Olivia; Denis, Jean-Baptiste; Vandenbogaert, Mathias; Caro, Valérie

    2015-01-01

    The detection and characterization of emerging infectious agents has been a continuing public health concern. High Throughput Sequencing (HTS) or Next-Generation Sequencing (NGS) technologies have proven to be promising approaches for efficient and unbiased detection of pathogens in complex biological samples, providing access to comprehensive analyses. As NGS approaches typically yield millions of putatively representative reads per sample, efficient data management and visualization resources have become mandatory. Most usually, those resources are implemented through a dedicated Laboratory Information Management System (LIMS), solely to provide perspective regarding the available information. We developed an easily deployable web-interface, facilitating management and bioinformatics analysis of metagenomics data-samples. It was engineered to run associated and dedicated Galaxy workflows for the detection and eventually classification of pathogens. The web application allows easy interaction with existing Galaxy metagenomic workflows, facilitates the organization, exploration and aggregation of the most relevant sample-specific sequences among millions of genomic sequences, allowing them to determine their relative abundance, and associate them to the most closely related organism or pathogen. The user-friendly Django-Based interface, associates the users' input data and its metadata through a bio-IT provided set of resources (a Galaxy instance, and both sufficient storage and grid computing power). Galaxy is used to handle and analyze the user's input data from loading, indexing, mapping, assembly and DB-searches. Interaction between our application and Galaxy is ensured by the BioBlend library, which gives API-based access to Galaxy's main features. Metadata about samples, runs, as well as the workflow results are stored in the LIMS. For metagenomic classification and exploration purposes, we show, as a proof of concept, that integration of intuitive exploratory tools, like Krona for representation of taxonomic classification, can be achieved very easily. In the trend of Galaxy, the interface enables the sharing of scientific results to fellow team members.

  15. Comparative sensitivities of functional MRI sequences in detection of local recurrence of prostate carcinoma after radical prostatectomy or external-beam radiotherapy.

    PubMed

    Roy, Catherine; Foudi, Fatah; Charton, Jeanne; Jung, Michel; Lang, Hervé; Saussine, Christian; Jacqmin, Didier

    2013-04-01

    The aim of this retrospective study was to determine the respective accuracies of three types of functional MRI sequences-diffusion-weighted imaging (DWI), dynamic contrast-enhanced (DCE) MRI, and 3D (1)H-MR spectroscopy (MRS)-in the depiction of local prostate cancer recurrence after two different initial therapy options. From a cohort of 83 patients with suspicion of local recurrence based on prostate-specific antigen (PSA) kinetics who were imaged on a 3-T MRI unit using an identical protocol including the three functional sequences with an endorectal coil, we selected 60 patients (group A, 28 patients who underwent radical prostatectomy; group B, 32 patients who underwent external-beam radiation) who had local recurrence ascertained on the basis of a transrectal ultrasound-guided biopsy results and a reduction in PSA level after salvage therapy. All patients presented with a local relapse. Sensitivity with T2-weighted MRI and 3D (1)H-MRS sequences was 57% and 53%, respectively, for group A and 71% and 78%, respectively, for group B. DCE-MRI alone showed a sensitivity of 100% and 96%, respectively, for groups A and B. DWI alone had a higher sensitivity for group B (96%) than for group A (71%). The combination of T2-weighted imaging plus DWI plus DCE-MRI provided a sensitivity as high as 100% in group B. The performance of functional imaging sequences for detecting recurrence is different after radical prostatectomy and external-beam radiotherapy. DCE-MRI is a valid and efficient tool to detect prostate cancer recurrence in radical prostatectomy as well as in external-beam radiotherapy. The combination of DCE-MRI and DWI is highly efficient after radiation therapy. Three-dimensional (1)H-MRS needs to be improved. Even though it is not accurate enough, T2-weighted imaging remains essential for the morphologic analysis of the area.

  16. Exact distribution of a pattern in a set of random sequences generated by a Markov source: applications to biological data

    PubMed Central

    2010-01-01

    Background In bioinformatics it is common to search for a pattern of interest in a potentially large set of rather short sequences (upstream gene regions, proteins, exons, etc.). Although many methodological approaches allow practitioners to compute the distribution of a pattern count in a random sequence generated by a Markov source, no specific developments have taken into account the counting of occurrences in a set of independent sequences. We aim to address this problem by deriving efficient approaches and algorithms to perform these computations both for low and high complexity patterns in the framework of homogeneous or heterogeneous Markov models. Results The latest advances in the field allowed us to use a technique of optimal Markov chain embedding based on deterministic finite automata to introduce three innovative algorithms. Algorithm 1 is the only one able to deal with heterogeneous models. It also permits to avoid any product of convolution of the pattern distribution in individual sequences. When working with homogeneous models, Algorithm 2 yields a dramatic reduction in the complexity by taking advantage of previous computations to obtain moment generating functions efficiently. In the particular case of low or moderate complexity patterns, Algorithm 3 exploits power computation and binary decomposition to further reduce the time complexity to a logarithmic scale. All these algorithms and their relative interest in comparison with existing ones were then tested and discussed on a toy-example and three biological data sets: structural patterns in protein loop structures, PROSITE signatures in a bacterial proteome, and transcription factors in upstream gene regions. On these data sets, we also compared our exact approaches to the tempting approximation that consists in concatenating the sequences in the data set into a single sequence. Conclusions Our algorithms prove to be effective and able to handle real data sets with multiple sequences, as well as biological patterns of interest, even when the latter display a high complexity (PROSITE signatures for example). In addition, these exact algorithms allow us to avoid the edge effect observed under the single sequence approximation, which leads to erroneous results, especially when the marginal distribution of the model displays a slow convergence toward the stationary distribution. We end up with a discussion on our method and on its potential improvements. PMID:20205909

  17. Exact distribution of a pattern in a set of random sequences generated by a Markov source: applications to biological data.

    PubMed

    Nuel, Gregory; Regad, Leslie; Martin, Juliette; Camproux, Anne-Claude

    2010-01-26

    In bioinformatics it is common to search for a pattern of interest in a potentially large set of rather short sequences (upstream gene regions, proteins, exons, etc.). Although many methodological approaches allow practitioners to compute the distribution of a pattern count in a random sequence generated by a Markov source, no specific developments have taken into account the counting of occurrences in a set of independent sequences. We aim to address this problem by deriving efficient approaches and algorithms to perform these computations both for low and high complexity patterns in the framework of homogeneous or heterogeneous Markov models. The latest advances in the field allowed us to use a technique of optimal Markov chain embedding based on deterministic finite automata to introduce three innovative algorithms. Algorithm 1 is the only one able to deal with heterogeneous models. It also permits to avoid any product of convolution of the pattern distribution in individual sequences. When working with homogeneous models, Algorithm 2 yields a dramatic reduction in the complexity by taking advantage of previous computations to obtain moment generating functions efficiently. In the particular case of low or moderate complexity patterns, Algorithm 3 exploits power computation and binary decomposition to further reduce the time complexity to a logarithmic scale. All these algorithms and their relative interest in comparison with existing ones were then tested and discussed on a toy-example and three biological data sets: structural patterns in protein loop structures, PROSITE signatures in a bacterial proteome, and transcription factors in upstream gene regions. On these data sets, we also compared our exact approaches to the tempting approximation that consists in concatenating the sequences in the data set into a single sequence. Our algorithms prove to be effective and able to handle real data sets with multiple sequences, as well as biological patterns of interest, even when the latter display a high complexity (PROSITE signatures for example). In addition, these exact algorithms allow us to avoid the edge effect observed under the single sequence approximation, which leads to erroneous results, especially when the marginal distribution of the model displays a slow convergence toward the stationary distribution. We end up with a discussion on our method and on its potential improvements.

  18. Specific ligands for classical swine fever virus screened from landscape phage display library.

    PubMed

    Yin, Long; Luo, Yuzi; Liang, Bo; Wang, Fei; Du, Min; Petrenko, Valery A; Qiu, Hua-Ji; Liu, Aihua

    2014-09-01

    Classical swine fever (CSF) is a devastating infectious disease caused by classical swine fever virus (CSFV). The screening of CSFV-specific ligands is of great significance for diagnosis and treatment of CSF. Affinity selection from random peptide libraries is an efficient approach to discover ligands with high stability and specificity. Here, we screened phage ligands for the CSFV E2 protein from f8/8 landscape phage display library by biopanning and obtained four phage clones specific for the E2 protein of CSFV. Viral blocking assays indicated that the phage clone displaying the octapeptide sequence DRATSSNA remarkably inhibited the CSFV replication in PK-15 cells at a titer of 10(10) transduction units, as evidenced by significantly decreased viral RNA copies and viral titers. The phage-displayed E2-binding peptides have the potential to be developed as antivirals for CSF. Copyright © 2014 Elsevier B.V. All rights reserved.

  19. In situ genetic correction of F8 intron 22 inversion in hemophilia A patient-specific iPSCs.

    PubMed

    Wu, Yong; Hu, Zhiqing; Li, Zhuo; Pang, Jialun; Feng, Mai; Hu, Xuyun; Wang, Xiaolin; Lin-Peng, Siyuan; Liu, Bo; Chen, Fangping; Wu, Lingqian; Liang, Desheng

    2016-01-08

    Nearly half of severe Hemophilia A (HA) cases are caused by F8 intron 22 inversion (Inv22). This 0.6-Mb inversion splits the 186-kb F8 into two parts with opposite transcription directions. The inverted 5' part (141 kb) preserves the first 22 exons that are driven by the intrinsic F8 promoter, leading to a truncated F8 transcript due to the lack of the last 627 bp coding sequence of exons 23-26. Here we describe an in situ genetic correction of Inv22 in patient-specific induced pluripotent stem cells (iPSCs). By using TALENs, the 627 bp sequence plus a polyA signal was precisely targeted at the junction of exon 22 and intron 22 via homologous recombination (HR) with high targeting efficiencies of 62.5% and 52.9%. The gene-corrected iPSCs retained a normal karyotype following removal of drug selection cassette using a Cre-LoxP system. Importantly, both F8 transcription and FVIII secretion were rescued in the candidate cell types for HA gene therapy including endothelial cells (ECs) and mesenchymal stem cells (MSCs) derived from the gene-corrected iPSCs. This is the first report of an efficient in situ genetic correction of the large inversion mutation using a strategy of targeted gene addition.

  20. In situ genetic correction of F8 intron 22 inversion in hemophilia A patient-specific iPSCs

    PubMed Central

    Wu, Yong; Hu, Zhiqing; Li, Zhuo; Pang, Jialun; Feng, Mai; Hu, Xuyun; Wang, Xiaolin; Lin-Peng, Siyuan; Liu, Bo; Chen, Fangping; Wu, Lingqian; Liang, Desheng

    2016-01-01

    Nearly half of severe Hemophilia A (HA) cases are caused by F8 intron 22 inversion (Inv22). This 0.6-Mb inversion splits the 186-kb F8 into two parts with opposite transcription directions. The inverted 5′ part (141 kb) preserves the first 22 exons that are driven by the intrinsic F8 promoter, leading to a truncated F8 transcript due to the lack of the last 627 bp coding sequence of exons 23–26. Here we describe an in situ genetic correction of Inv22 in patient-specific induced pluripotent stem cells (iPSCs). By using TALENs, the 627 bp sequence plus a polyA signal was precisely targeted at the junction of exon 22 and intron 22 via homologous recombination (HR) with high targeting efficiencies of 62.5% and 52.9%. The gene-corrected iPSCs retained a normal karyotype following removal of drug selection cassette using a Cre-LoxP system. Importantly, both F8 transcription and FVIII secretion were rescued in the candidate cell types for HA gene therapy including endothelial cells (ECs) and mesenchymal stem cells (MSCs) derived from the gene-corrected iPSCs. This is the first report of an efficient in situ genetic correction of the large inversion mutation using a strategy of targeted gene addition. PMID:26743572

  1. Efficient CRISPR-rAAV engineering of endogenous genes to study protein function by allele-specific RNAi.

    PubMed

    Kaulich, Manuel; Lee, Yeon J; Lönn, Peter; Springer, Aaron D; Meade, Bryan R; Dowdy, Steven F

    2015-04-20

    Gene knockout strategies, RNAi and rescue experiments are all employed to study mammalian gene function. However, the disadvantages of these approaches include: loss of function adaptation, reduced viability and gene overexpression that rarely matches endogenous levels. Here, we developed an endogenous gene knockdown/rescue strategy that combines RNAi selectivity with a highly efficient CRISPR directed recombinant Adeno-Associated Virus (rAAV) mediated gene targeting approach to introduce allele-specific mutations plus an allele-selective siRNA Sensitive (siSN) site that allows for studying gene mutations while maintaining endogenous expression and regulation of the gene of interest. CRISPR/Cas9 plus rAAV targeted gene-replacement and introduction of allele-specific RNAi sensitivity mutations in the CDK2 and CDK1 genes resulted in a >85% site-specific recombination of Neo-resistant clones versus ∼8% for rAAV alone. RNAi knockdown of wild type (WT) Cdk2 with siWT in heterozygotic knockin cells resulted in the mutant Cdk2 phenotype cell cycle arrest, whereas allele specific knockdown of mutant CDK2 with siSN resulted in a wild type phenotype. Together, these observations demonstrate the ability of CRISPR plus rAAV to efficiently recombine a genomic locus and tag it with a selective siRNA sequence that allows for allele-selective phenotypic assays of the gene of interest while it remains expressed and regulated under endogenous control mechanisms. © The Author(s) 2015. Published by Oxford University Press on behalf of Nucleic Acids Research.

  2. Peptide de novo sequencing of mixture tandem mass spectra.

    PubMed

    Gorshkov, Vladimir; Hotta, Stéphanie Yuki Kolbeck; Verano-Braga, Thiago; Kjeldsen, Frank

    2016-09-01

    The impact of mixture spectra deconvolution on the performance of four popular de novo sequencing programs was tested using artificially constructed mixture spectra as well as experimental proteomics data. Mixture fragmentation spectra are recognized as a limitation in proteomics because they decrease the identification performance using database search engines. De novo sequencing approaches are expected to be even more sensitive to the reduction in mass spectrum quality resulting from peptide precursor co-isolation and thus prone to false identifications. The deconvolution approach matched complementary b-, y-ions to each precursor peptide mass, which allowed the creation of virtual spectra containing sequence specific fragment ions of each co-isolated peptide. Deconvolution processing resulted in equally efficient identification rates but increased the absolute number of correctly sequenced peptides. The improvement was in the range of 20-35% additional peptide identifications for a HeLa lysate sample. Some correct sequences were identified only using unprocessed spectra; however, the number of these was lower than those where improvement was obtained by mass spectral deconvolution. Tight candidate peptide score distribution and high sensitivity to small changes in the mass spectrum introduced by the employed deconvolution method could explain some of the missing peptide identifications. © 2016 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.

  3. High-Yield Site-Specific Conjugation of Fibroblast Growth Factor 1 with Monomethylauristatin E via Cysteine Flanked by Basic Residues.

    PubMed

    Lobocki, Michal; Zakrzewska, Malgorzata; Szlachcic, Anna; Krzyscik, Mateusz A; Sokolowska-Wedzina, Aleksandra; Otlewski, Jacek

    2017-07-19

    Site-specific conjugation is a leading trend in the development of protein conjugates, including antibody-drug conjugates (ADCs), suitable for targeted cancer therapy. Here, we present a very efficient strategy for specific attachment of a cytotoxic drug to fibroblast growth factor 1 (FGF1), a natural ligand of FGF receptors (FGFRs), which are over-expressed in several types of lung, breast, and gastric cancers and are therefore an attractive molecular target. Recently, we showed that FGF1 fused to monomethylauristatin E (vcMMAE) was highly cytotoxic to cells presenting FGFRs on their surface and could be used as a targeting agent alternative to an antibody. Unfortunately, conjugation via maleimide chemistry to endogenous FGF1 cysteines or a cysteine introduced at the N-terminus proceeded with low yield and led to nonhomogeneous products. To improve the conjugation, we introduced a novel Lys-Cys-Lys motif at either FGF1 terminus, which increased cysteine reactivity and allowed us to obtain an FGF1 conjugate with a defined site of conjugation and a yield exceeding 95%. Using FGFR-expressing cancer lines, we confirmed specific cytotoxity of the obtained C-terminal FGF1-vcMMAE conjugate and its selective endocytososis as compared with FGFR1-negative cells. This simple and powerful approach relying on the introduction of a short sequence containing cysteine and positively charged amino acids could be used universally to improve the efficiency of the site-specific chemical modification of other proteins.

  4. Optimized knock-in of point mutations in zebrafish using CRISPR/Cas9.

    PubMed

    Prykhozhij, Sergey V; Fuller, Charlotte; Steele, Shelby L; Veinotte, Chansey J; Razaghi, Babak; Robitaille, Johane M; McMaster, Christopher R; Shlien, Adam; Malkin, David; Berman, Jason N

    2018-06-14

    We have optimized point mutation knock-ins into zebrafish genomic sites using clustered regularly interspaced palindromic repeats (CRISPR)/Cas9 reagents and single-stranded oligodeoxynucleotides. The efficiency of knock-ins was assessed by a novel application of allele-specific polymerase chain reaction and confirmed by high-throughput sequencing. Anti-sense asymmetric oligo design was found to be the most successful optimization strategy. However, cut site proximity to the mutation and phosphorothioate oligo modifications also greatly improved knock-in efficiency. A previously unrecognized risk of off-target trans knock-ins was identified that we obviated through the development of a workflow for correct knock-in detection. Together these strategies greatly facilitate the study of human genetic diseases in zebrafish, with additional applicability to enhance CRISPR-based approaches in other animal model systems.

  5. Genome Sequence of Lactobacillus rhamnosus Strain CASL, an Efficient l-Lactic Acid Producer from Cheap Substrate Cassava

    PubMed Central

    Yu, Bo; Su, Fei; Wang, Limin; Zhao, Bo; Qin, Jiayang; Ma, Cuiqing; Xu, Ping; Ma, Yanhe

    2011-01-01

    Lactobacillus rhamnosus is a type of probiotic bacteria with industrial potential for l-lactic acid production. We announce the draft genome sequence of L. rhamnosus CASL (2,855,156 bp with a G+C content of 46.6%), which is an efficient producer of l-lactic acid from cheap, nonfood substrate cassava with a high production titer. PMID:22123765

  6. Sequencing small genomic targets with high efficiency and extreme accuracy

    PubMed Central

    Schmitt, Michael W.; Fox, Edward J.; Prindle, Marc J.; Reid-Bayliss, Kate S.; True, Lawrence D.; Radich, Jerald P.; Loeb, Lawrence A.

    2015-01-01

    The detection of minority variants in mixed samples demands methods for enrichment and accurate sequencing of small genomic intervals. We describe an efficient approach based on sequential rounds of hybridization with biotinylated oligonucleotides, enabling more than one-million fold enrichment of genomic regions of interest. In conjunction with error correcting double-stranded molecular tags, our approach enables the quantification of mutations in individual DNA molecules. PMID:25849638

  7. High throughput sequencing analysis of RNA libraries reveals the influences of initial library and PCR methods on SELEX efficiency

    PubMed Central

    Takahashi, Mayumi; Wu, Xiwei; Ho, Michelle; Chomchan, Pritsana; Rossi, John J.; Burnett, John C.; Zhou, Jiehua

    2016-01-01

    The systemic evolution of ligands by exponential enrichment (SELEX) technique is a powerful and effective aptamer-selection procedure. However, modifications to the process can dramatically improve selection efficiency and aptamer performance. For example, droplet digital PCR (ddPCR) has been recently incorporated into SELEX selection protocols to putatively reduce the propagation of byproducts and avoid selection bias that result from differences in PCR efficiency of sequences within the random library. However, a detailed, parallel comparison of the efficacy of conventional solution PCR versus the ddPCR modification in the RNA aptamer-selection process is needed to understand effects on overall SELEX performance. In the present study, we took advantage of powerful high throughput sequencing technology and bioinformatics analysis coupled with SELEX (HT-SELEX) to thoroughly investigate the effects of initial library and PCR methods in the RNA aptamer identification. Our analysis revealed that distinct “biased sequences” and nucleotide composition existed in the initial, unselected libraries purchased from two different manufacturers and that the fate of the “biased sequences” was target-dependent during selection. Our comparison of solution PCR- and ddPCR-driven HT-SELEX demonstrated that PCR method affected not only the nucleotide composition of the enriched sequences, but also the overall SELEX efficiency and aptamer efficacy. PMID:27652575

  8. Genotypic, Phenotypic and Clinical Validation of GeneXpert in Extra-Pulmonary and Pulmonary Tuberculosis in India

    PubMed Central

    Singh, Urvashi B.; Pandey, Pooja; Mehta, Girija; Bhatnagar, Anuj K.; Mohan, Anant; Goyal, Vinay; Ahuja, Vineet; Ramachandran, Ranjani; Sachdeva, Kuldeep S.; Samantaray, Jyotish C.

    2016-01-01

    Background Newer molecular diagnostics have brought paradigm shift in early diagnosis of tuberculosis [TB]. WHO recommended use of GeneXpert MTB/RIF [Xpert] for Extra-pulmonary [EP] TB; critics have since questioned its efficiency. Methods The present study was designed to assess the performance of GeneXpert in 761 extra-pulmonary and 384 pulmonary specimens from patients clinically suspected of TB and compare with Phenotypic, Genotypic and Composite reference standards [CRS]. Results Comparison of GeneXpert results to CRS, demonstrated sensitivity of 100% and 90.68%, specificity of 100% and 99.62% for pulmonary and extra-pulmonary samples. On comparison with culture, sensitivity for Rifampicin [Rif] resistance detection was 87.5% and 81.82% respectively, while specificity was 100% for both pulmonary and extra-pulmonary TB. On comparison to sequencing of rpoB gene [Rif resistance determining region, RRDR], sensitivity was respectively 93.33% and 90% while specificity was 100% in both pulmonary and extra-pulmonary TB. GeneXpert assay missed 533CCG mutation in one sputum and dual mutation [517 & 519] in one pus sample, detected by sequencing. Sequencing picked dual mutation [529, 530] in a sputum sample sensitive to Rif, demonstrating, not all RRDR mutations lead to resistance. Conclusions Current study reports observations in a patient care setting in a high burden region, from a large collection of pulmonary and extra-pulmonary samples and puts to rest questions regarding sensitivity, specificity, detection of infrequent mutations and mutations responsible for low-level Rif resistance by GeneXpert. Improvements in the assay could offer further improvement in sensitivity of detection in different patient samples; nevertheless it may be difficult to improve sensitivity of Rif resistance detection if only one gene is targeted. Assay specificity was high both for TB detection and Rif resistance detection. Despite a few misses, the assay offers major boost to early diagnosis of TB and MDR-TB, in difficult to diagnose pauci-bacillary TB. PMID:26894283

  9. Chromosomal location and genetic mapping of the mismatch repair gene homologs MSH2, MSH3, and MSH6 in rye and wheat

    PubMed

    Korzun; Borner; Siebert; Malyshev; Hilpert; Kunze; Puchta

    1999-12-01

    The efficiency of homeologous recombination is influenced by mismatch repair genes in bacteria, yeast, and mammals. To elucidate a possible role of these genes in homeologous pairing and cross-compatibility in plants, gene probes of wheat (Triticum aestivum) specific for the mismatch repair gene homologues MSH2, MSH3, and MSH6 were used to map them to their genomic positions in rye (Secale cereale). Whereas MSH2 was mapped to the short arm of chromosome 1R, MSH3 was mapped to the long arm of chromosome 2R and MSH6 to the long arm of chromosome 5R. Southern blots with nullisomic-tetrasomic (NT) lines of wheat indicated the presence of the sequences on the respective homeologous group of wheat chromosomes. Additionally, an MSH6-specific homologue could also be detected on homoeologous group 3 of wheat. However, in the well-known, highly homoeologous pairing wheat mutant ph1b the MSH6-specific sequence is not within the deleted part of chromosome 5BL, indicating that the pairing phenotype is not due to a loss of one of the mismatch repair genes tested.

  10. In vitro folding of inclusion body proteins.

    PubMed

    Rudolph, R; Lilie, H

    1996-01-01

    Insoluble, inactive inclusion bodies are frequently formed upon recombinant protein production in transformed microorganisms. These inclusion bodies, which contain the recombinant protein in an highly enriched form, can be isolated by solid/liquid separation. After solubilization, native proteins can be generated from the inactive material by using in vitro folding techniques. New folding procedures have been developed for efficient in vitro reconstitution of complex hydrophobic, multidomain, oligomeric, or highly disulfide-bonded proteins. These protocols take into account process parameters such as protein concentration, catalysis of disulfide bond formation, temperature, pH, and ionic strength, as well as specific solvent ingredients that reduce unproductive side reactions. Modification of the protein sequence has been exploited to improve in vitro folding.

  11. Design strategy of pH-sensitive triblock copolymer micelles for efficient cellular uptake by computer simulations

    NASA Astrophysics Data System (ADS)

    Xia, Qiang-sheng; Ding, Hong-ming; Ma, Yu-qiang

    2018-03-01

    Efficient delivery of nanoparticles into specific cell interiors is of great importance in biomedicine. Recently, the pH-responsive micelle has emerged as one potential nanocarrier to realize such purpose since there exist obvious pH differences between normal tissues and tumors. Herein, by using dissipative particle dynamics simulation, we investigate the interaction of the pH-sensitive triblock copolymer micelles composed of ligand (L), hydrophobic block (C) and polyelectrolyte block (P) with cell membrane. It is found that the structure rearrangement of the micelle can facilitate its penetration into the lower leaflet of the bilayer. However, when the ligand-receptor specific interaction is weak, the micelles may just fuse with the upper leaflet of the bilayer. Moreover, the ionization degree of polyelectrolyte block and the length of hydrophobic block also play a vital role in the penetration efficiency. Further, when the sequence of the L, P, C beads in the copolymers is changed, the translocation pathways of the micelles may change from direct penetration to Janus engulfment. The present study reveals the relationship between the molecular structure of the copolymer and the uptake of the pH-sensitive micelles, which may give some significant insights into the experimental design of responsive micellar nanocarriers for highly efficient cellular delivery.

  12. Mapping Frequency-Specific Tone Predictions in the Human Auditory Cortex at High Spatial Resolution.

    PubMed

    Berlot, Eva; Formisano, Elia; De Martino, Federico

    2018-05-23

    Auditory inputs reaching our ears are often incomplete, but our brains nevertheless transform them into rich and complete perceptual phenomena such as meaningful conversations or pleasurable music. It has been hypothesized that our brains extract regularities in inputs, which enables us to predict the upcoming stimuli, leading to efficient sensory processing. However, it is unclear whether tone predictions are encoded with similar specificity as perceived signals. Here, we used high-field fMRI to investigate whether human auditory regions encode one of the most defining characteristics of auditory perception: the frequency of predicted tones. Two pairs of tone sequences were presented in ascending or descending directions, with the last tone omitted in half of the trials. Every pair of incomplete sequences contained identical sounds, but was associated with different expectations about the last tone (a high- or low-frequency target). This allowed us to disambiguate predictive signaling from sensory-driven processing. We recorded fMRI responses from eight female participants during passive listening to complete and incomplete sequences. Inspection of specificity and spatial patterns of responses revealed that target frequencies were encoded similarly during their presentations, as well as during omissions, suggesting frequency-specific encoding of predicted tones in the auditory cortex (AC). Importantly, frequency specificity of predictive signaling was observed already at the earliest levels of auditory cortical hierarchy: in the primary AC. Our findings provide evidence for content-specific predictive processing starting at the earliest cortical levels. SIGNIFICANCE STATEMENT Given the abundance of sensory information around us in any given moment, it has been proposed that our brain uses contextual information to prioritize and form predictions about incoming signals. However, there remains a surprising lack of understanding of the specificity and content of such prediction signaling; for example, whether a predicted tone is encoded with similar specificity as a perceived tone. Here, we show that early auditory regions encode the frequency of a tone that is predicted yet omitted. Our findings contribute to the understanding of how expectations shape sound processing in the human auditory cortex and provide further insights into how contextual information influences computations in neuronal circuits. Copyright © 2018 the authors 0270-6474/18/384934-09$15.00/0.

  13. Estimating the efficiency of fish cross-species cDNA microarray hybridization.

    PubMed

    Cohen, Raphael; Chalifa-Caspi, Vered; Williams, Timothy D; Auslander, Meirav; George, Stephen G; Chipman, James K; Tom, Moshe

    2007-01-01

    Using an available cross-species cDNA microarray is advantageous for examining multigene expression patterns in non-model organisms, saving the need for construction of species-specific arrays. The aim of the present study was to estimate relative efficiency of cross-species hybridizations across bony fishes, using bioinformatics tools. The methodology may serve also as a model for similar evaluations in other taxa. The theoretical evaluation was done by substituting comparative whole-transcriptome sequence similarity information into the thermodynamic hybridization equation. Complementary DNA sequence assemblages of nine fish species belonging to common families or suborders and distributed across the bony fish taxonomic branch were selected for transcriptome-wise comparisons. Actual cross-species hybridizations among fish of different taxonomic distances were used to validate and eventually to calibrate the theoretically computed relative efficiencies.

  14. The hidden genomic landscape of acute myeloid leukemia: subclonal structure revealed by undetected mutations

    PubMed Central

    Bodini, Margherita; Ronchini, Chiara; Giacò, Luciano; Russo, Anna; Melloni, Giorgio E. M.; Luzi, Lucilla; Sardella, Domenico; Volorio, Sara; Hasan, Syed K.; Ottone, Tiziana; Lavorgna, Serena; Lo-Coco, Francesco; Candoni, Anna; Fanin, Renato; Toffoletti, Eleonora; Iacobucci, Ilaria; Martinelli, Giovanni; Cignetti, Alessandro; Tarella, Corrado; Bernard, Loris; Pelicci, Pier Giuseppe

    2015-01-01

    The analyses carried out using 2 different bioinformatics pipelines (SomaticSniper and MuTect) on the same set of genomic data from 133 acute myeloid leukemia (AML) patients, sequenced inside the Cancer Genome Atlas project, gave discrepant results. We subsequently tested these 2 variant-calling pipelines on 20 leukemia samples from our series (19 primary AMLs and 1 secondary AML). By validating many of the predicted somatic variants (variant allele frequencies ranging from 100% to 5%), we observed significantly different calling efficiencies. In particular, despite relatively high specificity, sensitivity was poor in both pipelines resulting in a high rate of false negatives. Our findings raise the possibility that landscapes of AML genomes might be more complex than previously reported and characterized by the presence of hundreds of genes mutated at low variant allele frequency, suggesting that the application of genome sequencing to the clinic requires a careful and critical evaluation. We think that improvements in technology and workflow standardization, through the generation of clear experimental and bioinformatics guidelines, are fundamental to translate the use of next-generation sequencing from research to the clinic and to transform genomic information into better diagnosis and outcomes for the patient. PMID:25499761

  15. High Resolution Melt analysis for mutation screening in PKD1 and PKD2

    PubMed Central

    2011-01-01

    Background Autosomal dominant polycystic kidney disease (ADPKD) is the most common hereditary kidney disorder. It is characterized by focal development and progressive enlargement of renal cysts leading to end-stage renal disease. PKD1 and PKD2 have been implicated in ADPKD pathogenesis but genetic features and the size of PKD1 make genetic diagnosis tedious. Methods We aim to prove that high resolution melt analysis (HRM), a recent technique in molecular biology, can facilitate molecular diagnosis of ADPKD. We screened for mutations in PKD1 and PKD2 with HRM in 37 unrelated patients with ADPKD. Results We identified 440 sequence variants in the 37 patients. One hundred and thirty eight were different. We found 28 pathogenic mutations (25 in PKD1 and 3 in PKD2 ) within 28 different patients, which is a diagnosis rate of 75% consistent with literature mean direct sequencing diagnosis rate. We describe 52 new sequence variants in PKD1 and two in PKD2. Conclusion HRM analysis is a sensitive and specific method for molecular diagnosis of ADPKD. HRM analysis is also costless and time sparing. Thus, this method is efficient and might be used for mutation pre-screening in ADPKD genes. PMID:22008521

  16. Efficient sequence-specific isolation of DNA fragments and chromatin by in vitro enChIP technology using recombinant CRISPR ribonucleoproteins.

    PubMed

    Fujita, Toshitsugu; Yuno, Miyuki; Fujii, Hodaka

    2016-04-01

    The clustered regularly interspaced short palindromic repeats (CRISPR) system is widely used for various biological applications, including genome editing. We developed engineered DNA-binding molecule-mediated chromatin immunoprecipitation (enChIP) using CRISPR to isolate target genomic regions from cells for their biochemical characterization. In this study, we developed 'in vitro enChIP' using recombinant CRISPR ribonucleoproteins (RNPs) to isolate target genomic regions. in vitro enChIP has the great advantage over conventional enChIP of not requiring expression of CRISPR complexes in cells. We first showed that in vitro enChIP using recombinant CRISPR RNPs can be used to isolate target DNA from mixtures of purified DNA in a sequence-specific manner. In addition, we showed that this technology can be used to efficiently isolate target genomic regions, while retaining their intracellular molecular interactions, with negligible contamination from irrelevant genomic regions. Thus, in vitro enChIP technology is of potential use for sequence-specific isolation of DNA, as well as for identification of molecules interacting with genomic regions of interest in vivo in combination with downstream analysis. © 2016 The Authors. Genes to Cells published by Molecular Biology Society of Japan and John Wiley & Sons Australia, Ltd.

  17. Detection and Tracking of NY-ESO-1-Specific CD8+ T Cells by High-Throughput T Cell Receptor β (TCRB) Gene Rearrangements Sequencing in a Peptide-Vaccinated Patient.

    PubMed

    Miyai, Manami; Eikawa, Shingo; Hosoi, Akihiro; Iino, Tamaki; Matsushita, Hirokazu; Isobe, Midori; Uenaka, Akiko; Udono, Heiichiro; Nakajima, Jun; Nakayama, Eiichi; Kakimi, Kazuhiro

    2015-01-01

    Comprehensive immunological evaluation is crucial for monitoring patients undergoing antigen-specific cancer immunotherapy. The identification and quantification of T cell responses is most important for the further development of such therapies. Using well-characterized clinical samples from a high responder patient (TK-f01) in an NY-ESO-1f peptide vaccine study, we performed high-throughput T cell receptor β-chain (TCRB) gene next generation sequencing (NGS) to monitor the frequency of NY-ESO-1-specific CD8+ T cells. We compared these results with those of conventional immunological assays, such as IFN-γ capture, tetramer binding and limiting dilution clonality assays. We sequenced human TCRB complementarity-determining region 3 (CDR3) rearrangements of two NY-ESO-1f-specific CD8+ T cell clones, 6-8L and 2F6, as well as PBMCs over the course of peptide vaccination. Clone 6-8L possessed the TCRB CDR3 gene TCRBV11-03*01 and BJ02-01*01 with amino acid sequence CASSLRGNEQFF, whereas 2F6 possessed TCRBV05-08*01 and BJ02-04*01 (CASSLVGTNIQYF). Using these two sequences as models, we evaluated the frequency of NY-ESO-1-specific CD8+ T cells in PBMCs ex vivo. The 6-8L CDR3 sequence was the second most frequent in PBMC and was present at high frequency (0.7133%) even prior to vaccination, and sustained over the course of vaccination. Despite a marked expansion of NY-ESO-1-specific CD8+ T cells detected from the first through 6th vaccination by tetramer staining and IFN-γ capture assays, as evaluated by CDR3 sequencing the frequency did not increase with increasing rounds of peptide vaccination. By clonal analysis using 12 day in vitro stimulation, the frequency of B*52:01-restricted NY-ESO-1f peptide-specific CD8+ T cells in PBMCs was estimated as only 0.0023%, far below the 0.7133% by NGS sequencing. Thus, assays requiring in vitro stimulation might be underestimating the frequency of clones with lower proliferation potential. High-throughput TCRB sequencing using NGS can potentially better estimate the actual frequency of antigen-specific T cells and thus provide more accurate patient monitoring.

  18. Detection and Tracking of NY-ESO-1-Specific CD8+ T Cells by High-Throughput T Cell Receptor β (TCRB) Gene Rearrangements Sequencing in a Peptide-Vaccinated Patient

    PubMed Central

    Miyai, Manami; Eikawa, Shingo; Hosoi, Akihiro; Iino, Tamaki; Matsushita, Hirokazu; Isobe, Midori; Uenaka, Akiko; Udono, Heiichiro; Nakajima, Jun; Nakayama, Eiichi; Kakimi, Kazuhiro

    2015-01-01

    Comprehensive immunological evaluation is crucial for monitoring patients undergoing antigen-specific cancer immunotherapy. The identification and quantification of T cell responses is most important for the further development of such therapies. Using well-characterized clinical samples from a high responder patient (TK-f01) in an NY-ESO-1f peptide vaccine study, we performed high-throughput T cell receptor β-chain (TCRB) gene next generation sequencing (NGS) to monitor the frequency of NY-ESO-1-specific CD8+ T cells. We compared these results with those of conventional immunological assays, such as IFN-γ capture, tetramer binding and limiting dilution clonality assays. We sequenced human TCRB complementarity-determining region 3 (CDR3) rearrangements of two NY-ESO-1f-specific CD8+ T cell clones, 6-8L and 2F6, as well as PBMCs over the course of peptide vaccination. Clone 6-8L possessed the TCRB CDR3 gene TCRBV11-03*01 and BJ02-01*01 with amino acid sequence CASSLRGNEQFF, whereas 2F6 possessed TCRBV05-08*01 and BJ02-04*01 (CASSLVGTNIQYF). Using these two sequences as models, we evaluated the frequency of NY-ESO-1-specific CD8+ T cells in PBMCs ex vivo. The 6-8L CDR3 sequence was the second most frequent in PBMC and was present at high frequency (0.7133%) even prior to vaccination, and sustained over the course of vaccination. Despite a marked expansion of NY-ESO-1-specific CD8+ T cells detected from the first through 6th vaccination by tetramer staining and IFN-γ capture assays, as evaluated by CDR3 sequencing the frequency did not increase with increasing rounds of peptide vaccination. By clonal analysis using 12 day in vitro stimulation, the frequency of B*52:01-restricted NY-ESO-1f peptide-specific CD8+ T cells in PBMCs was estimated as only 0.0023%, far below the 0.7133% by NGS sequencing. Thus, assays requiring in vitro stimulation might be underestimating the frequency of clones with lower proliferation potential. High-throughput TCRB sequencing using NGS can potentially better estimate the actual frequency of antigen-specific T cells and thus provide more accurate patient monitoring. PMID:26291626

  19. Development of an efficient entire-capsid-coding-region amplification method for direct detection of poliovirus from stool extracts.

    PubMed

    Arita, Minetaro; Kilpatrick, David R; Nakamura, Tomofumi; Burns, Cara C; Bukbuk, David; Oderinde, Soji B; Oberste, M Steven; Kew, Olen M; Pallansch, Mark A; Shimizu, Hiroyuki

    2015-01-01

    Laboratory diagnosis has played a critical role in the Global Polio Eradication Initiative since 1988, by isolating and identifying poliovirus (PV) from stool specimens by using cell culture as a highly sensitive system to detect PV. In the present study, we aimed to develop a molecular method to detect PV directly from stool extracts, with a high efficiency comparable to that of cell culture. We developed a method to efficiently amplify the entire capsid coding region of human enteroviruses (EVs) including PV. cDNAs of the entire capsid coding region (3.9 kb) were obtained from as few as 50 copies of PV genomes. PV was detected from the cDNAs with an improved PV-specific real-time reverse transcription-PCR system and nucleotide sequence analysis of the VP1 coding region. For assay validation, we analyzed 84 stool extracts that were positive for PV in cell culture and detected PV genomes from 100% of the extracts (84/84 samples) with this method in combination with a PV-specific extraction method. PV could be detected in 2/4 stool extract samples that were negative for PV in cell culture. In PV-positive samples, EV species C viruses were also detected with high frequency (27% [23/86 samples]). This method would be useful for direct detection of PV from stool extracts without using cell culture. Copyright © 2015, American Society for Microbiology. All Rights Reserved.

  20. A High Quality Draft Consensus Sequence of the Genome of a Heterozygous Grapevine Variety

    PubMed Central

    Cartwright, Dustin A.; Cestaro, Alessandro; Pruss, Dmitry; Pindo, Massimo; FitzGerald, Lisa M.; Vezzulli, Silvia; Reid, Julia; Malacarne, Giulia; Iliev, Diana; Coppola, Giuseppina; Wardell, Bryan; Micheletti, Diego; Macalma, Teresita; Facci, Marco; Mitchell, Jeff T.; Perazzolli, Michele; Eldredge, Glenn; Gatto, Pamela; Oyzerski, Rozan; Moretto, Marco; Gutin, Natalia; Stefanini, Marco; Chen, Yang; Segala, Cinzia; Davenport, Christine; Demattè, Lorenzo; Mraz, Amy; Battilana, Juri; Stormo, Keith; Costa, Fabrizio; Tao, Quanzhou; Si-Ammour, Azeddine; Harkins, Tim; Lackey, Angie; Perbost, Clotilde; Taillon, Bruce; Stella, Alessandra; Solovyev, Victor; Fawcett, Jeffrey A.; Sterck, Lieven; Vandepoele, Klaas; Grando, Stella M.; Toppo, Stefano; Moser, Claudio; Lanchbury, Jerry; Bogden, Robert; Skolnick, Mark; Sgaramella, Vittorio; Bhatnagar, Satish K.; Fontana, Paolo; Gutin, Alexander; Van de Peer, Yves; Salamini, Francesco; Viola, Roberto

    2007-01-01

    Background Worldwide, grapes and their derived products have a large market. The cultivated grape species Vitis vinifera has potential to become a model for fruit trees genetics. Like many plant species, it is highly heterozygous, which is an additional challenge to modern whole genome shotgun sequencing. In this paper a high quality draft genome sequence of a cultivated clone of V. vinifera Pinot Noir is presented. Principal Findings We estimate the genome size of V. vinifera to be 504.6 Mb. Genomic sequences corresponding to 477.1 Mb were assembled in 2,093 metacontigs and 435.1 Mb were anchored to the 19 linkage groups (LGs). The number of predicted genes is 29,585, of which 96.1% were assigned to LGs. This assembly of the grape genome provides candidate genes implicated in traits relevant to grapevine cultivation, such as those influencing wine quality, via secondary metabolites, and those connected with the extreme susceptibility of grape to pathogens. Single nucleotide polymorphism (SNP) distribution was consistent with a diffuse haplotype structure across the genome. Of around 2,000,000 SNPs, 1,751,176 were mapped to chromosomes and one or more of them were identified in 86.7% of anchored genes. The relative age of grape duplicated genes was estimated and this made possible to reveal a relatively recent Vitis-specific large scale duplication event concerning at least 10 chromosomes (duplication not reported before). Conclusions Sanger shotgun sequencing and highly efficient sequencing by synthesis (SBS), together with dedicated assembly programs, resolved a complex heterozygous genome. A consensus sequence of the genome and a set of mapped marker loci were generated. Homologous chromosomes of Pinot Noir differ by 11.2% of their DNA (hemizygous DNA plus chromosomal gaps). SNP markers are offered as a tool with the potential of introducing a new era in the molecular breeding of grape. PMID:18094749

  1. Progressive engineering of a homing endonuclease genome editing reagent for the murine X-linked immunodeficiency locus

    PubMed Central

    Wang, Yupeng; Khan, Iram F.; Boissel, Sandrine; Jarjour, Jordan; Pangallo, Joseph; Thyme, Summer; Baker, David; Scharenberg, Andrew M.; Rawlings, David J.

    2014-01-01

    LAGLIDADG homing endonucleases (LHEs) are compact endonucleases with 20–22 bp recognition sites, and thus are ideal scaffolds for engineering site-specific DNA cleavage enzymes for genome editing applications. Here, we describe a general approach to LHE engineering that combines rational design with directed evolution, using a yeast surface display high-throughput cleavage selection. This approach was employed to alter the binding and cleavage specificity of the I-Anil LHE to recognize a mutation in the mouse Bruton tyrosine kinase (Btk) gene causative for mouse X-linked immunodeficiency (XID)—a model of human X-linked agammaglobulinemia (XLA). The required re-targeting of I-AniI involved progressive resculpting of the DNA contact interface to accommodate nine base differences from the native cleavage sequence. The enzyme emerging from the progressive engineering process was specific for the XID mutant allele versus the wild-type (WT) allele, and exhibited activity equivalent to WT I-AniI in vitro and in cellulo reporter assays. Fusion of the enzyme to a site-specific DNA binding domain of transcription activator-like effector (TALE) resulted in a further enhancement of gene editing efficiency. These results illustrate the potential of LHE enzymes as specific and efficient tools for therapeutic genome engineering. PMID:24682825

  2. ALMA constraints on star-forming gas in a prototypical z = 1.5 clumpy galaxy: the dearth of CO(5-4) emission from UV-bright clumps

    NASA Astrophysics Data System (ADS)

    Cibinel, A.; Daddi, E.; Bournaud, F.; Sargent, M. T.; le Floc'h, E.; Magdis, G. E.; Pannella, M.; Rujopakarn, W.; Juneau, S.; Zanella, A.; Duc, P.-A.; Oesch, P. A.; Elbaz, D.; Jagannathan, P.; Nyland, K.; Wang, T.

    2017-08-01

    We present deep ALMA CO(5-4) observations of a main-sequence, clumpy galaxy at z = 1.5 in the HUDF. Thanks to the ˜0{^''.}5 resolution of the ALMA data, we can link stellar population properties to the CO(5-4) emission on scales of a few kiloparsec. We detect strong CO(5-4) emission from the nuclear region of the galaxy, consistent with the observed LIR-L^' }_CO(5-4) correlation and indicating ongoing nuclear star formation. The CO(5-4) gas component appears more concentrated than other star formation tracers or the dust distribution in this galaxy. We discuss possible implications of this difference in terms of star formation efficiency and mass build-up at the galaxy centre. Conversely, we do not detect any CO(5-4) emission from the UV-bright clumps. This might imply that clumps have a high star formation efficiency (although they do not display unusually high specific star formation rates) and are not entirely gas dominated, with gas fractions no larger than that of their host galaxy (˜50 per cent). Stellar feedback and disc instability torques funnelling gas towards the galaxy centre could contribute to the relatively low gas content. Alternatively, clumps could fall in a more standard star formation efficiency regime if their actual star formation rates are lower than generally assumed. We find that clump star formation rates derived with several different, plausible methods can vary by up to an order of magnitude. The lowest estimates would be compatible with a CO(5-4) non-detection even for main-sequence like values of star formation efficiency and gas content.

  3. The immediate upstream region of the 5′-UTR from the AUG start codon has a pronounced effect on the translational efficiency in Arabidopsis thaliana

    PubMed Central

    Kim, Younghyun; Lee, Goeun; Jeon, Eunhyun; Sohn, Eun ju; Lee, Yongjik; Kang, Hyangju; Lee, Dong wook; Kim, Dae Heon; Hwang, Inhwan

    2014-01-01

    The nucleotide sequence around the translational initiation site is an important cis-acting element for post-transcriptional regulation. However, it has not been fully understood how the sequence context at the 5′-untranslated region (5′-UTR) affects the translational efficiency of individual mRNAs. In this study, we provide evidence that the 5′-UTRs of Arabidopsis genes showing a great difference in the nucleotide sequence vary greatly in translational efficiency with more than a 200-fold difference. Of the four types of nucleotides, the A residue was the most favourable nucleotide from positions −1 to −21 of the 5′-UTRs in Arabidopsis genes. In particular, the A residue in the 5′-UTR from positions −1 to −5 was required for a high-level translational efficiency. In contrast, the T residue in the 5′-UTR from positions −1 to −5 was the least favourable nucleotide in translational efficiency. Furthermore, the effect of the sequence context in the −1 to −21 region of the 5′-UTR was conserved in different plant species. Based on these observations, we propose that the sequence context immediately upstream of the AUG initiation codon plays a crucial role in determining the translational efficiency of plant genes. PMID:24084084

  4. Diagnosis of autosomal dominant polycystic kidney disease using efficient PKD1 and PKD2 targeted next-generation sequencing.

    PubMed

    Trujillano, Daniel; Bullich, Gemma; Ossowski, Stephan; Ballarín, José; Torra, Roser; Estivill, Xavier; Ars, Elisabet

    2014-09-01

    Molecular diagnostics of autosomal dominant polycystic kidney disease (ADPKD) relies on mutation screening of PKD1 and PKD2, which is complicated by extensive allelic heterogeneity and the presence of six highly homologous sequences of PKD1. To date, specific sequencing of PKD1 requires laborious long-range amplifications. The high cost and long turnaround time of PKD1 and PKD2 mutation analysis using conventional techniques limits its widespread application in clinical settings. We performed targeted next-generation sequencing (NGS) of PKD1 and PKD2. Pooled barcoded DNA patient libraries were enriched by in-solution hybridization with PKD1 and PKD2 capture probes. Bioinformatics analysis was performed using an in-house developed pipeline. We validated the assay in a cohort of 36 patients with previously known PKD1 and PKD2 mutations and five control individuals. Then, we used the same assay and bioinformatics analysis in a discovery cohort of 12 uncharacterized patients. We detected 35 out of 36 known definitely, highly likely, and likely pathogenic mutations in the validation cohort, including two large deletions. In the discovery cohort, we detected 11 different pathogenic mutations in 10 out of 12 patients. This study demonstrates that laborious long-range PCRs of the repeated PKD1 region can be avoided by in-solution enrichment of PKD1 and PKD2 and NGS. This strategy significantly reduces the cost and time for simultaneous PKD1 and PKD2 sequence analysis, facilitating routine genetic diagnostics of ADPKD.

  5. Measurements of weak interactions between truncated substrates and a hammerhead ribozyme by competitive kinetic analyses: implications for the design of new and efficient ribozymes with high sequence specificity

    PubMed Central

    Kasai, Yasuhiro; Shizuku, Hideki; Takagi, Yasuomi; Warashina, Masaki; Taira, Kazunari

    2002-01-01

    Exploitation of ribozymes in a practical setting requires high catalytic activity and strong specificity. The hammerhead ribozyme R32 has considerable potential in this regard since it has very high catalytic activity. In this study, we have examined how R32 recognizes and cleaves a specific substrate, focusing on the mechanism behind the specificity. Comparing rates of cleavage of a substrate in a mixture that included the correct substrate and various substrates with point mutations, we found that R32 cleaved the correct substrate specifically and at a high rate. To clarify the source of this strong specificity, we quantified the weak interactions between R32 and various truncated substrates, using truncated substrates as competitive inhibitors since they were not readily cleaved during kinetic measurements of cleavage of the correct substrate, S11. We found that the strong specificity of the cleavage reaction was due to a closed form of R32 with a hairpin structure. The self-complementary structure within R32 enabled the ribozyme to discriminate between the correct substrate and a mismatched substrate. Since this hairpin motif did not increase the Km (it did not inhibit the binding interaction) or decrease the kcat (it did not decrease the cleavage rate), this kind of hairpin structure might be useful for the design of new ribozymes with strong specificity and high activity. PMID:12034825

  6. High-efficiency/CRI/color stability warm white organic light-emitting diodes by incorporating ultrathin phosphorescence layers in a blue fluorescence layer

    NASA Astrophysics Data System (ADS)

    Miao, Yanqin; Wang, Kexiang; Zhao, Bo; Gao, Long; Tao, Peng; Liu, Xuguang; Hao, Yuying; Wang, Hua; Xu, Bingshe; Zhu, Furong

    2018-01-01

    By incorporating ultrathin (<0.1 nm) green, yellow, and red phosphorescence layers with different sequence arrangements in a blue fluorescence layer, four unique and simplified fluorescence/phosphorescence (F/P) hybrid, white organic light-emitting diodes (WOLEDs) were obtained. All four devices realize good warm white light emission, with high color rending index (CRI) of >80, low correlated color temperature of <3600 K, and high color stability at a wide voltage range of 5 V-9 V. These hybrid WOLEDs also reveal high forward-viewing external quantum efficiencies (EQE) of 17.82%-19.34%, which are close to the theoretical value of 20%, indicating an almost complete exciton harvesting. In addition, the electroluminescence spectra of the hybrid WOLEDs can be easily improved by only changing the incorporating sequence of the ultrathin phosphorescence layers without device efficiency loss. For example, the hybrid WOLED with an incorporation sequence of ultrathin red/yellow/green phosphorescence layers exhibits an ultra-high CRI of 96 and a high EQE of 19.34%. To the best of our knowledge, this is the first WOLED with good tradeoff among device efficiency, CRI, and color stability. The introduction of ultrathin (<0.1 nm) phosphorescence layers can also greatly reduce the consumption of phosphorescent emitters as well as simplify device structures and fabrication process, thus leading to low cost. Such a finding is very meaningful for the potential commercialization of hybrid WOLEDs.

  7. Screening and identification of male-specific DNA fragments in common carps Cyprinus carpio using suppression subtractive hybridization.

    PubMed

    Chen, J J; Du, Q Y; Yue, Y Y; Dang, B J; Chang, Z J

    2010-08-01

    In this study, a sex subtractive genomic DNA library was constructed using suppression subtractive hybridization (SSH) between male and female Cyprinus carpio. Twenty-two clones with distinguishable hybridization signals were selected and sequenced. The specific primers were designed based on the sequence data. Those primers were then used to amplify the sex-specific fragments from the genomic DNA of male and female carp. The amplified fragments from two clones showed specificity to males but not to females, which were named as Ccmf2 [387 base pairs (bp)] and Ccmf3 (183 bp), respectively. The sex-specific pattern was analysed in a total of 40 individuals from three other different C. carpio. stocks and grass carp Ctenopharyngodon idella using Ccmf2 and Ccmf3 as dot-blotting probes. The results revealed that the molecular diversity exists on the Y chromosome of C. carpio. No hybridization signals, however, were detected from individuals of C. idella, suggesting that the two sequences are specific to C. carpio. No significant homologous sequences of Ccmf2 and Ccmf3 were found in GenBank. Therefore, it was interpreted that the results as that Ccmf2 and Ccmf3 are two novel male-specific sequences; and both fragments could be used as markers to rapidly and accurately identify the genetic sex of part of C. carpio. This may provide a very efficient selective tool for practically breeding monosex female populations in aquacultural production.

  8. Complete genome sequence, metabolic model construction and phenotypic characterization of Geobacillus LC300, an extremely thermophilic, fast growing, xylose-utilizing bacterium.

    PubMed

    Cordova, Lauren T; Long, Christopher P; Venkataramanan, Keerthi P; Antoniewicz, Maciek R

    2015-11-01

    We have isolated a new extremely thermophilic fast-growing Geobacillus strain that can efficiently utilize xylose, glucose, mannose and galactose for cell growth. When grown aerobically at 72 °C, Geobacillus LC300 has a growth rate of 2.15 h(-1) on glucose and 1.52 h(-1) on xylose (doubling time less than 30 min). The corresponding specific glucose and xylose utilization rates are 5.55 g/g/h and 5.24 g/g/h, respectively. As such, Geobacillus LC300 grows 3-times faster than E. coli on glucose and xylose, and has a specific xylose utilization rate that is 3-times higher than the best metabolically engineered organism to date. To gain more insight into the metabolism of Geobacillus LC300 its genome was sequenced using PacBio's RS II single-molecule real-time (SMRT) sequencing platform and annotated using the RAST server. Based on the genome annotation and the measured biomass composition a core metabolic network model was constructed. To further demonstrate the biotechnological potential of this organism, Geobacillus LC300 was grown to high cell-densities in a fed-batch culture, where cells maintained a high xylose utilization rate under low dissolved oxygen concentrations. All of these characteristics make Geobacillus LC300 an attractive host for future metabolic engineering and biotechnology applications. Copyright © 2015 International Metabolic Engineering Society. Published by Elsevier Inc. All rights reserved.

  9. Flanking sequence determination and specific PCR identification of transgenic wheat B102-1-2.

    PubMed

    Cao, Jijuan; Xu, Junyi; Zhao, Tongtong; Cao, Dongmei; Huang, Xin; Zhang, Piqiao; Luan, Fengxia

    2014-01-01

    The exogenous fragment sequence and flanking sequence between the exogenous fragment and recombinant chromosome of transgenic wheat B102-1-2 were successfully acquired using genome walking technology. The newly acquired exogenous fragment encoded the full-length sequence of transformed genes with transformed plasmid and corresponding functional genes including ubi, vector pBANF-bar, vector pUbiGUSPlus, vector HSP, reporter vector pUbiGUSPlus, promoter ubiquitin, and coli DH1. A specific polymerase chain reaction (PCR) identification method for transgenic wheat B102-1-2 was established on the basis of designed primers according to flanking sequence. This established specific PCR strategy was validated by using transgenic wheat, transgenic corn, transgenic soybean, transgenic rice, and non-transgenic wheat. A specifically amplified target band was observed only in transgenic wheat B102-1-2. Therefore, this method is characterized by high specificity, high reproducibility, rapid identification, and excellent accuracy for the identification of transgenic wheat B102-1-2.

  10. A pH-sensitive heparin-binding sequence from Baculovirus gp64 protein is important for binding to mammalian cells but not to Sf9 insect cells.

    PubMed

    Wu, Chunxiao; Wang, Shu

    2012-01-01

    Binding to heparan sulfate is essential for baculovirus transduction of mammalian cells. Our previous study shows that gp64, the major glycoprotein on the virus surface, binds to heparin in a pH-dependent way, with a stronger binding at pH 6.2 than at 7.4. Using fluorescently labeled peptides, we mapped the pH-dependent heparin-binding sequence of gp64 to a 22-amino-acid region between residues 271 and 292. Binding of this region to the cell surface was also pH dependent, and peptides containing this sequence could efficiently inhibit baculovirus transduction of mammalian cells at pH 6.2. When the heparin-binding peptide was immobilized onto the bead surface to mimic the high local concentration of gp64 on the virus surface, the peptide-coated magnetic beads could efficiently pull down cells expressing heparan sulfate but not cells pretreated with heparinase or cells not expressing heparan sulfate. Interestingly, although this heparin-binding function is essential for baculovirus transduction of mammalian cells, it is dispensable for infection of Sf9 insect cells. Virus infectivity on Sf9 cells was not reduced by the presence of heparin or the identified heparin-binding peptide, even though the peptide could bind to Sf9 cell surface and be efficiently internalized. Thus, our data suggest that, depending on the availability of the target molecules on the cell surface, baculoviruses can use two different methods, electrostatic interaction with heparan sulfate and more specific receptor binding, for cell attachment.

  11. Towards a molecular taxonomic key of the Aurantioideae subfamily using chloroplastic SNP diagnostic markers of the main clades genotyped by competitive allele-specific PCR.

    PubMed

    Oueslati, Amel; Ollitrault, Frederique; Baraket, Ghada; Salhi-Hannachi, Amel; Navarro, Luis; Ollitrault, Patrick

    2016-08-18

    Chloroplast DNA is a primary source of molecular variations for phylogenetic analysis of photosynthetic eukaryotes. However, the sequencing and analysis of multiple chloroplastic regions is difficult to apply to large collections or large samples of natural populations. The objective of our work was to demonstrate that a molecular taxonomic key based on easy, scalable and low-cost genotyping method should be developed from a set of Single Nucleotide Polymorphisms (SNPs) diagnostic of well-established clades. It was applied to the Aurantioideae subfamily, the largest group of the Rutaceae family that includes the cultivated citrus species. The publicly available nucleotide sequences of eight plastid genomic regions were compared for 79 accessions of the Aurantioideae subfamily to search for SNPs revealing taxonomic differentiation at the inter-tribe, inter-subtribe, inter-genus and interspecific levels. Diagnostic SNPs (DSNPs) were found for 46 of the 54 clade levels analysed. Forty DSNPs were selected to develop KASPar markers and their taxonomic value was tested by genotyping 108 accessions of the Aurantioideae subfamily. Twenty-seven markers diagnostic of 24 clades were validated and they displayed a very high rate of transferability in the Aurantioideae subfamily (only 1.2 % of missing data on average). The UPGMA from the validated markers produced a cladistic organisation that was highly coherent with the previous phylogenetic analysis based on the sequence data of the eight plasmid regions. In particular, the monophyletic origin of the "true citrus" genera plus Oxanthera was validated. However, some clarification remains necessary regarding the organisation of the other wild species of the Citreae tribe. We validated the concept that with well-established clades, DSNPs can be selected and efficiently transformed into competitive allele-specific PCR markers (KASPar method) allowing cost-effective highly efficient cladistic analysis in large collections at subfamily level. The robustness of this genotyping method is an additional decisive advantage for network collaborative research. The availability of WGS data for the main "true citrus" species should soon make it possible to develop a set of DSNP markers allowing very fine resolution of this very important horticultural group.

  12. The Origins of Transmembrane Ion Channels

    NASA Technical Reports Server (NTRS)

    Pohorille, Andrew; Wilson, Michael A.

    2012-01-01

    Even though membrane proteins that mediate transport of ions and small molecules across cell walls are among the largest and least understood biopolymers in contemporary cells, it is still possible to shed light on their origins and early evolution. The central observation is that transmembrane portions of most ion channels are simply bundles of -helices. By combining results of experimental and computer simulation studies on synthetic models and natural channels, mostly of non-genomic origin, we show that the emergence of -helical channels was protobiologically plausible, and did not require highly specific amino acid sequences. Despite their simple structure, such channels could possess properties that, at the first sight, appear to require markedly larger complexity. Specifically, we explain how the antiamoebin channels, which are made of identical helices, 16 amino acids in length, achieve efficiency comparable to that of highly evolved channels. We further show that antiamoebin channels are extremely flexible, compared to modern, genetically coded channels. On the basis of our results, we propose that channels evolved further towards high structural complexity because they needed to acquire stable rigid structures and mechanisms for precise regulation rather than improve efficiency. In general, even though architectures of membrane proteins are not nearly as diverse as those of water-soluble proteins, they are sufficiently flexible to adapt readily to the functional demands arising during evolution.

  13. Rapid and reliable diagnostic method to detect Zika virus by real-time fluorescence reverse transcription loop-mediated isothermal amplification.

    PubMed

    Guo, Xu-Guang; Zhou, Yong-Zhuo; Li, Qin; Wang, Wei; Wen, Jin-Zhou; Zheng, Lei; Wang, Qian

    2018-04-18

    To detect Zika virus more rapidly and accurately, we developed a novel method that utilized a real-time fluorescence reverse transcription loop-mediated isothermal amplification (LAMP) technique. The NS5 gene was amplified by a set of six specific primers that recognized six distinct sequences. The amplification process, including 60 min of thermostatic reaction with Bst DNA polymerase following real-time fluorescence reverse transcriptase using genomic Zika virus standard strain (MR766), was conducted through fluorescent signaling. Among the six pairs of primers that we designate here, NS5 was the most efficient with a high sensitivity of up to 3.3 ng/μl and reproducible specificity on eight pathogen samples that were used as negative controls. The real-time fluorescence reverse transcription LAMP detection process can be completed within 35 min. Our study demonstrated that real-time fluorescence reverse transcription LAMP could be highly beneficial and convenient clinical application to detect Zika virus due to its high specificity and stability.

  14. Effect of DNA Extraction Methods on the Apparent Structure of Yak Rumen Microbial Communities as Revealed by 16S rDNA Sequencing.

    PubMed

    Chen, Ya-Bing; Lan, Dao-Liang; Tang, Cheng; Yang, Xiao-Nong; Li, Jian

    2015-01-01

    To more efficiently identify the microbial community of the yak rumen, the standardization of DNA extraction is key to ensure fidelity while studying environmental microbial communities. In this study, we systematically compared the efficiency of several extraction methods based on DNA yield, purity, and 16S rDNA sequencing to determine the optimal DNA extraction methods whose DNA products reflect complete bacterial communities. The results indicate that method 6 (hexadecyltrimethylammomium bromide-lysozyme-physical lysis by bead beating) is recommended for the DNA isolation of the rumen microbial community due to its high yield, operational taxonomic unit, bacterial diversity, and excellent cell-breaking capability. The results also indicate that the bead-beating step is necessary to effectively break down the cell walls of all of the microbes, especially Gram-positive bacteria. Another aim of this study was to preliminarily analyze the bacterial community via 16S rDNA sequencing. The microbial community spanned approximately 21 phyla, 35 classes, 75 families, and 112 genera. A comparative analysis showed some variations in the microbial community between yaks and cattle that may be attributed to diet and environmental differences. Interestingly, numerous uncultured or unclassified bacteria were found in yak rumen, suggesting that further research is required to determine the specific functional and ecological roles of these bacteria in yak rumen. In summary, the investigation of the optimal DNA extraction methods and the preliminary evaluation of the bacterial community composition of yak rumen support further identification of the specificity of the rumen microbial community in yak and the discovery of distinct gene resources.

  15. Identification of copy number variants in whole-genome data using Reference Coverage Profiles

    PubMed Central

    Glusman, Gustavo; Severson, Alissa; Dhankani, Varsha; Robinson, Max; Farrah, Terry; Mauldin, Denise E.; Stittrich, Anna B.; Ament, Seth A.; Roach, Jared C.; Brunkow, Mary E.; Bodian, Dale L.; Vockley, Joseph G.; Shmulevich, Ilya; Niederhuber, John E.; Hood, Leroy

    2015-01-01

    The identification of DNA copy numbers from short-read sequencing data remains a challenge for both technical and algorithmic reasons. The raw data for these analyses are measured in tens to hundreds of gigabytes per genome; transmitting, storing, and analyzing such large files is cumbersome, particularly for methods that analyze several samples simultaneously. We developed a very efficient representation of depth of coverage (150–1000× compression) that enables such analyses. Current methods for analyzing variants in whole-genome sequencing (WGS) data frequently miss copy number variants (CNVs), particularly hemizygous deletions in the 1–100 kb range. To fill this gap, we developed a method to identify CNVs in individual genomes, based on comparison to joint profiles pre-computed from a large set of genomes. We analyzed depth of coverage in over 6000 high quality (>40×) genomes. The depth of coverage has strong sequence-specific fluctuations only partially explained by global parameters like %GC. To account for these fluctuations, we constructed multi-genome profiles representing the observed or inferred diploid depth of coverage at each position along the genome. These Reference Coverage Profiles (RCPs) take into account the diverse technologies and pipeline versions used. Normalization of the scaled coverage to the RCP followed by hidden Markov model (HMM) segmentation enables efficient detection of CNVs and large deletions in individual genomes. Use of pre-computed multi-genome coverage profiles improves our ability to analyze each individual genome. We make available RCPs and tools for performing these analyses on personal genomes. We expect the increased sensitivity and specificity for individual genome analysis to be critical for achieving clinical-grade genome interpretation. PMID:25741365

  16. mRNA localization to the mitochondrial surface allows the efficient translocation inside the organelle of a nuclear recoded ATP6 protein

    PubMed Central

    Kaltimbacher, Valérie; Bonnet, Crystel; Lecoeuvre, Gaëlle; Forster, Valérie; Sahel, José-Alain; Corral-Debrinski, Marisol

    2006-01-01

    As previously established in yeast, two sequences within mRNAs are responsible for their specific localization to the mitochondrial surface—the region coding for the mitochondrial targeting sequence and the 3′UTR. This phenomenon is conserved in human cells. Therefore, we decided to use mRNA localization as a tool to address to mitochondria, a protein that is not normally imported. For this purpose, we associated a nuclear recoded ATP6 gene with the mitochondrial targeting sequence and the 3′UTR of the nuclear SOD2 gene, which mRNA exclusively localizes to the mitochondrial surface in HeLa cells. The ATP6 gene is naturally located into the organelle and encodes a highly hydrophobic protein of the respiratory chain complex V. In this study, we demonstrated that hybrid ATP6 mRNAs, as the endogenous SOD2 mRNA, localize to the mitochondrial surface in human cells. Remarkably, fusion proteins localize to mitochondria in vivo. Indeed, ATP6 precursors synthesized in the cytoplasm were imported into mitochondria in a highly efficient way, especially when both the MTS and the 3′UTR of the SOD2 gene were associated with the re-engineered ATP6 gene. Hence, these data indicate that mRNA targeting to the mitochondrial surface represents an attractive strategy for allowing the mitochondrial import of proteins originally encoded by the mitochondrial genome without any amino acid change in the protein that could interfere with its biologic activity. PMID:16751614

  17. LightAssembler: fast and memory-efficient assembly algorithm for high-throughput sequencing reads.

    PubMed

    El-Metwally, Sara; Zakaria, Magdi; Hamza, Taher

    2016-11-01

    The deluge of current sequenced data has exceeded Moore's Law, more than doubling every 2 years since the next-generation sequencing (NGS) technologies were invented. Accordingly, we will able to generate more and more data with high speed at fixed cost, but lack the computational resources to store, process and analyze it. With error prone high throughput NGS reads and genomic repeats, the assembly graph contains massive amount of redundant nodes and branching edges. Most assembly pipelines require this large graph to reside in memory to start their workflows, which is intractable for mammalian genomes. Resource-efficient genome assemblers combine both the power of advanced computing techniques and innovative data structures to encode the assembly graph efficiently in a computer memory. LightAssembler is a lightweight assembly algorithm designed to be executed on a desktop machine. It uses a pair of cache oblivious Bloom filters, one holding a uniform sample of [Formula: see text]-spaced sequenced [Formula: see text]-mers and the other holding [Formula: see text]-mers classified as likely correct, using a simple statistical test. LightAssembler contains a light implementation of the graph traversal and simplification modules that achieves comparable assembly accuracy and contiguity to other competing tools. Our method reduces the memory usage by [Formula: see text] compared to the resource-efficient assemblers using benchmark datasets from GAGE and Assemblathon projects. While LightAssembler can be considered as a gap-based sequence assembler, different gap sizes result in an almost constant assembly size and genome coverage. https://github.com/SaraEl-Metwally/LightAssembler CONTACT: sarah_almetwally4@mans.edu.egSupplementary information: Supplementary data are available at Bioinformatics online. © The Author 2016. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.

  18. Repairing the sickle cell mutation. I. Specific covalent binding of a photoreactive third strand to the mutated base pair.

    PubMed

    Broitman, S; Amosova, O; Dolinnaya, N G; Fresco, J R

    1999-07-30

    A DNA third strand with a 3'-psoralen substituent was designed to form a triplex with the sequence downstream of the T.A mutant base pair of the human sickle cell beta-globin gene. Triplex-mediated psoralen modification of the mutant T residue was sought as an approach to gene repair. The 24-nucleotide purine-rich target sequence switches from one strand to the other and has four pyrimidine interruptions. Therefore, a third strand sequence favorable to two triplex motifs was used, one parallel and the other antiparallel to it. To cope with the pyrimidine interruptions, which weaken third strand binding, 5-methylcytosine and 5-propynyluracil were used in the third strand. Further, a six residue "hook" complementary to an overhang of a linear duplex target was added to the 5'-end of the third strand via a T(4) linker. In binding to the overhang by Watson-Crick pairing, the hook facilitates triplex formation. This third strand also binds specifically to the target within a supercoiled plasmid. The psoralen moiety at the 3'-end of the third strand forms photoadducts to the targeted T with high efficiency. Such monoadducts are known to preferentially trigger reversion of the mutation by DNA repair enzymes.

  19. Development of SCoT-Based SCAR Marker for Rapid Authentication of Taxus Media.

    PubMed

    Hao, Juan; Jiao, Kaili; Yu, Chenliang; Guo, Hong; Zhu, Yujia; Yang, Xiao; Zhang, Siyang; Zhang, Lei; Feng, Shangguo; Song, Yaobin; Dong, Ming; Wang, Huizhong; Shen, Chenjia

    2018-06-01

    Taxus media is an important species in the family Taxaceae with high medicinal and commercial value. Overexploitation and illegal trade have led T. media to a severe threat of extinction. In addition, T. media and other Taxus species have similar morphological traits and are easily misidentified, particularly during the seedling stage. The purpose of this study is to develop a species-specific marker for T. media. Through a screening of 36 start codon targeted (SCoT) polymorphism primers, among 15 individuals of 4 Taxus species (T. media, T. chinensis, T. cuspidate and T. fuana), a clear species-specific DNA fragment (amplified by primer SCoT3) for T. media was identified. After isolation and sequencing, a DNA sequence with 530 bp was obtained. Based on this DNA fragment, a primer pair for the sequence-characterized amplified region marker was designed and named MHSF/MHSR. PCR analysis with primer pair MHSF/MHSR revealed a clear amplified band for all individuals of T. media but not for T. chinensis, T. cuspidate and T. fuana. Therefore, this marker can be used as a quick, efficient and reliable tool to identify T. media among other related Taxus species. The results of this study will lay an important foundation for the protection and management of T. media as a natural resource.

  20. Location of the unique integration site on an Escherichia coli chromosome by bacteriophage lambda DNA in vivo.

    PubMed

    Tal, Asaf; Arbel-Goren, Rinat; Costantino, Nina; Court, Donald L; Stavans, Joel

    2014-05-20

    The search for specific sequences on long genomes is a key process in many biological contexts. How can specific target sequences be located with high efficiency, within physiologically relevant times? We addressed this question for viral integration, a fundamental mechanism of horizontal gene transfer driving prokaryotic evolution, using the infection of Escherichia coli bacteria with bacteriophage λ and following the establishment of a lysogenic state. Following the targeting process in individual live E. coli cells in real time revealed that λ DNA remains confined near the entry point of a cell following infection. The encounter between the 15-bp-long target sequence on the chromosome and the recombination site on the viral genome is facilitated by the directed motion of bacterial DNA generated during chromosome replication, in conjunction with constrained diffusion of phage DNA. Moving the native bacterial integration site to different locations on the genome and measuring the integration frequency in these strains reveals that the frequencies of the native site and a site symmetric to it relative to the origin are similar, whereas both are significantly higher than when the integration site is moved near the terminus, consistent with the replication-driven mechanism we propose. This novel search mechanism is yet another example of the exquisite coevolution of λ with its host.

  1. Global investigation of protein-protein interactions in yeast Saccharomyces cerevisiae using re-occurring short polypeptide sequences.

    PubMed

    Pitre, S; North, C; Alamgir, M; Jessulat, M; Chan, A; Luo, X; Green, J R; Dumontier, M; Dehne, F; Golshani, A

    2008-08-01

    Protein-protein interaction (PPI) maps provide insight into cellular biology and have received considerable attention in the post-genomic era. While large-scale experimental approaches have generated large collections of experimentally determined PPIs, technical limitations preclude certain PPIs from detection. Recently, we demonstrated that yeast PPIs can be computationally predicted using re-occurring short polypeptide sequences between known interacting protein pairs. However, the computational requirements and low specificity made this method unsuitable for large-scale investigations. Here, we report an improved approach, which exhibits a specificity of approximately 99.95% and executes 16,000 times faster. Importantly, we report the first all-to-all sequence-based computational screen of PPIs in yeast, Saccharomyces cerevisiae in which we identify 29,589 high confidence interactions of approximately 2 x 10(7) possible pairs. Of these, 14,438 PPIs have not been previously reported and may represent novel interactions. In particular, these results reveal a richer set of membrane protein interactions, not readily amenable to experimental investigations. From the novel PPIs, a novel putative protein complex comprised largely of membrane proteins was revealed. In addition, two novel gene functions were predicted and experimentally confirmed to affect the efficiency of non-homologous end-joining, providing further support for the usefulness of the identified PPIs in biological investigations.

  2. Evaluation of two main RNA-seq approaches for gene quantification in clinical RNA sequencing: polyA+ selection versus rRNA depletion.

    PubMed

    Zhao, Shanrong; Zhang, Ying; Gamini, Ramya; Zhang, Baohong; von Schack, David

    2018-03-19

    To allow efficient transcript/gene detection, highly abundant ribosomal RNAs (rRNA) are generally removed from total RNA either by positive polyA+ selection or by rRNA depletion (negative selection) before sequencing. Comparisons between the two methods have been carried out by various groups, but the assessments have relied largely on non-clinical samples. In this study, we evaluated these two RNA sequencing approaches using human blood and colon tissue samples. Our analyses showed that rRNA depletion captured more unique transcriptome features, whereas polyA+ selection outperformed rRNA depletion with higher exonic coverage and better accuracy of gene quantification. For blood- and colon-derived RNAs, we found that 220% and 50% more reads, respectively, would have to be sequenced to achieve the same level of exonic coverage in the rRNA depletion method compared with the polyA+ selection method. Therefore, in most cases we strongly recommend polyA+ selection over rRNA depletion for gene quantification in clinical RNA sequencing. Our evaluation revealed that a small number of lncRNAs and small RNAs made up a large fraction of the reads in the rRNA depletion RNA sequencing data. Thus, we recommend that these RNAs are specifically depleted to improve the sequencing depth of the remaining RNAs.

  3. Fluorescent signatures for variable DNA sequences

    PubMed Central

    Rice, John E.; Reis, Arthur H.; Rice, Lisa M.; Carver-Brown, Rachel K.; Wangh, Lawrence J.

    2012-01-01

    Life abounds with genetic variations writ in sequences that are often only a few hundred nucleotides long. Rapid detection of these variations for identification of genetic diseases, pathogens and organisms has become the mainstay of molecular science and medicine. This report describes a new, highly informative closed-tube polymerase chain reaction (PCR) strategy for analysis of both known and unknown sequence variations. It combines efficient quantitative amplification of single-stranded DNA targets through LATE-PCR with sets of Lights-On/Lights-Off probes that hybridize to their target sequences over a broad temperature range. Contiguous pairs of Lights-On/Lights-Off probes of the same fluorescent color are used to scan hundreds of nucleotides for the presence of mutations. Sets of probes in different colors can be combined in the same tube to analyze even longer single-stranded targets. Each set of hybridized Lights-On/Lights-Off probes generates a composite fluorescent contour, which is mathematically converted to a sequence-specific fluorescent signature. The versatility and broad utility of this new technology is illustrated in this report by characterization of variant sequences in three different DNA targets: the rpoB gene of Mycobacterium tuberculosis, a sequence in the mitochondrial cytochrome C oxidase subunit 1 gene of nematodes and the V3 hypervariable region of the bacterial 16 s ribosomal RNA gene. We anticipate widespread use of these technologies for diagnostics, species identification and basic research. PMID:22879378

  4. Spore Heat Activation Requirements and Germination Responses Correlate with Sequences of Germinant Receptors and with the Presence of a Specific spoVA2mob Operon in Foodborne Strains of Bacillus subtilis.

    PubMed

    Krawczyk, Antonina O; de Jong, Anne; Omony, Jimmy; Holsappel, Siger; Wells-Bennik, Marjon H J; Kuipers, Oscar P; Eijlander, Robyn T

    2017-04-01

    Spore heat resistance, germination, and outgrowth are problematic bacterial properties compromising food safety and quality. Large interstrain variation in these properties makes prediction and control of spore behavior challenging. High-level heat resistance and slow germination of spores of some natural Bacillus subtilis isolates, encountered in foods, have been attributed to the occurrence of the spoVA 2mob operon carried on the Tn 1546 transposon. In this study, we further investigate the correlation between the presence of this operon in high-level-heat-resistant spores and their germination efficiencies before and after exposure to various sublethal heat treatments (heat activation, or HA), which are known to significantly improve spore responses to nutrient germinants. We show that high-level-heat-resistant spores harboring spoVA 2mob required higher HA temperatures for efficient germination than spores lacking spoVA 2mob The optimal spore HA requirements additionally depended on the nutrients used to trigger germination, l-alanine (l-Ala), or a mixture of l-asparagine, d-glucose, d-fructose, and K + (AGFK). The distinct HA requirements of these two spore germination pathways are likely related to differences in properties of specific germinant receptors. Moreover, spores that germinated inefficiently in AGFK contained specific changes in sequences of the GerB and GerK germinant receptors, which are involved in this germination response. In contrast, no relation was found between transcription levels of main germination genes and spore germination phenotypes. The findings presented in this study have great implications for practices in the food industry, where heat treatments are commonly used to inactivate pathogenic and spoilage microbes, including bacterial spore formers. IMPORTANCE This study describes a strong variation in spore germination capacities and requirements for a heat activation treatment, i.e., an exposure to sublethal heat that increases spore responsiveness to nutrient germination triggers, among 17 strains of B. subtilis , including 9 isolates from spoiled food products. Spores of industrial foodborne isolates exhibited, on average, less efficient and slower germination responses and required more severe heat activation than spores from other sources. High heat activation requirements and inefficient, slow germination correlated with elevated resistance of spores to heat and with specific genetic features, indicating a common genetic basis of these three phenotypic traits. Clearly, interstrain variation and numerous factors that shape spore germination behavior challenge standardization of methods to recover highly heat-resistant spores from the environment and have an impact on the efficacy of preservation techniques used by the food industry to control spores. Copyright © 2017 American Society for Microbiology.

  5. Spore Heat Activation Requirements and Germination Responses Correlate with Sequences of Germinant Receptors and with the Presence of a Specific spoVA2mob Operon in Foodborne Strains of Bacillus subtilis

    PubMed Central

    Krawczyk, Antonina O.; de Jong, Anne; Omony, Jimmy; Holsappel, Siger; Wells-Bennik, Marjon H. J.; Eijlander, Robyn T.

    2017-01-01

    ABSTRACT Spore heat resistance, germination, and outgrowth are problematic bacterial properties compromising food safety and quality. Large interstrain variation in these properties makes prediction and control of spore behavior challenging. High-level heat resistance and slow germination of spores of some natural Bacillus subtilis isolates, encountered in foods, have been attributed to the occurrence of the spoVA2mob operon carried on the Tn1546 transposon. In this study, we further investigate the correlation between the presence of this operon in high-level-heat-resistant spores and their germination efficiencies before and after exposure to various sublethal heat treatments (heat activation, or HA), which are known to significantly improve spore responses to nutrient germinants. We show that high-level-heat-resistant spores harboring spoVA2mob required higher HA temperatures for efficient germination than spores lacking spoVA2mob. The optimal spore HA requirements additionally depended on the nutrients used to trigger germination, l-alanine (l-Ala), or a mixture of l-asparagine, d-glucose, d-fructose, and K+ (AGFK). The distinct HA requirements of these two spore germination pathways are likely related to differences in properties of specific germinant receptors. Moreover, spores that germinated inefficiently in AGFK contained specific changes in sequences of the GerB and GerK germinant receptors, which are involved in this germination response. In contrast, no relation was found between transcription levels of main germination genes and spore germination phenotypes. The findings presented in this study have great implications for practices in the food industry, where heat treatments are commonly used to inactivate pathogenic and spoilage microbes, including bacterial spore formers. IMPORTANCE This study describes a strong variation in spore germination capacities and requirements for a heat activation treatment, i.e., an exposure to sublethal heat that increases spore responsiveness to nutrient germination triggers, among 17 strains of B. subtilis, including 9 isolates from spoiled food products. Spores of industrial foodborne isolates exhibited, on average, less efficient and slower germination responses and required more severe heat activation than spores from other sources. High heat activation requirements and inefficient, slow germination correlated with elevated resistance of spores to heat and with specific genetic features, indicating a common genetic basis of these three phenotypic traits. Clearly, interstrain variation and numerous factors that shape spore germination behavior challenge standardization of methods to recover highly heat-resistant spores from the environment and have an impact on the efficacy of preservation techniques used by the food industry to control spores. PMID:28130296

  6. An innovative SNP genotyping method adapting to multiple platforms and throughputs.

    PubMed

    Long, Y M; Chao, W S; Ma, G J; Xu, S S; Qi, L L

    2017-03-01

    An innovative genotyping method designated as semi-thermal asymmetric reverse PCR (STARP) was developed for genotyping individual SNPs with improved accuracy, flexible throughputs, low operational costs, and high platform compatibility. Multiplex chip-based technology for genome-scale genotyping of single nucleotide polymorphisms (SNPs) has made great progress in the past two decades. However, PCR-based genotyping of individual SNPs still remains problematic in accuracy, throughput, simplicity, and/or operational costs as well as the compatibility with multiple platforms. Here, we report a novel SNP genotyping method designated semi-thermal asymmetric reverse PCR (STARP). In this method, genotyping assay was performed under unique PCR conditions using two universal priming element-adjustable primers (PEA-primers) and one group of three locus-specific primers: two asymmetrically modified allele-specific primers (AMAS-primers) and their common reverse primer. The two AMAS-primers each were substituted one base in different positions at their 3' regions to significantly increase the amplification specificity of the two alleles and tailed at 5' ends to provide priming sites for PEA-primers. The two PEA-primers were developed for common use in all genotyping assays to stringently target the PCR fragments generated by the two AMAS-primers with similar PCR efficiencies and for flexible detection using either gel-free fluorescence signals or gel-based size separation. The state-of-the-art primer design and unique PCR conditions endowed STARP with all the major advantages of high accuracy, flexible throughputs, simple assay design, low operational costs, and platform compatibility. In addition to SNPs, STARP can also be employed in genotyping of indels (insertion-deletion polymorphisms). As vast variations in DNA sequences are being unearthed by many genome sequencing projects and genotyping by sequencing, STARP will have wide applications across all biological organisms in agriculture, medicine, and forensics.

  7. Definition of Cis-Acting Elements Regulating Expression of the Drosophila Melanogaster Ninae Opsin Gene by Oligonucleotide-Directed Mutagenesis

    PubMed Central

    Mismer, D.; Rubin, G. M.

    1989-01-01

    We have analyzed the cis-acting regulatory sequences of the Rh1 (ninaE) gene in Drosophila melanogaster by P-element-mediated germline transformation of indicator genes transcribed from mutant ninaE promoter sequences. We have previously shown that a 200-bp region extending from -120 to +67 relative to the transcription start site is sufficient to obtain eye-specific expression from the ninaE promoter. In the present study, 22 different 4-13-bp sequences in the -120/+67 promoter region were altered by oligonucleotide-directed mutagenesis. Several of these sequences were found to be required for proper promoter function; two of these are conserved in the promoter of the homologous gene isolated from the related species Drosophila virilis. Alteration of a conserved 9-bp sequence results in aberrant, low level expression in the body. Alteration of a separate 11-bp sequence, found in the promoter regions of several photoreceptor-specific genes of Drosophila, results in an approximately 15-fold reduction in promoter efficiency but without apparent alteration of tissue-specificity. A protein factor capable of interacting with this 11-bp sequence has been detected by DNaseI footprinting in embryonic nuclear extracts. Finally, we have further characterized two separable enhancer sequences previously shown to be required for normal levels of expression from this promoter. PMID:2521839

  8. Intrinsic sequence specificity of the Cas1 integrase directs new spacer acquisition

    PubMed Central

    Rollie, Clare; Schneider, Stefanie; Brinkmann, Anna Sophie; Bolt, Edward L; White, Malcolm F

    2015-01-01

    The adaptive prokaryotic immune system CRISPR-Cas provides RNA-mediated protection from invading genetic elements. The fundamental basis of the system is the ability to capture small pieces of foreign DNA for incorporation into the genome at the CRISPR locus, a process known as Adaptation, which is dependent on the Cas1 and Cas2 proteins. We demonstrate that Cas1 catalyses an efficient trans-esterification reaction on branched DNA substrates, which represents the reverse- or disintegration reaction. Cas1 from both Escherichia coli and Sulfolobus solfataricus display sequence specific activity, with a clear preference for the nucleotides flanking the integration site at the leader-repeat 1 boundary of the CRISPR locus. Cas2 is not required for this activity and does not influence the specificity. This suggests that the inherent sequence specificity of Cas1 is a major determinant of the adaptation process. DOI: http://dx.doi.org/10.7554/eLife.08716.001 PMID:26284603

  9. Structurally detailed coarse-grained model for Sec-facilitated co-translational protein translocation and membrane integration

    PubMed Central

    Miller, Thomas F.

    2017-01-01

    We present a coarse-grained simulation model that is capable of simulating the minute-timescale dynamics of protein translocation and membrane integration via the Sec translocon, while retaining sufficient chemical and structural detail to capture many of the sequence-specific interactions that drive these processes. The model includes accurate geometric representations of the ribosome and Sec translocon, obtained directly from experimental structures, and interactions parameterized from nearly 200 μs of residue-based coarse-grained molecular dynamics simulations. A protocol for mapping amino-acid sequences to coarse-grained beads enables the direct simulation of trajectories for the co-translational insertion of arbitrary polypeptide sequences into the Sec translocon. The model reproduces experimentally observed features of membrane protein integration, including the efficiency with which polypeptide domains integrate into the membrane, the variation in integration efficiency upon single amino-acid mutations, and the orientation of transmembrane domains. The central advantage of the model is that it connects sequence-level protein features to biological observables and timescales, enabling direct simulation for the mechanistic analysis of co-translational integration and for the engineering of membrane proteins with enhanced membrane integration efficiency. PMID:28328943

  10. Systematic analysis of transcribed loci in ENCODE regions using RACE sequencing reveals extensive transcription in the human genome.

    PubMed

    Wu, Jia Qian; Du, Jiang; Rozowsky, Joel; Zhang, Zhengdong; Urban, Alexander E; Euskirchen, Ghia; Weissman, Sherman; Gerstein, Mark; Snyder, Michael

    2008-01-03

    Recent studies of the mammalian transcriptome have revealed a large number of additional transcribed regions and extraordinary complexity in transcript diversity. However, there is still much uncertainty regarding precisely what portion of the genome is transcribed, the exact structures of these novel transcripts, and the levels of the transcripts produced. We have interrogated the transcribed loci in 420 selected ENCyclopedia Of DNA Elements (ENCODE) regions using rapid amplification of cDNA ends (RACE) sequencing. We analyzed annotated known gene regions, but primarily we focused on novel transcriptionally active regions (TARs), which were previously identified by high-density oligonucleotide tiling arrays and on random regions that were not believed to be transcribed. We found RACE sequencing to be very sensitive and were able to detect low levels of transcripts in specific cell types that were not detectable by microarrays. We also observed many instances of sense-antisense transcripts; further analysis suggests that many of the antisense transcripts (but not all) may be artifacts generated from the reverse transcription reaction. Our results show that the majority of the novel TARs analyzed (60%) are connected to other novel TARs or known exons. Of previously unannotated random regions, 17% were shown to produce overlapping transcripts. Furthermore, it is estimated that 9% of the novel transcripts encode proteins. We conclude that RACE sequencing is an efficient, sensitive, and highly accurate method for characterization of the transcriptome of specific cell/tissue types. Using this method, it appears that much of the genome is represented in polyA+ RNA. Moreover, a fraction of the novel RNAs can encode protein and are likely to be functional.

  11. Highly selective detection of single-nucleotide polymorphisms using a quartz crystal microbalance biosensor based on the toehold-mediated strand displacement reaction.

    PubMed

    Wang, Dingzhong; Tang, Wei; Wu, Xiaojie; Wang, Xinyi; Chen, Gengjia; Chen, Qiang; Li, Na; Liu, Feng

    2012-08-21

    Toehold-mediated strand displacement reaction (SDR) is first introduced to develop a simple quartz crystal microbalance (QCM) biosensor without an enzyme or label at normal temperature for highly selective and sensitive detection of single-nucleotide polymorphism (SNP) in the p53 tumor suppressor gene. A hairpin capture probe with an external toehold is designed and immobilized on the gold electrode surface of QCM. A successive SDR is initiated by the target sequence hybridization with the toehold domain and ends with the unfolding of the capture probe. Finally, the open-loop capture probe hybridizes with the streptavidin-coupled reporter probe as an efficient mass amplifier to enhance the QCM signal. The proposed biosensor displays remarkable specificity to target the p53 gene fragment against single-base mutant sequences (e.g., the largest discrimination factor is 63 to C-C mismatch) and high sensitivity with the detection limit of 0.3 nM at 20 °C. As the crucial component of the fabricated biosensor for providing the high discrimination capability, the design rationale of the capture probe is further verified by fluorescence sensing and atomic force microscopy imaging. Additionally, a recovery of 84.1% is obtained when detecting the target sequence in spiked HeLa cells lysate, demonstrating the feasibility of employing this biosensor in detecting SNPs in biological samples.

  12. A novel anti-aldolase C antibody specifically interacts with residues 85-102 of the protein.

    PubMed

    Langellotti, Simona; Romano, Maurizio; Guarnaccia, Corrado; Granata, Vincenzo; Orrù, Stefania; Zagari, Adriana; Baralle, Francisco E; Salvatore, Francesco

    2014-01-01

    Aldolase C is a brain-specific glycolytic isozyme whose complete repertoire of functions are obscure. This lack of knowledge can be addressed using molecular tools that discriminate the protein from the homologous, ubiquitous paralog aldolase A. The anti-aldolase C antibodies currently available are polyclonal and not highly specific. We obtained the novel monoclonal antibody 9F against human aldolase C, characterized its isoform specificity and tested its performance. First, we investigated the specificity of 9F for aldolase C. Then, using bioinformatic tools coupled to molecular cloning and chemical synthesis approaches, we produced truncated human aldolase C fragments, and assessed 9F binding to these fragments by western blot and ELISA assays. This strategy revealed that residues 85-102 harbor the epitope-containing region recognized by 9F. The efficiency of 9F was demonstrated also for immunoprecipitation assays. Finally, surface plasmon resonance revealed that the protein has a high affinity toward the epitope-containing peptide. Taken together, our findings show that epitope recognition is sequence-driven and is independent of the three-dimensional structure. In conclusion, given its specific molecular interaction, 9F is a novel and powerful tool to investigate aldolase C's functions in the brain.

  13. Development of a high efficiency integration system and promoter library for rapid modification of Pseudomonas putida KT2440

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Elmore, Joshua R.; Furches, Anna; Wolff, Gara N.

    Pseudomonas putida strains are highly robust bacteria known for their ability to efficiently utilize a variety of carbon sources, including aliphatic and aromatic hydrocarbons. Recently, P. putida has been engineered to valorize the lignin stream of a lignocellulosic biomass pretreatment process. Nonetheless, when compared to platform organisms such as Escherichia coli, the toolkit for engineering P. putida is underdeveloped. Heterologous gene expression in particular is problematic. Plasmid instability and copy number variance provide challenges for replicative plasmids, while use of homologous recombination for insertion of DNA into the chromosome is slow and laborious. Furthermore, heterologous expression efforts to date typicallymore » rely on overexpression of exogenous pathways using a handful of poorly characterized promoters. In order to improve the P. putida toolkit, we developed a rapid genome integration system using the site-specific recombinase from bacteriophage Bxb1 to enable rapid, high efficiency integration of DNA into the P. putida chromosome. We also developed a library of synthetic promoters with various UP elements, -35 sequences, and -10 sequences, as well as different ribosomal binding sites. We tested these promoters using a fluorescent reporter gene, mNeonGreen, to characterize the strength of each promoter, and identified UP-element-promoter-ribosomal binding sites combinations capable of driving a ~150-fold range of protein expression levels. One additional integrating vector was developed that confers more robust kanamycin resistance when integrated at single copy into the chromosome. This genome integration and reporter systems are extensible for testing other genetic parts, such as examining terminator strength, and will allow rapid integration of heterologous pathways for metabolic engineering.« less

  14. Analysis of the enzymatic formation of citral in the glands of sweet basil.

    PubMed

    Iijima, Yoko; Wang, Guodong; Fridman, Eyal; Pichersky, Eran

    2006-04-15

    Basil glands of the Sweet Dani cultivar contain high levels of citral, a mixture of geranial and its cis-isomer neral, as well as low levels of geraniol and nerol. We have previously reported the identification of a cDNA from Sweet Dani that encodes an enzyme responsible for the formation of geraniol from geranyl diphosphate in the glands, and that these glands cannot synthesize nerol directly from geranyl diphosphate. Here, we report the identification of two basil cDNAs encoding NADP+-dependent dehydrogenases that can use geraniol as the substrate. One cDNA, designated CAD1, represents a gene whose expression is highly specific to gland cells of all three basil cultivars examined, regardless of their citral content, and encodes an enzyme with high sequence similarity to known cinnamyl alcohol dehydrogenases (CADs). The enzyme encoded by CAD1 reversibly oxidizes geraniol to produce geranial (which reversibly isomerizes to neral via keto-enol tautomerization) at half the efficiency compared with its activity with cinnamyl alcohol. CAD1 does not use nerol and neral as substrates. A second cDNA, designated GEDH1, encodes an enzyme with sequence similarity to CAD1 that is capable of reversibly oxidizing geraniol and nerol in equal efficiency, and prolonged incubation of geraniol with GEDH1 in vitro produces not only geranial and neral, but also nerol. GEDH1 is also active, although at a lower efficiency, with cinnamyl alcohol. However, GEDH1 is expressed at low levels in glands of all cultivars compared with its expression in leaves. These and additional data presented indicate that basil glands may contain additional dehydrogenases capable of oxidizing geraniol.

  15. Theory on the mechanism of site-specific DNA-protein interactions in the presence of traps

    NASA Astrophysics Data System (ADS)

    Niranjani, G.; Murugan, R.

    2016-08-01

    The speed of site-specific binding of transcription factor (TFs) proteins with genomic DNA seems to be strongly retarded by the randomly occurring sequence traps. Traps are those DNA sequences sharing significant similarity with the original specific binding sites (SBSs). It is an intriguing question how the naturally occurring TFs and their SBSs are designed to manage the retarding effects of such randomly occurring traps. We develop a simple random walk model on the site-specific binding of TFs with genomic DNA in the presence of sequence traps. Our dynamical model predicts that (a) the retarding effects of traps will be minimum when the traps are arranged around the SBS such that there is a negative correlation between the binding strength of TFs with traps and the distance of traps from the SBS and (b) the retarding effects of sequence traps can be appeased by the condensed conformational state of DNA. Our computational analysis results on the distribution of sequence traps around the putative binding sites of various TFs in mouse and human genome clearly agree well the theoretical predictions. We propose that the distribution of traps can be used as an additional metric to efficiently identify the SBSs of TFs on genomic DNA.

  16. High-resolution melting analysis for bird sexing: a successful approach to molecular sex identification using different biological samples.

    PubMed

    Morinha, Francisco; Travassos, Paulo; Seixas, Fernanda; Santos, Nuno; Sargo, Roberto; Sousa, Luís; Magalhães, Paula; Cabral, João A; Bastos, Estela

    2013-05-01

    High-resolution melting (HRM) analysis is a very attractive and flexible advanced post-PCR method with high sensitivity/specificity for simple, fast and cost-effective genotyping based on the detection of specific melting profiles of PCR products. Next generation real-time PCR systems, along with improved saturating DNA-binding dyes, enable the direct acquisition of HRM data after quantitative PCR. Melting behaviour is particularly influenced by the length, nucleotide sequence and GC content of the amplicons. This method is expanding rapidly in several research areas such as human genetics, reproductive biology, microbiology and ecology/conservation of wild populations. Here we have developed a successful HRM protocol for avian sex identification based on the amplification of sex-specific CHD1 fragments. The melting curve patterns allowed efficient sexual differentiation of 111 samples analysed (plucked feathers, muscle tissues, blood and oral cavity epithelial cells) of 14 bird species. In addition, we sequenced the amplified regions of the CHD1 gene and demonstrated the usefulness of this strategy for the genotype discrimination of various amplicons (CHD1Z and CHD1W), which have small size differences, ranging from 2 bp to 44 bp. The established methodology clearly revealed the advantages (e.g. closed-tube system, high sensitivity and rapidity) of a simple HRM assay for accurate sex differentiation of the species under study. The requirements, strengths and limitations of the method are addressed to provide a simple guide for its application in the field of molecular sexing of birds. The high sensitivity and resolution relative to previous real-time PCR methods makes HRM analysis an excellent approach for improving advanced molecular methods for bird sexing. © 2013 Blackwell Publishing Ltd.

  17. Measuring the labeling efficiency of pseudocontinuous arterial spin labeling.

    PubMed

    Chen, Zhensen; Zhang, Xingxing; Yuan, Chun; Zhao, Xihai; van Osch, Matthias J P

    2017-05-01

    Optimization and validation of a sequence for measuring the labeling efficiency of pseudocontinuous arterial spin labeling (pCASL) perfusion MRI. The proposed sequence consists of a labeling module and a single slice Look-Locker echo planar imaging readout. A model-based algorithm was used to calculate labeling efficiency from the signal acquired from the main brain-feeding arteries. Stability of the labeling efficiency measurement was evaluated with regard to the use of cardiac triggering, flow compensation and vein signal suppression. Accuracy of the measurement was assessed by comparing the measured labeling efficiency to mean brain pCASL signal intensity over a wide range of flip angles as applied in the pCASL labeling. Simulations show that the proposed algorithm can effectively calculate labeling efficiency when correcting for T1 relaxation of the blood spins. Use of cardiac triggering and vein signal suppression improved stability of the labeling efficiency measurement, while flow compensation resulted in little improvement. The measured labeling efficiency was found to be linearly (R = 0.973; P < 0.001) related to brain pCASL signal intensity over a wide range of pCASL flip angles. The optimized labeling efficiency sequence provides robust artery-specific labeling efficiency measurement within a short acquisition time (∼30 s), thereby enabling improved accuracy of pCASL CBF quantification. Magn Reson Med 77:1841-1852, 2017. © 2016 International Society for Magnetic Resonance in Medicine Magn Reson Med 77:1841-1852, 2017. © 2016 International Society for Magnetic Resonance in Medicine. © 2016 International Society for Magnetic Resonance in Medicine.

  18. Characterisation of a DNA sequence element that directs Dictyostelium stalk cell-specific gene expression.

    PubMed

    Ceccarelli, A; Zhukovskaya, N; Kawata, T; Bozzaro, S; Williams, J

    2000-12-01

    The ecmB gene of Dictyostelium is expressed at culmination both in the prestalk cells that enter the stalk tube and in ancillary stalk cell structures such as the basal disc. Stalk tube-specific expression is regulated by sequence elements within the cap-site proximal part of the promoter, the stalk tube (ST) promoter region. Dd-STATa, a member of the STAT transcription factor family, binds to elements present in the ST promoter-region and represses transcription prior to entry into the stalk tube. We have characterised an activatory DNA sequence element, that lies distal to the repressor elements and that is both necessary and sufficient for expression within the stalk tube. We have mapped this activator to a 28 nucleotide region (the 28-mer) within which we have identified a GA-containing sequence element that is required for efficient gene transcription. The Dd-STATa protein binds to the 28-mer in an in vitro binding assay, and binding is dependent upon the GA-containing sequence. However, the ecmB gene is expressed in a Dd-STATa null mutant, therefore Dd-STATa cannot be responsible for activating the 28-mer in vivo. Instead, we identified a distinct 28-mer binding activity in nuclear extracts from the Dd-STATa null mutant, the activity of this GA binding activity being largely masked in wild type extracts by the high affinity binding of the Dd-STATa protein. We suggest, that in addition to the long range repression exerted by binding to the two known repressor sites, Dd-STATa inhibits transcription by direct competition with this putative activator for binding to the GA sequence.

  19. Groupwise registration of cardiac perfusion MRI sequences using normalized mutual information in high dimension

    NASA Astrophysics Data System (ADS)

    Hamrouni, Sameh; Rougon, Nicolas; Pr"teux, Françoise

    2011-03-01

    In perfusion MRI (p-MRI) exams, short-axis (SA) image sequences are captured at multiple slice levels along the long-axis of the heart during the transit of a vascular contrast agent (Gd-DTPA) through the cardiac chambers and muscle. Compensating cardio-thoracic motions is a requirement for enabling computer-aided quantitative assessment of myocardial ischaemia from contrast-enhanced p-MRI sequences. The classical paradigm consists of registering each sequence frame on a reference image using some intensity-based matching criterion. In this paper, we introduce a novel unsupervised method for the spatio-temporal groupwise registration of cardiac p-MRI exams based on normalized mutual information (NMI) between high-dimensional feature distributions. Here, local contrast enhancement curves are used as a dense set of spatio-temporal features, and statistically matched through variational optimization to a target feature distribution derived from a registered reference template. The hard issue of probability density estimation in high-dimensional state spaces is bypassed by using consistent geometric entropy estimators, allowing NMI to be computed directly from feature samples. Specifically, a computationally efficient kth-nearest neighbor (kNN) estimation framework is retained, leading to closed-form expressions for the gradient flow of NMI over finite- and infinite-dimensional motion spaces. This approach is applied to the groupwise alignment of cardiac p-MRI exams using a free-form Deformation (FFD) model for cardio-thoracic motions. Experiments on simulated and natural datasets suggest its accuracy and robustness for registering p-MRI exams comprising more than 30 frames.

  20. Specific Increase of Protein Levels by Enhancing Translation Using Antisense Oligonucleotides Targeting Upstream Open Frames.

    PubMed

    Liang, Xue-Hai; Shen, Wen; Crooke, Stanley T

    2017-01-01

    A number of diseases are caused by low levels of key proteins; therefore, increasing the amount of specific proteins in human bodies is of therapeutic interest. Protein expression is downregulated by some structural or sequence elements present in the 5' UTR of mRNAs, such as upstream open reading frames (uORF). Translation initiation from uORF(s) reduces translation from the downstream primary ORF encoding the main protein product in the same mRNA, leading to a less efficient protein expression. Therefore, it is possible to use antisense oligonucleotides (ASOs) to specifically inhibit translation of the uORF by base-pairing with the uAUG region of the mRNA, redirecting translation machinery to initiate from the primary AUG site. Here we review the recent findings that translation of specific mRNAs can be enhanced using ASOs targeting uORF regions. Appropriately designed and optimized ASOs are highly specific, and they act in a sequence- and position-dependent manner, with very minor off-target effects. Protein levels can be increased using this approach in different types of human and mouse cells, and, importantly, also in mice. Since uORFs are present in around half of human mRNAs, the uORF-targeting ASOs may thus have valuable potential as research tools and as therapeutics to increase the levels of proteins for a variety of genes.

  1. Efficient gene targeting by homology-directed repair in rat zygotes using TALE nucleases.

    PubMed

    Remy, Séverine; Tesson, Laurent; Menoret, Séverine; Usal, Claire; De Cian, Anne; Thepenier, Virginie; Thinard, Reynald; Baron, Daniel; Charpentier, Marine; Renaud, Jean-Baptiste; Buelow, Roland; Cost, Gregory J; Giovannangeli, Carine; Fraichard, Alexandre; Concordet, Jean-Paul; Anegon, Ignacio

    2014-08-01

    The generation of genetically modified animals is important for both research and commercial purposes. The rat is an important model organism that until recently lacked efficient genetic engineering tools. Sequence-specific nucleases, such as ZFNs, TALE nucleases, and CRISPR/Cas9 have allowed the creation of rat knockout models. Genetic engineering by homology-directed repair (HDR) is utilized to create animals expressing transgenes in a controlled way and to introduce precise genetic modifications. We applied TALE nucleases and donor DNA microinjection into zygotes to generate HDR-modified rats with large new sequences introduced into three different loci with high efficiency (0.62%-5.13% of microinjected zygotes). Two of these loci (Rosa26 and Hprt1) are known to allow robust and reproducible transgene expression and were targeted for integration of a GFP expression cassette driven by the CAG promoter. GFP-expressing embryos and four Rosa26 GFP rat lines analyzed showed strong and widespread GFP expression in most cells of all analyzed tissues. The third targeted locus was Ighm, where we performed successful exon exchange of rat exon 2 for the human one. At all three loci we observed HDR only when using linear and not circular donor DNA. Mild hypothermic (30°C) culture of zygotes after microinjection increased HDR efficiency for some loci. Our study demonstrates that TALE nuclease and donor DNA microinjection into rat zygotes results in efficient and reproducible targeted donor integration by HDR. This allowed creation of genetically modified rats in a work-, cost-, and time-effective manner. © 2014 Remy et al.; Published by Cold Spring Harbor Laboratory Press.

  2. Efficient gene targeting by homology-directed repair in rat zygotes using TALE nucleases

    PubMed Central

    Remy, Séverine; Tesson, Laurent; Menoret, Séverine; Usal, Claire; De Cian, Anne; Thepenier, Virginie; Thinard, Reynald; Baron, Daniel; Charpentier, Marine; Renaud, Jean-Baptiste; Buelow, Roland; Cost, Gregory J.; Giovannangeli, Carine; Fraichard, Alexandre; Concordet, Jean-Paul; Anegon, Ignacio

    2014-01-01

    The generation of genetically modified animals is important for both research and commercial purposes. The rat is an important model organism that until recently lacked efficient genetic engineering tools. Sequence-specific nucleases, such as ZFNs, TALE nucleases, and CRISPR/Cas9 have allowed the creation of rat knockout models. Genetic engineering by homology-directed repair (HDR) is utilized to create animals expressing transgenes in a controlled way and to introduce precise genetic modifications. We applied TALE nucleases and donor DNA microinjection into zygotes to generate HDR-modified rats with large new sequences introduced into three different loci with high efficiency (0.62%–5.13% of microinjected zygotes). Two of these loci (Rosa26 and Hprt1) are known to allow robust and reproducible transgene expression and were targeted for integration of a GFP expression cassette driven by the CAG promoter. GFP-expressing embryos and four Rosa26 GFP rat lines analyzed showed strong and widespread GFP expression in most cells of all analyzed tissues. The third targeted locus was Ighm, where we performed successful exon exchange of rat exon 2 for the human one. At all three loci we observed HDR only when using linear and not circular donor DNA. Mild hypothermic (30°C) culture of zygotes after microinjection increased HDR efficiency for some loci. Our study demonstrates that TALE nuclease and donor DNA microinjection into rat zygotes results in efficient and reproducible targeted donor integration by HDR. This allowed creation of genetically modified rats in a work-, cost-, and time-effective manner. PMID:24989021

  3. Identification of high-specificity H-NS binding site in LEE5 promoter of enteropathogenic Esherichia coli (EPEC).

    PubMed

    Bhat, Abhay Prasad; Shin, Minsang; Choy, Hyon E

    2014-07-01

    Histone-like nucleoid structuring protein (H-NS) is a small but abundant protein present in enteric bacteria and is involved in compaction of the DNA and regulation of the transcription. Recent reports have suggested that H-NS binds to a specific AT rich DNA sequence than to intrinsically curved DNA in sequence independent manner. We detected two high-specificity H-NS binding sites in LEE5 promoter of EPEC centered at -110 and -138, which were close to the proposed consensus H-NS binding motif. To identify H-NS binding sequence in LEE5 promoter, we took a random mutagenesis approach and found the mutations at around -138 were specifically defective in the regulation by H-NS. It was concluded that H-NS exerts maximum repression via the specific sequence at around -138 and subsequently contacts a subunit of RNAP through oligomerization.

  4. Discontinuous pH gradient-mediated separation of TiO2-enriched phosphopeptides

    PubMed Central

    Park, Sung-Soo; Maudsley, Stuart

    2010-01-01

    Global profiling of phosphoproteomes has proven a great challenge due to the relatively low stoichiometry of protein phosphorylation and poor ionization efficiency in mass spectrometers. Effective, physiologically-relevant, phosphoproteome research relies on the efficient phosphopeptide enrichment from complex samples. Immobilized metal affinity chromatography and titanium dioxide chromatography (TOC) can greatly assist selective phosphopeptide enrichment. However, the complexity of resultant enriched samples is often still high, suggesting that further separation of enriched phosphopeptides is required. We have developed a pH-gradient elution technique for enhanced phosphopeptide identification in conjunction with TOC. Using this process, we have demonstrated its superiority to the traditional ‘one-pot’ strategies for differential protein identification. Our technique generated a highly specific separation of phosphopeptides by an applied pH-gradient between 9.2 and 11.3. The most efficient elution range for high-resolution phosphopeptide separation was between pH 9.2 and 9.4. High-resolution separation of multiply-phosphorylated peptides was primarily achieved using elution ranges > pH 9.4. Investigation of phosphopeptide sequences identified in each pH fraction indicated that phosphopeptides with phosphorylated residues proximal to acidic residues, including glutamic acid, aspartic acid, and other phosphorylated residues, were preferentially eluted at higher pH values. PMID:20946866

  5. Shine-Dalgarno sequence enhances the efficiency of lacZ repression by artificial anti-lac antisense RNAs in Escherichia coli.

    PubMed

    Stefan, Alessandra; Schwarz, Flavio; Bressanin, Daniela; Hochkoeppler, Alejandro

    2010-11-01

    Silencing of the lacZ gene in Escherichia coli was attempted by means of the expression of antisense RNAs (asRNAs) in vivo. A short fragment of lacZ was cloned into the pBAD expression vector, in reverse orientation, using the EcoRI and PstI restriction sites. This construct (pBAD-Zcal1) was used to transform E. coli cells, and the antisense transcription was induced simply by adding arabinose to the culture medium. We demonstrated that the Zcal1 asRNA effectively silenced lacZ using β-galactosidase activity determinations, SDS-PAGE, and Western blotting. Because the concentration of the lac mRNA was always high in cells that expressed Zcal1, we hypothesize that this antisense acts by inhibiting messenger translation. Similar analyses, performed with a series of site-specific Zcal1 mutants, showed that the Shine-Dalgarno sequence, which is conferred by the pBAD vector, is an essential requisite for silencing competence. Indeed, the presence of the intact Shine-Dalgarno sequence positively affects asRNA stability and, hence, silencing effectiveness. Our observations will contribute to the understanding of the main determinants of silencing as exerted by asRNAs as well as provide useful support for the design of robust and efficient prokaryotic gene silencers. Copyright © 2010 The Society for Biotechnology, Japan. Published by Elsevier B.V. All rights reserved.

  6. Analysis of B Cell Repertoire Dynamics Following Hepatitis B Vaccination in Humans, and Enrichment of Vaccine-specific Antibody Sequences.

    PubMed

    Galson, Jacob D; Trück, Johannes; Fowler, Anna; Clutterbuck, Elizabeth A; Münz, Márton; Cerundolo, Vincenzo; Reinhard, Claudia; van der Most, Robbert; Pollard, Andrew J; Lunter, Gerton; Kelly, Dominic F

    2015-12-01

    Generating a diverse B cell immunoglobulin repertoire is essential for protection against infection. The repertoire in humans can now be comprehensively measured by high-throughput sequencing. Using hepatitis B vaccination as a model, we determined how the total immunoglobulin sequence repertoire changes following antigen exposure in humans, and compared this to sequences from vaccine-specific sorted cells. Clonal sequence expansions were seen 7 days after vaccination, which correlated with vaccine-specific plasma cell numbers. These expansions caused an increase in mutation, and a decrease in diversity and complementarity-determining region 3 sequence length in the repertoire. We also saw an increase in sequence convergence between participants 14 and 21 days after vaccination, coinciding with an increase of vaccine-specific memory cells. These features allowed development of a model for in silico enrichment of vaccine-specific sequences from the total repertoire. Identifying antigen-specific sequences from total repertoire data could aid our understanding B cell driven immunity, and be used for disease diagnostics and vaccine evaluation.

  7. Tuning the specificity of a Two-in-One Fab against three angiogenic antigens by fully utilizing the information of deep mutational scanning.

    PubMed

    Koenig, Patrick; Sanowar, Sarah; Lee, Chingwei V; Fuh, Germaine

    Monoclonal antibodies developed for therapeutic or diagnostic purposes need to demonstrate highly defined binding specificity profiles. Engineering of an antibody to enhance or reduce binding to related antigens is often needed to achieve the desired biologic activity without safety concern. Here, we describe a deep sequencing-aided engineering strategy to fine-tune the specificity of an angiopoietin-2 (Ang2)/vascular endothelial growth factor (VEGF) dual action Fab, 5A12.1 for the treatment of age-related macular degeneration. This antibody utilizes overlapping complementarity-determining region (CDR) sites for dual Ang2/VEGF interaction with K D in the sub-nanomolar range. However, it also exhibits significant (K D of 4 nM) binding to angiopoietin-1, which has high sequence identity with Ang2. We generated a large phage-displayed library of 5A12.1 Fab variants with all possible single mutations in the 6 CDRs. By tracking the change of prevalence of each mutation during various selection conditions, we identified 35 mutations predicted to decrease the affinity for Ang1 while maintaining the affinity for Ang2 and VEGF. We confirmed the specificity profiles for 25 of these single mutations as Fab protein. Structural analysis showed that some of the Fab mutations cluster near a potential Ang1/2 epitope residue that differs in the 2 proteins, while others are up to 15 Å away from the antigen-binding site and likely influence the binding interaction remotely. The approach presented here provides a robust and efficient method for specificity engineering that does not require prior knowledge of the antigen antibody interaction and can be broadly applied to antibody specificity engineering projects.

  8. Treatment of mature landfill leachate by internal micro-electrolysis integrated with coagulation: a comparative study on a novel sequencing batch reactor based on zero valent iron.

    PubMed

    Ying, Diwen; Peng, Juan; Xu, Xinyan; Li, Kan; Wang, Yalin; Jia, Jinping

    2012-08-30

    A comparative study of treating mature landfill leachate with various treatment processes was conducted to investigate whether the method of combined processes of internal micro-electrolysis (IME) without aeration and IME with full aeration in one reactor was an efficient treatment for mature landfill leachate. A specifically designed novel sequencing batch internal micro-electrolysis reactor (SIME) with the latest automation technology was employed in the experiment. Experimental data showed that combined processes obtained a high COD removal efficiency of 73.7 ± 1.3%, which was 15.2% and 24.8% higher than that of the IME with and without aeration, respectively. The SIME reactor also exhibited a COD removal efficiency of 86.1 ± 3.8% to mature landfill leachate in the continuous operation, which is much higher (p<0.05) than that of conventional treatments of electrolysis (22.8-47.0%), coagulation-sedimentation (18.5-22.2%), and the Fenton process (19.9-40.2%), respectively. The innovative concept behind this excellent performance is a combination effect of reductive and oxidative processes of the IME, and the integration electro-coagulation. Optimal operating parameters, including the initial pH, Fe/C mass ratio, air flow rate, and addition of H(2)O(2), were optimized. All results show that the SIME reactor is a promising and efficient technology in treating mature landfill leachate. Copyright © 2012 Elsevier B.V. All rights reserved.

  9. An efficient and high fidelity method for amplification, cloning and sequencing of complete tospovirus genomic RNA segments

    USDA-ARS?s Scientific Manuscript database

    Amplification and sequencing of the complete M- and S-RNA segments of Tomato spotted wilt virus and Impatiens necrotic spot virus as a single fragment is useful for whole genome sequencing of tospoviruses co-infecting a single host plant. It avoids issues associated with overlapping amplicon-based ...

  10. Properties of the recombinant TNF-binding proteins from variola, monkeypox, and cowpox viruses are different.

    PubMed

    Gileva, Irina P; Nepomnyashchikh, Tatiana S; Antonets, Denis V; Lebedev, Leonid R; Kochneva, Galina V; Grazhdantseva, Antonina V; Shchelkunov, Sergei N

    2006-11-01

    Tumor necrosis factor (TNF), a potent proinflammatory and antiviral cytokine, is a critical extracellular immune regulator targeted by poxviruses through the activity of virus-encoded family of TNF-binding proteins (CrmB, CrmC, CrmD, and CrmE). The only TNF-binding protein from variola virus (VARV), the causative agent of smallpox, infecting exclusively humans, is CrmB. Here we have aligned the amino acid sequences of CrmB proteins from 10 VARV, 14 cowpox virus (CPXV), and 22 monkeypox virus (MPXV) strains. Sequence analyses demonstrated a high homology of these proteins. The regions homologous to cd00185 domain of the TNF receptor family, determining the specificity of ligand-receptor binding, were found in the sequences of CrmB proteins. In addition, a comparative analysis of the C-terminal SECRET domain sequences of CrmB proteins was performed. The differences in the amino acid sequences of these domains characteristic of each particular orthopoxvirus species were detected. It was assumed that the species-specific distinctions between the CrmB proteins might underlie the differences in these physicochemical and biological properties. The individual recombinant proteins VARV-CrmB, MPXV-CrmB, and CPXV-CrmB were synthesized in a baculovirus expression system in insect cells and isolated. Purified VARV-CrmB was detectable as a dimer with a molecular weight of 90 kDa, while MPXV- and CPXV-CrmBs, as monomers when fractioned by non-reducing SDS-PAGE. The CrmB proteins of VARV, MPXV, and CPXV differed in the efficiencies of inhibition of the cytotoxic effects of human, mouse, or rabbit TNFs in L929 mouse fibroblast cell line. Testing of CrmBs in the experimental model of LPS-induced shock using SPF BALB/c mice detected a pronounced protective effect of VARV-CrmB. Thus, our data demonstrated the difference in anti-TNF activities of VARV-, MPXV-, and CPXV-CrmBs and efficiency of VARV-CrmB rather than CPXV- or MPXV-CrmBs against LPS-induced mortality in mice.

  11. Efficient inference of population size histories and locus-specific mutation rates from large-sample genomic variation data.

    PubMed

    Bhaskar, Anand; Wang, Y X Rachel; Song, Yun S

    2015-02-01

    With the recent increase in study sample sizes in human genetics, there has been growing interest in inferring historical population demography from genomic variation data. Here, we present an efficient inference method that can scale up to very large samples, with tens or hundreds of thousands of individuals. Specifically, by utilizing analytic results on the expected frequency spectrum under the coalescent and by leveraging the technique of automatic differentiation, which allows us to compute gradients exactly, we develop a very efficient algorithm to infer piecewise-exponential models of the historical effective population size from the distribution of sample allele frequencies. Our method is orders of magnitude faster than previous demographic inference methods based on the frequency spectrum. In addition to inferring demography, our method can also accurately estimate locus-specific mutation rates. We perform extensive validation of our method on simulated data and show that it can accurately infer multiple recent epochs of rapid exponential growth, a signal that is difficult to pick up with small sample sizes. Lastly, we use our method to analyze data from recent sequencing studies, including a large-sample exome-sequencing data set of tens of thousands of individuals assayed at a few hundred genic regions. © 2015 Bhaskar et al.; Published by Cold Spring Harbor Laboratory Press.

  12. Stability of local secondary structure determines selectivity of viral RNA chaperones.

    PubMed

    Bravo, Jack P K; Borodavka, Alexander; Barth, Anders; Calabrese, Antonio N; Mojzes, Peter; Cockburn, Joseph J B; Lamb, Don C; Tuma, Roman

    2018-05-18

    To maintain genome integrity, segmented double-stranded RNA viruses of the Reoviridae family must accurately select and package a complete set of up to a dozen distinct genomic RNAs. It is thought that the high fidelity segmented genome assembly involves multiple sequence-specific RNA-RNA interactions between single-stranded RNA segment precursors. These are mediated by virus-encoded non-structural proteins with RNA chaperone-like activities, such as rotavirus (RV) NSP2 and avian reovirus σNS. Here, we compared the abilities of NSP2 and σNS to mediate sequence-specific interactions between RV genomic segment precursors. Despite their similar activities, NSP2 successfully promotes inter-segment association, while σNS fails to do so. To understand the mechanisms underlying such selectivity in promoting inter-molecular duplex formation, we compared RNA-binding and helix-unwinding activities of both proteins. We demonstrate that octameric NSP2 binds structured RNAs with high affinity, resulting in efficient intramolecular RNA helix disruption. Hexameric σNS oligomerizes into an octamer that binds two RNAs, yet it exhibits only limited RNA-unwinding activity compared to NSP2. Thus, the formation of intersegment RNA-RNA interactions is governed by both helix-unwinding capacity of the chaperones and stability of RNA structure. We propose that this protein-mediated RNA selection mechanism may underpin the high fidelity assembly of multi-segmented RNA genomes in Reoviridae.

  13. Efficient molecular screening of Lynch syndrome by specific 3' promoter methylation of the MLH1 or BRAF mutation in colorectal cancer with high-frequency microsatellite instability.

    PubMed

    Nakagawa, Hitoshi; Nagasaka, Takeshi; Cullings, Harry M; Notohara, Kenji; Hoshijima, Naoko; Young, Joanne; Lynch, Henry T; Tanaka, Noriaki; Matsubara, Nagahide

    2009-06-01

    It is sometimes difficult to diagnose Lynch syndrome by the simple but strict clinical criteria, or even by the definitive genetic testing for causative germline mutation of mismatch repair genes. Thus, some practical and efficient screening strategy to select highly possible Lynch syndrome patients is exceedingly desirable. We performed a comprehensive study to evaluate the methylation status of whole MLH1 promoter region by direct bisulfite sequencing of the entire MLH1 promoter regions on Lynch and non-Lynch colorectal cancers (CRCs). Then, we established a convenient assay to detect methylation in key CpG islands responsible for the silencing of MLH1 expression. We studied the methylation status of MLH1 as well as the CpG island methylator phenotype (CIMP) and immunohistochemical analysis of mismatch repair proteins on 16 cases of Lynch CRC and 19 cases of sporadic CRCs with high-frequency microsatellite instability (MSI-H). Sensitivity to detect Lynch syndrome by MLH1 (CCAAT) methylation was 88% and the specificity was 84%. Positive likelihood ratio (PLR) was 5.5 and negative likelihood ratio (NLR) was 0.15. Sensitivity by mutational analysis of BRAF was 100%, specificity was 84%, PLR was 6.3 and NLR was zero. By CIMP analysis; sensitivity was 88%, specificity was 79%, PLR was 4.2, and NLR was 0.16. BRAF mutation or MLH1 methylation analysis combined with MSI testing could be a good alternative to screen Lynch syndrome patients in a cost effective manner. Although the assay for CIMP status also showed acceptable sensitivity and specificity, it may not be practical because of its rather complicated assay.

  14. DNA sequence requirements for the accurate transcription of a protein-coding plastid gene in a plastid in vitro system from mustard (Sinapis alba L.)

    PubMed Central

    Link, Gerhard

    1984-01-01

    A nuclease-treated plastid extract from mustard (Sinapis alba L.) allows efficient transcription of cloned plastid DNA templates. In this in vitro system, the major runoff transcript of the truncated gene for the 32 000 mol. wt. photosystem II protein was accurately initiated from a site close to or identical with the in vivo start site. By using plasmids with deletions in the 5'-flanking region of this gene as templates, a DNA region required for efficient and selective initiation was detected ˜28-35 nucleotides upstream of the transcription start site. This region contains the sequence element TTGACA, which matches the consensus sequence for prokaryotic `−35' promoter elements. In the absence of this region, a region ˜13-27 nucleotides upstream of the start site still enables a basic level of specific transcription. This second region contains the sequence element TATATAA, which matches the consensus sequence for the `TATA' box of genes transcribed by RNA polymerase II (or B). The region between the `TATA'-like element and the transcription start site is not sufficient but may be required for specific transcription of the plastid gene. This latter region contains the sequence element TATACT, which resembles the prokaryotic `−10' (Pribnow) box. Based on the structural and transcriptional features of the 5' upstream region, a `promoter switch' mechanism is proposed, which may account for the developmentally regulated expression of this plastid gene. ImagesFig. 1.Fig. 2.Fig. 3.Fig. 4.Figure 5. PMID:16453540

  15. Elements in the transcriptional regulatory region flanking herpes simplex virus type 1 oriS stimulate origin function.

    PubMed

    Wong, S W; Schaffer, P A

    1991-05-01

    Like other DNA-containing viruses, the three origins of herpes simplex virus type 1 (HSV-1) DNA replication are flanked by sequences containing transcriptional regulatory elements. In a transient plasmid replication assay, deletion of sequences comprising the transcriptional regulatory elements of ICP4 and ICP22/47, which flank oriS, resulted in a greater than 80-fold decrease in origin function compared with a plasmid, pOS-822, which retains these sequences. In an effort to identify specific cis-acting elements responsible for this effect, we conducted systematic deletion analysis of the flanking region with plasmid pOS-822 and tested the resulting mutant plasmids for origin function. Stimulation by cis-acting elements was shown to be both distance and orientation dependent, as changes in either parameter resulted in a decrease in oriS function. Additional evidence for the stimulatory effect of flanking sequences on origin function was demonstrated by replacement of these sequences with the cytomegalovirus immediate-early promoter, resulting in nearly wild-type levels of oriS function. In competition experiments, cotransfection of cells with the test plasmid, pOS-822, and increasing molar concentrations of a competitor plasmid which contained the ICP4 and ICP22/47 transcriptional regulatory regions but lacked core origin sequences resulted in a significant reduction in the replication efficiency of pOS-822, demonstrating that factors which bind specifically to the oriS-flanking sequences are likely involved as auxiliary proteins in oriS function. Together, these studies demonstrate that trans-acting factors and the sites to which they bind play a critical role in the efficiency of HSV-1 DNA replication from oriS in transient-replication assays.

  16. Evaluation of efficiency of nested multiplex allele-specific PCR assay for detection of multidrug resistant tuberculosis directly from sputum samples.

    PubMed

    Mistri, S K; Sultana, M; Kamal, S M M; Alam, M M; Irin, F; Nessa, J; Ahsan, C R; Yasmin, M

    2016-05-01

    For an effective control of tuberculosis, rapid detection of multidrug resistant tuberculosis (MDR-TB) is necessary. Therefore, we developed a modified nested multiplex allele-specific polymerase chain reaction (MAS-PCR) method that enables rapid MDR-TB detection directly from sputum samples. The efficacy of this method was evaluated using 79 sputum samples collected from suspected tuberculosis patients. The performance of nested MAS-PCR method was compared with other MDR-TB detection methods like drug susceptibility testing (DST) and DNA sequencing. As rifampicin (RIF) resistance conforms to MDR-TB in greater than 90% cases, only the presence of RIF-associated mutations in rpoB gene was determined by DNA sequencing and nested MAS-PCR to detect MDR-TB. The concordance between nested MAS-PCR and DNA sequencing results was found to be 96·3%. When compared with DST, the sensitivity and specificity of nested MAS-PCR for RIF-resistance detection were determined to be 92·9 and 100% respectively. For developing- and high-TB burden countries, molecular-based tests have been recommended by the World Health Organization for rapid detection of MDR-TB. The results of this study indicate that, nested MAS-PCR assay might be a practical and relatively cost effective molecular method for rapid detection of MDR-TB from suspected sputum samples in developing countries with resource poor settings. © 2016 The Society for Applied Microbiology.

  17. Stumbling across the Same Phage: Comparative Genomics of Widespread Temperate Phages Infecting the Fish Pathogen Vibrio anguillarum

    PubMed Central

    Kalatzis, Panos G.; Rørbo, Nanna; Castillo, Daniel; Mauritzen, Jesper Juel; Jørgensen, Jóhanna; Kokkari, Constantina; Zhang, Faxing; Katharios, Pantelis; Middelboe, Mathias

    2017-01-01

    Nineteen Vibrio anguillarum-specific temperate bacteriophages isolated across Europe and Chile from aquaculture and environmental sites were genome sequenced and analyzed for host range, morphology and life cycle characteristics. The phages were classified as Siphoviridae with genome sizes between 46,006 and 54,201 bp. All 19 phages showed high genetic similarity, and 13 phages were genetically identical. Apart from sporadically distributed single nucleotide polymorphisms (SNPs), genetic diversifications were located in three variable regions (VR1, VR2 and VR3) in six of the phage genomes. Identification of specific genes, such as N6-adenine methyltransferase and lambda like repressor, as well as the presence of a tRNAArg, suggested a both mutualistic and parasitic interaction between phages and hosts. During short term phage exposure experiments, 28% of a V. anguillarum host population was lysogenized by the temperate phages and a genomic analysis of a collection of 31 virulent V. anguillarum showed that the isolated phages were present as prophages in >50% of the strains covering large geographical distances. Further, phage sequences were widely distributed among CRISPR-Cas arrays of publicly available sequenced Vibrios. The observed distribution of these specific temperate Vibriophages across large geographical scales may be explained by efficient dispersal of phages and bacteria in the marine environment combined with a mutualistic interaction between temperate phages and their hosts which selects for co-existence rather than arms race dynamics. PMID:28531104

  18. Process in manufacturing high efficiency AlGaAs/GaAs solar cells by MO-CVD

    NASA Technical Reports Server (NTRS)

    Yeh, Y. C. M.; Chang, K. I.; Tandon, J.

    1984-01-01

    Manufacturing technology for mass producing high efficiency GaAs solar cells is discussed. A progress using a high throughput MO-CVD reactor to produce high efficiency GaAs solar cells is discussed. Thickness and doping concentration uniformity of metal oxide chemical vapor deposition (MO-CVD) GaAs and AlGaAs layer growth are discussed. In addition, new tooling designs are given which increase the throughput of solar cell processing. To date, 2cm x 2cm AlGaAs/GaAs solar cells with efficiency up to 16.5% were produced. In order to meet throughput goals for mass producing GaAs solar cells, a large MO-CVD system (Cambridge Instrument Model MR-200) with a susceptor which was initially capable of processing 20 wafers (up to 75 mm diameter) during a single growth run was installed. In the MR-200, the sequencing of the gases and the heating power are controlled by a microprocessor-based programmable control console. Hence, operator errors can be reduced, leading to a more reproducible production sequence.

  19. Project 1: Microbial Genomes: A Genomic Approach to Understanding the Evolution of Virulence. Project 2: From Genomes to Life: Drosophilia Development in Space and Time

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Robert DeSalle

    2004-09-10

    This project seeks to use the genomes of two close relatives, A. actinomycetemcomitans and H. aphrophilus, to understand the evolutionary changes that take place in a genome to make it more or less virulent. Our primary specific aim of this project was to sequence, annotate, and analyze the genomes of Actinobacillus actinomycetemcomitans (CU1000, serotype f) and Haemophilus aphrophilus. With these genome sequences we have then compared the whole genome sequences to each other and to the current Aa (HK1651 www.genome.ou.edu) genome project sequence along with other fully sequenced Pasteurellaceae to determine inter and intra species differences that may account formore » the differences and similarities in disease. We also propose to create and curate a comprehensive database where sequence information and analysis for the Pasteurellaceae (family that includes the genera Actinobacillus and Haemophilus) are readily accessible. And finally we have proposed to develop phylogenetic techniques that can be used to efficiently and accurately examine the evolution of genomes. Below we report on progress we have made on these major specific aims. Progress on the specific aims is reported below under two major headings--experimental approaches and bioinformatics and systematic biology approaches.« less

  20. Sequence specificity of single-stranded DNA-binding proteins: a novel DNA microarray approach

    PubMed Central

    Morgan, Hugh P.; Estibeiro, Peter; Wear, Martin A.; Max, Klaas E.A.; Heinemann, Udo; Cubeddu, Liza; Gallagher, Maurice P.; Sadler, Peter J.; Walkinshaw, Malcolm D.

    2007-01-01

    We have developed a novel DNA microarray-based approach for identification of the sequence-specificity of single-stranded nucleic-acid-binding proteins (SNABPs). For verification, we have shown that the major cold shock protein (CspB) from Bacillus subtilis binds with high affinity to pyrimidine-rich sequences, with a binding preference for the consensus sequence, 5′-GTCTTTG/T-3′. The sequence was modelled onto the known structure of CspB and a cytosine-binding pocket was identified, which explains the strong preference for a cytosine base at position 3. This microarray method offers a rapid high-throughput approach for determining the specificity and strength of ss DNA–protein interactions. Further screening of this newly emerging family of transcription factors will help provide an insight into their cellular function. PMID:17488853

  1. Development of high repetition rate nitric oxide planar laser induced fluorescence imaging

    NASA Astrophysics Data System (ADS)

    Jiang, Naibo

    This thesis has documented the development of a MHz repitition rate pulse burst laser system. Second harmonic and third harmonic efficiencies are improved by adding a Phase Conjugate Mirror to the system. Some high energy fundamental, second harmonic, and third harmonic burst sequences consisting of 1--12 pulses separated in time by between 4 and 12 microseconds are now routinely obtained. The reported burst envelopes are quite uniform. We have also demonstrated the ability to generate ultra-high frequency sequences of broadly wavelength tunable, high intensity laser pulses using a home built injection seeded Optical Parametric Oscillator (OPO), pumped by the second and third harmonic output of the pulse burst laser. Typical OPO output burst sequences consist of 6--10 pulses, separated in time by between 6 and 10 microseconds. With third harmonic pumping of the OPO system, we studied four conditions, two-crystal Singly Resonant OPO (SRO) cavity, three-crystal OPO cavity, single pass two-crystal Doubly Resonant OPO (DRO) cavity and double pass two-crystal OPO cavity. The double pass two-crystal OPO cavity gives the best operation in burst mode. For single pass OPO, the average total OPO conversion efficiency is approximately 25%. For double pass OPO, the average total OPO conversion efficiency is approximately 35%. As a preliminary work, we studied 532nm pumping of a single crystal OPO cavity. With single pulse pumping, the conversion efficiency can reach 30%. For both 355nm and 532nm pumping OPO, we have demonstrated injection seeding. The OPO output light linewidth is significantly narrowed. Some preliminary etalon traces are also reported. By mixing the OPO signal output at 622nm with residual third harmonic at 355nm, we obtained 226nm burst sequences with average pulse energy of ˜0.2 mJ. Injection seeding of the OPO increases the energy achieved by a factor of ˜2. 226nm burst sequences with reasonably uniform burst envelopes are reported. Using the system we have obtained, for the first time by any known optical method, Planar Laser Induced Fluorescence (PLIF) image sequences at ultrahigh (≥100kHz) frame rates, in particular NO PLIF image sequences, have been obtained in a Mach 2 jet. We also studied the possibility of utilizing a 250 kHz pulsed Nd:YVO 4 laser as the master oscillator. 10-pulse-10-mus spacing burst sequences with reasonably uniform burst envelope have been obtained. The total energy of the burst sequence is ˜2.5J.

  2. Isolation of two new retrotransposon sequences and development of molecular and cytological markers for Dasypyrum villosum (L.).

    PubMed

    Zhang, Jie; Jiang, Yun; Xuan, Pu; Guo, Yuanlin; Deng, Guangbing; Yu, Maoqun; Long, Hai

    2017-10-01

    Dasypyrum villosum is a valuable genetic resource for wheat improvement. With the aim to efficiently monitor the D. villosum chromatin introduced into common wheat, two novel retrotransposon sequences were isolated by RAPD, and were successfully converted to D. villosum-specific SCAR markers. In addition, we constructed a chromosomal karyotype of D. villosum. Our results revealed that different accessions of D. villosum showed slightly different signal patterns, indicating that distribution of repeats did not diverge significantly among D. villosum accessions. The two SCAR markers and FISH karyotype of D. villosum could be used for efficient and precise identification of D. villosum chromatin in wheat breeding.

  3. The sequence specificity of UV-induced DNA damage in a systematically altered DNA sequence.

    PubMed

    Khoe, Clairine V; Chung, Long H; Murray, Vincent

    2018-06-01

    The sequence specificity of UV-induced DNA damage was investigated in a specifically designed DNA plasmid using two procedures: end-labelling and linear amplification. Absorption of UV photons by DNA leads to dimerisation of pyrimidine bases and produces two major photoproducts, cyclobutane pyrimidine dimers (CPDs) and pyrimidine(6-4)pyrimidone photoproducts (6-4PPs). A previous study had determined that two hexanucleotide sequences, 5'-GCTC*AC and 5'-TATT*AA, were high intensity UV-induced DNA damage sites. The UV clone plasmid was constructed by systematically altering each nucleotide of these two hexanucleotide sequences. One of the main goals of this study was to determine the influence of single nucleotide alterations on the intensity of UV-induced DNA damage. The sequence 5'-GCTC*AC was designed to examine the sequence specificity of 6-4PPs and the highest intensity 6-4PP damage sites were found at 5'-GTTC*CC nucleotides. The sequence 5'-TATT*AA was devised to investigate the sequence specificity of CPDs and the highest intensity CPD damage sites were found at 5'-TTTT*CG nucleotides. It was proposed that the tetranucleotide DNA sequence, 5'-YTC*Y (where Y is T or C), was the consensus sequence for the highest intensity UV-induced 6-4PP adduct sites; while it was 5'-YTT*C for the highest intensity UV-induced CPD damage sites. These consensus tetranucleotides are composed entirely of consecutive pyrimidines and must have a DNA conformation that is highly productive for the absorption of UV photons. Crown Copyright © 2018. Published by Elsevier B.V. All rights reserved.

  4. Using high-throughput barcode sequencing to efficiently map connectomes.

    PubMed

    Peikon, Ian D; Kebschull, Justus M; Vagin, Vasily V; Ravens, Diana I; Sun, Yu-Chi; Brouzes, Eric; Corrêa, Ivan R; Bressan, Dario; Zador, Anthony M

    2017-07-07

    The function of a neural circuit is determined by the details of its synaptic connections. At present, the only available method for determining a neural wiring diagram with single synapse precision-a 'connectome'-is based on imaging methods that are slow, labor-intensive and expensive. Here, we present SYNseq, a method for converting the connectome into a form that can exploit the speed and low cost of modern high-throughput DNA sequencing. In SYNseq, each neuron is labeled with a unique random nucleotide sequence-an RNA 'barcode'-which is targeted to the synapse using engineered proteins. Barcodes in pre- and postsynaptic neurons are then associated through protein-protein crosslinking across the synapse, extracted from the tissue, and joined into a form suitable for sequencing. Although our failure to develop an efficient barcode joining scheme precludes the widespread application of this approach, we expect that with further development SYNseq will enable tracing of complex circuits at high speed and low cost. © The Author(s) 2017. Published by Oxford University Press on behalf of Nucleic Acids Research.

  5. Buffer layer enhanced stability of sodium-ion storage

    NASA Astrophysics Data System (ADS)

    Wang, Xusheng; Yang, Zhanhai; Wang, Chao; Chen, Dong; Li, Rui; Zhang, Xinxiang; Chen, Jitao; Xue, Mianqi

    2017-11-01

    Se-Se buffer layers are introduced into tin sequences as SnSe2 single crystal to enhance the cycling stability for long-term sodium-ion storage by blazing a trail of self-defence strategy to structural pulverization especially at high current density. Specifically, under half-cell test, the SnSe2 electrodes could yield a high discharge capacity of 345 mAh g-1 after 300 cycles at 1 A g-1 and a high discharge capacity of 300 mAh g-1 after 2100 cycles at 5 A g-1 with stable coulombic efficiency and no capacity fading. Even with the ultrafast sodium-ion storage at 10 A g-1, the cycling stability still makes a positive response and a high discharge capacity of 221 mAh g-1 is demonstrated after 2700 cycles without capacity fading. The full-cell test for the SnSe2 electrodes also demonstrates the superior cycling stability. The flexible and tough Se-Se buffer layers are favourable to accommodate the sodium-ion intercalation process, and the autogenous Na2Se layers could confine the structural pulverization of further sodiated tin sequences by the slip along the Na2Se-NaxSn interfaces.

  6. Arrays of probes for positional sequencing by hybridization

    DOEpatents

    Cantor, Charles R [Boston, MA; Prezetakiewiczr, Marek [East Boston, MA; Smith, Cassandra L [Boston, MA; Sano, Takeshi [Waltham, MA

    2008-01-15

    This invention is directed to methods and reagents useful for sequencing nucleic acid targets utilizing sequencing by hybridization technology comprising probes, arrays of probes and methods whereby sequence information is obtained rapidly and efficiently in discrete packages. That information can be used for the detection, identification, purification and complete or partial sequencing of a particular target nucleic acid. When coupled with a ligation step, these methods can be performed under a single set of hybridization conditions. The invention also relates to the replication of probe arrays and methods for making and replicating arrays of probes which are useful for the large scale manufacture of diagnostic aids used to screen biological samples for specific target sequences. Arrays created using PCR technology may comprise probes with 5'- and/or 3'-overhangs.

  7. Development of a genotyping microarray for Usher syndrome.

    PubMed

    Cremers, Frans P M; Kimberling, William J; Külm, Maigi; de Brouwer, Arjan P; van Wijk, Erwin; te Brinke, Heleen; Cremers, Cor W R J; Hoefsloot, Lies H; Banfi, Sandro; Simonelli, Francesca; Fleischhauer, Johannes C; Berger, Wolfgang; Kelley, Phil M; Haralambous, Elene; Bitner-Glindzicz, Maria; Webster, Andrew R; Saihan, Zubin; De Baere, Elfride; Leroy, Bart P; Silvestri, Giuliana; McKay, Gareth J; Koenekoop, Robert K; Millan, Jose M; Rosenberg, Thomas; Joensuu, Tarja; Sankila, Eeva-Marja; Weil, Dominique; Weston, Mike D; Wissinger, Bernd; Kremer, Hannie

    2007-02-01

    Usher syndrome, a combination of retinitis pigmentosa (RP) and sensorineural hearing loss with or without vestibular dysfunction, displays a high degree of clinical and genetic heterogeneity. Three clinical subtypes can be distinguished, based on the age of onset and severity of the hearing impairment, and the presence or absence of vestibular abnormalities. Thus far, eight genes have been implicated in the syndrome, together comprising 347 protein-coding exons. To improve DNA diagnostics for patients with Usher syndrome, we developed a genotyping microarray based on the arrayed primer extension (APEX) method. Allele-specific oligonucleotides corresponding to all 298 Usher syndrome-associated sequence variants known to date, 76 of which are novel, were arrayed. Approximately half of these variants were validated using original patient DNAs, which yielded an accuracy of >98%. The efficiency of the Usher genotyping microarray was tested using DNAs from 370 unrelated European and American patients with Usher syndrome. Sequence variants were identified in 64/140 (46%) patients with Usher syndrome type I, 45/189 (24%) patients with Usher syndrome type II, 6/21 (29%) patients with Usher syndrome type III and 6/20 (30%) patients with atypical Usher syndrome. The chip also identified two novel sequence variants, c.400C>T (p.R134X) in PCDH15 and c.1606T>C (p.C536S) in USH2A. The Usher genotyping microarray is a versatile and affordable screening tool for Usher syndrome. Its efficiency will improve with the addition of novel sequence variants with minimal extra costs, making it a very useful first-pass screening tool.

  8. Development of a genotyping microarray for Usher syndrome

    PubMed Central

    Cremers, Frans P M; Kimberling, William J; Külm, Maigi; de Brouwer, Arjan P; van Wijk, Erwin; te Brinke, Heleen; Cremers, Cor W R J; Hoefsloot, Lies H; Banfi, Sandro; Simonelli, Francesca; Fleischhauer, Johannes C; Berger, Wolfgang; Kelley, Phil M; Haralambous, Elene; Bitner‐Glindzicz, Maria; Webster, Andrew R; Saihan, Zubin; De Baere, Elfride; Leroy, Bart P; Silvestri, Giuliana; McKay, Gareth J; Koenekoop, Robert K; Millan, Jose M; Rosenberg, Thomas; Joensuu, Tarja; Sankila, Eeva‐Marja; Weil, Dominique; Weston, Mike D; Wissinger, Bernd; Kremer, Hannie

    2007-01-01

    Background Usher syndrome, a combination of retinitis pigmentosa (RP) and sensorineural hearing loss with or without vestibular dysfunction, displays a high degree of clinical and genetic heterogeneity. Three clinical subtypes can be distinguished, based on the age of onset and severity of the hearing impairment, and the presence or absence of vestibular abnormalities. Thus far, eight genes have been implicated in the syndrome, together comprising 347 protein‐coding exons. Methods: To improve DNA diagnostics for patients with Usher syndrome, we developed a genotyping microarray based on the arrayed primer extension (APEX) method. Allele‐specific oligonucleotides corresponding to all 298 Usher syndrome‐associated sequence variants known to date, 76 of which are novel, were arrayed. Results Approximately half of these variants were validated using original patient DNAs, which yielded an accuracy of >98%. The efficiency of the Usher genotyping microarray was tested using DNAs from 370 unrelated European and American patients with Usher syndrome. Sequence variants were identified in 64/140 (46%) patients with Usher syndrome type I, 45/189 (24%) patients with Usher syndrome type II, 6/21 (29%) patients with Usher syndrome type III and 6/20 (30%) patients with atypical Usher syndrome. The chip also identified two novel sequence variants, c.400C>T (p.R134X) in PCDH15 and c.1606T>C (p.C536S) in USH2A. Conclusion The Usher genotyping microarray is a versatile and affordable screening tool for Usher syndrome. Its efficiency will improve with the addition of novel sequence variants with minimal extra costs, making it a very useful first‐pass screening tool. PMID:16963483

  9. High-throughput sequencing of TCR repertoires in multiple sclerosis reveals intrathecal enrichment of EBV-reactive CD8+ T cells.

    PubMed

    Lossius, Andreas; Johansen, Jorunn N; Vartdal, Frode; Robins, Harlan; Jūratė Šaltytė, Benth; Holmøy, Trygve; Olweus, Johanna

    2014-11-01

    Epstein-Barr virus (EBV) has long been suggested as a pathogen in multiple sclerosis (MS). Here, we used high-throughput sequencing to determine the diversity, compartmentalization, persistence, and EBV-reactivity of the T-cell receptor (TCR) repertoires in MS. TCR-β genes were sequenced in paired samples of cerebrospinal fluid (CSF) and blood from patients with MS and controls with other inflammatory neurological diseases. The TCR repertoires were highly diverse in both compartments and patient groups. Expanded T-cell clones, represented by TCR-β sequences >0.1%, were of different identity in CSF and blood of MS patients, and persisted for more than a year. Reference TCR-β libraries generated from peripheral blood T cells reactive against autologous EBV-transformed B cells were highly enriched for public EBV-specific sequences and were used to quantify EBV-reactive TCR-β sequences in CSF. TCR-β sequences of EBV-reactive CD8+ T cells, including several public EBV-specific sequences, were intrathecally enriched in MS patients only, whereas those of EBV-reactive CD4+ T cells were also enriched in CSF of controls. These data provide evidence for a clonally diverse, yet compartmentalized and persistent, intrathecal T-cell response in MS. The presented strategy links TCR sequence to intrathecal T-cell specificity, demonstrating enrichment of EBV-reactive CD8+ T cells in MS. © 2014 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.

  10. The punctum fixum-punctum mobile model: a neuromuscular principle for efficient movement generation?

    PubMed

    von Laßberg, Christoph; Rapp, Walter

    2015-01-01

    According to the "punctum fixum-punctum mobile model" that was introduced in prior studies, for generation of the most effective intentional acceleration of a body part the intersegmental neuromuscular onset succession has to spread successively from the rotation axis (punctum fixum) toward the body part that shall be accelerated (punctum mobile). The aim of the present study was to investigate whether this principle is, indeed, fundamental for any kind of efficient rotational accelerations in general, independent of the kind of movements, type of rotational axis, the current body position, or movement direction. Neuromuscular onset succession was captured by surface electromyography of relevant muscles of the anterior and posterior muscle chain in 16 high-level gymnasts during intentional accelerating movement phases while performing 18 different gymnastics elements (in various body positions to forward and backward, performed on high bar, parallel bars, rings and trampoline), as well as during non-sport specific pivot movements around the longitudinal axis. The succession patterns to generate the acceleration phases during these movements were described and statistically evaluated based on the onset time difference between the muscles of the corresponding muscle chain. In all the analyzed movement phases, the results clearly support the hypothesized succession pattern from punctum fixum to punctum mobile. This principle was further underlined by the finding that the succession patterns do change their direction running through the body when the rotational axis (punctum fixum) has been changed (e.g., high bar or rings [hands] vs. floor or trampoline [feet]). The findings improve our understanding of intersegmental neuromuscular coordination patterns to generate intentional movements most efficiently. This could help to develop more specific methods to facilitate such patterns in particular contexts, thus allowing for shorter motor learning procedures of context-specific key movement sequences in different disciplines of sports, as well as during non-sport specific movements.

  11. The Punctum Fixum-Punctum Mobile Model: A Neuromuscular Principle for Efficient Movement Generation?

    PubMed Central

    von Laßberg, Christoph; Rapp, Walter

    2015-01-01

    According to the “punctum fixum–punctum mobile model” that was introduced in prior studies, for generation of the most effective intentional acceleration of a body part the intersegmental neuromuscular onset succession has to spread successively from the rotation axis (punctum fixum) toward the body part that shall be accelerated (punctum mobile). The aim of the present study was to investigate whether this principle is, indeed, fundamental for any kind of efficient rotational accelerations in general, independent of the kind of movements, type of rotational axis, the current body position, or movement direction. Neuromuscular onset succession was captured by surface electromyography of relevant muscles of the anterior and posterior muscle chain in 16 high-level gymnasts during intentional accelerating movement phases while performing 18 different gymnastics elements (in various body positions to forward and backward, performed on high bar, parallel bars, rings and trampoline), as well as during non-sport specific pivot movements around the longitudinal axis. The succession patterns to generate the acceleration phases during these movements were described and statistically evaluated based on the onset time difference between the muscles of the corresponding muscle chain. In all the analyzed movement phases, the results clearly support the hypothesized succession pattern from punctum fixum to punctum mobile. This principle was further underlined by the finding that the succession patterns do change their direction running through the body when the rotational axis (punctum fixum) has been changed (e.g., high bar or rings [hands] vs. floor or trampoline [feet]). The findings improve our understanding of intersegmental neuromuscular coordination patterns to generate intentional movements most efficiently. This could help to develop more specific methods to facilitate such patterns in particular contexts, thus allowing for shorter motor learning procedures of context-specific key movement sequences in different disciplines of sports, as well as during non-sport specific movements. PMID:25822498

  12. Herpes simplex virus DNA packaging sequences adopt novel structures that are specifically recognized by a component of the cleavage and packaging machinery.

    PubMed

    Adelman, K; Salmon, B; Baines, J D

    2001-03-13

    The product of the herpes simplex virus type 1 U(L)28 gene is essential for cleavage of concatemeric viral DNA into genome-length units and packaging of this DNA into viral procapsids. To address the role of U(L)28 in this process, purified U(L)28 protein was assayed for the ability to recognize conserved herpesvirus DNA packaging sequences. We report that DNA fragments containing the pac1 DNA packaging motif can be induced by heat treatment to adopt novel DNA conformations that migrate faster than the corresponding duplex in nondenaturing gels. Surprisingly, these novel DNA structures are high-affinity substrates for U(L)28 protein binding, whereas double-stranded DNA of identical sequence composition is not recognized by U(L)28 protein. We demonstrate that only one strand of the pac1 motif is responsible for the formation of novel DNA structures that are bound tightly and specifically by U(L)28 protein. To determine the relevance of the observed U(L)28 protein-pac1 interaction to the cleavage and packaging process, we have analyzed the binding affinity of U(L)28 protein for pac1 mutants previously shown to be deficient in cleavage and packaging in vivo. Each of the pac1 mutants exhibited a decrease in DNA binding by U(L)28 protein that correlated directly with the reported reduction in cleavage and packaging efficiency, thereby supporting a role for the U(L)28 protein-pac1 interaction in vivo. These data therefore suggest that the formation of novel DNA structures by the pac1 motif confers added specificity on recognition of DNA packaging sequences by the U(L)28-encoded component of the herpesvirus cleavage and packaging machinery.

  13. Lack of Heterologous Cross-reactivity toward HLA-A*02:01 Restricted Viral Epitopes Is Underpinned by Distinct αβT Cell Receptor Signatures.

    PubMed

    Grant, Emma J; Josephs, Tracy M; Valkenburg, Sophie A; Wooldridge, Linda; Hellard, Margaret; Rossjohn, Jamie; Bharadwaj, Mandvi; Kedzierska, Katherine; Gras, Stephanie

    2016-11-18

    αβT cell receptor (TCR) genetic diversity is outnumbered by the quantity of pathogenic epitopes to be recognized. To provide efficient protective anti-viral immunity, a single TCR ideally needs to cross-react with a multitude of pathogenic epitopes. However, the frequency, extent, and mechanisms of TCR cross-reactivity remain unclear, with conflicting results on anti-viral T cell cross-reactivity observed in humans. Namely, both the presence and lack of T cell cross-reactivity have been reported with HLA-A*02:01-restricted epitopes from the Epstein-Barr and influenza viruses (BMLF-1 and M1 58 , respectively) or with the hepatitis C and influenza viruses (NS3 1073 and NA 231 , respectively). Given the high sequence similarity of these paired viral epitopes (56 and 88%, respectively), the ubiquitous nature of the three viruses, and the high frequency of the HLA-A*02:01 allele, we selected these epitopes to establish the extent of T cell cross-reactivity. We combined ex vivo and in vitro functional assays, single-cell αβTCR repertoire sequencing, and structural analysis of these four epitopes in complex with HLA-A*02:01 to determine whether they could lead to heterologous T cell cross-reactivity. Our data show that sequence similarity does not translate to structural mimicry of the paired epitopes in complexes with HLA-A*02:01, resulting in induction of distinct αβTCR repertoires. The differences in epitope architecture might be an obstacle for TCR recognition, explaining the lack of T cell cross-reactivity observed. In conclusion, sequence similarity does not necessarily result in structural mimicry, and despite the need for cross-reactivity, antigen-specific TCR repertoires can remain highly specific. © 2016 by The American Society for Biochemistry and Molecular Biology, Inc.

  14. Development and application of a PCR assay to detect chicken and turkey parvoviruses in commercial poultry flocks in the United States.

    USDA-ARS?s Scientific Manuscript database

    Comparative sequence analysis of six independent chicken and turkey parvovirus nonstructural (NS) genes revealed specific genomic regions with 100% nucleotide sequence identity. A PCR assay with primers targeting these conserved genome sequences proved to be highly specific and sensitive to detect p...

  15. Rapid bursts and slow declines: on the possible evolutionary trajectories of enzymes.

    PubMed

    Newton, Matilda S; Arcus, Vickery L; Patrick, Wayne M

    2015-06-06

    The evolution of enzymes is often viewed as following a smooth and steady trajectory, from barely functional primordial catalysts to the highly active and specific enzymes that we observe today. In this review, we summarize experimental data that suggest a different reality. Modern examples, such as the emergence of enzymes that hydrolyse human-made pesticides, demonstrate that evolution can be extraordinarily rapid. Experiments to infer and resurrect ancient sequences suggest that some of the first organisms present on the Earth are likely to have possessed highly active enzymes. Reconciling these observations, we argue that rapid bursts of strong selection for increased catalytic efficiency are interspersed with much longer periods in which the catalytic power of an enzyme erodes, through neutral drift and selection for other properties such as cellular energy efficiency or regulation. Thus, many enzymes may have already passed their catalytic peaks. © 2015 The Author(s) Published by the Royal Society. All rights reserved.

  16. Digestion-ligation-only Hi-C is an efficient and cost-effective method for chromosome conformation capture.

    PubMed

    Lin, Da; Hong, Ping; Zhang, Siheng; Xu, Weize; Jamal, Muhammad; Yan, Keji; Lei, Yingying; Li, Liang; Ruan, Yijun; Fu, Zhen F; Li, Guoliang; Cao, Gang

    2018-05-01

    Chromosome conformation capture (3C) technologies can be used to investigate 3D genomic structures. However, high background noise, high costs, and a lack of straightforward noise evaluation in current methods impede the advancement of 3D genomic research. Here we developed a simple digestion-ligation-only Hi-C (DLO Hi-C) technology to explore the 3D landscape of the genome. This method requires only two rounds of digestion and ligation, without the need for biotin labeling and pulldown. Non-ligated DNA was efficiently removed in a cost-effective step by purifying specific linker-ligated DNA fragments. Notably, random ligation could be quickly evaluated in an early quality-control step before sequencing. Moreover, an in situ version of DLO Hi-C using a four-cutter restriction enzyme has been developed. We applied DLO Hi-C to delineate the genomic architecture of THP-1 and K562 cells and uncovered chromosomal translocations. This technology may facilitate investigation of genomic organization, gene regulation, and (meta)genome assembly.

  17. Using DNA origami nanostructures to determine absolute cross sections for UV photon-induced DNA strand breakage.

    PubMed

    Vogel, Stefanie; Rackwitz, Jenny; Schürman, Robin; Prinz, Julia; Milosavljević, Aleksandar R; Réfrégiers, Matthieu; Giuliani, Alexandre; Bald, Ilko

    2015-11-19

    We have characterized ultraviolet (UV) photon-induced DNA strand break processes by determination of absolute cross sections for photoabsorption and for sequence-specific DNA single strand breakage induced by photons in an energy range from 6.50 to 8.94 eV. These represent the lowest-energy photons able to induce DNA strand breaks. Oligonucleotide targets are immobilized on a UV transparent substrate in controlled quantities through attachment to DNA origami templates. Photon-induced dissociation of single DNA strands is visualized and quantified using atomic force microscopy. The obtained quantum yields for strand breakage vary between 0.06 and 0.5, indicating highly efficient DNA strand breakage by UV photons, which is clearly dependent on the photon energy. Above the ionization threshold strand breakage becomes clearly the dominant form of DNA radiation damage, which is then also dependent on the nucleotide sequence.

  18. CCTop: An Intuitive, Flexible and Reliable CRISPR/Cas9 Target Prediction Tool

    PubMed Central

    del Sol Keyer, Maria; Wittbrodt, Joachim; Mateo, Juan L.

    2015-01-01

    Engineering of the CRISPR/Cas9 system has opened a plethora of new opportunities for site-directed mutagenesis and targeted genome modification. Fundamental to this is a stretch of twenty nucleotides at the 5’ end of a guide RNA that provides specificity to the bound Cas9 endonuclease. Since a sequence of twenty nucleotides can occur multiple times in a given genome and some mismatches seem to be accepted by the CRISPR/Cas9 complex, an efficient and reliable in silico selection and evaluation of the targeting site is key prerequisite for the experimental success. Here we present the CRISPR/Cas9 target online predictor (CCTop, http://crispr.cos.uni-heidelberg.de) to overcome limitations of already available tools. CCTop provides an intuitive user interface with reasonable default parameters that can easily be tuned by the user. From a given query sequence, CCTop identifies and ranks all candidate sgRNA target sites according to their off-target quality and displays full documentation. CCTop was experimentally validated for gene inactivation, non-homologous end-joining as well as homology directed repair. Thus, CCTop provides the bench biologist with a tool for the rapid and efficient identification of high quality target sites. PMID:25909470

  19. Laser assisted microdissection, an efficient technique to understand tissue specific gene expression patterns and functional genomics in plants.

    PubMed

    Gautam, Vibhav; Sarkar, Ananda K

    2015-04-01

    Laser assisted microdissection (LAM) is an advanced technology used to perform tissue or cell-specific expression profiling of genes and proteins, owing to its ability to isolate the desired tissue or cell type from a heterogeneous population. Due to the specificity and high efficiency acquired during its pioneering use in medical science, the LAM technique has quickly been adopted for use in many biological researches. Today, it has become a potent tool to address a wide range of questions in diverse field of plant biology. Beginning with comparative transcriptome analysis of different tissues such as reproductive parts, meristems, lateral organs, roots etc., LAM has also been extensively used in plant-pathogen interaction studies, proteomics, and metabolomics. In combination with next generation sequencing and proteomics analysis, LAM has opened up promising opportunities in the area of large scale functional studies in plants. Ever since the advent of this technique, significant improvements have been achieved in term of its instrumentation and method, which has made LAM a more efficient tool applicable in wider research areas. Here, we discuss the advancement of LAM technique with special emphasis on its methodology and highlight its scope in modern research areas of plant biology. Although we put emphasis on use of LAM in transcriptome studies, which is mostly used, we also discuss its recent application and scope in proteome and metabolome studies.

  20. Parallel algorithms for large-scale biological sequence alignment on Xeon-Phi based clusters.

    PubMed

    Lan, Haidong; Chan, Yuandong; Xu, Kai; Schmidt, Bertil; Peng, Shaoliang; Liu, Weiguo

    2016-07-19

    Computing alignments between two or more sequences are common operations frequently performed in computational molecular biology. The continuing growth of biological sequence databases establishes the need for their efficient parallel implementation on modern accelerators. This paper presents new approaches to high performance biological sequence database scanning with the Smith-Waterman algorithm and the first stage of progressive multiple sequence alignment based on the ClustalW heuristic on a Xeon Phi-based compute cluster. Our approach uses a three-level parallelization scheme to take full advantage of the compute power available on this type of architecture; i.e. cluster-level data parallelism, thread-level coarse-grained parallelism, and vector-level fine-grained parallelism. Furthermore, we re-organize the sequence datasets and use Xeon Phi shuffle operations to improve I/O efficiency. Evaluations show that our method achieves a peak overall performance up to 220 GCUPS for scanning real protein sequence databanks on a single node consisting of two Intel E5-2620 CPUs and two Intel Xeon Phi 7110P cards. It also exhibits good scalability in terms of sequence length and size, and number of compute nodes for both database scanning and multiple sequence alignment. Furthermore, the achieved performance is highly competitive in comparison to optimized Xeon Phi and GPU implementations. Our implementation is available at https://github.com/turbo0628/LSDBS-mpi .

  1. SapTrap, a Toolkit for High-Throughput CRISPR/Cas9 Gene Modification in Caenorhabditis elegans.

    PubMed

    Schwartz, Matthew L; Jorgensen, Erik M

    2016-04-01

    In principle, clustered regularly interspaced short palindromic repeats (CRISPR)/Cas9 allows genetic tags to be inserted at any locus. However, throughput is limited by the laborious construction of repair templates and guide RNA constructs and by the identification of modified strains. We have developed a reagent toolkit and plasmid assembly pipeline, called "SapTrap," that streamlines the production of targeting vectors for tag insertion, as well as the selection of modified Caenorhabditis elegans strains. SapTrap is a high-efficiency modular plasmid assembly pipeline that produces single plasmid targeting vectors, each of which encodes both a guide RNA transcript and a repair template for a particular tagging event. The plasmid is generated in a single tube by cutting modular components with the restriction enzyme SapI, which are then "trapped" in a fixed order by ligation to generate the targeting vector. A library of donor plasmids supplies a variety of protein tags, a selectable marker, and regulatory sequences that allow cell-specific tagging at either the N or the C termini. All site-specific sequences, such as guide RNA targeting sequences and homology arms, are supplied as annealed synthetic oligonucleotides, eliminating the need for PCR or molecular cloning during plasmid assembly. Each tag includes an embedded Cbr-unc-119 selectable marker that is positioned to allow concurrent expression of both the tag and the marker. We demonstrate that SapTrap targeting vectors direct insertion of 3- to 4-kb tags at six different loci in 10-37% of injected animals. Thus SapTrap vectors introduce the possibility for high-throughput generation of CRISPR/Cas9 genome modifications. Copyright © 2016 by the Genetics Society of America.

  2. General approach to reversing ketol-acid reductoisomerase cofactor dependence from NADPH to NADH

    DOE PAGES

    Brinkmann-Chen, Sabine; Flock, Tilman; Cahn, Jackson K. B.; ...

    2013-06-17

    To date, efforts to switch the cofactor specificity of oxidoreductases from nicotinamide adenine dinucleotide phosphate (NADPH) to nicotinamide adenine dinucleotide (NADH) have been made on a case-by-case basis with varying degrees of success. Here we present a straightforward recipe for altering the cofactor specificity of a class of NADPH-dependent oxidoreductases, the ketol-acid reductoisomerases (KARIs). Combining previous results for an engineered NADH-dependent variant of Escherichia coli KARI with available KARI crystal structures and a comprehensive KARI-sequence alignment, we identified key cofactor specificity determinants and used this information to construct five KARIs with reversed cofactor preference. Additional directed evolution generated two enzymesmore » having NADH-dependent catalytic efficiencies that are greater than the wild-type enzymes with NADPH. As a result, high-resolution structures of a wild-type/variant pair reveal the molecular basis of the cofactor switch.« less

  3. Combination of Competitive Quantitative PCR and Constant-Denaturant Capillary Electrophoresis for High-Resolution Detection and Enumeration of Microbial Cells

    PubMed Central

    Lim, Eelin L.; Tomita, Aoy V.; Thilly, William G.; Polz, Martin F.

    2001-01-01

    A novel quantitative PCR (QPCR) approach, which combines competitive PCR with constant-denaturant capillary electrophoresis (CDCE), was adapted for enumerating microbial cells in environmental samples using the marine nanoflagellate Cafeteria roenbergensis as a model organism. Competitive PCR has been used successfully for quantification of DNA in environmental samples. However, this technique is labor intensive, and its accuracy is dependent on an internal competitor, which must possess the same amplification efficiency as the target yet can be easily discriminated from the target DNA. The use of CDCE circumvented these problems, as its high resolution permitted the use of an internal competitor which differed from the target DNA fragment by a single base and thus ensured that both sequences could be amplified with equal efficiency. The sensitivity of CDCE also enabled specific and precise detection of sequences over a broad range of concentrations. The combined competitive QPCR and CDCE approach accurately enumerated C. roenbergensis cells in eutrophic, coastal seawater at abundances ranging from approximately 10 to 104 cells ml−1. The QPCR cell estimates were confirmed by fluorescent in situ hybridization counts, but estimates of samples with <50 cells ml−1 by QPCR were less variable. This novel approach extends the usefulness of competitive QPCR by demonstrating its ability to reliably enumerate microorganisms at a range of environmentally relevant cell concentrations in complex aquatic samples. PMID:11525983

  4. Toward fish and seafood traceability: anchovy species determination in fish products by molecular markers and support through a public domain database.

    PubMed

    Jérôme, Marc; Martinsohn, Jann Thorsten; Ortega, Delphine; Carreau, Philippe; Verrez-Bagnis, Véronique; Mouchel, Olivier

    2008-05-28

    Traceability in the fish food sector plays an increasingly important role for consumer protection and confidence building. This is reflected by the introduction of legislation and rules covering traceability on national and international levels. Although traceability through labeling is well established and supported by respective regulations, monitoring and enforcement of these rules are still hampered by the lack of efficient diagnostic tools. We describe protocols using a direct sequencing method based on 212-274-bp diagnostic sequences derived from species-specific mitochondria DNA cytochrome b, 16S rRNA, and cytochrome oxidase subunit I sequences which can efficiently be applied to unambiguously determine even closely related fish species in processed food products labeled "anchovy". Traceability of anchovy-labeled products is supported by the public online database AnchovyID ( http://anchovyid.jrc.ec.europa.eu), which provided data obtained during our study and tools for analytical purposes.

  5. SeqCompress: an algorithm for biological sequence compression.

    PubMed

    Sardaraz, Muhammad; Tahir, Muhammad; Ikram, Ataul Aziz; Bajwa, Hassan

    2014-10-01

    The growth of Next Generation Sequencing technologies presents significant research challenges, specifically to design bioinformatics tools that handle massive amount of data efficiently. Biological sequence data storage cost has become a noticeable proportion of total cost in the generation and analysis. Particularly increase in DNA sequencing rate is significantly outstripping the rate of increase in disk storage capacity, which may go beyond the limit of storage capacity. It is essential to develop algorithms that handle large data sets via better memory management. This article presents a DNA sequence compression algorithm SeqCompress that copes with the space complexity of biological sequences. The algorithm is based on lossless data compression and uses statistical model as well as arithmetic coding to compress DNA sequences. The proposed algorithm is compared with recent specialized compression tools for biological sequences. Experimental results show that proposed algorithm has better compression gain as compared to other existing algorithms. Copyright © 2014 Elsevier Inc. All rights reserved.

  6. Baculoviral delivery of CRISPR/Cas9 facilitates efficient genome editing in human cells

    PubMed Central

    Hindriksen, Sanne; Bramer, Arne J.; Truong, My Anh; Vromans, Martijn J. M.; Post, Jasmin B.; Verlaan-Klink, Ingrid; Snippert, Hugo J.; Lens, Susanne M. A.

    2017-01-01

    The CRISPR/Cas9 system is a highly effective tool for genome editing. Key to robust genome editing is the efficient delivery of the CRISPR/Cas9 machinery. Viral delivery systems are efficient vehicles for the transduction of foreign genes but commonly used viral vectors suffer from a limited capacity in the genetic information they can carry. Baculovirus however is capable of carrying large exogenous DNA fragments. Here we investigate the use of baculoviral vectors as a delivery vehicle for CRISPR/Cas9 based genome-editing tools. We demonstrate transduction of a panel of cell lines with Cas9 and an sgRNA sequence, which results in efficient knockout of all four targeted subunits of the chromosomal passenger complex (CPC). We further show that introduction of a homology directed repair template into the same CRISPR/Cas9 baculovirus facilitates introduction of specific point mutations and endogenous gene tags. Tagging of the CPC recruitment factor Haspin with the fluorescent reporter YFP allowed us to study its native localization as well as recruitment to the cohesin subunit Pds5B. PMID:28640891

  7. An att site-based recombination reporter system for genome engineering and synthetic DNA assembly.

    PubMed

    Bland, Michael J; Ducos-Galand, Magaly; Val, Marie-Eve; Mazel, Didier

    2017-07-14

    Direct manipulation of the genome is a widespread technique for genetic studies and synthetic biology applications. The tyrosine and serine site-specific recombination systems of bacteriophages HK022 and ΦC31 are widely used for stable directional exchange and relocation of DNA sequences, making them valuable tools in these contexts. We have developed site-specific recombination tools that allow the direct selection of recombination events by embedding the attB site from each system within the β-lactamase resistance coding sequence (bla). The HK and ΦC31 tools were developed by placing the attB sites from each system into the signal peptide cleavage site coding sequence of bla. All possible open reading frames (ORFs) were inserted and tested for recombination efficiency and bla activity. Efficient recombination was observed for all tested ORFs (3 for HK, 6 for ΦC31) as shown through a cointegrate formation assay. The bla gene with the embedded attB site was functional for eight of the nine constructs tested. The HK/ΦC31 att-bla system offers a simple way to directly select recombination events, thus enhancing the use of site-specific recombination systems for carrying out precise, large-scale DNA manipulation, and adding useful tools to the genetics toolbox. We further show the power and flexibility of bla to be used as a reporter for recombination.

  8. Distinctive and Complementary MS2 Fragmentation Characteristics for Identification of Sulfated Sialylated N-Glycopeptides by nanoLC-MS/MS Workflow

    NASA Astrophysics Data System (ADS)

    Kuo, Chu-Wei; Guu, Shih-Yun; Khoo, Kay-Hooi

    2018-04-01

    High sensitivity identification of sulfated glycans carried on specific sites of glycoproteins is an important requisite for investigation of molecular recognition events involved in diverse biological processes. However, aiming for resolving site-specific glycosylation of sulfated glycopeptides by direct LC-MS2 sequencing is technically most challenging. Other than the usual limiting factors such as lower abundance and ionization efficiency compared to analysis of non-glycosylated peptides, confident identification of sulfated glycopeptides among the more abundant non-sulfated glycopeptides requires additional considerations in the selective enrichment and detection strategies. Metal oxide has been applied to enrich phosphopeptides and sialylated glycopeptides, but its use to capture sulfated glycopeptides has not been investigated. Likewise, various complementary MS2 fragmentation modes have yet to be tested against sialylated and non-sialylated sulfoglycopeptides due to limited appropriate sample availability. In this study, we have investigated the feasibility of sequencing tryptic sulfated N-glycopeptide and its MS2 fragmentation characteristics by first optimizing the enrichment methods to allow efficient LC-MS detection and MS2 analysis by a combination of CID, HCD, ETD, and EThcD on hybrid and tribrid Orbitrap instruments. Characteristic sulfated glyco-oxonium ions and direct loss of sulfite from precursors were detected as evidences of sulfate modification. It is anticipated that the technical advances demonstrated in this study would allow a feasible extension of our sulfoglycomic analysis to sulfoglycoproteomics. [Figure not available: see fulltext.

  9. Seamless Genetic Conversion of SMN2 to SMN1 via CRISPR/Cpf1 and Single-Stranded Oligodeoxynucleotides in Spinal Muscular Atrophy Patient-Specific Induced Pluripotent Stem Cells.

    PubMed

    Zhou, Miaojin; Hu, Zhiqing; Qiu, Liyan; Zhou, Tao; Feng, Mai; Hu, Qian; Zeng, Baitao; Li, Zhuo; Sun, Qianru; Wu, Yong; Liu, Xionghao; Wu, Lingqian; Liang, Desheng

    2018-05-09

    Spinal muscular atrophy (SMA) is a kind of neuromuscular disease characterized by progressive motor neuron loss in the spinal cord. It is caused by mutations in the survival motor neuron 1 (SMN1) gene. SMN1 has a paralogous gene, survival motor neuron 2 (SMN2), in humans that is present in almost all SMA patients. The generation and genetic correction of SMA patient-specific induced pluripotent stem cells (iPSCs) is a viable, autologous therapeutic strategy for the disease. Here, c-Myc-free and non-integrating iPSCs were generated from the urine cells of an SMA patient using an episomal iPSC reprogramming vector, and a unique crRNA was designed that does not have similar sequences (≤3 mismatches) anywhere in the human reference genome. In situ gene conversion of the SMN2 gene to an SMN1-like gene in SMA-iPSCs was achieved using CRISPR/Cpf1 and single-stranded oligodeoxynucleotide with a high efficiency of 4/36. Seamlessly gene-converted iPSC lines contained no exogenous sequences and retained a normal karyotype. Significantly, the SMN expression and gems localization were rescued in the gene-converted iPSCs and their derived motor neurons. This is the first report of an efficient gene conversion mediated by Cpf1 homology-directed repair in human cells and may provide a universal gene therapeutic approach for most SMA patients.

  10. Phylogeny of nodulation genes and symbiotic diversity of Acacia senegal (L.) Willd. and A. seyal (Del.) Mesorhizobium strains from different regions of Senegal.

    PubMed

    Bakhoum, Niokhor; Galiana, Antoine; Le Roux, Christine; Kane, Aboubacry; Duponnois, Robin; Ndoye, Fatou; Fall, Dioumacor; Noba, Kandioura; Sylla, Samba Ndao; Diouf, Diégane

    2015-04-01

    Acacia senegal and Acacia seyal are small, deciduous legume trees, most highly valued for nitrogen fixation and for the production of gum arabic, a commodity of international trade since ancient times. Symbiotic nitrogen fixation by legumes represents the main natural input of atmospheric N2 into ecosystems which may ultimately benefit all organisms. We analyzed the nod and nif symbiotic genes and symbiotic properties of root-nodulating bacteria isolated from A. senegal and A. seyal in Senegal. The symbiotic genes of rhizobial strains from the two Acacia species were closed to those of Mesorhizobium plurifarium and grouped separately in the phylogenetic trees. Phylogeny of rhizobial nitrogen fixation gene nifH was similar to those of nodulation genes (nodA and nodC). All A. senegal rhizobial strains showed identical nodA, nodC, and nifH gene sequences. By contrast, A. seyal rhizobial strains exhibited different symbiotic gene sequences. Efficiency tests demonstrated that inoculation of both Acacia species significantly affected nodulation, total dry weight, acetylene reduction activity (ARA), and specific acetylene reduction activity (SARA) of plants. However, these cross-inoculation tests did not show any specificity of Mesorhizobium strains toward a given Acacia host species in terms of infectivity and efficiency as stated by principal component analysis (PCA). This study demonstrates that large-scale inoculation of A. senegal and A. seyal in the framework of reafforestation programs requires a preliminary step of rhizobial strain selection for both Acacia species.

  11. Efficient induction of CD25- iTreg by co-immunization requires strongly antigenic epitopes for T cells.

    PubMed

    Geng, Shuang; Yu, Yang; Kang, Youmin; Pavlakis, George; Jin, Huali; Li, Jinyao; Hu, Yanxin; Hu, Weibin; Wang, Shuang; Wang, Bin

    2011-05-05

    We previously showed that co-immunization with a protein antigen and a DNA vaccine coding for the same antigen induces CD40 low IL-10 high tolerogenic DCs, which in turn stimulates the expansion of antigen-specific CD4+CD25-Foxp3+ regulatory T cells (CD25- iTreg). However, it was unclear how to choose the antigen sequence to maximize tolerogenic antigen presentation and, consequently, CD25- iTreg induction. In the present study, we demonstrated the requirement of highly antigenic epitopes for CD25- iTreg induction. Firstly, we showed that the induction of CD25- iTreg by tolerogenic DC can be blocked by anti-MHC-II antibody. Next, both the number and the suppressive activity of CD25- iTreg correlated positively with the overt antigenicity of an epitope to activate T cells. Finally, in a mouse model of dermatitis, highly antigenic epitopes derived from a flea allergen not only induced more CD25- iTreg, but also more effectively prevented allergenic reaction to the allergen than did weakly antigenic epitopes. Our data thus indicate that efficient induction of CD25- iTreg requires highly antigenic peptide epitopes. This finding suggests that highly antigenic epitopes should be used for efficient induction of CD25- iTreg for clinical applications such as flea allergic dermatitis.

  12. Optimizing high performance computing workflow for protein functional annotation.

    PubMed

    Stanberry, Larissa; Rekepalli, Bhanu; Liu, Yuan; Giblock, Paul; Higdon, Roger; Montague, Elizabeth; Broomall, William; Kolker, Natali; Kolker, Eugene

    2014-09-10

    Functional annotation of newly sequenced genomes is one of the major challenges in modern biology. With modern sequencing technologies, the protein sequence universe is rapidly expanding. Newly sequenced bacterial genomes alone contain over 7.5 million proteins. The rate of data generation has far surpassed that of protein annotation. The volume of protein data makes manual curation infeasible, whereas a high compute cost limits the utility of existing automated approaches. In this work, we present an improved and optmized automated workflow to enable large-scale protein annotation. The workflow uses high performance computing architectures and a low complexity classification algorithm to assign proteins into existing clusters of orthologous groups of proteins. On the basis of the Position-Specific Iterative Basic Local Alignment Search Tool the algorithm ensures at least 80% specificity and sensitivity of the resulting classifications. The workflow utilizes highly scalable parallel applications for classification and sequence alignment. Using Extreme Science and Engineering Discovery Environment supercomputers, the workflow processed 1,200,000 newly sequenced bacterial proteins. With the rapid expansion of the protein sequence universe, the proposed workflow will enable scientists to annotate big genome data.

  13. Optimizing high performance computing workflow for protein functional annotation

    PubMed Central

    Stanberry, Larissa; Rekepalli, Bhanu; Liu, Yuan; Giblock, Paul; Higdon, Roger; Montague, Elizabeth; Broomall, William; Kolker, Natali; Kolker, Eugene

    2014-01-01

    Functional annotation of newly sequenced genomes is one of the major challenges in modern biology. With modern sequencing technologies, the protein sequence universe is rapidly expanding. Newly sequenced bacterial genomes alone contain over 7.5 million proteins. The rate of data generation has far surpassed that of protein annotation. The volume of protein data makes manual curation infeasible, whereas a high compute cost limits the utility of existing automated approaches. In this work, we present an improved and optmized automated workflow to enable large-scale protein annotation. The workflow uses high performance computing architectures and a low complexity classification algorithm to assign proteins into existing clusters of orthologous groups of proteins. On the basis of the Position-Specific Iterative Basic Local Alignment Search Tool the algorithm ensures at least 80% specificity and sensitivity of the resulting classifications. The workflow utilizes highly scalable parallel applications for classification and sequence alignment. Using Extreme Science and Engineering Discovery Environment supercomputers, the workflow processed 1,200,000 newly sequenced bacterial proteins. With the rapid expansion of the protein sequence universe, the proposed workflow will enable scientists to annotate big genome data. PMID:25313296

  14. Complexity control algorithm based on adaptive mode selection for interframe coding in high efficiency video coding

    NASA Astrophysics Data System (ADS)

    Chen, Gang; Yang, Bing; Zhang, Xiaoyun; Gao, Zhiyong

    2017-07-01

    The latest high efficiency video coding (HEVC) standard significantly increases the encoding complexity for improving its coding efficiency. Due to the limited computational capability of handheld devices, complexity constrained video coding has drawn great attention in recent years. A complexity control algorithm based on adaptive mode selection is proposed for interframe coding in HEVC. Considering the direct proportionality between encoding time and computational complexity, the computational complexity is measured in terms of encoding time. First, complexity is mapped to a target in terms of prediction modes. Then, an adaptive mode selection algorithm is proposed for the mode decision process. Specifically, the optimal mode combination scheme that is chosen through offline statistics is developed at low complexity. If the complexity budget has not been used up, an adaptive mode sorting method is employed to further improve coding efficiency. The experimental results show that the proposed algorithm achieves a very large complexity control range (as low as 10%) for the HEVC encoder while maintaining good rate-distortion performance. For the lowdelayP condition, compared with the direct resource allocation method and the state-of-the-art method, an average gain of 0.63 and 0.17 dB in BDPSNR is observed for 18 sequences when the target complexity is around 40%.

  15. The study of human Y chromosome variation through ancient DNA.

    PubMed

    Kivisild, Toomas

    2017-05-01

    High throughput sequencing methods have completely transformed the study of human Y chromosome variation by offering a genome-scale view on genetic variation retrieved from ancient human remains in context of a growing number of high coverage whole Y chromosome sequence data from living populations from across the world. The ancient Y chromosome sequences are providing us the first exciting glimpses into the past variation of male-specific compartment of the genome and the opportunity to evaluate models based on previously made inferences from patterns of genetic variation in living populations. Analyses of the ancient Y chromosome sequences are challenging not only because of issues generally related to ancient DNA work, such as DNA damage-induced mutations and low content of endogenous DNA in most human remains, but also because of specific properties of the Y chromosome, such as its highly repetitive nature and high homology with the X chromosome. Shotgun sequencing of uniquely mapping regions of the Y chromosomes to sufficiently high coverage is still challenging and costly in poorly preserved samples. To increase the coverage of specific target SNPs capture-based methods have been developed and used in recent years to generate Y chromosome sequence data from hundreds of prehistoric skeletal remains. Besides the prospects of testing directly as how much genetic change in a given time period has accompanied changes in material culture the sequencing of ancient Y chromosomes allows us also to better understand the rate at which mutations accumulate and get fixed over time. This review considers genome-scale evidence on ancient Y chromosome diversity that has recently started to accumulate in geographic areas favourable to DNA preservation. More specifically the review focuses on examples of regional continuity and change of the Y chromosome haplogroups in North Eurasia and in the New World.

  16. Selection of a DNA barcode for Nectriaceae from fungal whole-genomes.

    PubMed

    Zeng, Zhaoqing; Zhao, Peng; Luo, Jing; Zhuang, Wenying; Yu, Zhihe

    2012-01-01

    A DNA barcode is a short segment of sequence that is able to distinguish species. A barcode must ideally contain enough variation to distinguish every individual species and be easily obtained. Fungi of Nectriaceae are economically important and show high species diversity. To establish a standard DNA barcode for this group of fungi, the genomes of Neurospora crassa and 30 other filamentous fungi were compared. The expect value was treated as a criterion to recognize homologous sequences. Four candidate markers, Hsp90, AAC, CDC48, and EF3, were tested for their feasibility as barcodes in the identification of 34 well-established species belonging to 13 genera of Nectriaceae. Two hundred and fifteen sequences were analyzed. Intra- and inter-specific variations and the success rate of PCR amplification and sequencing were considered as important criteria for estimation of the candidate markers. Ultimately, the partial EF3 gene met the requirements for a good DNA barcode: No overlap was found between the intra- and inter-specific pairwise distances. The smallest inter-specific distance of EF3 gene was 3.19%, while the largest intra-specific distance was 1.79%. In addition, there was a high success rate in PCR and sequencing for this gene (96.3%). CDC48 showed sufficiently high sequence variation among species, but the PCR and sequencing success rate was 84% using a single pair of primers. Although the Hsp90 and AAC genes had higher PCR and sequencing success rates (96.3% and 97.5%, respectively), overlapping occurred between the intra- and inter-specific variations, which could lead to misidentification. Therefore, we propose the EF3 gene as a possible DNA barcode for the nectriaceous fungi.

  17. Using the Relevance Vector Machine Model Combined with Local Phase Quantization to Predict Protein-Protein Interactions from Protein Sequences.

    PubMed

    An, Ji-Yong; Meng, Fan-Rong; You, Zhu-Hong; Fang, Yu-Hong; Zhao, Yu-Jun; Zhang, Ming

    2016-01-01

    We propose a novel computational method known as RVM-LPQ that combines the Relevance Vector Machine (RVM) model and Local Phase Quantization (LPQ) to predict PPIs from protein sequences. The main improvements are the results of representing protein sequences using the LPQ feature representation on a Position Specific Scoring Matrix (PSSM), reducing the influence of noise using a Principal Component Analysis (PCA), and using a Relevance Vector Machine (RVM) based classifier. We perform 5-fold cross-validation experiments on Yeast and Human datasets, and we achieve very high accuracies of 92.65% and 97.62%, respectively, which is significantly better than previous works. To further evaluate the proposed method, we compare it with the state-of-the-art support vector machine (SVM) classifier on the Yeast dataset. The experimental results demonstrate that our RVM-LPQ method is obviously better than the SVM-based method. The promising experimental results show the efficiency and simplicity of the proposed method, which can be an automatic decision support tool for future proteomics research.

  18. Signal amplification of padlock probes by rolling circle replication.

    PubMed Central

    Banér, J; Nilsson, M; Mendel-Hartvig, M; Landegren, U

    1998-01-01

    Circularizing oligonucleotide probes (padlock probes) have the potential to detect sets of gene sequences with high specificity and excellent selectivity for sequence variants, but sensitivity of detection has been limiting. By using a rolling circle replication (RCR) mechanism, circularized but not unreacted probes can yield a powerful signal amplification. We demonstrate here that in order for the reaction to proceed efficiently, the probes must be released from the topological link that forms with target molecules upon hybridization and ligation. If the target strand has a nearby free 3' end, then the probe-target hybrids can be displaced by the polymerase used for replication. The displaced probe can then slip off the targetstrand and a rolling circle amplification is initiated. Alternatively, the target sequence itself can prime an RCR after its non-base paired 3' end has been removed by exonucleolytic activity. We found the Phi29 DNA polymerase to be superior to the Klenow fragment in displacing the target DNA strand, and it maintained the polymerization reaction for at least 12 h, yielding an extension product that represents several thousand-fold the length of the padlock probe. PMID:9801302

  19. Evaluation of next generation mtGenome sequencing using the Ion Torrent Personal Genome Machine (PGM)☆

    PubMed Central

    Parson, Walther; Strobl, Christina; Huber, Gabriela; Zimmermann, Bettina; Gomes, Sibylle M.; Souto, Luis; Fendt, Liane; Delport, Rhena; Langit, Reina; Wootton, Sharon; Lagacé, Robert; Irwin, Jodi

    2013-01-01

    Insights into the human mitochondrial phylogeny have been primarily achieved by sequencing full mitochondrial genomes (mtGenomes). In forensic genetics (partial) mtGenome information can be used to assign haplotypes to their phylogenetic backgrounds, which may, in turn, have characteristic geographic distributions that would offer useful information in a forensic case. In addition and perhaps even more relevant in the forensic context, haplogroup-specific patterns of mutations form the basis for quality control of mtDNA sequences. The current method for establishing (partial) mtDNA haplotypes is Sanger-type sequencing (STS), which is laborious, time-consuming, and expensive. With the emergence of Next Generation Sequencing (NGS) technologies, the body of available mtDNA data can potentially be extended much more quickly and cost-efficiently. Customized chemistries, laboratory workflows and data analysis packages could support the community and increase the utility of mtDNA analysis in forensics. We have evaluated the performance of mtGenome sequencing using the Personal Genome Machine (PGM) and compared the resulting haplotypes directly with conventional Sanger-type sequencing. A total of 64 mtGenomes (>1 million bases) were established that yielded high concordance with the corresponding STS haplotypes (<0.02% differences). About two-thirds of the differences were observed in or around homopolymeric sequence stretches. In addition, the sequence alignment algorithm employed to align NGS reads played a significant role in the analysis of the data and the resulting mtDNA haplotypes. Further development of alignment software would be desirable to facilitate the application of NGS in mtDNA forensic genetics. PMID:23948325

  20. Stratification of co-evolving genomic groups using ranked phylogenetic profiles

    PubMed Central

    Freilich, Shiri; Goldovsky, Leon; Gottlieb, Assaf; Blanc, Eric; Tsoka, Sophia; Ouzounis, Christos A

    2009-01-01

    Background Previous methods of detecting the taxonomic origins of arbitrary sequence collections, with a significant impact to genome analysis and in particular metagenomics, have primarily focused on compositional features of genomes. The evolutionary patterns of phylogenetic distribution of genes or proteins, represented by phylogenetic profiles, provide an alternative approach for the detection of taxonomic origins, but typically suffer from low accuracy. Herein, we present rank-BLAST, a novel approach for the assignment of protein sequences into genomic groups of the same taxonomic origin, based on the ranking order of phylogenetic profiles of target genes or proteins across the reference database. Results The rank-BLAST approach is validated by computing the phylogenetic profiles of all sequences for five distinct microbial species of varying degrees of phylogenetic proximity, against a reference database of 243 fully sequenced genomes. The approach - a combination of sequence searches, statistical estimation and clustering - analyses the degree of sequence divergence between sets of protein sequences and allows the classification of protein sequences according to the species of origin with high accuracy, allowing taxonomic classification of 64% of the proteins studied. In most cases, a main cluster is detected, representing the corresponding species. Secondary, functionally distinct and species-specific clusters exhibit different patterns of phylogenetic distribution, thus flagging gene groups of interest. Detailed analyses of such cases are provided as examples. Conclusion Our results indicate that the rank-BLAST approach can capture the taxonomic origins of sequence collections in an accurate and efficient manner. The approach can be useful both for the analysis of genome evolution and the detection of species groups in metagenomics samples. PMID:19860884

  1. An Optimal Bahadur-Efficient Method in Detection of Sparse Signals with Applications to Pathway Analysis in Sequencing Association Studies.

    PubMed

    Dai, Hongying; Wu, Guodong; Wu, Michael; Zhi, Degui

    2016-01-01

    Next-generation sequencing data pose a severe curse of dimensionality, complicating traditional "single marker-single trait" analysis. We propose a two-stage combined p-value method for pathway analysis. The first stage is at the gene level, where we integrate effects within a gene using the Sequence Kernel Association Test (SKAT). The second stage is at the pathway level, where we perform a correlated Lancaster procedure to detect joint effects from multiple genes within a pathway. We show that the Lancaster procedure is optimal in Bahadur efficiency among all combined p-value methods. The Bahadur efficiency,[Formula: see text], compares sample sizes among different statistical tests when signals become sparse in sequencing data, i.e. ε →0. The optimal Bahadur efficiency ensures that the Lancaster procedure asymptotically requires a minimal sample size to detect sparse signals ([Formula: see text]). The Lancaster procedure can also be applied to meta-analysis. Extensive empirical assessments of exome sequencing data show that the proposed method outperforms Gene Set Enrichment Analysis (GSEA). We applied the competitive Lancaster procedure to meta-analysis data generated by the Global Lipids Genetics Consortium to identify pathways significantly associated with high-density lipoprotein cholesterol, low-density lipoprotein cholesterol, triglycerides, and total cholesterol.

  2. An integrated CRISPR Bombyx mori genome editing system with improved efficiency and expanded target sites.

    PubMed

    Ma, Sanyuan; Liu, Yue; Liu, Yuanyuan; Chang, Jiasong; Zhang, Tong; Wang, Xiaogang; Shi, Run; Lu, Wei; Xia, Xiaojuan; Zhao, Ping; Xia, Qingyou

    2017-04-01

    Genome editing enabled unprecedented new opportunities for targeted genomic engineering of a wide variety of organisms ranging from microbes, plants, animals and even human embryos. The serial establishing and rapid applications of genome editing tools significantly accelerated Bombyx mori (B. mori) research during the past years. However, the only CRISPR system in B. mori was the commonly used SpCas9, which only recognize target sites containing NGG PAM sequence. In the present study, we first improve the efficiency of our previous established SpCas9 system by 3.5 folds. The improved high efficiency was also observed at several loci in both BmNs cells and B. mori embryos. Then to expand the target sites, we showed that two newly discovered CRISPR system, SaCas9 and AsCpf1, could also induce highly efficient site-specific genome editing in BmNs cells, and constructed an integrated CRISPR system. Genome-wide analysis of targetable sites was further conducted and showed that the integrated system cover 69,144,399 sites in B. mori genome, and one site could be found in every 6.5 bp. The efficiency and resolution of this CRISPR platform will probably accelerate both fundamental researches and applicable studies in B. mori, and perhaps other insects. Copyright © 2017 Elsevier Ltd. All rights reserved.

  3. Novel Peptide Sequence (“IQ-tag”) with High Affinity for NIR Fluorochromes Allows Protein and Cell Specific Labeling for In Vivo Imaging

    PubMed Central

    McCarthy, Jason R.; Weissleder, Ralph

    2007-01-01

    Background Probes that allow site-specific protein labeling have become critical tools for visualizing biological processes. Methods Here we used phage display to identify a novel peptide sequence with nanomolar affinity for near infrared (NIR) (benz)indolium fluorochromes. The developed peptide sequence (“IQ-tag”) allows detection of NIR dyes in a wide range of assays including ELISA, flow cytometry, high throughput screens, microscopy, and optical in vivo imaging. Significance The described method is expected to have broad utility in numerous applications, namely site-specific protein imaging, target identification, cell tracking, and drug development. PMID:17653285

  4. Development and characterization of high-efficiency, high-specific impulse xenon Hall thrusters

    NASA Astrophysics Data System (ADS)

    Hofer, Richard Robert

    This dissertation presents research aimed at extending the efficient operation of 1600 s specific impulse Hall thruster technology to the 2000--3000 s range. While recent studies of commercially developed Hall thrusters demonstrated greater than 4000 s specific impulse, maximum efficiency occurred at less than 3000 s. It was hypothesized that the efficiency maximum resulted as a consequence of modern magnetic field designs, optimized for 1600 s, which were unsuitable at high-specific impulse. Motivated by the industry efforts and mission studies, the aim of this research was to develop and characterize xenon Hall thrusters capable of both high-specific impulse and high-efficiency operation. The research divided into development and characterization phases. During the development phase, the laboratory-model NASA-173M Hall thrusters were designed with plasma lens magnetic field topographies and their performance and plasma characteristics were evaluated. Experiments with the NASA-173M version 1 (v1) validated the plasma lens design by showing how changing the magnetic field topography at high-specific impulse improved efficiency. Experiments with the NASA-173M version 2 (v2) showed there was a minimum current density and optimum magnetic field topography at which efficiency monotonically increased with voltage. Between 300--1000 V, total specific impulse and total efficiency of the NASA-173Mv2 operating at 10 mg/s ranged from 1600--3400 s and 51--61%, respectively. Comparison of the thrusters showed that efficiency can be optimized for specific impulse by varying the plasma lens design. During the characterization phase, additional plasma properties of the NASA-173Mv2 were measured and a performance model was derived accounting for a multiply-charged, partially-ionized plasma. Results from the model based on experimental data showed how efficient operation at high-specific impulse was enabled through regulation of the electron current with the magnetic field. The decrease of efficiency due to multiply-charged ions was minor. Efficiency was largely determined by the current utilization, which suggested maximum Hall thruster efficiency has yet to be reached. The electron Hall parameter was approximately constant with voltage, decreasing from an average of 210 at 300 V to an average of 160 between 400--900 V, which confirmed efficient operation can be realized only over a limited range of Hall parameters.

  5. Development of loop-mediated isothermal amplification (LAMP) assays for the rapid detection of allergic peanut in processed food.

    PubMed

    Sheu, Shyang-Chwen; Tsou, Po-Chuan; Lien, Yi-Yang; Lee, Meng-Shiou

    2018-08-15

    Peanut is a widely and common used in many cuisines around the world. However, peanut is also one of the most important food allergen for causing anaphylactic reaction. To prevent allergic reaction, the best way is to avoid the food allergen or food containing allergic ingredient such as peanut before food consuming. Thus, to efficient and precisely detect the allergic ingredient, peanut or related product, is essential and required for maintain consumer's health or their interest. In this study, a loop-mediated isothermal amplification (LAMP) assay was developed for the detection of allergic peanut using specifically designed primer sets. Two sets of the specific LAMP primers respectively targeted the internal transcribed sequence 1 (ITS1) of nuclear ribosomal DNA sequence regions and the ara h1 gene sequence of Arachia hypogeae (peanut) were used to address the application of LAMP for detecting peanut in processed food or diet. The results demonstrated that the identification of peanut using the newly designed primers for ITS 1 sequence is more sensitive rather than primers for sequence of Ara h1 gene when performing LAMP assay. Besides, the sensitivity of LAMP for detecting peanut is also higher than the traditional PCR method. These LAMP primers sets showed high specificity for the identification of the peanut and had no cross-reaction to other species of nut including walnut, hazelnut, almonds, cashew and macadamia nut. Moreover, when minimal 0.1% peanuts were mixed with other nuts ingredients at different ratios, no any cross-reactivity was evident during performing LAMP. Finally, genomic DNAs extracted from boiled and steamed peanut were used as templates; the detection of peanut by LAMP was not affected and reproducible. As to this established LAMP herein, not only can peanut ingredients be detected but commercial foods containing peanut can also be identified. This assay will be useful and potential for the rapid detection of peanut in practical food markets. Copyright © 2018 Elsevier Ltd. All rights reserved.

  6. Highly Iterated Palindromic Sequences (HIPs) and Their Relationship to DNA Methyltransferases

    PubMed Central

    Elhai, Jeff

    2015-01-01

    The sequence GCGATCGC (Highly Iterated Palindrome, HIP1) is commonly found in high frequency in cyanobacterial genomes. An important clue to its function may be the presence of two orphan DNA methyltransferases that recognize internal sequences GATC and CGATCG. An examination of genomes from 97 cyanobacteria, both free-living and obligate symbionts, showed that there are exceptional cases in which HIP1 is at a low frequency or nearly absent. In some of these cases, it appears to have been replaced by a different GC-rich palindromic sequence, alternate HIPs. When HIP1 is at a high frequency, GATC- and CGATCG-specific methyltransferases are generally present in the genome. When an alternate HIP is at high frequency, a methyltransferase specific for that sequence is present. The pattern of 1-nt deviations from HIP1 sequences is biased towards the first and last nucleotides, i.e., those distinguish CGATCG from HIP1. Taken together, the results point to a role of DNA methylation in the creation or functioning of HIP sites. A model is presented that postulates the existence of a GmeC-dependent mismatch repair system whose activity creates and maintains HIP sequences. PMID:25789551

  7. Highly Iterated Palindromic Sequences (HIPs) and Their Relationship to DNA Methyltransferases.

    PubMed

    Elhai, Jeff

    2015-03-17

    The sequence GCGATCGC (Highly Iterated Palindrome, HIP1) is commonly found in high frequency in cyanobacterial genomes. An important clue to its function may be the presence of two orphan DNA methyltransferases that recognize internal sequences GATC and CGATCG. An examination of genomes from 97 cyanobacteria, both free-living and obligate symbionts, showed that there are exceptional cases in which HIP1 is at a low frequency or nearly absent. In some of these cases, it appears to have been replaced by a different GC-rich palindromic sequence, alternate HIPs. When HIP1 is at a high frequency, GATC- and CGATCG-specific methyltransferases are generally present in the genome. When an alternate HIP is at high frequency, a methyltransferase specific for that sequence is present. The pattern of 1-nt deviations from HIP1 sequences is biased towards the first and last nucleotides, i.e., those distinguish CGATCG from HIP1. Taken together, the results point to a role of DNA methylation in the creation or functioning of HIP sites. A model is presented that postulates the existence of a GmeC-dependent mismatch repair system whose activity creates and maintains HIP sequences.

  8. Malleable architecture generator for FPGA computing

    NASA Astrophysics Data System (ADS)

    Gokhale, Maya; Kaba, James; Marks, Aaron; Kim, Jang

    1996-10-01

    The malleable architecture generator (MARGE) is a tool set that translates high-level parallel C to configuration bit streams for field-programmable logic based computing systems. MARGE creates an application-specific instruction set and generates the custom hardware components required to perform exactly those computations specified by the C program. In contrast to traditional fixed-instruction processors, MARGE's dynamic instruction set creation provides for efficient use of hardware resources. MARGE processes intermediate code in which each operation is annotated by the bit lengths of the operands. Each basic block (sequence of straight line code) is mapped into a single custom instruction which contains all the operations and logic inherent in the block. A synthesis phase maps the operations comprising the instructions into register transfer level structural components and control logic which have been optimized to exploit functional parallelism and function unit reuse. As a final stage, commercial technology-specific tools are used to generate configuration bit streams for the desired target hardware. Technology- specific pre-placed, pre-routed macro blocks are utilized to implement as much of the hardware as possible. MARGE currently supports the Xilinx-based Splash-2 reconfigurable accelerator and National Semiconductor's CLAy-based parallel accelerator, MAPA. The MARGE approach has been demonstrated on systolic applications such as DNA sequence comparison.

  9. ZOOM Lite: next-generation sequencing data mapping and visualization software

    PubMed Central

    Zhang, Zefeng; Lin, Hao; Ma, Bin

    2010-01-01

    High-throughput next-generation sequencing technologies pose increasing demands on the efficiency, accuracy and usability of data analysis software. In this article, we present ZOOM Lite, a software for efficient reads mapping and result visualization. With a kernel capable of mapping tens of millions of Illumina or AB SOLiD sequencing reads efficiently and accurately, and an intuitive graphical user interface, ZOOM Lite integrates reads mapping and result visualization into a easy to use pipeline on desktop PC. The software handles both single-end and paired-end reads, and can output both the unique mapping result or the top N mapping results for each read. Additionally, the software takes a variety of input file formats and outputs to several commonly used result formats. The software is freely available at http://bioinfor.com/zoom/lite/. PMID:20530531

  10. High-Resolution Whole-Genome Sequencing Reveals That Specific Chromatin Domains from Most Human Chromosomes Associate with Nucleoli

    PubMed Central

    van Koningsbruggen, Silvana; Gierliński, Marek; Schofield, Pietá; Martin, David; Barton, Geoffey J.; Ariyurek, Yavuz; den Dunnen, Johan T.

    2010-01-01

    The nuclear space is mostly occupied by chromosome territories and nuclear bodies. Although this organization of chromosomes affects gene function, relatively little is known about the role of nuclear bodies in the organization of chromosomal regions. The nucleolus is the best-studied subnuclear structure and forms around the rRNA repeat gene clusters on the acrocentric chromosomes. In addition to rDNA, other chromatin sequences also surround the nucleolar surface and may even loop into the nucleolus. These additional nucleolar-associated domains (NADs) have not been well characterized. We present here a whole-genome, high-resolution analysis of chromatin endogenously associated with nucleoli. We have used a combination of three complementary approaches, namely fluorescence comparative genome hybridization, high-throughput deep DNA sequencing and photoactivation combined with time-lapse fluorescence microscopy. The data show that specific sequences from most human chromosomes, in addition to the rDNA repeat units, associate with nucleoli in a reproducible and heritable manner. NADs have in common a high density of AT-rich sequence elements, low gene density and a statistically significant enrichment in transcriptionally repressed genes. Unexpectedly, both the direct DNA sequencing and fluorescence photoactivation data show that certain chromatin loci can specifically associate with either the nucleolus, or the nuclear envelope. PMID:20826608

  11. High-resolution whole-genome sequencing reveals that specific chromatin domains from most human chromosomes associate with nucleoli.

    PubMed

    van Koningsbruggen, Silvana; Gierlinski, Marek; Schofield, Pietá; Martin, David; Barton, Geoffey J; Ariyurek, Yavuz; den Dunnen, Johan T; Lamond, Angus I

    2010-11-01

    The nuclear space is mostly occupied by chromosome territories and nuclear bodies. Although this organization of chromosomes affects gene function, relatively little is known about the role of nuclear bodies in the organization of chromosomal regions. The nucleolus is the best-studied subnuclear structure and forms around the rRNA repeat gene clusters on the acrocentric chromosomes. In addition to rDNA, other chromatin sequences also surround the nucleolar surface and may even loop into the nucleolus. These additional nucleolar-associated domains (NADs) have not been well characterized. We present here a whole-genome, high-resolution analysis of chromatin endogenously associated with nucleoli. We have used a combination of three complementary approaches, namely fluorescence comparative genome hybridization, high-throughput deep DNA sequencing and photoactivation combined with time-lapse fluorescence microscopy. The data show that specific sequences from most human chromosomes, in addition to the rDNA repeat units, associate with nucleoli in a reproducible and heritable manner. NADs have in common a high density of AT-rich sequence elements, low gene density and a statistically significant enrichment in transcriptionally repressed genes. Unexpectedly, both the direct DNA sequencing and fluorescence photoactivation data show that certain chromatin loci can specifically associate with either the nucleolus, or the nuclear envelope.

  12. Rapid Analysis of Protein Farnesyltransferase Substrate Specificity Using Peptide Libraries and Isoprenoid Diphosphate Analogues

    PubMed Central

    2015-01-01

    Protein farnesytransferase (PFTase) catalyzes the farnesylation of proteins with a carboxy-terminal tetrapeptide sequence denoted as a Ca1a2X box. To explore the specificity of this enzyme, an important therapeutic target, solid-phase peptide synthesis in concert with a peptide inversion strategy was used to prepare two libraries, each containing 380 peptides. The libraries were screened using an alkyne-containing isoprenoid analogue followed by click chemistry with biotin azide and subsequent visualization with streptavidin-AP. Screening of the CVa2X and CCa2X libraries with Rattus norvegicus PFTase revealed reaction by many known recognition sequences as well as numerous unknown ones. Some of the latter occur in the genomes of bacteria and viruses and may be important for pathogenesis, suggesting new targets for therapeutic intervention. Screening of the CVa2X library with alkyne-functionalized isoprenoid substrates showed that those prepared from C10 or C15 precursors gave similar results, whereas the analogue synthesized from a C5 unit gave a different pattern of reactivity. Lastly, the substrate specificities of PFTases from three organisms (R. norvegicus, Saccharomyces cerevisiae, and Candida albicans) were compared using CVa2X libraries. R. norvegicus PFTase was found to share more peptide substrates with S. cerevisiae PFTase than with C. albicans PFTase. In general, this method is a highly efficient strategy for rapidly probing the specificity of this important enzyme. PMID:24841702

  13. Efficient recovery of the functional IP10-scFv fusion protein from inclusion bodies with an on-column refolding system.

    PubMed

    Guo, Jun-Qing; Li, Qing-Mei; Zhou, Ji-Yong; Zhang, Gai-Ping; Yang, Yan-Yan; Xing, Guang-Xu; Zhao, Dong; You, Shang-You; Zhang, Chu-Yu

    2006-01-01

    A functional IP10-scFv fusion protein retaining the antibody specificity for acidic isoferritin and chemokine function was produced at high level in Esherichia coli (E. coli). IP10-scFv gene from the recombinant plasmid pc3IP104c9 was subcloned into pET28a fused to N-terminal His-tag sequence in frame and overexpressed in E. coli BL21(DE3). With an on-column refolding procedure based on Ni-chelating chromatography, the active fusion protein was recovered efficiently from inclusion bodies with a refolding yield of approximate 45% confirmed by spectrophotometer. The activity of refolded IP10-scFv was determined through sodium dodecyl sulfate-polyacrylamide gel electrophoresis, Western blotting and enzyme-linked immunosorbent assay. The results showed the fusion protein retains the specific binding activity to AIF with an affinity constant of 4.48x10(-8) M as well as the chemokine function of IP-10. The overall yield of IP10-scFv with bioactivity in E. coli flask culture was more than 40 mg/L.

  14. A multicolor panel of TALE-KRAB based transcriptional repressor vectors enabling knockdown of multiple gene targets

    PubMed Central

    Zhang, Zhonghui; Wu, Elise; Qian, Zhijian; Wu, Wen-Shu

    2014-01-01

    Stable and efficient knockdown of multiple gene targets is highly desirable for dissection of molecular pathways. Because it allows sequence-specific DNA binding, transcription activator-like effector (TALE) offers a new genetic perturbation technique that allows for gene-specific repression. Here, we constructed a multicolor lentiviral TALE-Kruppel-associated box (KRAB) expression vector platform that enables knockdown of multiple gene targets. This platform is fully compatible with the Golden Gate TALEN and TAL Effector Kit 2.0, a widely used and efficient method for TALE assembly. We showed that this multicolor TALE-KRAB vector system when combined together with bone marrow transplantation could quickly knock down c-kit and PU.1 genes in hematopoietic stem and progenitor cells of recipient mice. Furthermore, our data demonstrated that this platform simultaneously knocked down both c-Kit and PU.1 genes in the same primary cell populations. Together, our results suggest that this multicolor TALE-KRAB vector platform is a promising and versatile tool for knockdown of multiple gene targets and could greatly facilitate dissection of molecular pathways. PMID:25475013

  15. A multicolor panel of TALE-KRAB based transcriptional repressor vectors enabling knockdown of multiple gene targets.

    PubMed

    Zhang, Zhonghui; Wu, Elise; Qian, Zhijian; Wu, Wen-Shu

    2014-12-05

    Stable and efficient knockdown of multiple gene targets is highly desirable for dissection of molecular pathways. Because it allows sequence-specific DNA binding, transcription activator-like effector (TALE) offers a new genetic perturbation technique that allows for gene-specific repression. Here, we constructed a multicolor lentiviral TALE-Kruppel-associated box (KRAB) expression vector platform that enables knockdown of multiple gene targets. This platform is fully compatible with the Golden Gate TALEN and TAL Effector Kit 2.0, a widely used and efficient method for TALE assembly. We showed that this multicolor TALE-KRAB vector system when combined together with bone marrow transplantation could quickly knock down c-kit and PU.1 genes in hematopoietic stem and progenitor cells of recipient mice. Furthermore, our data demonstrated that this platform simultaneously knocked down both c-Kit and PU.1 genes in the same primary cell populations. Together, our results suggest that this multicolor TALE-KRAB vector platform is a promising and versatile tool for knockdown of multiple gene targets and could greatly facilitate dissection of molecular pathways.

  16. The CRISPR/Cas9 system produces specific and homozygous targeted gene editing in rice in one generation.

    PubMed

    Zhang, Hui; Zhang, Jinshan; Wei, Pengliang; Zhang, Botao; Gou, Feng; Feng, Zhengyan; Mao, Yanfei; Yang, Lan; Zhang, Heng; Xu, Nanfei; Zhu, Jian-Kang

    2014-08-01

    The CRISPR/Cas9 system has been demonstrated to efficiently induce targeted gene editing in a variety of organisms including plants. Recent work showed that CRISPR/Cas9-induced gene mutations in Arabidopsis were mostly somatic mutations in the early generation, although some mutations could be stably inherited in later generations. However, it remains unclear whether this system will work similarly in crops such as rice. In this study, we tested in two rice subspecies 11 target genes for their amenability to CRISPR/Cas9-induced editing and determined the patterns, specificity and heritability of the gene modifications. Analysis of the genotypes and frequency of edited genes in the first generation of transformed plants (T0) showed that the CRISPR/Cas9 system was highly efficient in rice, with target genes edited in nearly half of the transformed embryogenic cells before their first cell division. Homozygotes of edited target genes were readily found in T0 plants. The gene mutations were passed to the next generation (T1) following classic Mendelian law, without any detectable new mutation or reversion. Even with extensive searches including whole genome resequencing, we could not find any evidence of large-scale off-targeting in rice for any of the many targets tested in this study. By specifically sequencing the putative off-target sites of a large number of T0 plants, low-frequency mutations were found in only one off-target site where the sequence had 1-bp difference from the intended target. Overall, the data in this study point to the CRISPR/Cas9 system being a powerful tool in crop genome engineering. © 2014 Society for Experimental Biology, Association of Applied Biologists and John Wiley & Sons Ltd.

  17. Engineered external guide sequences are highly effective in inhibiting gene expression and replication of hepatitis B virus in cultured cells.

    PubMed

    Zhang, Zhigang; Vu, Gia-Phong; Gong, Hao; Xia, Chuan; Chen, Yuan-Chuan; Liu, Fenyong; Wu, Jianguo; Lu, Sangwei

    2013-01-01

    External guide sequences (EGSs) are RNA molecules that consist of a sequence complementary to a target mRNA and recruit intracellular ribonuclease P (RNase P), a tRNA processing enzyme, for specific degradation of the target mRNA. We have previously used an in vitro selection procedure to generate EGS variants that efficiently induce human RNase P to cleave a target mRNA in vitro. In this study, we constructed EGSs from a variant to target the overlapping region of the S mRNA, pre-S/L mRNA, and pregenomic RNA (pgRNA) of hepatitis B virus (HBV), which are essential for viral replication and infection. The EGS variant was about 50-fold more efficient in inducing human RNase P to cleave the mRNA in vitro than the EGS derived from a natural tRNA. Following Salmonella-mediated gene delivery, the EGSs were expressed in cultured HBV-carrying cells. A reduction of about 97% and 75% in the level of HBV RNAs and proteins and an inhibition of about 6,000- and 130-fold in the levels of capsid-associated HBV DNA were observed in cells treated with Salmonella vectors carrying the expression cassette for the variant and the tRNA-derived EGS, respectively. Our study provides direct evidence that the EGS variant is more effective in blocking HBV gene expression and DNA replication than the tRNA-derived EGS. Furthermore, these results demonstrate the feasibility of developing Salmonella-mediated gene delivery of highly active EGS RNA variants as a novel approach for gene-targeting applications such as anti-HBV therapy.

  18. Global mapping of DNA conformational flexibility on Saccharomyces cerevisiae.

    PubMed

    Menconi, Giulia; Bedini, Andrea; Barale, Roberto; Sbrana, Isabella

    2015-04-01

    In this study we provide the first comprehensive map of DNA conformational flexibility in Saccharomyces cerevisiae complete genome. Flexibility plays a key role in DNA supercoiling and DNA/protein binding, regulating DNA transcription, replication or repair. Specific interest in flexibility analysis concerns its relationship with human genome instability. Enrichment in flexible sequences has been detected in unstable regions of human genome defined fragile sites, where genes map and carry frequent deletions and rearrangements in cancer. Flexible sequences have been suggested to be the determinants of fragile gene proneness to breakage; however, their actual role and properties remain elusive. Our in silico analysis carried out genome-wide via the StabFlex algorithm, shows the conserved presence of highly flexible regions in budding yeast genome as well as in genomes of other Saccharomyces sensu stricto species. Flexibile peaks in S. cerevisiae identify 175 ORFs mapping on their 3'UTR, a region affecting mRNA translation, localization and stability. (TA)n repeats of different extension shape the central structure of peaks and co-localize with polyadenylation efficiency element (EE) signals. ORFs with flexible peaks share common features. Transcripts are characterized by decreased half-life: this is considered peculiar of genes involved in regulatory systems with high turnover; consistently, their function affects biological processes such as cell cycle regulation or stress response. Our findings support the functional importance of flexibility peaks, suggesting that the flexible sequence may be derived by an expansion of canonical TAYRTA polyadenylation efficiency element. The flexible (TA)n repeat amplification could be the outcome of an evolutionary neofunctionalization leading to a differential 3'-end processing and expression regulation in genes with peculiar function. Our study provides a new support to the functional role of flexibility in genomes and a strategy for its characterization inside human fragile sites.

  19. Global Mapping of DNA Conformational Flexibility on Saccharomyces cerevisiae

    PubMed Central

    Menconi, Giulia; Bedini, Andrea; Barale, Roberto; Sbrana, Isabella

    2015-01-01

    In this study we provide the first comprehensive map of DNA conformational flexibility in Saccharomyces cerevisiae complete genome. Flexibility plays a key role in DNA supercoiling and DNA/protein binding, regulating DNA transcription, replication or repair. Specific interest in flexibility analysis concerns its relationship with human genome instability. Enrichment in flexible sequences has been detected in unstable regions of human genome defined fragile sites, where genes map and carry frequent deletions and rearrangements in cancer. Flexible sequences have been suggested to be the determinants of fragile gene proneness to breakage; however, their actual role and properties remain elusive. Our in silico analysis carried out genome-wide via the StabFlex algorithm, shows the conserved presence of highly flexible regions in budding yeast genome as well as in genomes of other Saccharomyces sensu stricto species. Flexibile peaks in S. cerevisiae identify 175 ORFs mapping on their 3’UTR, a region affecting mRNA translation, localization and stability. (TA)n repeats of different extension shape the central structure of peaks and co-localize with polyadenylation efficiency element (EE) signals. ORFs with flexible peaks share common features. Transcripts are characterized by decreased half-life: this is considered peculiar of genes involved in regulatory systems with high turnover; consistently, their function affects biological processes such as cell cycle regulation or stress response. Our findings support the functional importance of flexibility peaks, suggesting that the flexible sequence may be derived by an expansion of canonical TAYRTA polyadenylation efficiency element. The flexible (TA)n repeat amplification could be the outcome of an evolutionary neofunctionalization leading to a differential 3’-end processing and expression regulation in genes with peculiar function. Our study provides a new support to the functional role of flexibility in genomes and a strategy for its characterization inside human fragile sites. PMID:25860149

  20. Current siRNA Targets in Atherosclerosis and Aortic Aneurysm

    PubMed Central

    Pradhan-Nabzdyk, Leena; Huang, Chenyu; Logerfo, Frank W.; Nabzdyk, Christoph S.

    2014-01-01

    Atherosclerosis (ATH) and aortic aneurysms (AA) remain challenging chronic diseases that confer high morbidity and mortality despite advances in medical, interventional, and surgical care. RNA interference represents a promising technology that may be utilized to silence genes contributing to ATH and AA. Despite positive results in preclinical and some clinical feasibility studies, challenges such as target/sequence validation, tissue specificity, transfection efficiency, and mitigation of unwanted off-target effects remain to be addressed. In this review the most current targets and some novel approaches in siRNA delivery are being discussed. Due to the plethora of investigated targets, only studies published between 2010 and 2014 were included. PMID:24882715

Top