Method for rapid base sequencing in DNA and RNA with two base labeling
Jett, J.H.; Keller, R.A.; Martin, J.C.; Posner, R.G.; Marrone, B.L.; Hammond, M.L.; Simpson, D.J.
1995-04-11
A method is described for rapid-base sequencing in DNA and RNA with two-base labeling and employing fluorescent detection of single molecules at two wavelengths. Bases modified to accept fluorescent labels are used to replicate a single DNA or RNA strand to be sequenced. The bases are then sequentially cleaved from the replicated strand, excited with a chosen spectrum of electromagnetic radiation, and the fluorescence from individual, tagged bases detected in the order of cleavage from the strand. 4 figures.
Method for rapid base sequencing in DNA and RNA with two base labeling
Jett, James H.; Keller, Richard A.; Martin, John C.; Posner, Richard G.; Marrone, Babetta L.; Hammond, Mark L.; Simpson, Daniel J.
1995-01-01
Method for rapid-base sequencing in DNA and RNA with two-base labeling and employing fluorescent detection of single molecules at two wavelengths. Bases modified to accept fluorescent labels are used to replicate a single DNA or RNA strand to be sequenced. The bases are then sequentially cleaved from the replicated strand, excited with a chosen spectrum of electromagnetic radiation, and the fluorescence from individual, tagged bases detected in the order of cleavage from the strand.
Analysis of Pteridium ribosomal RNA sequences by rapid direct sequencing.
Tan, M K
1991-08-01
A total of 864 bases from 5 regions interspersed in the 18S and 26S rRNA molecules from various clones of Pteridium covering the general geographical distribution of the genus was analysed using a rapid rRNA sequencing technique. No base difference has been detected amongst the three major lineages, two of which apparently separated before the breakup of the ancient supercontinent, Pangaea. These regions of the rRNA sequences have thus been conserved for at least 160 million years and are here compared with other eukaryotic, especially plant rRNAs.
Method for rapid base sequencing in DNA and RNA
Jett, J.H.; Keller, R.A.; Martin, J.C.; Moyzis, R.K.; Ratliff, R.L.; Shera, E.B.; Stewart, C.C.
1987-10-07
A method is provided for the rapid base sequencing of DNA or RNA fragments wherein a single fragment of DNA or RNA is provided with identifiable bases and suspended in a moving flow stream. An exonuclease sequentially cleaves individual bases from the end of the suspended fragment. The moving flow stream maintains the cleaved bases in an orderly train for subsequent detection and identification. In a particular embodiment, individual bases forming the DNA or RNA fragments are individually tagged with a characteristic fluorescent dye. The train of bases is then excited to fluorescence with an output spectrum characteristic of the individual bases. Accordingly, the base sequence of the original DNA or RNA fragment can be reconstructed. 2 figs.
Method for rapid base sequencing in DNA and RNA
Jett, J.H.; Keller, R.A.; Martin, J.C.; Moyzis, R.K.; Ratliff, R.L.; Shera, E.B.; Stewart, C.C.
1990-10-09
A method is provided for the rapid base sequencing of DNA or RNA fragments wherein a single fragment of DNA or RNA is provided with identifiable bases and suspended in a moving flow stream. An exonuclease sequentially cleaves individual bases from the end of the suspended fragment. The moving flow stream maintains the cleaved bases in an orderly train for subsequent detection and identification. In a particular embodiment, individual bases forming the DNA or RNA fragments are individually tagged with a characteristic fluorescent dye. The train of bases is then excited to fluorescence with an output spectrum characteristic of the individual bases. Accordingly, the base sequence of the original DNA or RNA fragment can be reconstructed. 2 figs.
Method for rapid base sequencing in DNA and RNA
Jett, James H.; Keller, Richard A.; Martin, John C.; Moyzis, Robert K.; Ratliff, Robert L.; Shera, E. Brooks; Stewart, Carleton C.
1990-01-01
A method is provided for the rapid base sequencing of DNA or RNA fragments wherein a single fragment of DNA or RNA is provided with identifiable bases and suspended in a moving flow stream. An exonuclease sequentially cleaves individual bases from the end of the suspended fragment. The moving flow stream maintains the cleaved bases in an orderly train for subsequent detection and identification. In a particular embodiment, individual bases forming the DNA or RNA fragments are individually tagged with a characteristic fluorescent dye. The train of bases is then excited to fluorescence with an output spectrum characteristic of the individual bases. Accordingly, the base sequence of the original DNA or RNA fragment can be reconstructed.
Phytophthora-ID.org: A sequence-based Phytophthora identification tool
N.J. Grünwald; F.N. Martin; M.M. Larsen; C.M. Sullivan; C.M. Press; M.D. Coffey; E.M. Hansen; J.L. Parke
2010-01-01
Contemporary species identification relies strongly on sequence-based identification, yet resources for identification of many fungal and oomycete pathogens are rare. We developed two web-based, searchable databases for rapid identification of Phytophthora spp. based on sequencing of the internal transcribed spacer (ITS) or the cytochrome oxidase...
Rapid Identification of Sequences for Orphan Enzymes to Power Accurate Protein Annotation
Ojha, Sunil; Watson, Douglas S.; Bomar, Martha G.; Galande, Amit K.; Shearer, Alexander G.
2013-01-01
The power of genome sequencing depends on the ability to understand what those genes and their proteins products actually do. The automated methods used to assign functions to putative proteins in newly sequenced organisms are limited by the size of our library of proteins with both known function and sequence. Unfortunately this library grows slowly, lagging well behind the rapid increase in novel protein sequences produced by modern genome sequencing methods. One potential source for rapidly expanding this functional library is the “back catalog” of enzymology – “orphan enzymes,” those enzymes that have been characterized and yet lack any associated sequence. There are hundreds of orphan enzymes in the Enzyme Commission (EC) database alone. In this study, we demonstrate how this orphan enzyme “back catalog” is a fertile source for rapidly advancing the state of protein annotation. Starting from three orphan enzyme samples, we applied mass-spectrometry based analysis and computational methods (including sequence similarity networks, sequence and structural alignments, and operon context analysis) to rapidly identify the specific sequence for each orphan while avoiding the most time- and labor-intensive aspects of typical sequence identifications. We then used these three new sequences to more accurately predict the catalytic function of 385 previously uncharacterized or misannotated proteins. We expect that this kind of rapid sequence identification could be efficiently applied on a larger scale to make enzymology’s “back catalog” another powerful tool to drive accurate genome annotation. PMID:24386392
Rapid identification of sequences for orphan enzymes to power accurate protein annotation.
Ramkissoon, Kevin R; Miller, Jennifer K; Ojha, Sunil; Watson, Douglas S; Bomar, Martha G; Galande, Amit K; Shearer, Alexander G
2013-01-01
The power of genome sequencing depends on the ability to understand what those genes and their proteins products actually do. The automated methods used to assign functions to putative proteins in newly sequenced organisms are limited by the size of our library of proteins with both known function and sequence. Unfortunately this library grows slowly, lagging well behind the rapid increase in novel protein sequences produced by modern genome sequencing methods. One potential source for rapidly expanding this functional library is the "back catalog" of enzymology--"orphan enzymes," those enzymes that have been characterized and yet lack any associated sequence. There are hundreds of orphan enzymes in the Enzyme Commission (EC) database alone. In this study, we demonstrate how this orphan enzyme "back catalog" is a fertile source for rapidly advancing the state of protein annotation. Starting from three orphan enzyme samples, we applied mass-spectrometry based analysis and computational methods (including sequence similarity networks, sequence and structural alignments, and operon context analysis) to rapidly identify the specific sequence for each orphan while avoiding the most time- and labor-intensive aspects of typical sequence identifications. We then used these three new sequences to more accurately predict the catalytic function of 385 previously uncharacterized or misannotated proteins. We expect that this kind of rapid sequence identification could be efficiently applied on a larger scale to make enzymology's "back catalog" another powerful tool to drive accurate genome annotation.
Transcriptome-Based Differentiation of Closely-Related Miscanthus Lines
Chouvarine, Philippe; Cooksey, Amanda M.; McCarthy, Fiona M.; ...
2012-01-10
Distinguishing between individuals is critical to those conducting animal/plant breeding, food safety/quality research, diagnostic and clinical testing, and evolutionary biology studies. Classical genetic identification studies are based on marker polymorphisms, but polymorphism-based techniques are time and labor intensive and often cannot distinguish between closely related individuals. Illumina sequencing technologies provide the detailed sequence data required for rapid and efficient differentiation of related species, lines/cultivars, and individuals in a cost-effective manner. Here we describe the use of Illumina high-throughput exome sequencing, coupled with SNP mapping, as a rapid means of distinguishing between related cultivars of the lignocellulosic bioenergy crop giant miscanthusmore » (Miscanthus6giganteus). We provide the first exome sequence database for Miscanthus species complete with Gene Ontology (GO) functional annotations."« less
Hocum, Jonah D; Battrell, Logan R; Maynard, Ryan; Adair, Jennifer E; Beard, Brian C; Rawlings, David J; Kiem, Hans-Peter; Miller, Daniel G; Trobridge, Grant D
2015-07-07
Analyzing the integration profile of retroviral vectors is a vital step in determining their potential genotoxic effects and developing safer vectors for therapeutic use. Identifying retroviral vector integration sites is also important for retroviral mutagenesis screens. We developed VISA, a vector integration site analysis server, to analyze next-generation sequencing data for retroviral vector integration sites. Sequence reads that contain a provirus are mapped to the human genome, sequence reads that cannot be localized to a unique location in the genome are filtered out, and then unique retroviral vector integration sites are determined based on the alignment scores of the remaining sequence reads. VISA offers a simple web interface to upload sequence files and results are returned in a concise tabular format to allow rapid analysis of retroviral vector integration sites.
Rapid Conversion of Traditional Introductory Physics Sequences to an Activity-Based Format
ERIC Educational Resources Information Center
Yoder, Garett; Cook, Jerry
2014-01-01
The Department of Physics at EKU [Eastern Kentucky University] with support from the National Science Foundations Course Curriculum and Laboratory Improvement Program has successfully converted our entire introductory physics sequence, both algebra-based and calculus-based courses, to an activity-based format where laboratory activities,…
Brichtová, Eva; Šenkyřík, J
2017-05-01
A low radiation burden is essential during diagnostic procedures in pediatric patients due to their high tissue sensitivity. Using MR examination instead of the routinely used CT reduces the radiation exposure and the risk of adverse stochastic effects. Our retrospective study evaluated the possibility of using ultrafast single-shot (SSh) sequences and turbo spin echo (TSE) sequences in rapid MR brain imaging in pediatric patients with hydrocephalus and a programmable ventriculoperitoneal drainage system. SSh sequences seem to be suitable for examining pediatric patients due to the speed of using this technique, but significant susceptibility artifacts due to the programmable drainage valve degrade the image quality. Therefore, a rapid MR examination protocol based on TSE sequences, less sensitive to artifacts due to ferromagnetic components, has been developed. Of 61 pediatric patients who were examined using MR and the SSh sequence protocol, a group of 15 patients with hydrocephalus and a programmable drainage system also underwent TSE sequence MR imaging. The susceptibility artifact volume in both rapid MR protocols was evaluated using a semiautomatic volumetry system. A statistically significant decrease in the susceptibility artifact volume has been demonstrated in TSE sequence imaging in comparison with SSh sequences. Using TSE sequences reduced the influence of artifacts from the programmable valve, and the image quality in all cases was rated as excellent. In all patients, rapid MR examinations were performed without any need for intravenous sedation or general anesthesia. Our study results strongly suggest the superiority of the TSE sequence MR protocol compared to the SSh sequence protocol in pediatric patients with a programmable ventriculoperitoneal drainage system due to a significant reduction of susceptibility artifact volume. Both rapid sequence MR protocols provide quick and satisfactory brain imaging with no ionizing radiation and a reduced need for intravenous or general anesthesia.
van Gijlswijk, R P; Wiegant, J; Vervenne, R; Lasan, R; Tanke, H J; Raap, A K
1996-01-01
We present a sensitive and rapid fluorescence in situ hybridization (FISH) strategy for detecting chromosome-specific repeat sequences. It uses horseradish peroxidase (HRP)-labeled oligonucleotide sequences in combination with fluorescent tyramide-based detection. After in situ hybridization, the HRP conjugated to the oligonucleotide probe is used to deposit fluorescently labeled tyramide molecules at the site of hybridization. The method features full chemical synthesis of probes, strong FISH signals, and short processing periods, as well as multicolor capabilities.
Centrifuge: rapid and sensitive classification of metagenomic sequences
Song, Li; Breitwieser, Florian P.
2016-01-01
Centrifuge is a novel microbial classification engine that enables rapid, accurate, and sensitive labeling of reads and quantification of species on desktop computers. The system uses an indexing scheme based on the Burrows-Wheeler transform (BWT) and the Ferragina-Manzini (FM) index, optimized specifically for the metagenomic classification problem. Centrifuge requires a relatively small index (4.2 GB for 4078 bacterial and 200 archaeal genomes) and classifies sequences at very high speed, allowing it to process the millions of reads from a typical high-throughput DNA sequencing run within a few minutes. Together, these advances enable timely and accurate analysis of large metagenomics data sets on conventional desktop computers. Because of its space-optimized indexing schemes, Centrifuge also makes it possible to index the entire NCBI nonredundant nucleotide sequence database (a total of 109 billion bases) with an index size of 69 GB, in contrast to k-mer-based indexing schemes, which require far more extensive space. PMID:27852649
SPHINX--an algorithm for taxonomic binning of metagenomic sequences.
Mohammed, Monzoorul Haque; Ghosh, Tarini Shankar; Singh, Nitin Kumar; Mande, Sharmila S
2011-01-01
Compared with composition-based binning algorithms, the binning accuracy and specificity of alignment-based binning algorithms is significantly higher. However, being alignment-based, the latter class of algorithms require enormous amount of time and computing resources for binning huge metagenomic datasets. The motivation was to develop a binning approach that can analyze metagenomic datasets as rapidly as composition-based approaches, but nevertheless has the accuracy and specificity of alignment-based algorithms. This article describes a hybrid binning approach (SPHINX) that achieves high binning efficiency by utilizing the principles of both 'composition'- and 'alignment'-based binning algorithms. Validation results with simulated sequence datasets indicate that SPHINX is able to analyze metagenomic sequences as rapidly as composition-based algorithms. Furthermore, the binning efficiency (in terms of accuracy and specificity of assignments) of SPHINX is observed to be comparable with results obtained using alignment-based algorithms. A web server for the SPHINX algorithm is available at http://metagenomics.atc.tcs.com/SPHINX/.
Accurate, Rapid Taxonomic Classification of Fungal Large-Subunit rRNA Genes
Liu, Kuan-Liang; Porras-Alfaro, Andrea; Eichorst, Stephanie A.
2012-01-01
Taxonomic and phylogenetic fingerprinting based on sequence analysis of gene fragments from the large-subunit rRNA (LSU) gene or the internal transcribed spacer (ITS) region is becoming an integral part of fungal classification. The lack of an accurate and robust classification tool trained by a validated sequence database for taxonomic placement of fungal LSU genes is a severe limitation in taxonomic analysis of fungal isolates or large data sets obtained from environmental surveys. Using a hand-curated set of 8,506 fungal LSU gene fragments, we determined the performance characteristics of a naïve Bayesian classifier across multiple taxonomic levels and compared the classifier performance to that of a sequence similarity-based (BLASTN) approach. The naïve Bayesian classifier was computationally more rapid (>460-fold with our system) than the BLASTN approach, and it provided equal or superior classification accuracy. Classifier accuracies were compared using sequence fragments of 100 bp and 400 bp and two different PCR primer anchor points to mimic sequence read lengths commonly obtained using current high-throughput sequencing technologies. Accuracy was higher with 400-bp sequence reads than with 100-bp reads. It was also significantly affected by sequence location across the 1,400-bp test region. The highest accuracy was obtained across either the D1 or D2 variable region. The naïve Bayesian classifier provides an effective and rapid means to classify fungal LSU sequences from large environmental surveys. The training set and tool are publicly available through the Ribosomal Database Project (http://rdp.cme.msu.edu/classifier/classifier.jsp). PMID:22194300
Centrifuge: rapid and sensitive classification of metagenomic sequences.
Kim, Daehwan; Song, Li; Breitwieser, Florian P; Salzberg, Steven L
2016-12-01
Centrifuge is a novel microbial classification engine that enables rapid, accurate, and sensitive labeling of reads and quantification of species on desktop computers. The system uses an indexing scheme based on the Burrows-Wheeler transform (BWT) and the Ferragina-Manzini (FM) index, optimized specifically for the metagenomic classification problem. Centrifuge requires a relatively small index (4.2 GB for 4078 bacterial and 200 archaeal genomes) and classifies sequences at very high speed, allowing it to process the millions of reads from a typical high-throughput DNA sequencing run within a few minutes. Together, these advances enable timely and accurate analysis of large metagenomics data sets on conventional desktop computers. Because of its space-optimized indexing schemes, Centrifuge also makes it possible to index the entire NCBI nonredundant nucleotide sequence database (a total of 109 billion bases) with an index size of 69 GB, in contrast to k-mer-based indexing schemes, which require far more extensive space. © 2016 Kim et al.; Published by Cold Spring Harbor Laboratory Press.
Molecular beacon sequence design algorithm.
Monroe, W Todd; Haselton, Frederick R
2003-01-01
A method based on Web-based tools is presented to design optimally functioning molecular beacons. Molecular beacons, fluorogenic hybridization probes, are a powerful tool for the rapid and specific detection of a particular nucleic acid sequence. However, their synthesis costs can be considerable. Since molecular beacon performance is based on its sequence, it is imperative to rationally design an optimal sequence before synthesis. The algorithm presented here uses simple Microsoft Excel formulas and macros to rank candidate sequences. This analysis is carried out using mfold structural predictions along with other free Web-based tools. For smaller laboratories where molecular beacons are not the focus of research, the public domain algorithm described here may be usefully employed to aid in molecular beacon design.
NASA Astrophysics Data System (ADS)
Khosla, Deepak; Huber, David J.; Martin, Kevin
2017-05-01
This paper† describes a technique in which we improve upon the prior performance of the Rapid Serial Visual Presentation (RSVP) EEG paradigm for image classification though the insertion of visual attention distracters and overall sequence reordering based upon the expected ratio of rare to common "events" in the environment and operational context. Inserting distracter images maintains the ratio of common events to rare events at an ideal level, maximizing the rare event detection via P300 EEG response to the RSVP stimuli. The method has two steps: first, we compute the optimal number of distracters needed for an RSVP stimuli based on the desired sequence length and expected number of targets and insert the distracters into the RSVP sequence, and then we reorder the RSVP sequence to maximize P300 detection. We show that by reducing the ratio of target events to nontarget events using this method, we can allow RSVP sequences with more targets without sacrificing area under the ROC curve (azimuth).
Indexcov: fast coverage quality control for whole-genome sequencing.
Pedersen, Brent S; Collins, Ryan L; Talkowski, Michael E; Quinlan, Aaron R
2017-11-01
The BAM and CRAM formats provide a supplementary linear index that facilitates rapid access to sequence alignments in arbitrary genomic regions. Comparing consecutive entries in a BAM or CRAM index allows one to infer the number of alignment records per genomic region for use as an effective proxy of sequence depth in each genomic region. Based on these properties, we have developed indexcov, an efficient estimator of whole-genome sequencing coverage to rapidly identify samples with aberrant coverage profiles, reveal large-scale chromosomal anomalies, recognize potential batch effects, and infer the sex of a sample. Indexcov is available at https://github.com/brentp/goleft under the MIT license. © The Authors 2017. Published by Oxford University Press.
Molecular Identification and Databases in Fusarium
USDA-ARS?s Scientific Manuscript database
DNA sequence-based methods for identifying pathogenic and mycotoxigenic Fusarium isolates have become the gold standard worldwide. Moreover, fusarial DNA sequence data are increasing rapidly in several web-accessible databases for comparative purposes. Unfortunately, the use of Basic Alignment Sea...
Genome Improvement at JGI-HAGSC
DOE Office of Scientific and Technical Information (OSTI.GOV)
Grimwood, Jane; Schmutz, Jeremy J.; Myers, Richard M.
Since the completion of the sequencing of the human genome, the Joint Genome Institute (JGI) has rapidly expanded its scientific goals in several DOE mission-relevant areas. At the JGI-HAGSC, we have kept pace with this rapid expansion of projects with our focus on assessing, assembling, improving and finishing eukaryotic whole genome shotgun (WGS) projects for which the shotgun sequence is generated at the Production Genomic Facility (JGI-PGF). We follow this by combining the draft WGS with genomic resources generated at JGI-HAGSC or in collaborator laboratories (including BAC end sequences, genetic maps and FLcDNA sequences) to produce an improved draft sequence.more » For eukaryotic genomes important to the DOE mission, we then add further information from directed experiments to produce reference genomic sequences that are publicly available for any scientific researcher. Also, we have continued our program for producing BAC-based finished sequence, both for adding information to JGI genome projects and for small BAC-based sequencing projects proposed through any of the JGI sequencing programs. We have now built our computational expertise in WGS assembly and analysis and have moved eukaryotic genome assembly from the JGI-PGF to JGI-HAGSC. We have concentrated our assembly development work on large plant genomes and complex fungal and algal genomes.« less
Frickmann, Hagen; Zautner, Andreas E.
2014-01-01
Atypical and multidrug resistance, especially ESBL and carbapenemase expressing Enterobacteriaceae, is globally spreading. Therefore, it becomes increasingly difficult to achieve therapeutic success by calculated antibiotic therapy. Consequently, rapid antibiotic resistance testing is essential. Various molecular and mass spectrometry-based approaches have been introduced in diagnostic microbiology to speed up the providing of reliable resistance data. PCR- and sequencing-based approaches are the most expensive but the most frequently applied modes of testing, suitable for the detection of resistance genes even from primary material. Next generation sequencing, based either on assessment of allelic single nucleotide polymorphisms or on the detection of nonubiquitous resistance mechanisms might allow for sequence-based bacterial resistance testing comparable to viral resistance testing on the long term. Fluorescence in situ hybridization (FISH), based on specific binding of fluorescence-labeled oligonucleotide probes, provides a less expensive molecular bridging technique. It is particularly useful for detection of resistance mechanisms based on mutations in ribosomal RNA. Approaches based on MALDI-TOF-MS, alone or in combination with molecular techniques, like PCR/electrospray ionization MS or minisequencing provide the fastest resistance results from pure colonies or even primary samples with a growing number of protocols. This review details the various approaches of rapid resistance testing, their pros and cons, and their potential use for the diagnostic laboratory. PMID:25343142
Yang, Lei; Naylor, Gavin J P
2016-01-01
We determined the complete mitochondrial genome sequence (16,760 bp) of the peacock skate Pavoraja nitida using a long-PCR based next generation sequencing method. It has 13 protein-coding genes, 22 tRNA genes, 2 rRNA genes, and 1 control region in the typical vertebrate arrangement. Primers, protocols, and procedures used to obtain this mitogenome are provided. We anticipate that this approach will facilitate rapid collection of mitogenome sequences for studies on phylogenetic relationships, population genetics, and conservation of cartilaginous fishes.
Transcriptome-based differentiation of closely-related Miscanthus lines.
Chouvarine, Philippe; Cooksey, Amanda M; McCarthy, Fiona M; Ray, David A; Baldwin, Brian S; Burgess, Shane C; Peterson, Daniel G
2012-01-01
Distinguishing between individuals is critical to those conducting animal/plant breeding, food safety/quality research, diagnostic and clinical testing, and evolutionary biology studies. Classical genetic identification studies are based on marker polymorphisms, but polymorphism-based techniques are time and labor intensive and often cannot distinguish between closely related individuals. Illumina sequencing technologies provide the detailed sequence data required for rapid and efficient differentiation of related species, lines/cultivars, and individuals in a cost-effective manner. Here we describe the use of Illumina high-throughput exome sequencing, coupled with SNP mapping, as a rapid means of distinguishing between related cultivars of the lignocellulosic bioenergy crop giant miscanthus (Miscanthus × giganteus). We provide the first exome sequence database for Miscanthus species complete with Gene Ontology (GO) functional annotations. A SNP comparative analysis of rhizome-derived cDNA sequences was successfully utilized to distinguish three Miscanthus × giganteus cultivars from each other and from other Miscanthus species. Moreover, the resulting phylogenetic tree generated from SNP frequency data parallels the known breeding history of the plants examined. Some of the giant miscanthus plants exhibit considerable sequence divergence. Here we describe an analysis of Miscanthus in which high-throughput exome sequencing was utilized to differentiate between closely related genotypes despite the current lack of a reference genome sequence. We functionally annotated the exome sequences and provide resources to support Miscanthus systems biology. In addition, we demonstrate the use of the commercial high-performance cloud computing to do computational GO annotation.
Daniel, Eleni; Jones, Robert; Bull, Matthew; Newell-Price, John
2016-12-01
Patients with SDHx mutations need long-term radiological surveillance for the development of paragangliomas and phaeochromocytomas, but no longitudinal data exist. The aim of the study was to assess the performance of rapid-sequence non-contrast magnetic resonance imaging (MRI) in the long-term monitoring of patients with SDHx mutations. Retrospective study between 2005 and 2015 at a University Hospital and regional endocrine genetics referral centre. Clinical and imaging data of 47 patients with SDHx mutations (SDHB (36), SDHC (6) and SDHD (5)) who had surveillance for detection of paragangliomas by rapid-sequence non-contrast MRI (base of skull to pubic symphysis) were collected. Twelve index cases (nine SDHB, one SDHC and two SDHD) and 35 mutation-positive relatives were monitored for a mean of 6.4 years (range 3.1-10.0 years). Mean age at the end of the study: SDHB 46.9 ± 17.6 years; SDHC 42.3 ± 24.4 years; SDHD 54.9 ± 10.6 years. On excluding imaging at initial diagnosis of index cases, 42 patients underwent 116 rapid-sequence MRI scans: 83 scans were negative and 31 scans were positive for sPGL/HNPGL in 13 patients. Most patients had multiple scans (n = number of patients (number of rapid-sequence MRI scans during screening)): n = 9 (2), n = 20 (3), n = 6 (4), n = 1 (6). Nine patients (three index) were diagnosed with new paragangliomas during surveillance and non-operated tumour size was monitored in nine patients. There were two false-positive scans (1.6%). Scans were repeated every 27 ± 9 months. Biannual rapid-sequence non-contrast MRI is effective to monitor patients with SDHx mutations for detection of new tumours and monitoring of known tumours. © 2016 European Society of Endocrinology.
Molecular Diagnosis of Long-QT syndrome at 10 Days of Life by Rapid Whole Genome Sequencing
Priest, James R.; Ceresnak, Scott R.; Dewey, Frederick E.; Malloy-Walton, Lindsey E.; Dunn, Kyla; Grove, Megan E.; Perez, Marco V.; Maeda, Katsuhide; Dubin, Anne M.; Ashley, Euan A.
2014-01-01
Background The advent of clinical next generation sequencing is rapidly changing the landscape of rare disease medicine. Molecular diagnosis of long QT syndrome (LQTS) can impact clinical management, including risk stratification and selection of pharmacotherapy based on the type of ion channel affected, but results from current gene panel testing requires 4 to 16 weeks before return to clinicians. Objective A term female infant presented with 2:1 atrioventricular block and ventricular arrhythmias consistent with perinatal LQTS, requiring aggressive treatment including epicardial pacemaker, and cardioverter-defibrillator implantation and sympathectomy on day of life two. We sought to provide a rapid molecular diagnosis for optimization of treatment strategies. Methods We performed CLIA-certified rapid whole genome sequencing (WGS) with a speed-optimized bioinformatics platform to achieve molecular diagnosis at 10 days of life. Results We detected a known pathogenic variant in KCNH2 that was demonstrated to be paternally inherited by followup genotyping. The unbiased assessment of the entire catalog of human genes provided by whole genome sequencing revealed a maternally inherited variant of unknown significance in a novel gene. Conclusions Rapid clinical WGS provides faster and more comprehensive diagnostic information by 10 days of life than standard gene-panel testing. In selected clinical scenarios such as perinatal LQTS, rapid WGS may be able to provide more timely and clinically actionable information than a standard commercial test. PMID:24973560
TOPPE: A framework for rapid prototyping of MR pulse sequences.
Nielsen, Jon-Fredrik; Noll, Douglas C
2018-06-01
To introduce a framework for rapid prototyping of MR pulse sequences. We propose a simple file format, called "TOPPE", for specifying all details of an MR imaging experiment, such as gradient and radiofrequency waveforms and the complete scan loop. In addition, we provide a TOPPE file "interpreter" for GE scanners, which is a binary executable that loads TOPPE files and executes the sequence on the scanner. We also provide MATLAB scripts for reading and writing TOPPE files and previewing the sequence prior to hardware execution. With this setup, the task of the pulse sequence programmer is reduced to creating TOPPE files, eliminating the need for hardware-specific programming. No sequence-specific compilation is necessary; the interpreter only needs to be compiled once (for every scanner software upgrade). We demonstrate TOPPE in three different applications: k-space mapping, non-Cartesian PRESTO whole-brain dynamic imaging, and myelin mapping in the brain using inhomogeneous magnetization transfer. We successfully implemented and executed the three example sequences. By simply changing the various TOPPE sequence files, a single binary executable (interpreter) was used to execute several different sequences. The TOPPE file format is a complete specification of an MR imaging experiment, based on arbitrary sequences of a (typically small) number of unique modules. Along with the GE interpreter, TOPPE comprises a modular and flexible platform for rapid prototyping of new pulse sequences. Magn Reson Med 79:3128-3134, 2018. © 2017 International Society for Magnetic Resonance in Medicine. © 2017 International Society for Magnetic Resonance in Medicine.
Isalan, M; Klug, A; Choo, Y
2001-07-01
DNA-binding domains with predetermined sequence specificity are engineered by selection of zinc finger modules using phage display, allowing the construction of customized transcription factors. Despite remarkable progress in this field, the available protein-engineering methods are deficient in many respects, thus hampering the applicability of the technique. Here we present a rapid and convenient method that can be used to design zinc finger proteins against a variety of DNA-binding sites. This is based on a pair of pre-made zinc finger phage-display libraries, which are used in parallel to select two DNA-binding domains each of which recognizes given 5 base pair sequences, and whose products are recombined to produce a single protein that recognizes a composite (9 base pair) site of predefined sequence. Engineering using this system can be completed in less than two weeks and yields proteins that bind sequence-specifically to DNA with Kd values in the nanomolar range. To illustrate the technique, we have selected seven different proteins to bind various regions of the human immunodeficiency virus 1 (HIV-1) promoter.
Microbe-ID: an open source toolbox for microbial genotyping and species identification.
Tabima, Javier F; Everhart, Sydney E; Larsen, Meredith M; Weisberg, Alexandra J; Kamvar, Zhian N; Tancos, Matthew A; Smart, Christine D; Chang, Jeff H; Grünwald, Niklaus J
2016-01-01
Development of tools to identify species, genotypes, or novel strains of invasive organisms is critical for monitoring emergence and implementing rapid response measures. Molecular markers, although critical to identifying species or genotypes, require bioinformatic tools for analysis. However, user-friendly analytical tools for fast identification are not readily available. To address this need, we created a web-based set of applications called Microbe-ID that allow for customizing a toolbox for rapid species identification and strain genotyping using any genetic markers of choice. Two components of Microbe-ID, named Sequence-ID and Genotype-ID, implement species and genotype identification, respectively. Sequence-ID allows identification of species by using BLAST to query sequences for any locus of interest against a custom reference sequence database. Genotype-ID allows placement of an unknown multilocus marker in either a minimum spanning network or dendrogram with bootstrap support from a user-created reference database. Microbe-ID can be used for identification of any organism based on nucleotide sequences or any molecular marker type and several examples are provided. We created a public website for demonstration purposes called Microbe-ID (microbe-id.org) and provided a working implementation for the genus Phytophthora (phytophthora-id.org). In Phytophthora-ID, the Sequence-ID application allows identification based on ITS or cox spacer sequences. Genotype-ID groups individuals into clonal lineages based on simple sequence repeat (SSR) markers for the two invasive plant pathogen species P. infestans and P. ramorum. All code is open source and available on github and CRAN. Instructions for installation and use are provided at https://github.com/grunwaldlab/Microbe-ID.
Manigart, Olivier; Boeras, Debrah I; Karita, Etienne; Hawkins, Paulina A; Vwalika, Cheswa; Makombe, Nathan; Mulenga, Joseph; Derdeyn, Cynthia A; Allen, Susan; Hunter, Eric
2012-12-01
A critical step in HIV-1 transmission studies is the rapid and accurate identification of epidemiologically linked transmission pairs. To date, this has been accomplished by comparison of polymerase chain reaction (PCR)-amplified nucleotide sequences from potential transmission pairs, which can be cost-prohibitive for use in resource-limited settings. Here we describe a rapid, cost-effective approach to determine transmission linkage based on the heteroduplex mobility assay (HMA), and validate this approach by comparison to nucleotide sequencing. A total of 102 HIV-1-infected Zambian and Rwandan couples, with known linkage, were analyzed by gp41-HMA. A 400-base pair fragment within the envelope gp41 region of the HIV proviral genome was PCR amplified and HMA was applied to both partners' amplicons separately (autologous) and as a mixture (heterologous). If the diversity between gp41 sequences was low (<5%), a homoduplex was observed upon gel electrophoresis and the transmission was characterized as having occurred between partners (linked). If a new heteroduplex formed, within the heterologous migration, the transmission was determined to be unlinked. Initial blind validation of gp-41 HMA demonstrated 90% concordance between HMA and sequencing with 100% concordance in the case of linked transmissions. Following validation, 25 newly infected partners in Kigali and 12 in Lusaka were evaluated prospectively using both HMA and nucleotide sequences. Concordant results were obtained in all but one case (97.3%). The gp41-HMA technique is a reliable and feasible tool to detect linked transmissions in the field. All identified unlinked results should be confirmed by sequence analyses.
Mohammed, Monzoorul Haque; Ghosh, Tarini Shankar; Chadaram, Sudha; Mande, Sharmila S
2011-11-30
Obtaining accurate estimates of microbial diversity using rDNA profiling is the first step in most metagenomics projects. Consequently, most metagenomic projects spend considerable amounts of time, money and manpower for experimentally cloning, amplifying and sequencing the rDNA content in a metagenomic sample. In the second step, the entire genomic content of the metagenome is extracted, sequenced and analyzed. Since DNA sequences obtained in this second step also contain rDNA fragments, rapid in silico identification of these rDNA fragments would drastically reduce the cost, time and effort of current metagenomic projects by entirely bypassing the experimental steps of primer based rDNA amplification, cloning and sequencing. In this study, we present an algorithm called i-rDNA that can facilitate the rapid detection of 16S rDNA fragments from amongst millions of sequences in metagenomic data sets with high detection sensitivity. Performance evaluation with data sets/database variants simulating typical metagenomic scenarios indicates the significantly high detection sensitivity of i-rDNA. Moreover, i-rDNA can process a million sequences in less than an hour on a simple desktop with modest hardware specifications. In addition to the speed of execution, high sensitivity and low false positive rate, the utility of the algorithmic approach discussed in this paper is immense given that it would help in bypassing the entire experimental step of primer-based rDNA amplification, cloning and sequencing. Application of this algorithmic approach would thus drastically reduce the cost, time and human efforts invested in all metagenomic projects. A web-server for the i-rDNA algorithm is available at http://metagenomics.atc.tcs.com/i-rDNA/
Sauvage, Thomas; Plouviez, Sophie; Schmidt, William E; Fredericq, Suzanne
2018-03-05
The body of DNA sequence data lacking taxonomically informative sequence headers is rapidly growing in user and public databases (e.g. sequences lacking identification and contaminants). In the context of systematics studies, sorting such sequence data for taxonomic curation and/or molecular diversity characterization (e.g. crypticism) often requires the building of exploratory phylogenetic trees with reference taxa. The subsequent step of segregating DNA sequences of interest based on observed topological relationships can represent a challenging task, especially for large datasets. We have written TREE2FASTA, a Perl script that enables and expedites the sorting of FASTA-formatted sequence data from exploratory phylogenetic trees. TREE2FASTA takes advantage of the interactive, rapid point-and-click color selection and/or annotations of tree leaves in the popular Java tree-viewer FigTree to segregate groups of FASTA sequences of interest to separate files. TREE2FASTA allows for both simple and nested segregation designs to facilitate the simultaneous preparation of multiple data sets that may overlap in sequence content.
Ferchichi, M; Valcheva, R; Prévost, H; Onno, B; Dousset, X
2008-06-01
Species-specific primers targeting the 16S-23S ribosomal DNA (rDNA) intergenic spacer region (ISR) were designed to rapidly discriminate between Lactobacillus mindensis, Lactobacillus panis, Lactobacillus paralimentarius, Lactobacillus pontis and Lactobacillus frumenti species recently isolated from French sourdough. The 16S-23S ISRs were amplified using primers 16S/p2 and 23S/p7, which anneal to positions 1388-1406 of the 16S rRNA gene and to positions 207-189 of the 23S rRNA gene respectively, Escherichia coli numbering (GenBank accession number V00331). Clone libraries of the resulting amplicons were constructed using a pCR2.1 TA cloning kit and sequenced. Species-specific primers were designed based on the sequences obtained and were used to amplify the 16S-23S ISR in the Lactobacillus species considered. For all of them, two PCR amplicons, designated as small ISR (S-ISR) and large ISR (L-ISR), were obtained. The L-ISR is composed of the corresponding S-ISR, interrupted by a sequence containing tRNA(Ile) and tRNA(Ala) genes. Based on these sequences, species-specific primers were designed and proved to identify accurately the species considered among 30 reference Lactobacillus species tested. Designed species-specific primers enable a rapid and accurate identification of L. mindensis, L. paralimentarius, L. panis, L. pontis and L. frumenti species among other lactobacilli. The proposed method provides a powerful and convenient means of rapidly identifying some sourdough lactobacilli, which could be of help in large starter culture surveys.
Sayah, Anousheh; Jay, Ann K; Toaff, Jacob S; Makariou, Erini V; Berkowitz, Frank
2016-09-01
Reducing lumbar spine MRI scanning time while retaining diagnostic accuracy can benefit patients and reduce health care costs. This study compares the effectiveness of a rapid lumbar MRI protocol using 3D T2-weighted sampling perfection with application-optimized contrast with different flip-angle evolutions (SPACE) sequences with a standard MRI protocol for evaluation of lumbar spondylosis. Two hundred fifty consecutive unenhanced lumbar MRI examinations performed at 1.5 T were retrospectively reviewed. Full, rapid, and complete versions of each examination were interpreted for spondylotic changes at each lumbar level, including herniations and neural compromise. The full examination consisted of sagittal T1-weighted, T2-weighted turbo spin-echo (TSE), and STIR sequences; and axial T1- and T2-weighted TSE sequences (time, 18 minutes 40 seconds). The rapid examination consisted of sagittal T1- and T2-weighted SPACE sequences, with axial SPACE reformations (time, 8 minutes 46 seconds). The complete examination consisted of the full examination plus the T2-weighted SPACE sequence. Sensitivities and specificities of the full and rapid examinations were calculated using the complete study as the reference standard. The rapid and full studies had sensitivities of 76.0% and 69.3%, with specificities of 97.2% and 97.9%, respectively, for all degenerative processes. Rapid and full sensitivities were 68.7% and 66.3% for disk herniation, 85.2% and 81.5% for canal compromise, 82.9% and 69.1% for lateral recess compromise, and 76.9% and 69.7% for foraminal compromise, respectively. Isotropic SPACE T2-weighted imaging provides high-quality imaging of lumbar spondylosis, with multiplanar reformatting capability. Our SPACE-based rapid protocol had sensitivities and specificities for herniations and neural compromise comparable to those of the protocol without SPACE. This protocol fits within a 15-minute slot, potentially reducing costs and discomfort for a large subgroup of patients.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Krishnakumar, Raga; Sinha, Anupama; Bird, Sara W.
Emerging sequencing technologies are allowing us to characterize environmental, clinical and laboratory samples with increasing speed and detail, including real-time analysis and interpretation of data. One example of this is being able to rapidly and accurately detect a wide range of pathogenic organisms, both in the clinic and the field. Genomes can have radically different GC content however, such that accurate sequence analysis can be challenging depending upon the technology used. Here, we have characterized the performance of the Oxford MinION nanopore sequencer for detection and evaluation of organisms with a range of genomic nucleotide bias. We have diagnosed themore » quality of base-calling across individual reads and discovered that the position within the read affects base-calling and quality scores. Finally, we have evaluated the performance of the current state-of-the-art neural network-based MinION basecaller, characterizing its behavior with respect to systemic errors as well as context- and sequence-specific errors. Overall, we present a detailed characterization the capabilities of the MinION in terms of generating high-accuracy sequence data from genomes with a wide range of nucleotide content. This study provides a framework for designing the appropriate experiments that are the likely to lead to accurate and rapid field-forward diagnostics.« less
Krishnakumar, Raga; Sinha, Anupama; Bird, Sara W.; ...
2018-02-16
Emerging sequencing technologies are allowing us to characterize environmental, clinical and laboratory samples with increasing speed and detail, including real-time analysis and interpretation of data. One example of this is being able to rapidly and accurately detect a wide range of pathogenic organisms, both in the clinic and the field. Genomes can have radically different GC content however, such that accurate sequence analysis can be challenging depending upon the technology used. Here, we have characterized the performance of the Oxford MinION nanopore sequencer for detection and evaluation of organisms with a range of genomic nucleotide bias. We have diagnosed themore » quality of base-calling across individual reads and discovered that the position within the read affects base-calling and quality scores. Finally, we have evaluated the performance of the current state-of-the-art neural network-based MinION basecaller, characterizing its behavior with respect to systemic errors as well as context- and sequence-specific errors. Overall, we present a detailed characterization the capabilities of the MinION in terms of generating high-accuracy sequence data from genomes with a wide range of nucleotide content. This study provides a framework for designing the appropriate experiments that are the likely to lead to accurate and rapid field-forward diagnostics.« less
ABACAS: algorithm-based automatic contiguation of assembled sequences
Assefa, Samuel; Keane, Thomas M.; Otto, Thomas D.; Newbold, Chris; Berriman, Matthew
2009-01-01
Summary: Due to the availability of new sequencing technologies, we are now increasingly interested in sequencing closely related strains of existing finished genomes. Recently a number of de novo and mapping-based assemblers have been developed to produce high quality draft genomes from new sequencing technology reads. New tools are necessary to take contigs from a draft assembly through to a fully contiguated genome sequence. ABACAS is intended as a tool to rapidly contiguate (align, order, orientate), visualize and design primers to close gaps on shotgun assembled contigs based on a reference sequence. The input to ABACAS is a set of contigs which will be aligned to the reference genome, ordered and orientated, visualized in the ACT comparative browser, and optimal primer sequences are automatically generated. Availability and Implementation: ABACAS is implemented in Perl and is freely available for download from http://abacas.sourceforge.net Contact: sa4@sanger.ac.uk PMID:19497936
Pryce, Todd M; Palladino, Silvano; Price, Diane M; Gardam, Dianne J; Campbell, Peter B; Christiansen, Keryn J; Murray, Ronan J
2006-04-01
We report a direct polymerase chain reaction/sequence (d-PCRS)-based method for the rapid identification of clinically significant fungi from 5 different types of commercial broth enrichment media inoculated with clinical specimens. Media including BacT/ALERT FA (BioMérieux, Marcy l'Etoile, France) (n = 87), BACTEC Plus Aerobic/F (Becton Dickinson, Microbiology Systems, Sparks, MD) (n = 16), BACTEC Peds Plus/F (Becton Dickinson) (n = 15), BACTEC Lytic/10 Anaerobic/F (Becton Dickinson) (n = 11) bottles, and BBL MGIT (Becton Dickinson) (n = 11) were inoculated with specimens from 138 patients. A universal DNA extraction method was used combining a novel pretreatment step to remove PCR inhibitors with a column-based DNA extraction kit. Target sequences in the noncoding internal transcribed spacer regions of the rRNA gene were amplified by PCR and sequenced using a rapid (24 h) automated capillary electrophoresis system. Using sequence alignment software, fungi were identified by sequence similarity with sequences derived from isolates identified by upper-level reference laboratories or isolates defined as ex-type strains. We identified Candida albicans (n = 14), Candida parapsilosis (n = 8), Candida glabrata (n = 7), Candida krusei (n = 2), Scedosporium prolificans (n = 4), and 1 each of Candida orthopsilosis, Candida dubliniensis, Candida kefyr, Candida tropicalis, Candida guilliermondii, Saccharomyces cerevisiae, Cryptococcus neoformans, Aspergillus fumigatus, Histoplasma capsulatum, and Malassezia pachydermatis by d-PCRS analysis. All d-PCRS identifications from positive broths were in agreement with the final species identification of the isolates grown from subculture. Earlier identification of fungi using d-PCRS may facilitate prompt and more appropriate antifungal therapy.
McTaggart, Lisa; Richardson, Susan E.; Seah, Christine; Hoang, Linda; Fothergill, Annette; Zhang, Sean X.
2011-01-01
Rapid identification of Cryptococcus neoformans var. grubii, Cryptococcus neoformans var. neoformans, and Cryptococcus gattii is imperative for facilitation of prompt treatment of cryptococcosis and for understanding the epidemiology of the disease. Our purpose was to evaluate a test algorithm incorporating commercial rapid biochemical tests, differential media, and DNA sequence analysis that will allow us to differentiate these taxa rapidly and accurately. We assessed 147 type, reference, and clinical isolates, including 6 other Cryptococcus spp. (10 isolates) and 14 other yeast species (24 isolates), using a 4-hour urea broth test (Remel), a 24-hour urea broth test (Becton Dickinson), a 4-hour caffeic acid disk test (Hardy Diagnostics and Remel), 40- to 44-hour growth assessment on l-canavanine glycine bromothymol blue (CGB) agar, and intergenic spacer (IGS) sequence analysis. All 123 Cryptococcus isolates hydrolyzed urea, along with 7 isolates of Rhodotorula and Trichosporon. Eighty-five of 86 C. neoformans (99%) and 26 of 27 C. gattii (96%) isolates had positive caffeic acid results, unlike the other cryptococci (0/10) and yeast species (0/24). Together, these two tests positively identified virtually all C. neoformans/C. gattii isolates (98%) within 4 h. CGB agar or IGS sequencing further differentiated these isolates within 48 h. On CGB, 25 of 27 (93%) C. gattii strains induced a blue color change, in contrast to 0 of 86 C. neoformans isolates. Neighbor-joining cluster analysis of IGS sequences differentiated C. neoformans var. grubii, C. neoformans var. neoformans, and C. gattii. Based on these results, we describe a rapid identification algorithm for use in a microbiology laboratory to distinguish clinically relevant Cryptococcus spp. PMID:21593254
McTaggart, Lisa; Richardson, Susan E; Seah, Christine; Hoang, Linda; Fothergill, Annette; Zhang, Sean X
2011-07-01
Rapid identification of Cryptococcus neoformans var. grubii, Cryptococcus neoformans var. neoformans, and Cryptococcus gattii is imperative for facilitation of prompt treatment of cryptococcosis and for understanding the epidemiology of the disease. Our purpose was to evaluate a test algorithm incorporating commercial rapid biochemical tests, differential media, and DNA sequence analysis that will allow us to differentiate these taxa rapidly and accurately. We assessed 147 type, reference, and clinical isolates, including 6 other Cryptococcus spp. (10 isolates) and 14 other yeast species (24 isolates), using a 4-hour urea broth test (Remel), a 24-hour urea broth test (Becton Dickinson), a 4-hour caffeic acid disk test (Hardy Diagnostics and Remel), 40- to 44-hour growth assessment on l-canavanine glycine bromothymol blue (CGB) agar, and intergenic spacer (IGS) sequence analysis. All 123 Cryptococcus isolates hydrolyzed urea, along with 7 isolates of Rhodotorula and Trichosporon. Eighty-five of 86 C. neoformans (99%) and 26 of 27 C. gattii (96%) isolates had positive caffeic acid results, unlike the other cryptococci (0/10) and yeast species (0/24). Together, these two tests positively identified virtually all C. neoformans/C. gattii isolates (98%) within 4 h. CGB agar or IGS sequencing further differentiated these isolates within 48 h. On CGB, 25 of 27 (93%) C. gattii strains induced a blue color change, in contrast to 0 of 86 C. neoformans isolates. Neighbor-joining cluster analysis of IGS sequences differentiated C. neoformans var. grubii, C. neoformans var. neoformans, and C. gattii. Based on these results, we describe a rapid identification algorithm for use in a microbiology laboratory to distinguish clinically relevant Cryptococcus spp.
Microbe-ID: an open source toolbox for microbial genotyping and species identification
Tabima, Javier F.; Everhart, Sydney E.; Larsen, Meredith M.; Weisberg, Alexandra J.; Kamvar, Zhian N.; Tancos, Matthew A.; Smart, Christine D.; Chang, Jeff H.
2016-01-01
Development of tools to identify species, genotypes, or novel strains of invasive organisms is critical for monitoring emergence and implementing rapid response measures. Molecular markers, although critical to identifying species or genotypes, require bioinformatic tools for analysis. However, user-friendly analytical tools for fast identification are not readily available. To address this need, we created a web-based set of applications called Microbe-ID that allow for customizing a toolbox for rapid species identification and strain genotyping using any genetic markers of choice. Two components of Microbe-ID, named Sequence-ID and Genotype-ID, implement species and genotype identification, respectively. Sequence-ID allows identification of species by using BLAST to query sequences for any locus of interest against a custom reference sequence database. Genotype-ID allows placement of an unknown multilocus marker in either a minimum spanning network or dendrogram with bootstrap support from a user-created reference database. Microbe-ID can be used for identification of any organism based on nucleotide sequences or any molecular marker type and several examples are provided. We created a public website for demonstration purposes called Microbe-ID (microbe-id.org) and provided a working implementation for the genus Phytophthora (phytophthora-id.org). In Phytophthora-ID, the Sequence-ID application allows identification based on ITS or cox spacer sequences. Genotype-ID groups individuals into clonal lineages based on simple sequence repeat (SSR) markers for the two invasive plant pathogen species P. infestans and P. ramorum. All code is open source and available on github and CRAN. Instructions for installation and use are provided at https://github.com/grunwaldlab/Microbe-ID. PMID:27602267
USDA-ARS?s Scientific Manuscript database
Rapid development of highly saturated genetic maps aids molecular breeding, which can accelerate gain per breeding cycle in woody perennial plants such as Rubus idaeus (red raspberry). Recently, robust genotyping methods based on high-throughput sequencing were developed, which provide high marker d...
Accurate multiplex polony sequencing of an evolved bacterial genome.
Shendure, Jay; Porreca, Gregory J; Reppas, Nikos B; Lin, Xiaoxia; McCutcheon, John P; Rosenbaum, Abraham M; Wang, Michael D; Zhang, Kun; Mitra, Robi D; Church, George M
2005-09-09
We describe a DNA sequencing technology in which a commonly available, inexpensive epifluorescence microscope is converted to rapid nonelectrophoretic DNA sequencing automation. We apply this technology to resequence an evolved strain of Escherichia coli at less than one error per million consensus bases. A cell-free, mate-paired library provided single DNA molecules that were amplified in parallel to 1-micrometer beads by emulsion polymerase chain reaction. Millions of beads were immobilized in a polyacrylamide gel and subjected to automated cycles of sequencing by ligation and four-color imaging. Cost per base was roughly one-ninth as much as that of conventional sequencing. Our protocols were implemented with off-the-shelf instrumentation and reagents.
Shin, Jeong Hong; Jung, Soobin; Ramakrishna, Suresh; Kim, Hyongbum Henry; Lee, Junwon
2018-07-07
Genome editing technology using programmable nucleases has rapidly evolved in recent years. The primary mechanism to achieve precise integration of a transgene is mainly based on homology-directed repair (HDR). However, an HDR-based genome-editing approach is less efficient than non-homologous end-joining (NHEJ). Recently, a microhomology-mediated end-joining (MMEJ)-based transgene integration approach was developed, showing feasibility both in vitro and in vivo. We expanded this method to achieve targeted sequence substitution (TSS) of mutated sequences with normal sequences using double-guide RNAs (gRNAs), and a donor template flanking the microhomologies and target sequence of the gRNAs in vitro and in vivo. Our method could realize more efficient sequence substitution than the HDR-based method in vitro using a reporter cell line, and led to the survival of a hereditary tyrosinemia mouse model in vivo. The proposed MMEJ-based TSS approach could provide a novel therapeutic strategy, in addition to HDR, to achieve gene correction from a mutated sequence to a normal sequence. Copyright © 2018 Elsevier Inc. All rights reserved.
Current management of penetrating torso trauma: nontherapeutic is not good enough anymore.
Ball, Chad G
2014-04-01
A highly organized approach to the evaluation and treatment of penetrating torso injuries based on regional anatomy provides rapid diagnostic and therapeutic consistency. It also minimizes delays in diagnosis, missed injuries and nontherapeutic laparotomies. This review discusses an optimal sequence of structured rapid assessments that allow the clinician to rapidly proceed to gold standard therapies with a minimal risk of associated morbidity.
Syromyatnikov, Mikhail Y; Golub, Victor B; Kokina, Anastasia V; Victoria A Soboleva; Popov, Vasily N
2017-01-01
The genus Eurygaster Laporte, 1833 includes ten species five of which inhabit the European part of Russia. The harmful species of the genus is E. integriceps . Eurygaster species identification based on the morphological traits is very difficult, while that of the species at the egg or larval stages is extremely difficult or impossible. Eurygaster integriceps , E. maura , and E. testudinaria differ only slightly between each other morphologically, E. maura and E. testudinaria being almost indiscernible. DNA barcoding based on COI sequences have shown that E. integriceps differs significantly from these closely related species, which enables its rapid and accurate identification. Based on COI nucleotide sequences, three species of Sunn pests, E. maura , E. testudinarius , E. dilaticollis , could not be differentiated from each other through DNA barcoding. The difference in the DNA sequences between the COI gene of E. integriceps and COI genes of E. maura and E. testudinarius was more than 4%. In the present study DNA barcoding of two Eurygaster species was performed for the first time on E. integriceps , the most dangerous pest in the genus, and E. dilaticollis that only inhabits natural ecosystems. The PCR-RFLP method was developed in this work for the rapid identification of E. integriceps .
Syromyatnikov, Mikhail Y.; Golub, Victor B.; Kokina, Anastasia V.; Victoria A. Soboleva; Popov, Vasily N.
2017-01-01
Abstract The genus Eurygaster Laporte, 1833 includes ten species five of which inhabit the European part of Russia. The harmful species of the genus is E. integriceps. Eurygaster species identification based on the morphological traits is very difficult, while that of the species at the egg or larval stages is extremely difficult or impossible. Eurygaster integriceps, E. maura, and E. testudinaria differ only slightly between each other morphologically, E. maura and E. testudinaria being almost indiscernible. DNA barcoding based on COI sequences have shown that E. integriceps differs significantly from these closely related species, which enables its rapid and accurate identification. Based on COI nucleotide sequences, three species of Sunn pests, E. maura, E. testudinarius, E. dilaticollis, could not be differentiated from each other through DNA barcoding. The difference in the DNA sequences between the COI gene of E. integriceps and COI genes of E. maura and E. testudinarius was more than 4%. In the present study DNA barcoding of two Eurygaster species was performed for the first time on E. integriceps, the most dangerous pest in the genus, and E. dilaticollis that only inhabits natural ecosystems. The PCR-RFLP method was developed in this work for the rapid identification of E. integriceps. PMID:29118620
[Current applications of high-throughput DNA sequencing technology in antibody drug research].
Yu, Xin; Liu, Qi-Gang; Wang, Ming-Rong
2012-03-01
Since the publication of a high-throughput DNA sequencing technology based on PCR reaction was carried out in oil emulsions in 2005, high-throughput DNA sequencing platforms have been evolved to a robust technology in sequencing genomes and diverse DNA libraries. Antibody libraries with vast numbers of members currently serve as a foundation of discovering novel antibody drugs, and high-throughput DNA sequencing technology makes it possible to rapidly identify functional antibody variants with desired properties. Herein we present a review of current applications of high-throughput DNA sequencing technology in the analysis of antibody library diversity, sequencing of CDR3 regions, identification of potent antibodies based on sequence frequency, discovery of functional genes, and combination with various display technologies, so as to provide an alternative approach of discovery and development of antibody drugs.
Frankham, Greta J.; McEwing, Ross; The, Dang Tat; Hogg, Carolyn J.; Lo, Nathan; Johnson, Rebecca N.
2018-01-01
Rhinoceros (rhinos) have suffered a dramatic increase in poaching over the past decade due to the growing demand for rhino horn products in Asia. One way to reverse this trend is to enhance enforcement and intelligence gathering tools used for species identification of horns, in particular making them fast, inexpensive and accurate. Traditionally, species identification tests are based on DNA sequence data, which, depending on laboratory resources, can be either time or cost prohibitive. This study presents a rapid rhino species identification test, utilizing species-specific primers within the cytochrome b gene multiplexed in a single reaction, with a presumptive species identification based on the length of the resultant amplicon. This multiplex PCR assay can provide a presumptive species identification result in less than 24 hours. Sequence-based definitive testing can be conducted if/when required (e.g. court purposes). This work also presents an actual casework scenario in which the presumptive test was successfully utlitised, in concert with sequence-based definitive testing. The test was carried out on seized suspected rhino horns tested at the Institute of Ecology and Biological Resources, the CITES mandated laboratory in Vietnam, a country that is known to be a major source of demand for rhino horns. This test represents the basis for which future ‘rapid species identification tests’ can be trialed. PMID:29902212
Computer constructed imagery of distant plasma interaction boundaries
DOE Office of Scientific and Technical Information (OSTI.GOV)
Grenstadt, E.W.; Schurr, H.D.; Tsugawa, R.K.
1982-01-01
Computer constructed sketches of plasma boundaries arising from the interaction between the solar wind and the magnetosphere can serve as both didactic and research tools. In particular, the structure of the earth's bow shock can be represented as a nonuniform surfce according to the instantaneous orientation of the IMF, and temporal changes in structural distribution can be modeled as a sequence of sketches based on observed sequences of spacecraft-based measurements. Viewed rapidly, such a sequence of sketches can be the basis for representation of plasma processes by computer animation.
Methods and compositions for efficient nucleic acid sequencing
Drmanac, Radoje
2006-07-04
Disclosed are novel methods and compositions for rapid and highly efficient nucleic acid sequencing based upon hybridization with two sets of small oligonucleotide probes of known sequences. Extremely large nucleic acid molecules, including chromosomes and non-amplified RNA, may be sequenced without prior cloning or subcloning steps. The methods of the invention also solve various current problems associated with sequencing technology such as, for example, high noise to signal ratios and difficult discrimination, attaching many nucleic acid fragments to a surface, preparing many, longer or more complex probes and labelling more species.
Methods and compositions for efficient nucleic acid sequencing
Drmanac, Radoje
2002-01-01
Disclosed are novel methods and compositions for rapid and highly efficient nucleic acid sequencing based upon hybridization with two sets of small oligonucleotide probes of known sequences. Extremely large nucleic acid molecules, including chromosomes and non-amplified RNA, may be sequenced without prior cloning or subcloning steps. The methods of the invention also solve various current problems associated with sequencing technology such as, for example, high noise to signal ratios and difficult discrimination, attaching many nucleic acid fragments to a surface, preparing many, longer or more complex probes and labelling more species.
Koo, Bonhan; Lee, Tae Yoon; Lee, Jeong Hoon; Shin, Yong; Lim, Seok-Byung
2017-01-01
Although KRAS mutational status testing is becoming a companion diagnostic tool for managing patients with colorectal cancer (CRC), there are still several difficulties when analyzing KRAS mutations using the existing assays, particularly with regard to low sensitivity, its time-consuming, and the need for large instruments. We developed a rapid, sensitive, and specific mutation detection assay based on the bio-photonic sensor termed ISAD (isothermal solid-phase amplification/detection), and used it to analyze KRAS gene mutations in human clinical samples. To validate the ISAD-KRAS assay for use in clinical diagnostics, we examined for hotspot KRAS mutations (codon 12 and codon 13) in 70 CRC specimens using PCR and direct sequencing methods. In a serial dilution study, ISAD-KRAS could detect mutations in a sample containing only 1% of the mutant allele in a mixture of wild-type DNA, whereas both PCR and direct sequencing methods could detect mutations in a sample containing approximately 30% of mutant cells. The results of the ISAD-KRAS assay from 70 clinical samples matched those from PCR and direct sequencing, except in 5 cases, wherein ISAD-KRAS could detect mutations that were not detected by PCR and direct sequencing. We also found that the sensitivity and specificity of ISAD-KRAS were 100% within 30 min. The ISAD-KRAS assay provides a rapid, highly sensitive, and label-free method for KRAS mutation testing, and can serve as a robust and near patient testing approach for the rapid detection of patients most likely to respond to anti-EGFR drugs. PMID:29137388
The future scalability of pH-based genome sequencers: A theoretical perspective
NASA Astrophysics Data System (ADS)
Go, Jonghyun; Alam, Muhammad A.
2013-10-01
Sequencing of human genome is an essential prerequisite for personalized medicine and early prognosis of various genetic diseases. The state-of-art, high-throughput genome sequencing technologies provide improved sequencing; however, their reliance on relatively expensive optical detection schemes has prevented wide-spread adoption of the technology in routine care. In contrast, the recently announced pH-based electronic genome sequencers achieve fast sequencing at low cost because of the compatibility with the current microelectronics technology. While the progress in technology development has been rapid, the physics of the sequencing chips and the potential for future scaling (and therefore, cost reduction) remain unexplored. In this article, we develop a theoretical framework and a scaling theory to explain the principle of operation of the pH-based sequencing chips and use the framework to explore various perceived scaling limits of the technology related to signal to noise ratio, well-to-well crosstalk, and sequencing accuracy. We also address several limitations inherent to the key steps of pH-based genome sequencers, which are widely shared by many other sequencing platforms in the market but remained unexplained properly so far.
Current management of penetrating torso trauma: nontherapeutic is not good enough anymore
Ball, Chad G.
2014-01-01
A highly organized approach to the evaluation and treatment of penetrating torso injuries based on regional anatomy provides rapid diagnostic and therapeutic consistency. It also minimizes delays in diagnosis, missed injuries and nontherapeutic laparotomies. This review discusses an optimal sequence of structured rapid assessments that allow the clinician to rapidly proceed to gold standard therapies with a minimal risk of associated morbidity. PMID:24666458
The sequence of sequencers: The history of sequencing DNA
Heather, James M.; Chain, Benjamin
2016-01-01
Determining the order of nucleic acid residues in biological samples is an integral component of a wide variety of research applications. Over the last fifty years large numbers of researchers have applied themselves to the production of techniques and technologies to facilitate this feat, sequencing DNA and RNA molecules. This time-scale has witnessed tremendous changes, moving from sequencing short oligonucleotides to millions of bases, from struggling towards the deduction of the coding sequence of a single gene to rapid and widely available whole genome sequencing. This article traverses those years, iterating through the different generations of sequencing technology, highlighting some of the key discoveries, researchers, and sequences along the way. PMID:26554401
Sulaiman, Irshad M; Torres, Patricia; Simpson, Steven; Kerdahi, Khalil; Ortega, Ynes
2013-04-01
We have described the development of a 2-step nested PCR protocol based on the characterization of the 70-kDa heat shock protein (HSP70) gene for rapid detection of the human-pathogenic Cyclospora cayetanensis parasite. We tested and validated these newly designed primer sets by PCR amplification followed by nucleotide sequencing of PCR-amplified HSP70 fragments belonging to 16 human C. cayetanensis isolates from 3 different endemic regions that include Nepal, Mexico, and Peru. No genetic polymorphism was observed among the isolates at the characterized regions of the HSP70 locus. This newly developed HSP70 gene-based nested PCR protocol provides another useful genetic marker for the rapid detection of C. cayetanensis in the future.
Yoshida, Catherine E; Kruczkiewicz, Peter; Laing, Chad R; Lingohr, Erika J; Gannon, Victor P J; Nash, John H E; Taboada, Eduardo N
2016-01-01
For nearly 100 years serotyping has been the gold standard for the identification of Salmonella serovars. Despite the increasing adoption of DNA-based subtyping approaches, serotype information remains a cornerstone in food safety and public health activities aimed at reducing the burden of salmonellosis. At the same time, recent advances in whole-genome sequencing (WGS) promise to revolutionize our ability to perform advanced pathogen characterization in support of improved source attribution and outbreak analysis. We present the Salmonella In Silico Typing Resource (SISTR), a bioinformatics platform for rapidly performing simultaneous in silico analyses for several leading subtyping methods on draft Salmonella genome assemblies. In addition to performing serovar prediction by genoserotyping, this resource integrates sequence-based typing analyses for: Multi-Locus Sequence Typing (MLST), ribosomal MLST (rMLST), and core genome MLST (cgMLST). We show how phylogenetic context from cgMLST analysis can supplement the genoserotyping analysis and increase the accuracy of in silico serovar prediction to over 94.6% on a dataset comprised of 4,188 finished genomes and WGS draft assemblies. In addition to allowing analysis of user-uploaded whole-genome assemblies, the SISTR platform incorporates a database comprising over 4,000 publicly available genomes, allowing users to place their isolates in a broader phylogenetic and epidemiological context. The resource incorporates several metadata driven visualizations to examine the phylogenetic, geospatial and temporal distribution of genome-sequenced isolates. As sequencing of Salmonella isolates at public health laboratories around the world becomes increasingly common, rapid in silico analysis of minimally processed draft genome assemblies provides a powerful approach for molecular epidemiology in support of public health investigations. Moreover, this type of integrated analysis using multiple sequence-based methods of sub-typing allows for continuity with historical serotyping data as we transition towards the increasing adoption of genomic analyses in epidemiology. The SISTR platform is freely available on the web at https://lfz.corefacility.ca/sistr-app/.
Kong, Fanrong; Tong, Zhongsheng; Chen, Xiaoyou; Sorrell, Tania; Wang, Bin; Wu, Qixuan; Ellis, David; Chen, Sharon
2008-01-01
DNA sequencing analyses have demonstrated relatively limited polymorphisms within the fungal internal transcribed spacer (ITS) regions among Trichophyton spp. We sequenced the ITS region (ITS1, 5.8S, and ITS2) for 42 dermatophytes belonging to seven species (Trichophyton rubrum, T. mentagrophytes, T. soudanense, T. tonsurans, Epidermophyton floccosum, Microsporum canis, and M. gypseum) and developed a novel padlock probe and rolling-circle amplification (RCA)-based method for identification of single nucleotide polymorphisms (SNPs) that could be exploited to differentiate between Trichophyton spp. Sequencing results demonstrated intraspecies genetic variation for T. tonsurans, T. mentagrophytes, and T. soudanense but not T. rubrum. Signature sets of SNPs between T. rubrum and T. soudanense (4-bp difference) and T. violaceum and T. soudanense (3-bp difference) were identified. The RCA assay correctly identified five Trichophyton species. Although the use of two “group-specific” probes targeting both the ITS1 and the ITS2 regions were required to identify T. soudanense, the other species were identified by single ITS1- or ITS2-targeted species-specific probes. There was good agreement between ITS sequencing and the RCA assay. Despite limited genetic variation between Trichophyton spp., the sensitive, specific RCA-based SNP detection assay showed potential as a simple, reproducible method for the rapid (2-h) identification of Trichophyton spp. PMID:18234865
Laboratory Diagnosis and Susceptibility Testing for Mycobacterium tuberculosis.
Procop, Gary W
2016-12-01
The laboratory, which utilizes some of the most sophisticated and rapidly changing technologies, plays a critical role in the diagnosis of tuberculosis. Some of these tools are being employed in resource-challenged countries for the rapid detection and characterization of Mycobacterium tuberculosis. Foremost, the laboratory defines appropriate specimen criteria for optimal test performance. The direct detection of mycobacteria in the clinical specimen, predominantly done by acid-fast staining, may eventually be replaced by rapid-cycle PCR. The widespread use of the Xpert MTB/RIF (Cepheid) assay, which detects both M. tuberculosis and key genetic determinants of rifampin resistance, is important for the early detection of multidrug-resistant strains. Culture, using both broth and solid media, remains the standard for establishing the laboratory-based diagnosis of tuberculosis. Cultured isolates are identified far less commonly by traditional biochemical profiling and more commonly by molecular methods, such as DNA probes and broad-range PCR with DNA sequencing. Non-nucleic acid-based methods of identification, such as high-performance liquid chromatography and, more recently, matrix-assisted laser desorption/ionization-time of flight mass spectrometry, may also be used for identification. Cultured isolates of M. tuberculosis should be submitted for susceptibility testing according to standard guidelines. The use of broth-based susceptibility testing is recommended to significantly decrease the time to result. Cultured isolates may also be submitted for strain typing for epidemiologic purposes. The use of massive parallel sequencing, also known as next-generation sequencing, promises to continue to this molecular revolution in mycobacteriology, as whole-genome sequencing provides identification, susceptibility, and typing information simultaneously.
V, Pavana Jyothi; S, Akila; Selvan, Malini K; Naidu, Hariprasad; Raghunathan, Shwethaa; Kota, Sathish; Sundaram, R C Raja; Rana, Samir Kumar; Raj, G Dhinakar; Srinivasan, V A; Mohana Subramanian, B
2016-12-01
Canine parvovirus (CPV) is a non-enveloped single stranded DNA virus with an icosahedral capsid. Mini-sequencing based CPV typing was developed earlier to detect and differentiate all the CPV types and FPV in a single reaction. This technique was further evaluated in the present study by performing the mini-sequencing directly from fecal samples which avoided tedious virus isolation steps by cell culture system. Fecal swab samples were collected from 84 dogs with enteritis symptoms, suggestive of parvoviral infection from different locations across India. Seventy six of these samples were positive by PCR; the subsequent mini-sequencing reaction typed 74 of them as type 2a virus, and 2 samples as type 2b. Additionally, 25 of the positive samples were typed by cycle sequencing of PCR products. Direct CPV typing from fecal samples using mini-sequencing showed 100% correlation with CPV typing by cycle sequencing. Moreover, CPV typing was achieved by mini-sequencing even with faintly positive PCR amplicons which was not possible by cycle sequencing. Therefore, the mini-sequencing technique is recommended for regular epidemiological follow up of CPV types, since the technique is rapid, highly sensitive and high capacity method for CPV typing. Copyright © 2016. Published by Elsevier B.V.
Pant, Saumya; Weiner, Russell; Marton, Matthew J.
2014-01-01
Over the past decade, next-generation sequencing (NGS) technology has experienced meteoric growth in the aspects of platform, technology, and supporting bioinformatics development allowing its widespread and rapid uptake in research settings. More recently, NGS-based genomic data have been exploited to better understand disease development and patient characteristics that influence response to a given therapeutic intervention. Cancer, as a disease characterized by and driven by the tumor genetic landscape, is particularly amenable to NGS-based diagnostic (Dx) approaches. NGS-based technologies are particularly well suited to studying cancer disease development, progression and emergence of resistance, all key factors in the development of next-generation cancer Dxs. Yet, to achieve the promise of NGS-based patient treatment, drug developers will need to overcome a number of operational, technical, regulatory, and strategic challenges. Here, we provide a succinct overview of the state of the clinical NGS field in terms of the available clinically targeted platforms and sequencing technologies. We discuss the various operational and practical aspects of clinical NGS testing that will facilitate or limit the uptake of such assays in routine clinical care. We examine the current strategies for analytical validation and Food and Drug Administration (FDA)-approval of NGS-based assays and ongoing efforts to standardize clinical NGS and build quality control standards for the same. The rapidly evolving companion diagnostic (CDx) landscape for NGS-based assays will be reviewed, highlighting the key areas of concern and suggesting strategies to mitigate risk. The review will conclude with a series of strategic questions that face drug developers and a discussion of the likely future course of NGS-based CDx development efforts. PMID:24860780
Googling DNA sequences on the World Wide Web.
Hajibabaei, Mehrdad; Singer, Gregory A C
2009-11-10
New web-based technologies provide an excellent opportunity for sharing and accessing information and using web as a platform for interaction and collaboration. Although several specialized tools are available for analyzing DNA sequence information, conventional web-based tools have not been utilized for bioinformatics applications. We have developed a novel algorithm and implemented it for searching species-specific genomic sequences, DNA barcodes, by using popular web-based methods such as Google. We developed an alignment independent character based algorithm based on dividing a sequence library (DNA barcodes) and query sequence to words. The actual search is conducted by conventional search tools such as freely available Google Desktop Search. We implemented our algorithm in two exemplar packages. We developed pre and post-processing software to provide customized input and output services, respectively. Our analysis of all publicly available DNA barcode sequences shows a high accuracy as well as rapid results. Our method makes use of conventional web-based technologies for specialized genetic data. It provides a robust and efficient solution for sequence search on the web. The integration of our search method for large-scale sequence libraries such as DNA barcodes provides an excellent web-based tool for accessing this information and linking it to other available categories of information on the web.
The sequence of sequencers: The history of sequencing DNA.
Heather, James M; Chain, Benjamin
2016-01-01
Determining the order of nucleic acid residues in biological samples is an integral component of a wide variety of research applications. Over the last fifty years large numbers of researchers have applied themselves to the production of techniques and technologies to facilitate this feat, sequencing DNA and RNA molecules. This time-scale has witnessed tremendous changes, moving from sequencing short oligonucleotides to millions of bases, from struggling towards the deduction of the coding sequence of a single gene to rapid and widely available whole genome sequencing. This article traverses those years, iterating through the different generations of sequencing technology, highlighting some of the key discoveries, researchers, and sequences along the way. Copyright © 2015 The Authors. Published by Elsevier Inc. All rights reserved.
Jenkins, David
2018-01-10
David Jenkins on "Ion Torrent semiconductor sequencing allows rapid, low-cost sequencing of the human exome" at the 2012 Sequencing, Finishing, Analysis in the Future Meeting held June 5-7, 2012 in Santa Fe, New Mexico.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Jenkins, David
David Jenkins on "Ion Torrent semiconductor sequencing allows rapid, low-cost sequencing of the human exome" at the 2012 Sequencing, Finishing, Analysis in the Future Meeting held June 5-7, 2012 in Santa Fe, New Mexico.
The rapid evolution of molecular genetic diagnostics in neuromuscular diseases.
Volk, Alexander E; Kubisch, Christian
2017-10-01
The development of massively parallel sequencing (MPS) has revolutionized molecular genetic diagnostics in monogenic disorders. The present review gives a brief overview of different MPS-based approaches used in clinical diagnostics of neuromuscular disorders (NMDs) and highlights their advantages and limitations. MPS-based approaches like gene panel sequencing, (whole) exome sequencing, (whole) genome sequencing, and RNA sequencing have been used to identify the genetic cause in NMDs. Although gene panel sequencing has evolved as a standard test for heterogeneous diseases, it is still debated, mainly because of financial issues and unsolved problems of variant interpretation, whether genome sequencing (and to a lesser extent also exome sequencing) of single patients can already be regarded as routine diagnostics. However, it has been shown that the inclusion of parents and additional family members often leads to a substantial increase in the diagnostic yield in exome-wide/genome-wide MPS approaches. In addition, MPS-based RNA sequencing just enters the research and diagnostic scene. Next-generation sequencing increasingly enables the detection of the genetic cause in highly heterogeneous diseases like NMDs in an efficient and affordable way. Gene panel sequencing and family-based exome sequencing have been proven as potent and cost-efficient diagnostic tools. Although clinical validation and interpretation of genome sequencing is still challenging, diagnostic RNA sequencing represents a promising tool to bypass some hurdles of diagnostics using genomic DNA.
Frequency tagging to track the neural processing of contrast in fast, continuous sound sequences.
Nozaradan, Sylvie; Mouraux, André; Cousineau, Marion
2017-07-01
The human auditory system presents a remarkable ability to detect rapid changes in fast, continuous acoustic sequences, as best illustrated in speech and music. However, the neural processing of rapid auditory contrast remains largely unclear, probably due to the lack of methods to objectively dissociate the response components specifically related to the contrast from the other components in response to the sequence of fast continuous sounds. To overcome this issue, we tested a novel use of the frequency-tagging approach allowing contrast-specific neural responses to be tracked based on their expected frequencies. The EEG was recorded while participants listened to 40-s sequences of sounds presented at 8Hz. A tone or interaural time contrast was embedded every fifth sound (AAAAB), such that a response observed in the EEG at exactly 8 Hz/5 (1.6 Hz) or harmonics should be the signature of contrast processing by neural populations. Contrast-related responses were successfully identified, even in the case of very fine contrasts. Moreover, analysis of the time course of the responses revealed a stable amplitude over repetitions of the AAAAB patterns in the sequence, except for the response to perceptually salient contrasts that showed a buildup and decay across repetitions of the sounds. Overall, this new combination of frequency-tagging with an oddball design provides a valuable complement to the classic, transient, evoked potentials approach, especially in the context of rapid auditory information. Specifically, we provide objective evidence on the neural processing of contrast embedded in fast, continuous sound sequences. NEW & NOTEWORTHY Recent theories suggest that the basis of neurodevelopmental auditory disorders such as dyslexia might be an impaired processing of fast auditory changes, highlighting how the encoding of rapid acoustic information is critical for auditory communication. Here, we present a novel electrophysiological approach to capture in humans neural markers of contrasts in fast continuous tone sequences. Contrast-specific responses were successfully identified, even for very fine contrasts, providing direct insight on the encoding of rapid auditory information. Copyright © 2017 the American Physiological Society.
Dieckmann, Ralf; Hammerl, Jens Andre; Hahmann, Hartmut; Wicke, Amal; Kleta, Sylvia; Dabrowski, Piotr Wojciech; Nitsche, Andreas; Stämmler, Maren; Al Dahouk, Sascha; Lasch, Peter
2016-06-23
Microbiological monitoring of consumer products and the efficiency of early warning systems and outbreak investigations depend on the rapid identification and strain characterisation of pathogens posing risks to the health and safety of consumers. This study evaluates the potential of three rapid analytical techniques for identification and subtyping of bacterial isolates obtained from a liquid hand soap product, which has been recalled and reported through the EU RAPEX system due to its severe bacterial contamination. Ten isolates recovered from two bottles of the product were identified as Klebsiella oxytoca and subtyped using matrix-assisted laser desorption/ionisation time-of-flight mass spectrometry (MALDI TOF MS), near-infrared Fourier transform (NIR FT) Raman spectroscopy and Fourier transform infrared (FTIR) spectroscopy. Comparison of the classification results obtained by these phenotype-based techniques with outcomes of the DNA-based methods pulsed-field gel electrophoresis (PFGE), multi-locus sequence typing (MLST) and single nucleotide polymorphism (SNP) analysis of whole-genome sequencing (WGS) data revealed a high level of concordance. In conclusion, a set of analytical techniques might be useful for rapid, reliable and cost-effective microbial typing to ensure safe consumer products and allow source tracking.
Phage display selection of peptides that target calcium-binding proteins.
Vetter, Stefan W
2013-01-01
Phage display allows to rapidly identify peptide sequences with binding affinity towards target proteins, for example, calcium-binding proteins (CBPs). Phage technology allows screening of 10(9) or more independent peptide sequences and can identify CBP binding peptides within 2 weeks. Adjusting of screening conditions allows selecting CBPs binding peptides that are either calcium-dependent or independent. Obtained peptide sequences can be used to identify CBP target proteins based on sequence homology or to quickly obtain peptide-based CBP inhibitors to modulate CBP-target interactions. The protocol described here uses a commercially available phage display library, in which random 12-mer peptides are displayed on filamentous M13 phages. The library was screened against the calcium-binding protein S100B.
Numerical classification of coding sequences
NASA Technical Reports Server (NTRS)
Collins, D. W.; Liu, C. C.; Jukes, T. H.
1992-01-01
DNA sequences coding for protein may be represented by counts of nucleotides or codons. A complete reading frame may be abbreviated by its base count, e.g. A76C158G121T74, or with the corresponding codon table, e.g. (AAA)0(AAC)1(AAG)9 ... (TTT)0. We propose that these numerical designations be used to augment current methods of sequence annotation. Because base counts and codon tables do not require revision as knowledge of function evolves, they are well-suited to act as cross-references, for example to identify redundant GenBank entries. These descriptors may be compared, in place of DNA sequences, to extract homologous genes from large databases. This approach permits rapid searching with good selectivity.
2012-01-01
Background In the last 30 years, a number of DNA fingerprinting methods such as RFLP, RAPD, AFLP, SSR, DArT, have been extensively used in marker development for molecular plant breeding. However, it remains a daunting task to identify highly polymorphic and closely linked molecular markers for a target trait for molecular marker-assisted selection. The next-generation sequencing (NGS) technology is far more powerful than any existing generic DNA fingerprinting methods in generating DNA markers. In this study, we employed a grain legume crop Lupinus angustifolius (lupin) as a test case, and examined the utility of an NGS-based method of RAD (restriction-site associated DNA) sequencing as DNA fingerprinting for rapid, cost-effective marker development tagging a disease resistance gene for molecular breeding. Results Twenty informative plants from a cross of RxS (disease resistant x susceptible) in lupin were subjected to RAD single-end sequencing by multiplex identifiers. The entire RAD sequencing products were resolved in two lanes of the 16-lanes per run sequencing platform Solexa HiSeq2000. A total of 185 million raw reads, approximately 17 Gb of sequencing data, were collected. Sequence comparison among the 20 test plants discovered 8207 SNP markers. Filtration of DNA sequencing data with marker identification parameters resulted in the discovery of 38 molecular markers linked to the disease resistance gene Lanr1. Five randomly selected markers were converted into cost-effective, simple PCR-based markers. Linkage analysis using marker genotyping data and disease resistance phenotyping data on a F8 population consisting of 186 individual plants confirmed that all these five markers were linked to the R gene. Two of these newly developed sequence-specific PCR markers, AnSeq3 and AnSeq4, flanked the target R gene at a genetic distance of 0.9 centiMorgan (cM), and are now replacing the markers previously developed by a traditional DNA fingerprinting method for marker-assisted selection in the Australian national lupin breeding program. Conclusions We demonstrated that more than 30 molecular markers linked to a target gene of agronomic trait of interest can be identified from a small portion (1/8) of one sequencing run on HiSeq2000 by applying NGS based RAD sequencing in marker development. The markers developed by the strategy described in this study are all co-dominant SNP markers, which can readily be converted into high throughput multiplex format or low-cost, simple PCR-based markers desirable for large scale marker implementation in plant breeding programs. The high density and closely linked molecular markers associated with a target trait help to overcome a major bottleneck for implementation of molecular markers on a wide range of germplasm in breeding programs. We conclude that application of NGS based RAD sequencing as DNA fingerprinting is a very rapid and cost-effective strategy for marker development in molecular plant breeding. The strategy does not require any prior genome knowledge or molecular information for the species under investigation, and it is applicable to other plant species. PMID:22805587
Yang, Huaan; Tao, Ye; Zheng, Zequn; Li, Chengdao; Sweetingham, Mark W; Howieson, John G
2012-07-17
In the last 30 years, a number of DNA fingerprinting methods such as RFLP, RAPD, AFLP, SSR, DArT, have been extensively used in marker development for molecular plant breeding. However, it remains a daunting task to identify highly polymorphic and closely linked molecular markers for a target trait for molecular marker-assisted selection. The next-generation sequencing (NGS) technology is far more powerful than any existing generic DNA fingerprinting methods in generating DNA markers. In this study, we employed a grain legume crop Lupinus angustifolius (lupin) as a test case, and examined the utility of an NGS-based method of RAD (restriction-site associated DNA) sequencing as DNA fingerprinting for rapid, cost-effective marker development tagging a disease resistance gene for molecular breeding. Twenty informative plants from a cross of RxS (disease resistant x susceptible) in lupin were subjected to RAD single-end sequencing by multiplex identifiers. The entire RAD sequencing products were resolved in two lanes of the 16-lanes per run sequencing platform Solexa HiSeq2000. A total of 185 million raw reads, approximately 17 Gb of sequencing data, were collected. Sequence comparison among the 20 test plants discovered 8207 SNP markers. Filtration of DNA sequencing data with marker identification parameters resulted in the discovery of 38 molecular markers linked to the disease resistance gene Lanr1. Five randomly selected markers were converted into cost-effective, simple PCR-based markers. Linkage analysis using marker genotyping data and disease resistance phenotyping data on a F8 population consisting of 186 individual plants confirmed that all these five markers were linked to the R gene. Two of these newly developed sequence-specific PCR markers, AnSeq3 and AnSeq4, flanked the target R gene at a genetic distance of 0.9 centiMorgan (cM), and are now replacing the markers previously developed by a traditional DNA fingerprinting method for marker-assisted selection in the Australian national lupin breeding program. We demonstrated that more than 30 molecular markers linked to a target gene of agronomic trait of interest can be identified from a small portion (1/8) of one sequencing run on HiSeq2000 by applying NGS based RAD sequencing in marker development. The markers developed by the strategy described in this study are all co-dominant SNP markers, which can readily be converted into high throughput multiplex format or low-cost, simple PCR-based markers desirable for large scale marker implementation in plant breeding programs. The high density and closely linked molecular markers associated with a target trait help to overcome a major bottleneck for implementation of molecular markers on a wide range of germplasm in breeding programs. We conclude that application of NGS based RAD sequencing as DNA fingerprinting is a very rapid and cost-effective strategy for marker development in molecular plant breeding. The strategy does not require any prior genome knowledge or molecular information for the species under investigation, and it is applicable to other plant species.
An Accurate Scalable Template-based Alignment Algorithm
Gardner, David P.; Xu, Weijia; Miranker, Daniel P.; Ozer, Stuart; Cannone, Jamie J.; Gutell, Robin R.
2013-01-01
The rapid determination of nucleic acid sequences is increasing the number of sequences that are available. Inherent in a template or seed alignment is the culmination of structural and functional constraints that are selecting those mutations that are viable during the evolution of the RNA. While we might not understand these structural and functional, template-based alignment programs utilize the patterns of sequence conservation to encapsulate the characteristics of viable RNA sequences that are aligned properly. We have developed a program that utilizes the different dimensions of information in rCAD, a large RNA informatics resource, to establish a profile for each position in an alignment. The most significant include sequence identity and column composition in different phylogenetic taxa. We have compared our methods with a maximum of eight alternative alignment methods on different sets of 16S and 23S rRNA sequences with sequence percent identities ranging from 50% to 100%. The results showed that CRWAlign outperformed the other alignment methods in both speed and accuracy. A web-based alignment server is available at http://www.rna.ccbb.utexas.edu/SAE/2F/CRWAlign. PMID:24772376
Sequence specificity of single-stranded DNA-binding proteins: a novel DNA microarray approach
Morgan, Hugh P.; Estibeiro, Peter; Wear, Martin A.; Max, Klaas E.A.; Heinemann, Udo; Cubeddu, Liza; Gallagher, Maurice P.; Sadler, Peter J.; Walkinshaw, Malcolm D.
2007-01-01
We have developed a novel DNA microarray-based approach for identification of the sequence-specificity of single-stranded nucleic-acid-binding proteins (SNABPs). For verification, we have shown that the major cold shock protein (CspB) from Bacillus subtilis binds with high affinity to pyrimidine-rich sequences, with a binding preference for the consensus sequence, 5′-GTCTTTG/T-3′. The sequence was modelled onto the known structure of CspB and a cytosine-binding pocket was identified, which explains the strong preference for a cytosine base at position 3. This microarray method offers a rapid high-throughput approach for determining the specificity and strength of ss DNA–protein interactions. Further screening of this newly emerging family of transcription factors will help provide an insight into their cellular function. PMID:17488853
Schmidt, Olga; Hausmann, Axel; Cancian de Araujo, Bruno; Sutrisno, Hari; Peggie, Djunijanti; Schmidt, Stefan
2017-01-01
Here we present a general collecting and preparation protocol for DNA barcoding of Lepidoptera as part of large-scale rapid biodiversity assessment projects, and a comparison with alternative preserving and vouchering methods. About 98% of the sequenced specimens processed using the present collecting and preparation protocol yielded sequences with more than 500 base pairs. The study is based on the first outcomes of the Indonesian Biodiversity Discovery and Information System (IndoBioSys). IndoBioSys is a German-Indonesian research project that is conducted by the Museum für Naturkunde in Berlin and the Zoologische Staatssammlung München, in close cooperation with the Research Center for Biology - Indonesian Institute of Sciences (RCB-LIPI, Bogor).
NIPTmer: rapid k-mer-based software package for detection of fetal aneuploidies.
Sauk, Martin; Žilina, Olga; Kurg, Ants; Ustav, Eva-Liina; Peters, Maire; Paluoja, Priit; Roost, Anne Mari; Teder, Hindrek; Palta, Priit; Brison, Nathalie; Vermeesch, Joris R; Krjutškov, Kaarel; Salumets, Andres; Kaplinski, Lauris
2018-04-04
Non-invasive prenatal testing (NIPT) is a recent and rapidly evolving method for detecting genetic lesions, such as aneuploidies, of a fetus. However, there is a need for faster and cheaper laboratory and analysis methods to make NIPT more widely accessible. We have developed a novel software package for detection of fetal aneuploidies from next-generation low-coverage whole genome sequencing data. Our tool - NIPTmer - is based on counting pre-defined per-chromosome sets of unique k-mers from raw sequencing data, and applying linear regression model on the counts. Additionally, the filtering process used for k-mer list creation allows one to take into account the genetic variance in a specific sample, thus reducing the source of uncertainty. The processing time of one sample is less than 10 CPU-minutes on a high-end workstation. NIPTmer was validated on a cohort of 583 NIPT samples and it correctly predicted 37 non-mosaic fetal aneuploidies. NIPTmer has the potential to reduce significantly the time and complexity of NIPT post-sequencing analysis compared to mapping-based methods. For non-commercial users the software package is freely available at http://bioinfo.ut.ee/NIPTMer/ .
Saeed, Isaam; Wong, Stephen Q.; Mar, Victoria; Goode, David L.; Caramia, Franco; Doig, Ken; Ryland, Georgina L.; Thompson, Ella R.; Hunter, Sally M.; Halgamuge, Saman K.; Ellul, Jason; Dobrovic, Alexander; Campbell, Ian G.; Papenfuss, Anthony T.; McArthur, Grant A.; Tothill, Richard W.
2014-01-01
Targeted resequencing by massively parallel sequencing has become an effective and affordable way to survey small to large portions of the genome for genetic variation. Despite the rapid development in open source software for analysis of such data, the practical implementation of these tools through construction of sequencing analysis pipelines still remains a challenging and laborious activity, and a major hurdle for many small research and clinical laboratories. We developed TREVA (Targeted REsequencing Virtual Appliance), making pre-built pipelines immediately available as a virtual appliance. Based on virtual machine technologies, TREVA is a solution for rapid and efficient deployment of complex bioinformatics pipelines to laboratories of all sizes, enabling reproducible results. The analyses that are supported in TREVA include: somatic and germline single-nucleotide and insertion/deletion variant calling, copy number analysis, and cohort-based analyses such as pathway and significantly mutated genes analyses. TREVA is flexible and easy to use, and can be customised by Linux-based extensions if required. TREVA can also be deployed on the cloud (cloud computing), enabling instant access without investment overheads for additional hardware. TREVA is available at http://bioinformatics.petermac.org/treva/. PMID:24752294
Attentional awakening: gradual modulation of temporal attention in rapid serial visual presentation.
Ariga, Atsunori; Yokosawa, Kazuhiko
2008-03-01
Orienting attention to a point in time facilitates processing of an item within rapidly changing surroundings. We used a one-target RSVP task to look for differences in accuracy in reporting a target related to when the target temporally appeared in the sequence. The results show that observers correctly report a target early in the sequence less frequently than later in the sequence. Previous RSVP studies predicted equivalently accurate performances for one target wherever it appeared in the sequence. We named this new phenomenon attentional awakening, which reflects a gradual modulation of temporal attention in a rapid sequence.
2009-01-01
Background ESTs or variable sequence reads can be available in prokaryotic studies well before a complete genome is known. Use cases include (i) transcriptome studies or (ii) single cell sequencing of bacteria. Without suitable software their further analysis and mapping would have to await finalization of the corresponding genome. Results The tool JANE rapidly maps ESTs or variable sequence reads in prokaryotic sequencing and transcriptome efforts to related template genomes. It provides an easy-to-use graphics interface for information retrieval and a toolkit for EST or nucleotide sequence function prediction. Furthermore, we developed for rapid mapping an enhanced sequence alignment algorithm which reassembles and evaluates high scoring pairs provided from the BLAST algorithm. Rapid assembly on and replacement of the template genome by sequence reads or mapped ESTs is achieved. This is illustrated (i) by data from Staphylococci as well as from a Blattabacteria sequencing effort, (ii) mapping single cell sequencing reads is shown for poribacteria to sister phylum representative Rhodopirellula Baltica SH1. The algorithm has been implemented in a web-server accessible at http://jane.bioapps.biozentrum.uni-wuerzburg.de. Conclusion Rapid prokaryotic EST mapping or mapping of sequence reads is achieved applying JANE even without knowing the cognate genome sequence. PMID:19943962
Law, Jodi Woan-Fei; Ab Mutalib, Nurul-Syakima; Chan, Kok-Gan; Lee, Learn-Han
2015-01-01
The incidence of foodborne diseases has increased over the years and resulted in major public health problem globally. Foodborne pathogens can be found in various foods and it is important to detect foodborne pathogens to provide safe food supply and to prevent foodborne diseases. The conventional methods used to detect foodborne pathogen are time consuming and laborious. Hence, a variety of methods have been developed for rapid detection of foodborne pathogens as it is required in many food analyses. Rapid detection methods can be categorized into nucleic acid-based, biosensor-based and immunological-based methods. This review emphasizes on the principles and application of recent rapid methods for the detection of foodborne bacterial pathogens. Detection methods included are simple polymerase chain reaction (PCR), multiplex PCR, real-time PCR, nucleic acid sequence-based amplification (NASBA), loop-mediated isothermal amplification (LAMP) and oligonucleotide DNA microarray which classified as nucleic acid-based methods; optical, electrochemical and mass-based biosensors which classified as biosensor-based methods; enzyme-linked immunosorbent assay (ELISA) and lateral flow immunoassay which classified as immunological-based methods. In general, rapid detection methods are generally time-efficient, sensitive, specific and labor-saving. The developments of rapid detection methods are vital in prevention and treatment of foodborne diseases. PMID:25628612
Reads2Type: a web application for rapid microbial taxonomy identification.
Saputra, Dhany; Rasmussen, Simon; Larsen, Mette V; Haddad, Nizar; Sperotto, Maria Maddalena; Aarestrup, Frank M; Lund, Ole; Sicheritz-Pontén, Thomas
2015-11-25
Identification of bacteria may be based on sequencing and molecular analysis of a specific locus such as 16S rRNA, or a set of loci such as in multilocus sequence typing. In the near future, healthcare institutions and routine diagnostic microbiology laboratories may need to sequence the entire genome of microbial isolates. Therefore we have developed Reads2Type, a web-based tool for taxonomy identification based on whole bacterial genome sequence data. Raw sequencing data provided by the user are mapped against a set of marker probes that are derived from currently available bacteria complete genomes. Using a dataset of 1003 whole genome sequenced bacteria from various sequencing platforms, Reads2Type was able to identify the species with 99.5 % accuracy and on the minutes time scale. In comparison with other tools, Reads2Type offers the advantage of not needing to transfer sequencing files, as the entire computational analysis is done on the computer of whom utilizes the web application. This also prevents data privacy issues to arise. The Reads2Type tool is available at http://www.cbs.dtu.dk/~dhany/reads2type.html.
A Rapid Method for Engineering Recombinant Polioviruses or Other Enteroviruses.
Bessaud, Maël; Pelletier, Isabelle; Blondel, Bruno; Delpeyroux, Francis
2016-01-01
The cloning of large enterovirus RNA sequences is labor-intensive because of the frequent instability in bacteria of plasmidic vectors containing the corresponding cDNAs. In order to circumvent this issue we have developed a PCR-based method that allows the generation of highly modified or chimeric full-length enterovirus genomes. This method relies on fusion PCR which enables the concatenation of several overlapping cDNA amplicons produced separately. A T7 promoter sequence added upstream the fusion PCR products allows its transcription into infectious genomic RNAs directly in transfected cells constitutively expressing the phage T7 RNA polymerase. This method permits the rapid recovery of modified viruses that can be subsequently amplified on adequate cell-lines.
SSAW: A new sequence similarity analysis method based on the stationary discrete wavelet transform.
Lin, Jie; Wei, Jing; Adjeroh, Donald; Jiang, Bing-Hua; Jiang, Yue
2018-05-02
Alignment-free sequence similarity analysis methods often lead to significant savings in computational time over alignment-based counterparts. A new alignment-free sequence similarity analysis method, called SSAW is proposed. SSAW stands for Sequence Similarity Analysis using the Stationary Discrete Wavelet Transform (SDWT). It extracts k-mers from a sequence, then maps each k-mer to a complex number field. Then, the series of complex numbers formed are transformed into feature vectors using the stationary discrete wavelet transform. After these steps, the original sequence is turned into a feature vector with numeric values, which can then be used for clustering and/or classification. Using two different types of applications, namely, clustering and classification, we compared SSAW against the the-state-of-the-art alignment free sequence analysis methods. SSAW demonstrates competitive or superior performance in terms of standard indicators, such as accuracy, F-score, precision, and recall. The running time was significantly better in most cases. These make SSAW a suitable method for sequence analysis, especially, given the rapidly increasing volumes of sequence data required by most modern applications.
Bijwaard, Karen; Dickey, Jennifer S; Kelm, Kellie; Težak, Živana
2015-01-01
The rapid emergence and clinical translation of novel high-throughput sequencing technologies created a need to clarify the regulatory pathway for the evaluation and authorization of these unique technologies. Recently, the US FDA authorized for marketing four next generation sequencing (NGS)-based diagnostic devices which consisted of two heritable disease-specific assays, library preparation reagents and a NGS platform that are intended for human germline targeted sequencing from whole blood. These first authorizations can serve as a case study in how different types of NGS-based technology are reviewed by the FDA. In this manuscript we describe challenges associated with the evaluation of these novel technologies and provide an overview of what was reviewed. Besides making validated NGS-based devices available for in vitro diagnostic use, these first authorizations create a regulatory path for similar future instruments and assays.
Bakker, Theo C M; Giger, Thomas; Frommen, Joachim G; Largiadèr, Carlo R
2017-08-01
There is a need for rapid and reliable molecular sexing of three-spined sticklebacks, Gasterosteus aculeatus, the supermodel species for evolutionary biology. A DNA region at the 5' end of the sex-linked microsatellite Gac4202 was sequenced for the X chromosome of six females and the Y chromosome of five males from three populations. The Y chromosome contained two large insertions, which did not recombine with the phenotype of sex in a cross of 322 individuals. Genetic variation (SNPs and indels) within the insertions was smaller than on flanking DNA sequences. Three molecular PCR-based sex tests were developed, in which the first, the second or both insertions were covered. In five European populations (from DE, CH, NL, GB) of three-spined sticklebacks, tests with both insertions combined showed two clearly separated bands on agarose minigels in males and one band in females. The tests with the separate insertions gave similar results. Thus, the new molecular sexing method gave rapid and reliable results for sexing three-spined sticklebacks and is an improvement and/or alternative to existing methods.
Functional interrogation of non-coding DNA through CRISPR genome editing
Canver, Matthew C.; Bauer, Daniel E.; Orkin, Stuart H.
2017-01-01
Methodologies to interrogate non-coding regions have lagged behind coding regions despite comprising the vast majority of the genome. However, the rapid evolution of clustered regularly interspaced short palindromic repeats (CRISPR)-based genome editing has provided a multitude of novel techniques for laboratory investigation including significant contributions to the toolbox for studying non-coding DNA. CRISPR-mediated loss-of-function strategies rely on direct disruption of the underlying sequence or repression of transcription without modifying the targeted DNA sequence. CRISPR-mediated gain-of-function approaches similarly benefit from methods to alter the targeted sequence through integration of customized sequence into the genome as well as methods to activate transcription. Here we review CRISPR-based loss- and gain-of-function techniques for the interrogation of non-coding DNA. PMID:28288828
Optical Processing Techniques For Pseudorandom Sequence Prediction
NASA Astrophysics Data System (ADS)
Gustafson, Steven C.
1983-11-01
Pseudorandom sequences are series of apparently random numbers generated, for example, by linear or nonlinear feedback shift registers. An important application of these sequences is in spread spectrum communication systems, in which, for example, the transmitted carrier phase is digitally modulated rapidly and pseudorandomly and in which the information to be transmitted is incorporated as a slow modulation in the pseudorandom sequence. In this case the transmitted information can be extracted only by a receiver that uses for demodulation the same pseudorandom sequence used by the transmitter, and thus this type of communication system has a very high immunity to third-party interference. However, if a third party can predict in real time the probable future course of the transmitted pseudorandom sequence given past samples of this sequence, then interference immunity can be significantly reduced.. In this application effective pseudorandom sequence prediction techniques should be (1) applicable in real time to rapid (e.g., megahertz) sequence generation rates, (2) applicable to both linear and nonlinear pseudorandom sequence generation processes, and (3) applicable to error-prone past sequence samples of limited number and continuity. Certain optical processing techniques that may meet these requirements are discussed in this paper. In particular, techniques based on incoherent optical processors that perform general linear transforms or (more specifically) matrix-vector multiplications are considered. Computer simulation examples are presented which indicate that significant prediction accuracy can be obtained using these transforms for simple pseudorandom sequences. However, the useful prediction of more complex pseudorandom sequences will probably require the application of more sophisticated optical processing techniques.
Borsu, Laetitia; Intrieri, Julie; Thampi, Linta; Yu, Helena; Riely, Gregory; Nafa, Khedoudja; Chandramohan, Raghu; Ladanyi, Marc; Arcila, Maria E
2016-11-01
Although next-generation sequencing (NGS) is a robust technology for comprehensive assessment of EGFR-mutant lung adenocarcinomas with acquired resistance to tyrosine kinase inhibitors, it may not provide sufficiently rapid and sensitive detection of the EGFR T790M mutation, the most clinically relevant resistance biomarker. Here, we describe a digital PCR (dPCR) assay for rapid T790M detection on aliquots of NGS libraries prepared for comprehensive profiling, fully maximizing broad genomic analysis on limited samples. Tumor DNAs from patients with EGFR-mutant lung adenocarcinomas and acquired resistance to epidermal growth factor receptor inhibitors were prepared for Memorial Sloan-Kettering-Integrated Mutation Profiling of Actionable Cancer Targets sequencing, a hybrid capture-based assay interrogating 410 cancer-related genes. Precapture library aliquots were used for rapid EGFR T790M testing by dPCR, and results were compared with NGS and locked nucleic acid-PCR Sanger sequencing (reference high sensitivity method). Seventy resistance samples showed 99% concordance with the reference high sensitivity method in accuracy studies. Input as low as 2.5 ng provided a sensitivity of 1% and improved further with increasing DNA input. dPCR on libraries required less DNA and showed better performance than direct genomic DNA. dPCR on NGS libraries is a robust and rapid approach to EGFR T790M testing, allowing most economical utilization of limited material for comprehensive assessment. The same assay can also be performed directly on any limited DNA source and cell-free DNA. Copyright © 2016 American Society for Investigative Pathology and the Association for Molecular Pathology. Published by Elsevier Inc. All rights reserved.
NASA Astrophysics Data System (ADS)
Zwolak, Michael
2013-03-01
A rapid and low-cost method to sequence DNA would revolutionize personalized medicine, where genetic information is used to diagnose, treat, and prevent diseases. There is a longstanding interest in nanopores as a platform for rapid interrogation of single DNA molecules. I will discuss a sequencing protocol based on the measurement of transverse electronic currents during the translocation of single-stranded DNA through nanopores. Using molecular dynamics simulations coupled to quantum mechanical calculations of the tunneling current, I will show that the DNA nucleotides are predicted to have distinguishable electronic signatures in experimentally realizable systems. Several recent experiments support our theoretical predictions. In addition to their possible impact in medicine and biology, the above methods offer ideal test beds to study open scientific issues in the relatively unexplored area at the interface between solids, liquids, and biomolecules at the nanometer length scale. http://mike.zwolak.org
Alu repeat discovery and characterization within human genomes
Hormozdiari, Fereydoun; Alkan, Can; Ventura, Mario; Hajirasouliha, Iman; Malig, Maika; Hach, Faraz; Yorukoglu, Deniz; Dao, Phuong; Bakhshi, Marzieh; Sahinalp, S. Cenk; Eichler, Evan E.
2011-01-01
Human genomes are now being rapidly sequenced, but not all forms of genetic variation are routinely characterized. In this study, we focus on Alu retrotransposition events and seek to characterize differences in the pattern of mobile insertion between individuals based on the analysis of eight human genomes sequenced using next-generation sequencing. Applying a rapid read-pair analysis algorithm, we discover 4342 Alu insertions not found in the human reference genome and show that 98% of a selected subset (63/64) experimentally validate. Of these new insertions, 89% correspond to AluY elements, suggesting that they arose by retrotransposition. Eighty percent of the Alu insertions have not been previously reported and more novel events were detected in Africans when compared with non-African samples (76% vs. 69%). Using these data, we develop an experimental and computational screen to identify ancestry informative Alu retrotransposition events among different human populations. PMID:21131385
Chiron: translating nanopore raw signal directly into nucleotide sequence using deep learning.
Teng, Haotian; Cao, Minh Duc; Hall, Michael B; Duarte, Tania; Wang, Sheng; Coin, Lachlan J M
2018-05-01
Sequencing by translocating DNA fragments through an array of nanopores is a rapidly maturing technology that offers faster and cheaper sequencing than other approaches. However, accurately deciphering the DNA sequence from the noisy and complex electrical signal is challenging. Here, we report Chiron, the first deep learning model to achieve end-to-end basecalling and directly translate the raw signal to DNA sequence without the error-prone segmentation step. Trained with only a small set of 4,000 reads, we show that our model provides state-of-the-art basecalling accuracy, even on previously unseen species. Chiron achieves basecalling speeds of more than 2,000 bases per second using desktop computer graphics processing units.
Liu, Yang; Wang, Xiao-Yue; Wei, Xue-Min; Gao, Zi-Tong; Han, Jian-Ping
2018-05-22
Species adulteration in herbal products (HPs) exposes consumers to health risks. Chemical and morphological methods have their own deficiencies when dealing with the detection of species containing the same active compounds in HPs. In this study, we developed a rapid identification method using the recombinase polymerase amplification (RPA) assay to detect two species, Ginkgo biloba and Sophora japonica (as adulteration), in Ginkgo biloba HPs. Among 36 Ginkgo biloba HP samples, 34 were found to have Ginkgo biloba sequences, and 9 were found to have Sophora japonica sequences. During the authentication process, the RPA-LFS assay showed a higher specificity, sensitivity and efficiency than PCR-based methods. We initially applied the RPA-LSF technique to detect plant species in HPs, demonstrating that this assay can be developed into an efficient tool for the rapid on-site authentication of plant species in Ginkgo biloba HPs.
Anchoring and ordering NGS contig assemblies by population sequencing (POPSEQ)
Mascher, Martin; Muehlbauer, Gary J; Rokhsar, Daniel S; Chapman, Jarrod; Schmutz, Jeremy; Barry, Kerrie; Muñoz-Amatriaín, María; Close, Timothy J; Wise, Roger P; Schulman, Alan H; Himmelbach, Axel; Mayer, Klaus FX; Scholz, Uwe; Poland, Jesse A; Stein, Nils; Waugh, Robbie
2013-01-01
Next-generation whole-genome shotgun assemblies of complex genomes are highly useful, but fail to link nearby sequence contigs with each other or provide a linear order of contigs along individual chromosomes. Here, we introduce a strategy based on sequencing progeny of a segregating population that allows de novo production of a genetically anchored linear assembly of the gene space of an organism. We demonstrate the power of the approach by reconstructing the chromosomal organization of the gene space of barley, a large, complex and highly repetitive 5.1 Gb genome. We evaluate the robustness of the new assembly by comparison to a recently released physical and genetic framework of the barley genome, and to various genetically ordered sequence-based genotypic datasets. The method is independent of the need for any prior sequence resources, and will enable rapid and cost-efficient establishment of powerful genomic information for many species. PMID:23998490
Chiba, Satoshi
1999-04-01
An endemic land snail genus Mandarina of the oceanic Bonin (Ogasawara) Islands shows exceptionally rapid evolution not only of morphological and ecological traits, but of DNA sequence. A phylogenetic relationship based on mitochondrial DNA (mtDNA) sequences suggests that morphological differences equivalent to the differences between families were produced between Mandarina and its ancestor during the Pleistocene. The inferred phylogeny shows that species with similar morphologies and life habitats appeared repeatedly and independently in different lineages and islands at different times. Sequential adaptive radiations occurred in different islands of the Bonin Islands and species occupying arboreal, semiarboreal, and terrestrial habitat arose independently in each island. Because of a close relationship between shell morphology and life habitat, independent evolution of the same life habitat in different islands created species possesing the same shell morphology in different islands and lineages. This rapid evolution produced some incongruences between phylogenetic relationship and species taxonomy. Levels of sequence divergence of mtDNA among the species of Mandarina is extremely high. The maximum level of sequence divergence at 16S and 12S ribosomal RNA sequence within Mandarina are 18.7% and 17.7%, respectively, and this suggests that evolution of mtDNA of Mandarina is extremely rapid, more than 20 times faster than the standard rate in other animals. The present examination reveals that evolution of morphological and ecological traits occurs at extremely high rates in the time of adaptive radiation, especially in fragmented environments. © 1999 The Society for the Study of Evolution.
Pyrosequencing for Microbial Identification and Characterization
Cummings, Patrick J.; Ahmed, Ray; Durocher, Jeffrey A.; Jessen, Adam; Vardi, Tamar; Obom, Kristina M.
2013-01-01
Pyrosequencing is a versatile technique that facilitates microbial genome sequencing that can be used to identify bacterial species, discriminate bacterial strains and detect genetic mutations that confer resistance to anti-microbial agents. The advantages of pyrosequencing for microbiology applications include rapid and reliable high-throughput screening and accurate identification of microbes and microbial genome mutations. Pyrosequencing involves sequencing of DNA by synthesizing the complementary strand a single base at a time, while determining the specific nucleotide being incorporated during the synthesis reaction. The reaction occurs on immobilized single stranded template DNA where the four deoxyribonucleotides (dNTP) are added sequentially and the unincorporated dNTPs are enzymatically degraded before addition of the next dNTP to the synthesis reaction. Detection of the specific base incorporated into the template is monitored by generation of chemiluminescent signals. The order of dNTPs that produce the chemiluminescent signals determines the DNA sequence of the template. The real-time sequencing capability of pyrosequencing technology enables rapid microbial identification in a single assay. In addition, the pyrosequencing instrument, can analyze the full genetic diversity of anti-microbial drug resistance, including typing of SNPs, point mutations, insertions, and deletions, as well as quantification of multiple gene copies that may occur in some anti-microbial resistance patterns. PMID:23995536
Pyrosequencing for microbial identification and characterization.
Cummings, Patrick J; Ahmed, Ray; Durocher, Jeffrey A; Jessen, Adam; Vardi, Tamar; Obom, Kristina M
2013-08-22
Pyrosequencing is a versatile technique that facilitates microbial genome sequencing that can be used to identify bacterial species, discriminate bacterial strains and detect genetic mutations that confer resistance to anti-microbial agents. The advantages of pyrosequencing for microbiology applications include rapid and reliable high-throughput screening and accurate identification of microbes and microbial genome mutations. Pyrosequencing involves sequencing of DNA by synthesizing the complementary strand a single base at a time, while determining the specific nucleotide being incorporated during the synthesis reaction. The reaction occurs on immobilized single stranded template DNA where the four deoxyribonucleotides (dNTP) are added sequentially and the unincorporated dNTPs are enzymatically degraded before addition of the next dNTP to the synthesis reaction. Detection of the specific base incorporated into the template is monitored by generation of chemiluminescent signals. The order of dNTPs that produce the chemiluminescent signals determines the DNA sequence of the template. The real-time sequencing capability of pyrosequencing technology enables rapid microbial identification in a single assay. In addition, the pyrosequencing instrument, can analyze the full genetic diversity of anti-microbial drug resistance, including typing of SNPs, point mutations, insertions, and deletions, as well as quantification of multiple gene copies that may occur in some anti-microbial resistance patterns.
Gámez-Díaz, Laura; Sigmund, Elena C; Reiser, Veronika; Vach, Werner; Jung, Sophie; Grimbacher, Bodo
2018-01-01
The diagnosis of lipopolysaccharide-responsive beige-like-anchor-protein (LRBA) deficiency currently relies on gene sequencing approaches that do not support a timely diagnosis and clinical management. We developed a rapid and sensitive test for clinical implementation based on the detection of LRBA protein by flow cytometry in peripheral blood cells after stimulation. LRBA protein was assessed in a prospective cohort of 54 healthy donors and 57 patients suspected of LRBA deficiency. Receiver operating characteristics analysis suggested an LRBA:MFI ratio cutoff point of 2.6 to identify LRBA-deficient patients by FACS with 94% sensitivity and 80% specificity and to discriminate them from patients with a similar clinical picture but other disease-causing mutations. This easy flow cytometry-based assay allows a fast screening of patients with suspicion of LRBA deficiency reducing therefore the number of patients requiring LRBA sequencing and accelerating the treatment implementation. Detection of biallelic mutations in LRBA is however required for a definitive diagnosis.
Nanopore Sequencing as a Rapidly Deployable Ebola Outbreak Tool.
Hoenen, Thomas; Groseth, Allison; Rosenke, Kyle; Fischer, Robert J; Hoenen, Andreas; Judson, Seth D; Martellaro, Cynthia; Falzarano, Darryl; Marzi, Andrea; Squires, R Burke; Wollenberg, Kurt R; de Wit, Emmie; Prescott, Joseph; Safronetz, David; van Doremalen, Neeltje; Bushmaker, Trenton; Feldmann, Friederike; McNally, Kristin; Bolay, Fatorma K; Fields, Barry; Sealy, Tara; Rayfield, Mark; Nichol, Stuart T; Zoon, Kathryn C; Massaquoi, Moses; Munster, Vincent J; Feldmann, Heinz
2016-02-01
Rapid sequencing of RNA/DNA from pathogen samples obtained during disease outbreaks provides critical scientific and public health information. However, challenges exist for exporting samples to laboratories or establishing conventional sequencers in remote outbreak regions. We successfully used a novel, pocket-sized nanopore sequencer at a field diagnostic laboratory in Liberia during the current Ebola virus outbreak.
Mavromatis, Konstantinos; Land, Miriam L; Brettin, Thomas S; Quest, Daniel J; Copeland, Alex; Clum, Alicia; Goodwin, Lynne; Woyke, Tanja; Lapidus, Alla; Klenk, Hans Peter; Cottingham, Robert W; Kyrpides, Nikos C
2012-01-01
The emergence of next generation sequencing (NGS) has provided the means for rapid and high throughput sequencing and data generation at low cost, while concomitantly creating a new set of challenges. The number of available assembled microbial genomes continues to grow rapidly and their quality reflects the quality of the sequencing technology used, but also of the analysis software employed for assembly and annotation. In this work, we have explored the quality of the microbial draft genomes across various sequencing technologies. We have compared the draft and finished assemblies of 133 microbial genomes sequenced at the Department of Energy-Joint Genome Institute and finished at the Los Alamos National Laboratory using a variety of combinations of sequencing technologies, reflecting the transition of the institute from Sanger-based sequencing platforms to NGS platforms. The quality of the public assemblies and of the associated gene annotations was evaluated using various metrics. Results obtained with the different sequencing technologies, as well as their effects on downstream processes, were analyzed. Our results demonstrate that the Illumina HiSeq 2000 sequencing system, the primary sequencing technology currently used for de novo genome sequencing and assembly at JGI, has various advantages in terms of total sequence throughput and cost, but it also introduces challenges for the downstream analyses. In all cases assembly results although on average are of high quality, need to be viewed critically and consider sources of errors in them prior to analysis. These data follow the evolution of microbial sequencing and downstream processing at the JGI from draft genome sequences with large gaps corresponding to missing genes of significant biological role to assemblies with multiple small gaps (Illumina) and finally to assemblies that generate almost complete genomes (Illumina+PacBio).
A rapid, one step molecular identification of Trichoderma citrinoviride and Trichoderma reesei.
Saroj, Dina B; Dengeti, Shrinivas N; Aher, Supriya; Gupta, Anil K
2015-06-01
Trichoderma species are widely used as production hosts for industrial enzymes. Identification of Trichoderma species requires a complex molecular biology based identification involving amplification and sequencing of multiple genes. Industrial laboratories are required to run identification tests repeatedly in cell banking procedures and also to prove absence of production host in the product. Such demands can be fulfilled by a brief method which enables confirmation of strain identity. This communication describes one step identification method for two common Trichoderma species; T. citrinoviride and T. reesei, based on identification of polymorphic region in the nucleotide sequence of translation elongation factor 1 alpha. A unique forward primer and common reverse primer resulted in 153 and 139 bp amplicon for T. citrinoviride and T. reesei, respectively. Simplification was further introduced by using mycelium as template for PCR amplification. Method described in this communication allows rapid, one step identification of two Trichoderma species.
Fasihi, Yasser; Fooladi, Saba; Mohammadi, Mohammad Ali; Emaneini, Mohammad; Kalantar-Neyestanaki, Davood
2017-09-06
Molecular typing is an important tool for control and prevention of infection. A suitable molecular typing method for epidemiological investigation must be easy to perform, highly reproducible, inexpensive, rapid and easy to interpret. In this study, two molecular typing methods including the conventional PCR-sequencing method and high resolution melting (HRM) analysis were used for staphylococcal protein A (spa) typing of 30 Methicillin-resistant Staphylococcus aureus (MRSA) isolates recovered from clinical samples. Based on PCR-sequencing method results, 16 different spa types were identified among the 30 MRSA isolates. Among the 16 different spa types, 14 spa types separated by HRM method. Two spa types including t4718 and t2894 were not separated from each other. According to our results, spa typing based on HRM analysis method is very rapid, easy to perform and cost-effective, but this method must be standardized for different regions, spa types, and real-time machinery.
Hausmann, Axel; Cancian de Araujo, Bruno; Sutrisno, Hari; Peggie, Djunijanti; Schmidt, Stefan
2017-01-01
Abstract Here we present a general collecting and preparation protocol for DNA barcoding of Lepidoptera as part of large-scale rapid biodiversity assessment projects, and a comparison with alternative preserving and vouchering methods. About 98% of the sequenced specimens processed using the present collecting and preparation protocol yielded sequences with more than 500 base pairs. The study is based on the first outcomes of the Indonesian Biodiversity Discovery and Information System (IndoBioSys). IndoBioSys is a German-Indonesian research project that is conducted by the Museum für Naturkunde in Berlin and the Zoologische Staatssammlung München, in close cooperation with the Research Center for Biology – Indonesian Institute of Sciences (RCB-LIPI, Bogor). PMID:29134041
A powerful graphical pulse sequence programming tool for magnetic resonance imaging.
Jie, Shen; Ying, Liu; Jianqi, Li; Gengying, Li
2005-12-01
A powerful graphical pulse sequence programming tool has been designed for creating magnetic resonance imaging (MRI) applications. It allows rapid development of pulse sequences in graphical mode (allowing for the visualization of sequences), and consists of three modules which include a graphical sequence editor, a parameter management module and a sequence compiler. Its key features are ease to use, flexibility and hardware independence. When graphic elements are combined with a certain text expressions, the graphical pulse sequence programming is as flexible as text-based programming tool. In addition, a hardware-independent design is implemented by using the strategy of two step compilations. To demonstrate the flexibility and the capability of this graphical sequence programming tool, a multi-slice fast spin echo experiment is performed on our home-made 0.3 T permanent magnet MRI system.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Gilbert, Jack A.; Quinn, Robert A.; Debelius, Justine
Rapid advances in DNA sequencing, metabolomics, proteomics and computation dramatically increase accessibility of microbiome studies and identify links between the microbiome and disease. Microbial time-series and multiple molecular perspectives enable Microbiome-Wide Association Studies (MWAS), analogous to Genome-Wide Association Studies (GWAS). Rapid research advances point towards actionable results, although approved clinical tests based on MWAS are still in the future. Appreciating the complexity of interactions between diet, chemistry, health and the microbiome, and determining the frequency of observations needed to capture and integrate this dynamic interface, is paramount for addressing the need for personalized and precision microbiome-based diagnostics and therapies.
Tabata, Ryo; Kamiya, Takehiro; Shigenobu, Shuji; Yamaguchi, Katsushi; Yamada, Masashi; Hasebe, Mitsuyasu; Fujiwara, Toru; Sawa, Shinichiro
2013-01-01
Next-generation sequencing (NGS) technologies enable the rapid production of an enormous quantity of sequence data. These powerful new technologies allow the identification of mutations by whole-genome sequencing. However, most reported NGS-based mapping methods, which are based on bulked segregant analysis, are costly and laborious. To address these limitations, we designed a versatile NGS-based mapping method that consists of a combination of low- to medium-coverage multiplex SOLiD (Sequencing by Oligonucleotide Ligation and Detection) and classical genetic rough mapping. Using only low to medium coverage reduces the SOLiD sequencing costs and, since just 10 to 20 mutant F2 plants are required for rough mapping, the operation is simple enough to handle in a laboratory with limited space and funding. As a proof of principle, we successfully applied this method to identify the CTR1, which is involved in boron-mediated root development, from among a population of high boron requiring Arabidopsis thaliana mutants. Our work demonstrates that this NGS-based mapping method is a moderately priced and versatile method that can readily be applied to other model organisms. PMID:23104114
Ruppitsch, W; Stöger, A; Indra, A; Grif, K; Schabereiter-Gurtner, C; Hirschl, A; Allerberger, F
2007-03-01
In a bioterrorism event a rapid tool is needed to identify relevant dangerous bacteria. The aim of the study was to assess the usefulness of partial 16S rRNA gene sequence analysis and the suitability of diverse databases for identifying dangerous bacterial pathogens. For rapid identification purposes a 500-bp fragment of the 16S rRNA gene of 28 isolates comprising Bacillus anthracis, Brucella melitensis, Burkholderia mallei, Burkholderia pseudomallei, Francisella tularensis, Yersinia pestis, and eight genus-related and unrelated control strains was amplified and sequenced. The obtained sequence data were submitted to three public and two commercial sequence databases for species identification. The most frequent reason for incorrect identification was the lack of the respective 16S rRNA gene sequences in the database. Sequence analysis of a 500-bp 16S rDNA fragment allows the rapid identification of dangerous bacterial species. However, for discrimination of closely related species sequencing of the entire 16S rRNA gene, additional sequencing of the 23S rRNA gene or sequencing of the 16S-23S rRNA intergenic spacer is essential. This work provides comprehensive information on the suitability of partial 16S rDNA analysis and diverse databases for rapid and accurate identification of dangerous bacterial pathogens.
rpoB-Based Identification of Nonpigmented and Late-Pigmenting Rapidly Growing Mycobacteria
Adékambi, Toïdi; Colson, Philippe; Drancourt, Michel
2003-01-01
Nonpigmented and late-pigmenting rapidly growing mycobacteria (RGM) are increasingly isolated in clinical microbiology laboratories. Their accurate identification remains problematic because classification is labor intensive work and because new taxa are not often incorporated into classification databases. Also, 16S rRNA gene sequence analysis underestimates RGM diversity and does not distinguish between all taxa. We determined the complete nucleotide sequence of the rpoB gene, which encodes the bacterial β subunit of the RNA polymerase, for 20 RGM type strains. After using in-house software which analyzes and graphically represents variability stretches of 60 bp along the nucleotide sequence, our analysis focused on a 723-bp variable region exhibiting 83.9 to 97% interspecies similarity and 0 to 1.7% intraspecific divergence. Primer pair Myco-F-Myco-R was designed as a tool for both PCR amplification and sequencing of this region for molecular identification of RGM. This tool was used for identification of 63 RGM clinical isolates previously identified at the species level on the basis of phenotypic characteristics and by 16S rRNA gene sequence analysis. Of 63 clinical isolates, 59 (94%) exhibited <2% partial rpoB gene sequence divergence from 1 of 20 species under study and were regarded as correctly identified at the species level. Mycobacterium abscessus and Mycobacterium mucogenicum isolates were clearly distinguished from Mycobacterium chelonae; Mycobacterium mageritense isolates were clearly distinguished from “Mycobacterium houstonense.” Four isolates were not identified at the species level because they exhibited >3% partial rpoB gene sequence divergence from the corresponding type strain; they belonged to three taxa related to M. mucogenicum, Mycobacterium smegmatis, and Mycobacterium porcinum. For M. abscessus and M. mucogenicum, this partial sequence yielded a high genetic heterogeneity within the clinical isolates. We conclude that molecular identification by analysis of the 723-bp rpoB sequence is a rapid and accurate tool for identification of RGM. PMID:14662964
April, Michael D; Arana, Allyson; Pallin, Daniel J; Schauer, Steven G; Fantegrossi, Andrea; Fernandez, Jessie; Maddry, Joseph K; Summers, Shane M; Antonacci, Mark A; Brown, Calvin A
2018-05-07
Although both succinylcholine and rocuronium are used to facilitate emergency department (ED) rapid sequence intubation, the difference in intubation success rate between them is unknown. We compare first-pass intubation success between ED rapid sequence intubation facilitated by succinylcholine versus rocuronium. We analyzed prospectively collected data from the National Emergency Airway Registry, a multicenter registry collecting data on all intubations performed in 22 EDs. We included intubations of patients older than 14 years who received succinylcholine or rocuronium during 2016. We compared the first-pass intubation success between patients receiving succinylcholine and those receiving rocuronium. We also compared the incidence of adverse events (cardiac arrest, dental trauma, direct airway injury, dysrhythmias, epistaxis, esophageal intubation, hypotension, hypoxia, iatrogenic bleeding, laryngoscope failure, laryngospasm, lip laceration, main-stem bronchus intubation, malignant hyperthermia, medication error, pharyngeal laceration, pneumothorax, endotracheal tube cuff failure, and vomiting). We conducted subgroup analyses stratified by paralytic weight-based dose. There were 2,275 rapid sequence intubations facilitated by succinylcholine and 1,800 by rocuronium. Patients receiving succinylcholine were younger and more likely to undergo intubation with video laryngoscopy and by more experienced providers. First-pass intubation success rate was 87.0% with succinylcholine versus 87.5% with rocuronium (adjusted odds ratio 0.9; 95% confidence interval 0.6 to 1.3). The incidence of any adverse event was also comparable between these agents: 14.7% for succinylcholine versus 14.8% for rocuronium (adjusted odds ratio 1.1; 95% confidence interval 0.9 to 1.3). We observed similar results when they were stratified by paralytic weight-based dose. In this large observational series, we did not detect an association between paralytic choice and first-pass rapid sequence intubation success or peri-intubation adverse events. Copyright © 2018 American College of Emergency Physicians. Published by Elsevier Inc. All rights reserved.
Mutation detection using automated fluorescence-based sequencing.
Montgomery, Kate T; Iartchouck, Oleg; Li, Li; Perera, Anoja; Yassin, Yosuf; Tamburino, Alex; Loomis, Stephanie; Kucherlapati, Raju
2008-04-01
The development of high-throughput DNA sequencing techniques has made direct DNA sequencing of PCR-amplified genomic DNA a rapid and economical approach to the identification of polymorphisms that may play a role in disease. Point mutations as well as small insertions or deletions are readily identified by DNA sequencing. The mutations may be heterozygous (occurring in one allele while the other allele retains the normal sequence) or homozygous (occurring in both alleles). Sequencing alone cannot discriminate between true homozygosity and apparent homozygosity due to the loss of one allele due to a large deletion. In this unit, strategies are presented for using PCR amplification and automated fluorescence-based sequencing to identify sequence variation. The size of the project and laboratory preference and experience will dictate how the data is managed and which software tools are used for analysis. A high-throughput protocol is given that has been used to search for mutations in over 200 different genes at the Harvard Medical School - Partners Center for Genetics and Genomics (HPCGG, http://www.hpcgg.org/). Copyright 2008 by John Wiley & Sons, Inc.
Lane, Todd
2018-05-18
Todd Lane on "RapTOR: Automated sequencing library preparation and suppression for rapid pathogen characterization" at the 2012 Sequencing, Finishing, Analysis in the Future Meeting held June 5-7, 2012 in Santa Fe, New Mexico.
DOE Office of Scientific and Technical Information (OSTI.GOV)
FitzGerald, Michael
2012-06-01
Michael FitzGerald on "A rapid whole genome sequencing and analysis system supporting genomic epidemiology" at the 2012 Sequencing, Finishing, Analysis in the Future Meeting held June 5-7, 2012 in Santa Fe, New Mexico.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Lane, Todd
2012-06-01
Todd Lane on "RapTOR: Automated sequencing library preparation and suppression for rapid pathogen characterization" at the 2012 Sequencing, Finishing, Analysis in the Future Meeting held June 5-7, 2012 in Santa Fe, New Mexico.
FitzGerald, Michael
2018-01-11
Michael FitzGerald on "A rapid whole genome sequencing and analysis system supporting genomic epidemiology" at the 2012 Sequencing, Finishing, Analysis in the Future Meeting held June 5-7, 2012 in Santa Fe, New Mexico.
Rapid Threat Organism Recognition Pipeline
DOE Office of Scientific and Technical Information (OSTI.GOV)
Williams, Kelly P.; Solberg, Owen D.; Schoeniger, Joseph S.
2013-05-07
The RAPTOR computational pipeline identifies microbial nucleic acid sequences present in sequence data from clinical samples. It takes as input raw short-read genomic sequence data (in particular, the type generated by the Illumina sequencing platforms) and outputs taxonomic evaluation of detected microbes in various human-readable formats. This software was designed to assist in the diagnosis or characterization of infectious disease, by detecting pathogen sequences in nucleic acid sequence data from clinical samples. It has also been applied in the detection of algal pathogens, when algal biofuel ponds became unproductive. RAPTOR first trims and filters genomic sequence reads based on qualitymore » and related considerations, then performs a quick alignment to the human (or other host) genome to filter out host sequences, then performs a deeper search against microbial genomes. Alignment to a protein sequence database is optional. Alignment results are summarized and placed in a taxonomic framework using the Lowest Common Ancestor algorithm.« less
Fredlake, Christopher P; Hert, Daniel G; Kan, Cheuk-Wai; Chiesl, Thomas N; Root, Brian E; Forster, Ryan E; Barron, Annelise E
2008-01-15
To realize the immense potential of large-scale genomic sequencing after the completion of the second human genome (Venter's), the costs for the complete sequencing of additional genomes must be dramatically reduced. Among the technologies being developed to reduce sequencing costs, microchip electrophoresis is the only new technology ready to produce the long reads most suitable for the de novo sequencing and assembly of large and complex genomes. Compared with the current paradigm of capillary electrophoresis, microchip systems promise to reduce sequencing costs dramatically by increasing throughput, reducing reagent consumption, and integrating the many steps of the sequencing pipeline onto a single platform. Although capillary-based systems require approximately 70 min to deliver approximately 650 bases of contiguous sequence, we report sequencing up to 600 bases in just 6.5 min by microchip electrophoresis with a unique polymer matrix/adsorbed polymer wall coating combination. This represents a two-thirds reduction in sequencing time over any previously published chip sequencing result, with comparable read length and sequence quality. We hypothesize that these ultrafast long reads on chips can be achieved because the combined polymer system engenders a recently discovered "hybrid" mechanism of DNA electromigration, in which DNA molecules alternate rapidly between repeating through the intact polymer network and disrupting network entanglements to drag polymers through the solution, similar to dsDNA dynamics we observe in single-molecule DNA imaging studies. Most importantly, these results reveal the surprisingly powerful ability of microchip electrophoresis to provide ultrafast Sanger sequencing, which will translate to increased system throughput and reduced costs.
Fredlake, Christopher P.; Hert, Daniel G.; Kan, Cheuk-Wai; Chiesl, Thomas N.; Root, Brian E.; Forster, Ryan E.; Barron, Annelise E.
2008-01-01
To realize the immense potential of large-scale genomic sequencing after the completion of the second human genome (Venter's), the costs for the complete sequencing of additional genomes must be dramatically reduced. Among the technologies being developed to reduce sequencing costs, microchip electrophoresis is the only new technology ready to produce the long reads most suitable for the de novo sequencing and assembly of large and complex genomes. Compared with the current paradigm of capillary electrophoresis, microchip systems promise to reduce sequencing costs dramatically by increasing throughput, reducing reagent consumption, and integrating the many steps of the sequencing pipeline onto a single platform. Although capillary-based systems require ≈70 min to deliver ≈650 bases of contiguous sequence, we report sequencing up to 600 bases in just 6.5 min by microchip electrophoresis with a unique polymer matrix/adsorbed polymer wall coating combination. This represents a two-thirds reduction in sequencing time over any previously published chip sequencing result, with comparable read length and sequence quality. We hypothesize that these ultrafast long reads on chips can be achieved because the combined polymer system engenders a recently discovered “hybrid” mechanism of DNA electromigration, in which DNA molecules alternate rapidly between reptating through the intact polymer network and disrupting network entanglements to drag polymers through the solution, similar to dsDNA dynamics we observe in single-molecule DNA imaging studies. Most importantly, these results reveal the surprisingly powerful ability of microchip electrophoresis to provide ultrafast Sanger sequencing, which will translate to increased system throughput and reduced costs. PMID:18184818
Nanopore Sequencing as a Rapidly Deployable Ebola Outbreak Tool
Groseth, Allison; Rosenke, Kyle; Fischer, Robert J.; Hoenen, Andreas; Judson, Seth D.; Martellaro, Cynthia; Falzarano, Darryl; Marzi, Andrea; Squires, R. Burke; Wollenberg, Kurt R.; de Wit, Emmie; Prescott, Joseph; Safronetz, David; van Doremalen, Neeltje; Bushmaker, Trenton; Feldmann, Friederike; McNally, Kristin; Bolay, Fatorma K.; Fields, Barry; Sealy, Tara; Rayfield, Mark; Nichol, Stuart T.; Zoon, Kathryn C.; Massaquoi, Moses; Munster, Vincent J.; Feldmann, Heinz
2016-01-01
Rapid sequencing of RNA/DNA from pathogen samples obtained during disease outbreaks provides critical scientific and public health information. However, challenges exist for exporting samples to laboratories or establishing conventional sequencers in remote outbreak regions. We successfully used a novel, pocket-sized nanopore sequencer at a field diagnostic laboratory in Liberia during the current Ebola virus outbreak. PMID:26812583
LLNL Genomic Assessment: Viral and Bacterial Sequencing Needs for TMTI, Task 1.4.2 Report
DOE Office of Scientific and Technical Information (OSTI.GOV)
Slezak, T; Borucki, M; Lam, M
Good progress has been made on both bacterial and viral sequencing by the TMTI centers. While access to appropriate samples is a limiting factor to throughput, excellent progress has been made with respect to getting agreements in place with key sources of relevant materials. Sharing of sequenced genomes funded by TMTI has been extremely limited to date. The April 2010 exercise should force a resolution to this, but additional managerial pressures may be needed to ensure that rapid sharing of TMTI-funded sequencing occurs, regardless of collaborator constraints concerning ultimate publication(s). Policies to permit TMTI-internal rapid sharing of sequenced genomes shouldmore » be written into all TMTI agreements with collaborators now being negotiated. TMTI needs to establish a Web-based system for tracking samples destined for sequencing. This includes metadata on sample origins and contributor, information on sample shipment/receipt, prioritization by TMTI, assignment to one or more sequencing centers (including possible TMTI-sponsored sequencing at a contributor site), and status history of the sample sequencing effort. While this system could be a component of the AFRL system, it is not part of any current development effort. Policy and standardized procedures are needed to ensure appropriate verification of all TMTI samples prior to the investment in sequencing. PCR, arrays, and classical biochemical tests are examples of potential verification methods. Verification is needed to detect miss-labeled, degraded, mixed or contaminated samples. Regular QC exercises are needed to ensure that the TMTI-funded centers are meeting all standards for producing quality genomic sequence data.« less
Functional interrogation of non-coding DNA through CRISPR genome editing.
Canver, Matthew C; Bauer, Daniel E; Orkin, Stuart H
2017-05-15
Methodologies to interrogate non-coding regions have lagged behind coding regions despite comprising the vast majority of the genome. However, the rapid evolution of clustered regularly interspaced short palindromic repeats (CRISPR)-based genome editing has provided a multitude of novel techniques for laboratory investigation including significant contributions to the toolbox for studying non-coding DNA. CRISPR-mediated loss-of-function strategies rely on direct disruption of the underlying sequence or repression of transcription without modifying the targeted DNA sequence. CRISPR-mediated gain-of-function approaches similarly benefit from methods to alter the targeted sequence through integration of customized sequence into the genome as well as methods to activate transcription. Here we review CRISPR-based loss- and gain-of-function techniques for the interrogation of non-coding DNA. Copyright © 2017 Elsevier Inc. All rights reserved.
Chaitankar, Vijender; Karakülah, Gökhan; Ratnapriya, Rinki; Giuste, Felipe O.; Brooks, Matthew J.; Swaroop, Anand
2016-01-01
The advent of high throughput next generation sequencing (NGS) has accelerated the pace of discovery of disease-associated genetic variants and genomewide profiling of expressed sequences and epigenetic marks, thereby permitting systems-based analyses of ocular development and disease. Rapid evolution of NGS and associated methodologies presents significant challenges in acquisition, management, and analysis of large data sets and for extracting biologically or clinically relevant information. Here we illustrate the basic design of commonly used NGS-based methods, specifically whole exome sequencing, transcriptome, and epigenome profiling, and provide recommendations for data analyses. We briefly discuss systems biology approaches for integrating multiple data sets to elucidate gene regulatory or disease networks. While we provide examples from the retina, the NGS guidelines reviewed here are applicable to other tissues/cell types as well. PMID:27297499
Oligo/Polynucleotide-Based Gene Modification: Strategies and Therapeutic Potential
Sargent, R. Geoffrey; Kim, Soya
2011-01-01
Oligonucleotide- and polynucleotide-based gene modification strategies were developed as an alternative to transgene-based and classical gene targeting-based gene therapy approaches for treatment of genetic disorders. Unlike the transgene-based strategies, oligo/polynucleotide gene targeting approaches maintain gene integrity and the relationship between the protein coding and gene-specific regulatory sequences. Oligo/polynucleotide-based gene modification also has several advantages over classical vector-based homologous recombination approaches. These include essentially complete homology to the target sequence and the potential to rapidly engineer patient-specific oligo/polynucleotide gene modification reagents. Several oligo/polynucleotide-based approaches have been shown to successfully mediate sequence-specific modification of genomic DNA in mammalian cells. The strategies involve the use of polynucleotide small DNA fragments, triplex-forming oligonucleotides, and single-stranded oligodeoxynucleotides to mediate homologous exchange. The primary focus of this review will be on the mechanistic aspects of the small fragment homologous replacement, triplex-forming oligonucleotide-mediated, and single-stranded oligodeoxynucleotide-mediated gene modification strategies as it relates to their therapeutic potential. PMID:21417933
Identification of Bacterial Populations in Drinking Water Using 16S rRNA-Based Sequence Analyses
Intracellular RNA is rapidly degraded in stressed cells and is more unstable outside of the cell than DNA. As a result, RNA-based methods have been suggested to study the active microbial fraction in environmental matrices. The aim of this study was to identify bacterial populati...
Rapid Detection & Identification of Bacillus Species using MALDI-TOF/TOF and Biomarker Database
2006-06-01
rRNA sequence analysis. Multilocus enzyme electrophoresis ( MEE ) and comparative DNA sequence analysis suggest that they may represent a single species...adaptation of the MEE method [63] but with greater discrimination [64]. All of these new PCR-based subtyping methods are certainly superior and more...Demirev, P.A., Lin, J.S., Pineda , F.J., and Fenselau, C. (2001). Bioinformatics and mass spectrometry for microorganism identification: proteome-wide
Site-Specific Pyrolysis Induced Cleavage at Aspartic Acid Residue in Peptides and Proteins
Zhang, Shaofeng; Basile, Franco
2011-01-01
A simple and site-specific non-enzymatic method based on pyrolysis has been developed to cleave peptides and proteins. Pyrolytic cleavage was found to be specific and rapid as it induced a cleavage at the C-terminal side of aspartic acid in the temperature range of 220–250 °C in 10 seconds. Electrospray Ionization (ESI) mass spectrometry (MS) and tandem-MS (MS/MS) were used to characterize and identify pyrolysis cleavage products, confirming that sequence information is conserved after the pyrolysis process in both peptides and protein tested. This suggests that pyrolysis-induced cleavage at aspartyl residues can be used as a rapid protein digestion procedure for the generation of sequence specific protein biomarkers. PMID:17388620
Strain, Errol; Melka, David; Bunning, Kelly; Musser, Steven M.; Brown, Eric W.; Timme, Ruth
2016-01-01
The FDA has created a United States-based open-source whole-genome sequencing network of state, federal, international, and commercial partners. The GenomeTrakr network represents a first-of-its-kind distributed genomic food shield for characterizing and tracing foodborne outbreak pathogens back to their sources. The GenomeTrakr network is leading investigations of outbreaks of foodborne illnesses and compliance actions with more accurate and rapid recalls of contaminated foods as well as more effective monitoring of preventive controls for food manufacturing environments. An expanded network would serve to provide an international rapid surveillance system for pathogen traceback, which is critical to support an effective public health response to bacterial outbreaks. PMID:27008877
Gold, Nicola D; Jackson, Richard M
2006-02-03
The rapid growth in protein structural data and the emergence of structural genomics projects have increased the need for automatic structure analysis and tools for function prediction. Small molecule recognition is critical to the function of many proteins; therefore, determination of ligand binding site similarity is important for understanding ligand interactions and may allow their functional classification. Here, we present a binding sites database (SitesBase) that given a known protein-ligand binding site allows rapid retrieval of other binding sites with similar structure independent of overall sequence or fold similarity. However, each match is also annotated with sequence similarity and fold information to aid interpretation of structure and functional similarity. Similarity in ligand binding sites can indicate common binding modes and recognition of similar molecules, allowing potential inference of function for an uncharacterised protein or providing additional evidence of common function where sequence or fold similarity is already known. Alternatively, the resource can provide valuable information for detailed studies of molecular recognition including structure-based ligand design and in understanding ligand cross-reactivity. Here, we show examples of atomic similarity between superfamily or more distant fold relatives as well as between seemingly unrelated proteins. Assignment of unclassified proteins to structural superfamiles is also undertaken and in most cases substantiates assignments made using sequence similarity. Correct assignment is also possible where sequence similarity fails to find significant matches, illustrating the potential use of binding site comparisons for newly determined proteins.
Recent developments in detection and enumeration of waterborne bacteria: a retrospective minireview.
Deshmukh, Rehan A; Joshi, Kopal; Bhand, Sunil; Roy, Utpal
2016-12-01
Waterborne diseases have emerged as global health problems and their rapid and sensitive detection in environmental water samples is of great importance. Bacterial identification and enumeration in water samples is significant as it helps to maintain safe drinking water for public consumption. Culture-based methods are laborious, time-consuming, and yield false-positive results, whereas viable but nonculturable (VBNCs) microorganisms cannot be recovered. Hence, numerous methods have been developed for rapid detection and quantification of waterborne pathogenic bacteria in water. These rapid methods can be classified into nucleic acid-based, immunology-based, and biosensor-based detection methods. This review summarizes the principle and current state of rapid methods for the monitoring and detection of waterborne bacterial pathogens. Rapid methods outlined are polymerase chain reaction (PCR), digital droplet PCR, real-time PCR, multiplex PCR, DNA microarray, Next-generation sequencing (pyrosequencing, Illumina technology and genomics), and fluorescence in situ hybridization that are categorized as nucleic acid-based methods. Enzyme-linked immunosorbent assay (ELISA) and immunofluorescence are classified into immunology-based methods. Optical, electrochemical, and mass-based biosensors are grouped into biosensor-based methods. Overall, these methods are sensitive, specific, time-effective, and important in prevention and diagnosis of waterborne bacterial diseases. © 2016 The Authors. MicrobiologyOpen published by John Wiley & Sons Ltd.
Zhang, Wei; Zhang, Xiaolong; Qiang, Yan; Tian, Qi; Tang, Xiaoxian
2017-01-01
The fast and accurate segmentation of lung nodule image sequences is the basis of subsequent processing and diagnostic analyses. However, previous research investigating nodule segmentation algorithms cannot entirely segment cavitary nodules, and the segmentation of juxta-vascular nodules is inaccurate and inefficient. To solve these problems, we propose a new method for the segmentation of lung nodule image sequences based on superpixels and density-based spatial clustering of applications with noise (DBSCAN). First, our method uses three-dimensional computed tomography image features of the average intensity projection combined with multi-scale dot enhancement for preprocessing. Hexagonal clustering and morphological optimized sequential linear iterative clustering (HMSLIC) for sequence image oversegmentation is then proposed to obtain superpixel blocks. The adaptive weight coefficient is then constructed to calculate the distance required between superpixels to achieve precise lung nodules positioning and to obtain the subsequent clustering starting block. Moreover, by fitting the distance and detecting the change in slope, an accurate clustering threshold is obtained. Thereafter, a fast DBSCAN superpixel sequence clustering algorithm, which is optimized by the strategy of only clustering the lung nodules and adaptive threshold, is then used to obtain lung nodule mask sequences. Finally, the lung nodule image sequences are obtained. The experimental results show that our method rapidly, completely and accurately segments various types of lung nodule image sequences. PMID:28880916
2013-01-01
Background Perturbations in intestinal microbiota composition have been associated with a variety of gastrointestinal tract-related diseases. The alleviation of symptoms has been achieved using treatments that alter the gastrointestinal tract microbiota toward that of healthy individuals. Identifying differences in microbiota composition through the use of 16S rRNA gene hypervariable tag sequencing has profound health implications. Current computational methods for comparing microbial communities are usually based on multiple alignments and phylogenetic inference, making them time consuming and requiring exceptional expertise and computational resources. As sequencing data rapidly grows in size, simpler analysis methods are needed to meet the growing computational burdens of microbiota comparisons. Thus, we have developed a simple, rapid, and accurate method, independent of multiple alignments and phylogenetic inference, to support microbiota comparisons. Results We create a metric, called compression-based distance (CBD) for quantifying the degree of similarity between microbial communities. CBD uses the repetitive nature of hypervariable tag datasets and well-established compression algorithms to approximate the total information shared between two datasets. Three published microbiota datasets were used as test cases for CBD as an applicable tool. Our study revealed that CBD recaptured 100% of the statistically significant conclusions reported in the previous studies, while achieving a decrease in computational time required when compared to similar tools without expert user intervention. Conclusion CBD provides a simple, rapid, and accurate method for assessing distances between gastrointestinal tract microbiota 16S hypervariable tag datasets. PMID:23617892
Yang, Fang; Chia, Nicholas; White, Bryan A; Schook, Lawrence B
2013-04-23
Perturbations in intestinal microbiota composition have been associated with a variety of gastrointestinal tract-related diseases. The alleviation of symptoms has been achieved using treatments that alter the gastrointestinal tract microbiota toward that of healthy individuals. Identifying differences in microbiota composition through the use of 16S rRNA gene hypervariable tag sequencing has profound health implications. Current computational methods for comparing microbial communities are usually based on multiple alignments and phylogenetic inference, making them time consuming and requiring exceptional expertise and computational resources. As sequencing data rapidly grows in size, simpler analysis methods are needed to meet the growing computational burdens of microbiota comparisons. Thus, we have developed a simple, rapid, and accurate method, independent of multiple alignments and phylogenetic inference, to support microbiota comparisons. We create a metric, called compression-based distance (CBD) for quantifying the degree of similarity between microbial communities. CBD uses the repetitive nature of hypervariable tag datasets and well-established compression algorithms to approximate the total information shared between two datasets. Three published microbiota datasets were used as test cases for CBD as an applicable tool. Our study revealed that CBD recaptured 100% of the statistically significant conclusions reported in the previous studies, while achieving a decrease in computational time required when compared to similar tools without expert user intervention. CBD provides a simple, rapid, and accurate method for assessing distances between gastrointestinal tract microbiota 16S hypervariable tag datasets.
Lager, Malin; Mernelius, Sara; Löfgren, Sture; Söderman, Jan
2016-01-01
Healthcare-associated infections caused by Escherichia coli and antibiotic resistance due to extended-spectrum beta-lactamase (ESBL) production constitute a threat against patient safety. To identify, track, and control outbreaks and to detect emerging virulent clones, typing tools of sufficient discriminatory power that generate reproducible and unambiguous data are needed. A probe based real-time PCR method targeting multiple single nucleotide polymorphisms (SNP) was developed. The method was based on the multi locus sequence typing scheme of Institute Pasteur and by adaptation of previously described typing assays. An 8 SNP-panel that reached a Simpson's diversity index of 0.95 was established, based on analysis of sporadic E. coli cases (ESBL n = 27 and non-ESBL n = 53). This multi-SNP assay was used to identify the sequence type 131 (ST131) complex according to the Achtman's multi locus sequence typing scheme. However, it did not fully discriminate within the complex but provided a diagnostic signature that outperformed a previously described detection assay. Pulsed-field gel electrophoresis typing of isolates from a presumed outbreak (n = 22) identified two outbreaks (ST127 and ST131) and three different non-outbreak-related isolates. Multi-SNP typing generated congruent data except for one non-outbreak-related ST131 isolate. We consider multi-SNP real-time PCR typing an accessible primary generic E. coli typing tool for rapid and uniform type identification.
McDonagh, Laura; Thornton, Chris; Wallman, James F; Stevens, Jamie R
2009-06-01
In this study we examine the limitations of currently used sequence-based approaches to blowfly (Calliphoridae) identification and evaluate the utility of an immunological approach to discriminate between blowfly species of forensic importance. By investigating antigenic similarity and dissimilarity between the first instar larval stages of four forensically important blowfly species, we have been able to identify immunoreactive proteins of potential use in the development of species-specific immuno-diagnostic tests. Here we outline our protein-based approach to species determination, and describe how it may be adapted to develop rapid diagnostic assays for the 'on-site' identification of blowfly species.
Fior, Simone; Li, Mingai; Oxelman, Bengt; Viola, Roberto; Hodges, Scott A; Ometto, Lino; Varotto, Claudio
2013-04-01
Aquilegia is a well-known model system in the field of evolutionary biology, but obtaining a resolved and well-supported phylogenetic reconstruction for the genus has been hindered by its recent and rapid diversification. Here, we applied 454 next-generation sequencing to PCR amplicons of 21 of the most rapidly evolving regions of the plastome to generate c. 24 kb of sequences from each of 84 individuals from throughout the genus. The resulting phylogeny has well-supported resolution of the main lineages of the genus, although recent diversification such as in the European taxa remains unresolved. By producing a chronogram of the whole Ranunculaceae family based on published data, we inferred calibration points for dating the Aquilegia radiation. The genus originated in the upper Miocene c. 6.9 million yr ago (Ma) in Eastern Asia, and diversification occurred c. 4.8 Ma with the split of two main clades, one colonizing North America, and the other Western Eurasia through the mountains of Central Asia. This was followed by a back-to-Asia migration, originating from the European stock using a North Asian route. These results provide the first backbone phylogeny and spatiotemporal reconstruction of the Aquilegia radiation, and constitute a robust framework to address the adaptative nature of speciation within the group. © 2013 The Authors. New Phytologist © 2013 New Phytologist Trust.
PhAST: pharmacophore alignment search tool.
Hähnke, Volker; Hofmann, Bettina; Grgat, Tomislav; Proschak, Ewgenij; Steinhilber, Dieter; Schneider, Gisbert
2009-04-15
We present a ligand-based virtual screening technique (PhAST) for rapid hit and lead structure searching in large compound databases. Molecules are represented as strings encoding the distribution of pharmacophoric features on the molecular graph. In contrast to other text-based methods using SMILES strings, we introduce a new form of text representation that describes the pharmacophore of molecules. This string representation opens the opportunity for revealing functional similarity between molecules by sequence alignment techniques in analogy to homology searching in protein or nucleic acid sequence databases. We favorably compared PhAST with other current ligand-based virtual screening methods in a retrospective analysis using the BEDROC metric. In a prospective application, PhAST identified two novel inhibitors of 5-lipoxygenase product formation with minimal experimental effort. This outcome demonstrates the applicability of PhAST to drug discovery projects and provides an innovative concept of sequence-based compound screening with substantial scaffold hopping potential. 2008 Wiley Periodicals, Inc.
Yu, Li; Li, Yi-Wei; Ryder, Oliver A; Zhang, Ya-Ping
2007-10-24
Despite the small number of ursid species, bear phylogeny has long been a focus of study due to their conservation value, as all bear genera have been classified as endangered at either the species or subspecies level. The Ursidae family represents a typical example of rapid evolutionary radiation. Previous analyses with a single mitochondrial (mt) gene or a small number of mt genes either provide weak support or a large unresolved polytomy for ursids. We revisit the contentious relationships within Ursidae by analyzing complete mt genome sequences and evaluating the performance of both entire mt genomes and constituent mtDNA genes in recovering a phylogeny of extremely recent speciation events. This mitochondrial genome-based phylogeny provides strong evidence that the spectacled bear diverged first, while within the genus Ursus, the sloth bear is the sister taxon of all the other five ursines. The latter group is divided into the brown bear/polar bear and the two black bears/sun bear assemblages. These findings resolve the previous conflicts between trees using partial mt genes. The ability of different categories of mt protein coding genes to recover the correct phylogeny is concordant with previous analyses for taxa with deep divergence times. This study provides a robust Ursidae phylogenetic framework for future validation by additional independent evidence, and also has significant implications for assisting in the resolution of other similarly difficult phylogenetic investigations. Identification of base composition bias and utilization of the combined data of whole mitochondrial genome sequences has allowed recovery of a strongly supported phylogeny that is upheld when using multiple alternative outgroups for the Ursidae, a mammalian family that underwent a rapid radiation since the mid- to late Pliocene. It remains to be seen if the reliability of mt genome analysis will hold up in studies of other difficult phylogenetic issues. Although the whole mitochondrial DNA sequence based phylogeny is robust, it remains in conflict with phylogenetic relationships suggested by analysis of limited nuclear-encoded data, a situation that will require gathering more nuclear DNA sequence information.
Yu, Li; Li, Yi-Wei; Ryder, Oliver A; Zhang, Ya-Ping
2007-01-01
Background Despite the small number of ursid species, bear phylogeny has long been a focus of study due to their conservation value, as all bear genera have been classified as endangered at either the species or subspecies level. The Ursidae family represents a typical example of rapid evolutionary radiation. Previous analyses with a single mitochondrial (mt) gene or a small number of mt genes either provide weak support or a large unresolved polytomy for ursids. We revisit the contentious relationships within Ursidae by analyzing complete mt genome sequences and evaluating the performance of both entire mt genomes and constituent mtDNA genes in recovering a phylogeny of extremely recent speciation events. Results This mitochondrial genome-based phylogeny provides strong evidence that the spectacled bear diverged first, while within the genus Ursus, the sloth bear is the sister taxon of all the other five ursines. The latter group is divided into the brown bear/polar bear and the two black bears/sun bear assemblages. These findings resolve the previous conflicts between trees using partial mt genes. The ability of different categories of mt protein coding genes to recover the correct phylogeny is concordant with previous analyses for taxa with deep divergence times. This study provides a robust Ursidae phylogenetic framework for future validation by additional independent evidence, and also has significant implications for assisting in the resolution of other similarly difficult phylogenetic investigations. Conclusion Identification of base composition bias and utilization of the combined data of whole mitochondrial genome sequences has allowed recovery of a strongly supported phylogeny that is upheld when using multiple alternative outgroups for the Ursidae, a mammalian family that underwent a rapid radiation since the mid- to late Pliocene. It remains to be seen if the reliability of mt genome analysis will hold up in studies of other difficult phylogenetic issues. Although the whole mitochondrial DNA sequence based phylogeny is robust, it remains in conflict with phylogenetic relationships suggested by analysis of limited nuclear-encoded data, a situation that will require gathering more nuclear DNA sequence information. PMID:17956639
Drakatos, Panagis; Kosky, Christopher A; Higgins, Sean E; Muza, Rexford T; Williams, Adrian J; Leschziner, Guy D
2013-09-01
Discrimination between narcolepsy, idiopathic hypersomnia, and behavior-induced inadequate sleep syndrome (BIISS) is based on clinical features and on specific nocturnal polysomnography (NPSG) and multiple sleep latency test (MSLT) results. However, previous studies have cast doubt on the specificity and sensitivity of these diagnostic tools. Eleven variables of the NPSG were analyzed in 101 patients who were retrospectively diagnosed with narcolepsy with cataplexy (N+C) (n=24), narcolepsy without cataplexy (N-C) (n=38), idiopathic hypersomnia with long sleep period (IHL) (n=21), and BIISS (n=18). Fifteen out of 24 N+C and 8 out of 38 N-C entered the first rapid eye movement (REM) sleep period (FREMP) from sleep stage 1 (N1) or wake (W), though this sleep-stage sequence did not arise in the other patient groups. FREMP stage sequence was a function of REM sleep latency (REML) for both N+C and N-C groups. FREMP stage sequence was not associated with mean sleep latency (MSL) in N+C but was associated in N-C, which implies heterogeneity within the N-C group. REML also was a useful discriminator. Depending on the cutoff period, REML had a sensitivity and specificity of up to 85.5% and 97.4%, respectively. The FREMP stage sequence may be a useful tool in the diagnosis of narcolepsy, particularly in conjunction with sleep-stage sequence analysis of sleep-onset REM periods (SOREMPs) in the MSLT; it also may provide a helpful intermediate phenotype in the clarification of heterogeneity in the N-C diagnostic group. However, larger prospective studies are necessary to confirm these findings. Copyright © 2013 Elsevier B.V. All rights reserved.
DNA extraction for streamlined metagenomics of diverse environmental samples.
Marotz, Clarisse; Amir, Amnon; Humphrey, Greg; Gaffney, James; Gogul, Grant; Knight, Rob
2017-06-01
A major bottleneck for metagenomic sequencing is rapid and efficient DNA extraction. Here, we compare the extraction efficiencies of three magnetic bead-based platforms (KingFisher, epMotion, and Tecan) to a standardized column-based extraction platform across a variety of sample types, including feces, oral, skin, soil, and water. Replicate sample plates were extracted and prepared for 16S rRNA gene amplicon sequencing in parallel to assess extraction bias and DNA quality. The data demonstrate that any effect of extraction method on sequencing results was small compared with the variability across samples; however, the KingFisher platform produced the largest number of high-quality reads in the shortest amount of time. Based on these results, we have identified an extraction pipeline that dramatically reduces sample processing time without sacrificing bacterial taxonomic or abundance information.
An efficient annotation and gene-expression derivation tool for Illumina Solexa datasets.
Hosseini, Parsa; Tremblay, Arianne; Matthews, Benjamin F; Alkharouf, Nadim W
2010-07-02
The data produced by an Illumina flow cell with all eight lanes occupied, produces well over a terabyte worth of images with gigabytes of reads following sequence alignment. The ability to translate such reads into meaningful annotation is therefore of great concern and importance. Very easily, one can get flooded with such a great volume of textual, unannotated data irrespective of read quality or size. CASAVA, a optional analysis tool for Illumina sequencing experiments, enables the ability to understand INDEL detection, SNP information, and allele calling. To not only extract from such analysis, a measure of gene expression in the form of tag-counts, but furthermore to annotate such reads is therefore of significant value. We developed TASE (Tag counting and Analysis of Solexa Experiments), a rapid tag-counting and annotation software tool specifically designed for Illumina CASAVA sequencing datasets. Developed in Java and deployed using jTDS JDBC driver and a SQL Server backend, TASE provides an extremely fast means of calculating gene expression through tag-counts while annotating sequenced reads with the gene's presumed function, from any given CASAVA-build. Such a build is generated for both DNA and RNA sequencing. Analysis is broken into two distinct components: DNA sequence or read concatenation, followed by tag-counting and annotation. The end result produces output containing the homology-based functional annotation and respective gene expression measure signifying how many times sequenced reads were found within the genomic ranges of functional annotations. TASE is a powerful tool to facilitate the process of annotating a given Illumina Solexa sequencing dataset. Our results indicate that both homology-based annotation and tag-count analysis are achieved in very efficient times, providing researchers to delve deep in a given CASAVA-build and maximize information extraction from a sequencing dataset. TASE is specially designed to translate sequence data in a CASAVA-build into functional annotations while producing corresponding gene expression measurements. Achieving such analysis is executed in an ultrafast and highly efficient manner, whether the analysis be a single-read or paired-end sequencing experiment. TASE is a user-friendly and freely available application, allowing rapid analysis and annotation of any given Illumina Solexa sequencing dataset with ease.
Vujaklija, Ivan; Bielen, Ana; Paradžik, Tina; Biđin, Siniša; Goldstein, Pavle; Vujaklija, Dušica
2016-02-18
The massive accumulation of protein sequences arising from the rapid development of high-throughput sequencing, coupled with automatic annotation, results in high levels of incorrect annotations. In this study, we describe an approach to decrease annotation errors of protein families characterized by low overall sequence similarity. The GDSL lipolytic family comprises proteins with multifunctional properties and high potential for pharmaceutical and industrial applications. The number of proteins assigned to this family has increased rapidly over the last few years. In particular, the natural abundance of GDSL enzymes reported recently in plants indicates that they could be a good source of novel GDSL enzymes. We noticed that a significant proportion of annotated sequences lack specific GDSL motif(s) or catalytic residue(s). Here, we applied motif-based sequence analyses to identify enzymes possessing conserved GDSL motifs in selected proteomes across the plant kingdom. Motif-based HMM scanning (Viterbi decoding-VD and posterior decoding-PD) and the here described PD/VD protocol were successfully applied on 12 selected plant proteomes to identify sequences with GDSL motifs. A significant number of identified GDSL sequences were novel. Moreover, our scanning approach successfully detected protein sequences lacking at least one of the essential motifs (171/820) annotated by Pfam profile search (PfamA) as GDSL. Based on these analyses we provide a curated list of GDSL enzymes from the selected plants. CLANS clustering and phylogenetic analysis helped us to gain a better insight into the evolutionary relationship of all identified GDSL sequences. Three novel GDSL subfamilies as well as unreported variations in GDSL motifs were discovered in this study. In addition, analyses of selected proteomes showed a remarkable expansion of GDSL enzymes in the lycophyte, Selaginella moellendorffii. Finally, we provide a general motif-HMM scanner which is easily accessible through the graphical user interface ( http://compbio.math.hr/ ). Our results show that scanning with a carefully parameterized motif-HMM is an effective approach for annotation of protein families with low sequence similarity and conserved motifs. The results of this study expand current knowledge and provide new insights into the evolution of the large GDSL-lipase family in land plants.
Accurate read-based metagenome characterization using a hierarchical suite of unique signatures
Freitas, Tracey Allen K.; Li, Po-E; Scholz, Matthew B.; Chain, Patrick S. G.
2015-01-01
A major challenge in the field of shotgun metagenomics is the accurate identification of organisms present within a microbial community, based on classification of short sequence reads. Though existing microbial community profiling methods have attempted to rapidly classify the millions of reads output from modern sequencers, the combination of incomplete databases, similarity among otherwise divergent genomes, errors and biases in sequencing technologies, and the large volumes of sequencing data required for metagenome sequencing has led to unacceptably high false discovery rates (FDR). Here, we present the application of a novel, gene-independent and signature-based metagenomic taxonomic profiling method with significantly and consistently smaller FDR than any other available method. Our algorithm circumvents false positives using a series of non-redundant signature databases and examines Genomic Origins Through Taxonomic CHAllenge (GOTTCHA). GOTTCHA was tested and validated on 20 synthetic and mock datasets ranging in community composition and complexity, was applied successfully to data generated from spiked environmental and clinical samples, and robustly demonstrates superior performance compared with other available tools. PMID:25765641
PHYLOViZ: phylogenetic inference and data visualization for sequence based typing methods
2012-01-01
Background With the decrease of DNA sequencing costs, sequence-based typing methods are rapidly becoming the gold standard for epidemiological surveillance. These methods provide reproducible and comparable results needed for a global scale bacterial population analysis, while retaining their usefulness for local epidemiological surveys. Online databases that collect the generated allelic profiles and associated epidemiological data are available but this wealth of data remains underused and are frequently poorly annotated since no user-friendly tool exists to analyze and explore it. Results PHYLOViZ is platform independent Java software that allows the integrated analysis of sequence-based typing methods, including SNP data generated from whole genome sequence approaches, and associated epidemiological data. goeBURST and its Minimum Spanning Tree expansion are used for visualizing the possible evolutionary relationships between isolates. The results can be displayed as an annotated graph overlaying the query results of any other epidemiological data available. Conclusions PHYLOViZ is a user-friendly software that allows the combined analysis of multiple data sources for microbial epidemiological and population studies. It is freely available at http://www.phyloviz.net. PMID:22568821
DOE Office of Scientific and Technical Information (OSTI.GOV)
Leung, Elo; Huang, Amy; Cadag, Eithon
In this study, we introduce the Protein Sequence Annotation Tool (PSAT), a web-based, sequence annotation meta-server for performing integrated, high-throughput, genome-wide sequence analyses. Our goals in building PSAT were to (1) create an extensible platform for integration of multiple sequence-based bioinformatics tools, (2) enable functional annotations and enzyme predictions over large input protein fasta data sets, and (3) provide a web interface for convenient execution of the tools. In this paper, we demonstrate the utility of PSAT by annotating the predicted peptide gene products of Herbaspirillum sp. strain RV1423, importing the results of PSAT into EC2KEGG, and using the resultingmore » functional comparisons to identify a putative catabolic pathway, thereby distinguishing RV1423 from a well annotated Herbaspirillum species. This analysis demonstrates that high-throughput enzyme predictions, provided by PSAT processing, can be used to identify metabolic potential in an otherwise poorly annotated genome. Lastly, PSAT is a meta server that combines the results from several sequence-based annotation and function prediction codes, and is available at http://psat.llnl.gov/psat/. PSAT stands apart from other sequencebased genome annotation systems in providing a high-throughput platform for rapid de novo enzyme predictions and sequence annotations over large input protein sequence data sets in FASTA. PSAT is most appropriately applied in annotation of large protein FASTA sets that may or may not be associated with a single genome.« less
Leung, Elo; Huang, Amy; Cadag, Eithon; ...
2016-01-20
In this study, we introduce the Protein Sequence Annotation Tool (PSAT), a web-based, sequence annotation meta-server for performing integrated, high-throughput, genome-wide sequence analyses. Our goals in building PSAT were to (1) create an extensible platform for integration of multiple sequence-based bioinformatics tools, (2) enable functional annotations and enzyme predictions over large input protein fasta data sets, and (3) provide a web interface for convenient execution of the tools. In this paper, we demonstrate the utility of PSAT by annotating the predicted peptide gene products of Herbaspirillum sp. strain RV1423, importing the results of PSAT into EC2KEGG, and using the resultingmore » functional comparisons to identify a putative catabolic pathway, thereby distinguishing RV1423 from a well annotated Herbaspirillum species. This analysis demonstrates that high-throughput enzyme predictions, provided by PSAT processing, can be used to identify metabolic potential in an otherwise poorly annotated genome. Lastly, PSAT is a meta server that combines the results from several sequence-based annotation and function prediction codes, and is available at http://psat.llnl.gov/psat/. PSAT stands apart from other sequencebased genome annotation systems in providing a high-throughput platform for rapid de novo enzyme predictions and sequence annotations over large input protein sequence data sets in FASTA. PSAT is most appropriately applied in annotation of large protein FASTA sets that may or may not be associated with a single genome.« less
Representation and alignment of sung queries for music information retrieval
NASA Astrophysics Data System (ADS)
Adams, Norman H.; Wakefield, Gregory H.
2005-09-01
The pursuit of robust and rapid query-by-humming systems, which search melodic databases using sung queries, is a common theme in music information retrieval. The retrieval aspect of this database problem has received considerable attention, whereas the front-end processing of sung queries and the data structure to represent melodies has been based on musical intuition and historical momentum. The present work explores three time series representations for sung queries: a sequence of notes, a ``smooth'' pitch contour, and a sequence of pitch histograms. The performance of the three representations is compared using a collection of naturally sung queries. It is found that the most robust performance is achieved by the representation with highest dimension, the smooth pitch contour, but that this representation presents a formidable computational burden. For all three representations, it is necessary to align the query and target in order to achieve robust performance. The computational cost of the alignment is quadratic, hence it is necessary to keep the dimension small for rapid retrieval. Accordingly, iterative deepening is employed to achieve both robust performance and rapid retrieval. Finally, the conventional iterative framework is expanded to adapt the alignment constraints based on previous iterations, further expediting retrieval without degrading performance.
Flow cytometric detection method for DNA samples
Nasarabadi, Shanavaz [Livermore, CA; Langlois, Richard G [Livermore, CA; Venkateswaran, Kodumudi S [Round Rock, TX
2011-07-05
Disclosed herein are two methods for rapid multiplex analysis to determine the presence and identity of target DNA sequences within a DNA sample. Both methods use reporting DNA sequences, e.g., modified conventional Taqman.RTM. probes, to combine multiplex PCR amplification with microsphere-based hybridization using flow cytometry means of detection. Real-time PCR detection can also be incorporated. The first method uses a cyanine dye, such as, Cy3.TM., as the reporter linked to the 5' end of a reporting DNA sequence. The second method positions a reporter dye, e.g., FAM.TM. on the 3' end of the reporting DNA sequence and a quencher dye, e.g., TAMRA.TM., on the 5' end.
Flow cytometric detection method for DNA samples
Nasarabadi, Shanavaz [Livermore, CA; Langlois, Richard G [Livermore, CA; Venkateswaran, Kodumudi S [Livermore, CA
2006-08-01
Disclosed herein are two methods for rapid multiplex analysis to determine the presence and identity of target DNA sequences within a DNA sample. Both methods use reporting DNA sequences, e.g., modified conventional Taqman.RTM. probes, to combine multiplex PCR amplification with microsphere-based hybridization using flow cytometry means of detection. Real-time PCR detection can also be incorporated. The first method uses a cyanine dye, such as, Cy3.TM., as the reporter linked to the 5' end of a reporting DNA sequence. The second method positions a reporter dye, e.g., FAM, on the 3' end of the reporting DNA sequence and a quencher dye, e.g., TAMRA, on the 5' end.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Gardner, Shea N.; McLoughlin, Kevin; Be, Nicholas A.
Venezuelan equine encephalitis virus (VEEV) is a mosquito-borne alphavirus that has caused large outbreaks of severe illness in both horses and humans. New approaches are needed to rapidly infer the origin of a newly discovered VEEV strain, estimate its equine amplification and resultant epidemic potential, and predict human virulence phenotype. We performed whole genome single nucleotide polymorphism (SNP) analysis of all available VEE antigenic complex genomes, verified that a SNP-based phylogeny accurately captured the features of a phylogenetic tree based on multiple sequence alignment, and developed a high resolution genome-wide SNP microarray. We used the microarray to analyze a broadmore » panel of VEEV isolates, found excellent concordance between array- and sequence-based SNP calls, genotyped unsequenced isolates, and placed them on a phylogeny with sequenced genomes. The microarray successfully genotyped VEEV directly from tissue samples of an infected mouse, bypassing the need for viral isolation, culture and genomic sequencing. Lastly, we identified genomic variants associated with serotypes and host species, revealing a complex relationship between genotype and phenotype.« less
Novel method for high-throughput colony PCR screening in nanoliter-reactors
Walser, Marcel; Pellaux, Rene; Meyer, Andreas; Bechtold, Matthias; Vanderschuren, Herve; Reinhardt, Richard; Magyar, Joseph; Panke, Sven; Held, Martin
2009-01-01
We introduce a technology for the rapid identification and sequencing of conserved DNA elements employing a novel suspension array based on nanoliter (nl)-reactors made from alginate. The reactors have a volume of 35 nl and serve as reaction compartments during monoseptic growth of microbial library clones, colony lysis, thermocycling and screening for sequence motifs via semi-quantitative fluorescence analyses. nl-Reactors were kept in suspension during all high-throughput steps which allowed performing the protocol in a highly space-effective fashion and at negligible expenses of consumables and reagents. As a first application, 11 high-quality microsatellites for polymorphism studies in cassava were isolated and sequenced out of a library of 20 000 clones in 2 days. The technology is widely scalable and we envision that throughputs for nl-reactor based screenings can be increased up to 100 000 and more samples per day thereby efficiently complementing protocols based on established deep-sequencing technologies. PMID:19282448
Xiong, Ai-Sheng; Yao, Quan-Hong; Peng, Ri-He; Li, Xian; Fan, Hui-Qin; Cheng, Zong-Ming; Li, Yi
2004-07-07
Chemical synthesis of DNA sequences provides a powerful tool for modifying genes and for studying gene function, structure and expression. Here, we report a simple, high-fidelity and cost-effective PCR-based two-step DNA synthesis (PTDS) method for synthesis of long segments of DNA. The method involves two steps. (i) Synthesis of individual fragments of the DNA of interest: ten to twelve 60mer oligonucleotides with 20 bp overlap are mixed and a PCR reaction is carried out with high-fidelity DNA polymerase Pfu to produce DNA fragments that are approximately 500 bp in length. (ii) Synthesis of the entire sequence of the DNA of interest: five to ten PCR products from the first step are combined and used as the template for a second PCR reaction using high-fidelity DNA polymerase pyrobest, with the two outermost oligonucleotides as primers. Compared with the previously published methods, the PTDS method is rapid (5-7 days) and suitable for synthesizing long segments of DNA (5-6 kb) with high G + C contents, repetitive sequences or complex secondary structures. Thus, the PTDS method provides an alternative tool for synthesizing and assembling long genes with complex structures. Using the newly developed PTDS method, we have successfully obtained several genes of interest with sizes ranging from 1.0 to 5.4 kb.
Browning, J.V.; Miller, K.G.; McLaughlin, P.P.; Edwards, L.E.; Kulpecz, A.A.; Powars, D.S.; Wade, B.S.; Feigenson, M.D.; Wright, J.D.
2009-01-01
The Eyreville core holes provide the first continuously cored record of postimpact sequences from within the deepest part of the central Chesapeake Bay impact crater. We analyzed the upper Eocene to Pliocene postimpact sediments from the Eyreville A and C core holes for lithology (semiquantitative measurements of grain size and composition), sequence stratigraphy, and chronostratigraphy. Age is based primarily on Sr isotope stratigraphy supplemented by biostratigraphy (dinocysts, nannofossils, and planktonic foraminifers); age resolution is approximately ??0.5 Ma for early Miocene sequences and approximately ??1.0 Ma for younger and older sequences. Eocene-lower Miocene sequences are subtle, upper middle to lower upper Miocene sequences are more clearly distinguished, and upper Miocene- Pliocene sequences display a distinct facies pattern within sequences. We recognize two upper Eocene, two Oligocene, nine Miocene, three Pliocene, and one Pleistocene sequence and correlate them with those in New Jersey and Delaware. The upper Eocene through Pleistocene strata at Eyreville record changes from: (1) rapidly deposited, extremely fi ne-grained Eocene strata that probably represent two sequences deposited in a deep (>200 m) basin; to (2) highly dissected Oligocene (two very thin sequences) to lower Miocene (three thin sequences) with a long hiatus; to (3) a thick, rapidly deposited (43-73 m/Ma), very fi ne-grained, biosiliceous middle Miocene (16.5-14 Ma) section divided into three sequences (V5-V3) deposited in middle neritic paleoenvironments; to (4) a 4.5-Ma-long hiatus (12.8-8.3 Ma); to (5) sandy, shelly upper Miocene to Pliocene strata (8.3-2.0 Ma) divided into six sequences deposited in shelf and shoreface environments; and, last, to (6) a sandy middle Pleistocene paralic sequence (~400 ka). The Eyreville cores thus record the fi lling of a deep impact-generated basin where the timing of sequence boundaries is heavily infl uenced by eustasy. ?? 2009 The Geological Society of America.
Wang, Yongjie; Kleespies, Regina G; Ramle, Moslim B; Jehle, Johannes A
2008-09-01
The genomic sequence analysis of many large dsDNA viruses is hampered by the lack of enough sample materials. Here, we report a whole genome amplification of the Oryctes rhinoceros nudivirus (OrNV) isolate Ma07 starting from as few as about 10 ng of purified viral DNA by application of phi29 DNA polymerase- and exonuclease-resistant random hexamer-based multiple displacement amplification (MDA) method. About 60 microg of high molecular weight DNA with fragment sizes of up to 25 kbp was amplified. A genomic DNA clone library was generated using the product DNA. After 8-fold sequencing coverage, the 127,615 bp of OrNV whole genome was sequenced successfully. The results demonstrate that the MDA-based whole genome amplification enables rapid access to genomic information from exiguous virus samples.
Telomere dynamics in an immortal human cell line.
Murnane, J P; Sabatier, L; Marder, B A; Morgan, W F
1994-01-01
The integration of transfected plasmid DNA at the telomere of chromosome 13 in an immortalized simian virus 40-transformed human cell line provided the first opportunity to study polymorphism in the number of telomeric repeat sequences on the end of a single chromosome. Three subclones of this cell line were selected for analysis: one with a long telomere on chromosome 13, one with a short telomere, and one with such extreme polymorphism that no distinct band was discernible. Further subcloning demonstrated that telomere polymorphism resulted from both gradual changes and rapid changes that sometimes involved many kilobases. The gradual changes were due to the shortening of telomeres at a rate similar to that reported for telomeres of somatic cells without telomerase, eventually resulting in the loss of nearly all of the telomere. However, telomeres were not generally lost completely, as shown by the absence of polymorphism in the subtelomeric plasmid sequences. Instead, telomeres that were less than a few hundred base pairs in length showed a rapid, highly heterogeneous increase in size. Rapid changes in telomere length also occurred on longer telomeres. The frequency of this type of change in telomere length varied among the subclones and correlated with chromosome fusion. Therefore, the rapid changes in telomere length appeared occasionally to result in the complete loss of telomeric repeat sequences. Rapid changes in telomere length have been associated with telomere loss and chromosome instability in yeast and could be responsible for the high rate of chromosome fusion observed in many human tumor cell lines. Images PMID:7957062
Li, Chuang; Chen, Tao; He, Qiang; Zhu, Yunping; Li, Kenli
2017-03-15
Tandem mass spectrometry-based de novo peptide sequencing is a complex and time-consuming process. The current algorithms for de novo peptide sequencing cannot rapidly and thoroughly process large mass spectrometry datasets. In this paper, we propose MRUniNovo, a novel tool for parallel de novo peptide sequencing. MRUniNovo parallelizes UniNovo based on the Hadoop compute platform. Our experimental results demonstrate that MRUniNovo significantly reduces the computation time of de novo peptide sequencing without sacrificing the correctness and accuracy of the results, and thus can process very large datasets that UniNovo cannot. MRUniNovo is an open source software tool implemented in java. The source code and the parameter settings are available at http://bioinfo.hupo.org.cn/MRUniNovo/index.php. s131020002@hnu.edu.cn ; taochen1019@163.com. Supplementary data are available at Bioinformatics online. © The Author 2016. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com
Recent advances in sequence assembly: principles and applications.
Chen, Qingfeng; Lan, Chaowang; Zhao, Liang; Wang, Jianxin; Chen, Baoshan; Chen, Yi-Ping Phoebe
2017-11-01
The application of advanced sequencing technologies and the rapid growth of various sequence data have led to increasing interest in DNA sequence assembly. However, repeats and polymorphism occur frequently in genomes, and each of these has different impacts on assembly. Further, many new applications for sequencing, such as metagenomics regarding multiple species, have emerged in recent years. These not only give rise to higher complexity but also prevent short-read assembly in an efficient way. This article reviews the theoretical foundations that underlie current mapping-based assembly and de novo-based assembly, and highlights the key issues and feasible solutions that need to be considered. It focuses on how individual processes, such as optimal k-mer determination and error correction in assembly, rely on intelligent strategies or high-performance computation. We also survey primary algorithms/software and offer a discussion on the emerging challenges in assembly. © The Author 2017. Published by Oxford University Press. All rights reserved. For permissions, please email: journals.permissions@oup.com.
Laser Desorption Mass Spectrometry for DNA Sequencing and Analysis
NASA Astrophysics Data System (ADS)
Chen, C. H. Winston; Taranenko, N. I.; Golovlev, V. V.; Isola, N. R.; Allman, S. L.
1998-03-01
Rapid DNA sequencing and/or analysis is critically important for biomedical research. In the past, gel electrophoresis has been the primary tool to achieve DNA analysis and sequencing. However, gel electrophoresis is a time-consuming and labor-extensive process. Recently, we have developed and used laser desorption mass spectrometry (LDMS) to achieve sequencing of ss-DNA longer than 100 nucleotides. With LDMS, we succeeded in sequencing DNA in seconds instead of hours or days required by gel electrophoresis. In addition to sequencing, we also applied LDMS for the detection of DNA probes for hybridization LDMS was also used to detect short tandem repeats for forensic applications. Clinical applications for disease diagnosis such as cystic fibrosis caused by base deletion and point mutation have also been demonstrated. Experimental details will be presented in the meeting. abstract.
VIPER: a web application for rapid expert review of variant calls.
Wöste, Marius; Dugas, Martin
2018-06-01
With the rapid development in next-generation sequencing, cost and time requirements for genomic sequencing are decreasing, enabling applications in many areas such as cancer research. Many tools have been developed to analyze genomic variation ranging from single nucleotide variants to whole chromosomal aberrations. As sequencing throughput increases, the number of variants called by such tools also grows. Often employed manual inspection of such calls is thus becoming a time-consuming procedure. We developed the Variant InsPector and Expert Rating tool (VIPER) to speed up this process by integrating the Integrative Genomics Viewer into a web application. Analysts can then quickly iterate through variants, apply filters and make decisions based on the generated images and variant metadata. VIPER was successfully employed in analyses with manual inspection of more than 10 000 calls. VIPER is implemented in Java and Javascript and is freely available at https://github.com/MarWoes/viper. marius.woeste@uni-muenster.de. Supplementary data are available at Bioinformatics online.
Burdick, David B; Cavnor, Chris C; Handcock, Jeremy; Killcoyne, Sarah; Lin, Jake; Marzolf, Bruz; Ramsey, Stephen A; Rovira, Hector; Bressler, Ryan; Shmulevich, Ilya; Boyle, John
2010-07-14
High throughput sequencing has become an increasingly important tool for biological research. However, the existing software systems for managing and processing these data have not provided the flexible infrastructure that research requires. Existing software solutions provide static and well-established algorithms in a restrictive package. However as high throughput sequencing is a rapidly evolving field, such static approaches lack the ability to readily adopt the latest advances and techniques which are often required by researchers. We have used a loosely coupled, service-oriented infrastructure to develop SeqAdapt. This system streamlines data management and allows for rapid integration of novel algorithms. Our approach also allows computational biologists to focus on developing and applying new methods instead of writing boilerplate infrastructure code. The system is based around the Addama service architecture and is available at our website as a demonstration web application, an installable single download and as a collection of individual customizable services.
2010-01-01
Background High throughput sequencing has become an increasingly important tool for biological research. However, the existing software systems for managing and processing these data have not provided the flexible infrastructure that research requires. Results Existing software solutions provide static and well-established algorithms in a restrictive package. However as high throughput sequencing is a rapidly evolving field, such static approaches lack the ability to readily adopt the latest advances and techniques which are often required by researchers. We have used a loosely coupled, service-oriented infrastructure to develop SeqAdapt. This system streamlines data management and allows for rapid integration of novel algorithms. Our approach also allows computational biologists to focus on developing and applying new methods instead of writing boilerplate infrastructure code. Conclusion The system is based around the Addama service architecture and is available at our website as a demonstration web application, an installable single download and as a collection of individual customizable services. PMID:20630057
Primer-independent RNA sequencing with bacteriophage phi6 RNA polymerase and chain terminators.
Makeyev, E V; Bamford, D H
2001-05-01
Here we propose a new general method for directly determining RNA sequence based on the use of the RNA-dependent RNA polymerase from bacteriophage phi6 and the chain terminators (RdRP sequencing). The following properties of the polymerase render it appropriate for this application: (1) the phi6 polymerase can replicate a number of single-stranded RNA templates in vitro. (2) In contrast to the primer-dependent DNA polymerases utilized in the sequencing procedure by Sanger et al. (Proc Natl Acad Sci USA, 1977, 74:5463-5467), it initiates nascent strand synthesis without a primer, starting the polymerization on the very 3'-terminus of the template. (3) The polymerase can incorporate chain-terminating nucleotide analogs into the nascent RNA chain to produce a set of base-specific termination products. Consequently, 3' proximal or even complete sequence of many target RNA molecules can be rapidly deduced without prior sequence information. The new technique proved useful for sequencing several synthetic ssRNA templates. Furthermore, using genomic segments of the bluetongue virus we show that RdRP sequencing can also be applied to naturally occurring dsRNA templates. This suggests possible uses of the method in the RNA virus research and diagnostics.
2013-01-01
Background Genetic linkage maps are important tools in breeding programmes and quantitative trait analyses. Traditional molecular markers used for genotyping are limited in throughput and efficiency. The advent of next-generation sequencing technologies has facilitated progeny genotyping and genetic linkage map construction in the major grains. However, the applicability of the approach remains untested in the fungal system. Findings Shiitake mushroom, Lentinula edodes, is a basidiomycetous fungus that represents one of the most popular cultivated edible mushrooms. Here, we developed a rapid genotyping method based on low-coverage (~0.5 to 1.5-fold) whole-genome resequencing. We used the approach to genotype 20 single-spore isolates derived from L. edodes strain L54 and constructed the first high-density sequence-based genetic linkage map of L. edodes. The accuracy of the proposed genotyping method was verified experimentally with results from mating compatibility tests and PCR-single-strand conformation polymorphism on a few known genes. The linkage map spanned a total genetic distance of 637.1 cM and contained 13 linkage groups. Two hundred sequence-based markers were placed on the map, with an average marker spacing of 3.4 cM. The accuracy of the map was confirmed by comparing with previous maps the locations of known genes such as matA and matB. Conclusions We used the shiitake mushroom as an example to provide a proof-of-principle that low-coverage resequencing could allow rapid genotyping of basidiospore-derived progenies, which could in turn facilitate the construction of high-density genetic linkage maps of basidiomycetous fungi for quantitative trait analyses and improvement of genome assembly. PMID:23915543
Pérez-Osorio, Ailyn C.; Boyle, David S.; Ingham, Zachary K.; Ostash, Alla; Gautom, Romesh K.; Colombel, Craig; Houze, Yolanda
2012-01-01
Tuberculosis (TB) remains a significant global health problem for which rapid diagnosis is critical to both treatment and control. This report describes a multiplex PCR method, the Mycobacterial IDentification and Drug Resistance Screen (MID-DRS) assay, which allows identification of members of the Mycobacterium tuberculosis complex (MTBC) and the simultaneous amplification of targets for sequencing-based drug resistance screening of rifampin-resistant (rifampinr), isoniazidr, and pyrazinamider TB. Additionally, the same multiplex reaction amplifies a specific 16S rRNA gene target for rapid identification of M. avium complex (MAC) and a region of the heat shock protein 65 gene (hsp65) for further DNA sequencing-based confirmation or identification of other mycobacterial species. Comparison of preliminary results generated with MID-DRS versus culture-based methods for a total of 188 bacterial isolates demonstrated MID-DRS sensitivity and specificity as 100% and 96.8% for MTBC identification; 100% and 98.3% for MAC identification; 97.4% and 98.7% for rifampinr TB identification; 60.6% and 100% for isoniazidr TB identification; and 75.0% and 98.1% for pyrazinamider TB identification. The performance of the MID-DRS was also tested on acid-fast-bacterium (AFB)-positive clinical specimens, resulting in sensitivity and specificity of 100% and 78.6% for detection of MTBC and 100% and 97.8% for detection of MAC. In conclusion, use of the MID-DRS reduces the time necessary for initial identification and drug resistance screening of TB specimens to as little as 2 days. Since all targets needed for completing the assay are included in a single PCR amplification step, assay costs, preparation time, and risks due to user errors are also reduced. PMID:22162548
Allard, Marc W; Strain, Errol; Melka, David; Bunning, Kelly; Musser, Steven M; Brown, Eric W; Timme, Ruth
2016-08-01
The FDA has created a United States-based open-source whole-genome sequencing network of state, federal, international, and commercial partners. The GenomeTrakr network represents a first-of-its-kind distributed genomic food shield for characterizing and tracing foodborne outbreak pathogens back to their sources. The GenomeTrakr network is leading investigations of outbreaks of foodborne illnesses and compliance actions with more accurate and rapid recalls of contaminated foods as well as more effective monitoring of preventive controls for food manufacturing environments. An expanded network would serve to provide an international rapid surveillance system for pathogen traceback, which is critical to support an effective public health response to bacterial outbreaks. Copyright © 2016, American Society for Microbiology. All Rights Reserved.
Using Genome Sequence to Enable the Design of Medicines and Chemical Probes.
Angelbello, Alicia J; Chen, Jonathan L; Childs-Disney, Jessica L; Zhang, Peiyuan; Wang, Zi-Fu; Disney, Matthew D
2018-02-28
Rapid progress in genome sequencing technology has put us firmly into a postgenomic era. A key challenge in biomedical research is harnessing genome sequence to fulfill the promise of personalized medicine. This Review describes how genome sequencing has enabled the identification of disease-causing biomolecules and how these data have been converted into chemical probes of function, preclinical lead modalities, and ultimately U.S. Food and Drug Administration (FDA)-approved drugs. In particular, we focus on the use of oligonucleotide-based modalities to target disease-causing RNAs; small molecules that target DNA, RNA, or protein; the rational repurposing of known therapeutic modalities; and the advantages of pharmacogenetics. Lastly, we discuss the remaining challenges and opportunities in the direct utilization of genome sequence to enable design of medicines.
2011-01-01
Background The advent of genomics-based technologies has revolutionized many fields of biological enquiry. However, chromosome walking or flanking sequence cloning is still a necessary and important procedure to determining gene structure. Such methods are used to identify T-DNA insertion sites and so are especially relevant for organisms where large T-DNA insertion libraries have been created, such as rice and Arabidopsis. The currently available methods for flanking sequence cloning, including the popular TAIL-PCR technique, are relatively laborious and slow. Results Here, we report a simple and effective fusion primer and nested integrated PCR method (FPNI-PCR) for the identification and cloning of unknown genomic regions flanked known sequences. In brief, a set of universal primers was designed that consisted of various 15-16 base arbitrary degenerate oligonucleotides. These arbitrary degenerate primers were fused to the 3' end of an adaptor oligonucleotide which provided a known sequence without degenerate nucleotides, thereby forming the fusion primers (FPs). These fusion primers are employed in the first step of an integrated nested PCR strategy which defines the overall FPNI-PCR protocol. In order to demonstrate the efficacy of this novel strategy, we have successfully used it to isolate multiple genomic sequences namely, 21 orthologs of genes in various species of Rosaceace, 4 MYB genes of Rosa rugosa, 3 promoters of transcription factors of Petunia hybrida, and 4 flanking sequences of T-DNA insertion sites in transgenic tobacco lines and 6 specific genes from sequenced genome of rice and Arabidopsis. Conclusions The successful amplification of target products through FPNI-PCR verified that this novel strategy is an effective, low cost and simple procedure. Furthermore, FPNI-PCR represents a more sensitive, rapid and accurate technique than the established TAIL-PCR and hiTAIL-PCR procedures. PMID:22093809
Reiz, Bela; Li, Liang
2010-09-01
Controlled hydrolysis of proteins to generate peptide ladders combined with mass spectrometric analysis of the resultant peptides can be used for protein sequencing. In this paper, two methods of improving the microwave-assisted protein hydrolysis process are described to enable rapid sequencing of proteins containing disulfide bonds and increase sequence coverage, respectively. It was demonstrated that proteins containing disulfide bonds could be sequenced by MS analysis by first performing hydrolysis for less than 2 min, followed by 1 h of reduction to release the peptides originally linked by disulfide bonds. It was shown that a strong base could be used as a catalyst for microwave-assisted protein hydrolysis, producing complementary sequence information to that generated by microwave-assisted acid hydrolysis. However, using either acid or base hydrolysis, amide bond breakages in small regions of the polypeptide chains of the model proteins (e.g., cytochrome c and lysozyme) were not detected. Dynamic light scattering measurement of the proteins solubilized in an acid or base indicated that protein-protein interaction or aggregation was not the cause of the failure to hydrolyze certain amide bonds. It was speculated that there were some unknown local structures that might play a role in preventing an acid or base from reacting with the peptide bonds therein. 2010 American Society for Mass Spectrometry. Published by Elsevier Inc. All rights reserved.
Bryant, D A; de Lorimier, R; Lambert, D H; Dubbs, J M; Stirewalt, V L; Stevens, S E; Porter, R D; Tam, J; Jay, E
1985-01-01
The genes for the alpha- and beta-subunit apoproteins of allophycocyanin (AP) were isolated from the cyanelle genome of Cyanophora paradoxa and subjected to nucleotide sequence analysis. The AP beta-subunit apoprotein gene was localized to a 7.8-kilobase-pair Pst I restriction fragment from cyanelle DNA by hybridization with a tetradecameric oligonucleotide probe. Sequence analysis using that oligonucleotide and its complement as primers for the dideoxy chain-termination sequencing method confirmed the presence of both AP alpha- and beta-subunit genes on this restriction fragment. Additional oligonucleotide primers were synthesized as sequencing progressed and were used to determine rapidly the nucleotide sequence of a 1336-base-pair region of this cloned fragment. This strategy allowed the sequencing to be completed without a detailed restriction map and without extensive and time-consuming subcloning. The sequenced region contains two open reading frames whose deduced amino acid sequences are 81-85% homologous to cyanobacterial and red algal AP subunits whose amino acid sequences have been determined. The two open reading frames are in the same orientation and are separated by 39 base pairs. AP alpha is 5' to AP beta and both coding sequences are preceded by a polypurine, Shine-Dalgarno-type sequence. Sequences upstream from AP alpha closely resemble the Escherichia coli consensus promoter sequences and also show considerable homology to promoter sequences for several chloroplast-encoded psbA genes. A 56-base-pair palindromic sequence downstream from the AP beta gene could play a role in the termination of transcription or translation. The allophycocyanin apoprotein subunit genes are located on the large single-copy region of the cyanelle genome. PMID:2987916
Research progress of plant population genomics based on high-throughput sequencing.
Wang, Yun-sheng
2016-08-01
Population genomics, a new paradigm for population genetics, combine the concepts and techniques of genomics with the theoretical system of population genetics and improve our understanding of microevolution through identification of site-specific effect and genome-wide effects using genome-wide polymorphic sites genotypeing. With the appearance and improvement of the next generation high-throughput sequencing technology, the numbers of plant species with complete genome sequences increased rapidly and large scale resequencing has also been carried out in recent years. Parallel sequencing has also been done in some plant species without complete genome sequences. These studies have greatly promoted the development of population genomics and deepened our understanding of the genetic diversity, level of linking disequilibium, selection effect, demographical history and molecular mechanism of complex traits of relevant plant population at a genomic level. In this review, I briely introduced the concept and research methods of population genomics and summarized the research progress of plant population genomics based on high-throughput sequencing. I also discussed the prospect as well as existing problems of plant population genomics in order to provide references for related studies.
CBrowse: a SAM/BAM-based contig browser for transcriptome assembly visualization and analysis.
Li, Pei; Ji, Guoli; Dong, Min; Schmidt, Emily; Lenox, Douglas; Chen, Liangliang; Liu, Qi; Liu, Lin; Zhang, Jie; Liang, Chun
2012-09-15
To address the impending need for exploring rapidly increased transcriptomics data generated for non-model organisms, we developed CBrowse, an AJAX-based web browser for visualizing and analyzing transcriptome assemblies and contigs. Designed in a standard three-tier architecture with a data pre-processing pipeline, CBrowse is essentially a Rich Internet Application that offers many seamlessly integrated web interfaces and allows users to navigate, sort, filter, search and visualize data smoothly. The pre-processing pipeline takes the contig sequence file in FASTA format and its relevant SAM/BAM file as the input; detects putative polymorphisms, simple sequence repeats and sequencing errors in contigs and generates image, JSON and database-compatible CSV text files that are directly utilized by different web interfaces. CBowse is a generic visualization and analysis tool that facilitates close examination of assembly quality, genetic polymorphisms, sequence repeats and/or sequencing errors in transcriptome sequencing projects. CBrowse is distributed under the GNU General Public License, available at http://bioinfolab.muohio.edu/CBrowse/ liangc@muohio.edu or liangc.mu@gmail.com; glji@xmu.edu.cn Supplementary data are available at Bioinformatics online.
[Review of Second Generation Sequencing and Its Application in Forensic Genetics].
Zhang, S H; Bian, Y N; Zhao, Q; Li, C T
2016-08-01
The rapid development of second generation sequencing (SGS) within the past few years has led to the increasement of data throughput and read length while at the same time brought down substantially the sequencing cost. This made new breakthrough in the area of biology and ushered the forensic genetics into a new era. Based on the history of sequencing application in forensic genetics, this paper reviews the importance of sequencing technologies for genetic marker detection. The application status and potential of SGS in forensic genetics are discussed based on the already explored SGS platforms of Roche, Illumina and Life Technologies. With these platforms, DNA markers (SNP, STR), RNA markers (mRNA, microRNA) and whole mtDNA can be sequenced. However, development and validation of application kits, maturation of analysis software, connection to the existing databases and the possible ethical issues occurred with big data will be the key factors that determine whether this technology can substitute or supplement PCR-CE, the mature technology, and be widely used for cases detection. Copyright© by the Editorial Department of Journal of Forensic Medicine.
Novel Primer Sets for Next Generation Sequencing-Based Analyses of Water Quality
Lee, Elvina; Khurana, Maninder S.; Whiteley, Andrew S.; Monis, Paul T.; Bath, Andrew; Gordon, Cameron; Ryan, Una M.; Paparini, Andrea
2017-01-01
Next generation sequencing (NGS) has rapidly become an invaluable tool for the detection, identification and relative quantification of environmental microorganisms. Here, we demonstrate two new 16S rDNA primer sets, which are compatible with NGS approaches and are primarily for use in water quality studies. Compared to 16S rRNA gene based universal primers, in silico and experimental analyses demonstrated that the new primers showed increased specificity for the Cyanobacteria and Proteobacteria phyla, allowing increased sensitivity for the detection, identification and relative quantification of toxic bloom-forming microalgae, microbial water quality bioindicators and common pathogens. Significantly, Cyanobacterial and Proteobacterial sequences accounted for ca. 95% of all sequences obtained within NGS runs (when compared to ca. 50% with standard universal NGS primers), providing higher sensitivity and greater phylogenetic resolution of key water quality microbial groups. The increased selectivity of the new primers allow the parallel sequencing of more samples through reduced sequence retrieval levels required to detect target groups, potentially reducing NGS costs by 50% but still guaranteeing optimal coverage and species discrimination. PMID:28118368
NASA Astrophysics Data System (ADS)
Tsao, Shih-Ming; Lai, Ji-Ching; Horng, Horng-Er; Liu, Tu-Chen; Hong, Chin-Yih
2017-04-01
Aptamers are oligonucleotides that can bind to specific target molecules. Most aptamers are generated using random libraries in the standard systematic evolution of ligands by exponential enrichment (SELEX). Each random library contains oligonucleotides with a randomized central region and two fixed primer regions at both ends. The fixed primer regions are necessary for amplifying target-bound sequences by PCR. However, these extra-sequences may cause non-specific bindings, which potentially interfere with good binding for random sequences. The Magnetic-Assisted Rapid Aptamer Selection (MARAS) is a newly developed protocol for generating single-strand DNA aptamers. No repeat selection cycle is required in the protocol. This study proposes and demonstrates a method to isolate aptamers for C-reactive proteins (CRP) from a randomized ssDNA library containing no fixed sequences at 5‧ and 3‧ termini using the MARAS platform. Furthermore, the isolated primer-free aptamer was sequenced and binding affinity for CRP was analyzed. The specificity of the obtained aptamer was validated using blind serum samples. The result was consistent with monoclonal antibody-based nephelometry analysis, which indicated that a primer-free aptamer has high specificity toward targets. MARAS is a feasible platform for efficiently generating primer-free aptamers for clinical diagnoses.
NASA Astrophysics Data System (ADS)
Long, Ying; Wood, Troy D.
2015-01-01
Most enzymatic microreactors for protein digestion are based on trypsin, but proteins with hydrophobic segments may be difficult to digest because of the paucity of Arg and Lys residues. Microreactors based on pepsin, which is less specific than trypsin, can overcome this challenge. Here, an integrated immobilized pepsin microreactor (IPMR)/nanoelectrospray emitter is examined for its potential for peptide mapping. For myoglobin, equivalent sequence coverage is obtained in a thousandth the time of solution digestion with better sequence coverage. While sequence coverage of cytochrome c is lesser than solution in this short duration, more highly-charged peptic peptides are produced and a number of peaks are unidentified at low-resolution, suggesting that high-resolution mass spectrometry is needed to take full advantage of integrated IPMR/nanoelectrospray devices.
Pulmonary function in microgravity: Spacelab 4 and beyond
NASA Technical Reports Server (NTRS)
Guy, H. J.; Prisk, G. K.; West, J. B.
1988-01-01
This paper refers principally to the composition gradient of gases within the lung in various conditions of gravity, as revealed by exhaled breath. A rapid gas analyzer-based system has been developed for tests in Spacelab 4. The test sequence and expected results are presented.
Rapid polymerase chain reaction-based screening assay for bacterial biothreat agents.
Yang, Samuel; Rothman, Richard E; Hardick, Justin; Kuroki, Marcos; Hardick, Andrew; Doshi, Vishal; Ramachandran, Padmini; Gaydos, Charlotte A
2008-04-01
To design and evaluate a rapid polymerase chain reaction (PCR)-based assay for detecting Eubacteria and performing early screening for selected Class A biothreat bacterial pathogens. The authors designed a two-step PCR-based algorithm consisting of an initial broad-based universal detection step, followed by specific pathogen identification targeted for identification of the Class A bacterial biothreat agents. A region in the bacterial 16S rRNA gene containing a highly variable sequence flanked by clusters of conserved sequences was chosen as the target for the PCR assay design. A previously described highly conserved region located within the 16S rRNA amplicon was selected as the universal probe (UniProbe, Integrated DNA Technology, Coralville, IA). Pathogen-specific TaqMan probes were designed for Bacillus anthracis, Yersinia pestis, and Francisella tularensis. Performance of the assay was assessed using genomic DNA extracted from the aforementioned biothreat-related organisms (inactivated or surrogate) and other common bacteria. The UniProbe detected the presence of all tested Eubacteria (31/31) with high analytical sensitivity. The biothreat-specific probes accurately identified organisms down to the closely related species and genus level, but were unable to discriminate between very close surrogates, such as Yersinia philomiragia and Bacillus cereus. A simple, two-step PCR-based assay proved capable of both universal bacterial detection and identification of select Class A bacterial biothreat and biothreat-related pathogens. Although this assay requires confirmatory testing for definitive species identification, the method has great potential for use in ED-based settings for rapid diagnosis in cases of suspected Category A bacterial biothreat agents.
NASA Astrophysics Data System (ADS)
Seto, Donald
The convergence and wealth of informatics, bioinformatics and genomics methods and associated resources allow a comprehensive and rapid approach for the surveillance and detection of bacterial and viral organisms. Coupled with the continuing race for the fastest, most cost-efficient and highest-quality DNA sequencing technology, that is, "next generation sequencing", the detection of biological threat agents by `cheaper and faster' means is possible. With the application of improved bioinformatic tools for the understanding of these genomes and for parsing unique pathogen genome signatures, along with `state-of-the-art' informatics which include faster computational methods, equipment and databases, it is feasible to apply new algorithms to biothreat agent detection. Two such methods are high-throughput DNA sequencing-based and resequencing microarray-based identification. These are illustrated and validated by two examples involving human adenoviruses, both from real-world test beds.
O'Brien, Heath E; Gong, Yunchen; Fung, Pauline; Wang, Pauline W; Guttman, David S
2011-01-01
Next-generation genomic technology has both greatly accelerated the pace of genome research as well as increased our reliance on draft genome sequences. While groups such as the Genomics Standards Consortium have made strong efforts to promote genome standards there is a still a general lack of uniformity among published draft genomes, leading to challenges for downstream comparative analyses. This lack of uniformity is a particular problem when using standard draft genomes that frequently have large numbers of low-quality sequencing tracts. Here we present a proposal for an "enhanced-quality draft" genome that identifies at least 95% of the coding sequences, thereby effectively providing a full accounting of the genic component of the genome. Enhanced-quality draft genomes are easily attainable through a combination of small- and large-insert next-generation, paired-end sequencing. We illustrate the generation of an enhanced-quality draft genome by re-sequencing the plant pathogenic bacterium Pseudomonas syringae pv. phaseolicola 1448A (Pph 1448A), which has a published, closed genome sequence of 5.93 Mbp. We use a combination of Illumina paired-end and mate-pair sequencing, and surprisingly find that de novo assemblies with 100x paired-end coverage and mate-pair sequencing with as low as low as 2-5x coverage are substantially better than assemblies based on higher coverage. The rapid and low-cost generation of large numbers of enhanced-quality draft genome sequences will be of particular value for microbial diagnostics and biosecurity, which rely on precise discrimination of potentially dangerous clones from closely related benign strains.
RAD tag sequencing as a source of SNP markers in Cynara cardunculus L
2012-01-01
Background The globe artichoke (Cynara cardunculus L. var. scolymus) genome is relatively poorly explored, especially compared to those of the other major Asteraceae crops sunflower and lettuce. No SNP markers are in the public domain. We have combined the recently developed restriction-site associated DNA (RAD) approach with the Illumina DNA sequencing platform to effect the rapid and mass discovery of SNP markers for C. cardunculus. Results RAD tags were sequenced from the genomic DNA of three C. cardunculus mapping population parents, generating 9.7 million reads, corresponding to ~1 Gbp of sequence. An assembly based on paired ends produced ~6.0 Mbp of genomic sequence, separated into ~19,000 contigs (mean length 312 bp), of which ~21% were fragments of putative coding sequence. The shared sequences allowed for the discovery of ~34,000 SNPs and nearly 800 indels, equivalent to a SNP frequency of 5.6 per 1,000 nt, and an indel frequency of 0.2 per 1,000 nt. A sample of heterozygous SNP loci was mapped by CAPS assays and this exercise provided validation of our mining criteria. The repetitive fraction of the genome had a high representation of retrotransposon sequence, followed by simple repeats, AT-low complexity regions and mobile DNA elements. The genomic k-mers distribution and CpG rate of C. cardunculus, compared with data derived from three whole genome-sequenced dicots species, provided a further evidence of the random representation of the C. cardunculus genome generated by RAD sampling. Conclusion The RAD tag sequencing approach is a cost-effective and rapid method to develop SNP markers in a highly heterozygous species. Our approach permitted to generate a large and robust SNP datasets by the adoption of optimized filtering criteria. PMID:22214349
Managing the genomic revolution in cancer diagnostics.
Nguyen, Doreen; Gocke, Christopher D
2017-08-01
Molecular tumor profiling is now a routine part of patient care, revealing targetable genomic alterations and molecularly distinct tumor subtypes with therapeutic and prognostic implications. The widespread adoption of next-generation sequencing technologies has greatly facilitated clinical implementation of genomic data and opened the door for high-throughput multigene-targeted sequencing. Herein, we discuss the variability of cancer genetic profiling currently offered by clinical laboratories, the challenges of applying rapidly evolving medical knowledge to individual patients, and the need for more standardized population-based molecular profiling.
ParticleCall: A particle filter for base calling in next-generation sequencing systems
2012-01-01
Background Next-generation sequencing systems are capable of rapid and cost-effective DNA sequencing, thus enabling routine sequencing tasks and taking us one step closer to personalized medicine. Accuracy and lengths of their reads, however, are yet to surpass those provided by the conventional Sanger sequencing method. This motivates the search for computationally efficient algorithms capable of reliable and accurate detection of the order of nucleotides in short DNA fragments from the acquired data. Results In this paper, we consider Illumina’s sequencing-by-synthesis platform which relies on reversible terminator chemistry and describe the acquired signal by reformulating its mathematical model as a Hidden Markov Model. Relying on this model and sequential Monte Carlo methods, we develop a parameter estimation and base calling scheme called ParticleCall. ParticleCall is tested on a data set obtained by sequencing phiX174 bacteriophage using Illumina’s Genome Analyzer II. The results show that the developed base calling scheme is significantly more computationally efficient than the best performing unsupervised method currently available, while achieving the same accuracy. Conclusions The proposed ParticleCall provides more accurate calls than the Illumina’s base calling algorithm, Bustard. At the same time, ParticleCall is significantly more computationally efficient than other recent schemes with similar performance, rendering it more feasible for high-throughput sequencing data analysis. Improvement of base calling accuracy will have immediate beneficial effects on the performance of downstream applications such as SNP and genotype calling. ParticleCall is freely available at https://sourceforge.net/projects/particlecall. PMID:22776067
[Prediction of ETA oligopeptides antagonists from Glycine max based on in silico proteolysis].
Qiao, Lian-Sheng; Jiang, Lu-di; Luo, Gang-Gang; Lu, Fang; Chen, Yan-Kun; Wang, Ling-Zhi; Li, Gong-Yu; Zhang, Yan-Ling
2017-02-01
Oligopeptides are one of the the key pharmaceutical effective constituents of traditional Chinese medicine(TCM). Systematic study on composition and efficacy of TCM oligopeptides is essential for the analysis of material basis and mechanism of TCM. In this study, the potential anti-hypertensive oligopeptides from Glycine max and their endothelin receptor A (ETA) antagonistic activity were discovered and predicted based on in silico technologies.Main protein sequences of G. max were collected and oligopeptides were obtained using in silico gastrointestinal tract proteolysis. Then, the pharmacophore of ETA antagonistic peptides was constructed and included one hydrophobic feature, one ionizable negative feature, one ring aromatic feature and five excluded volumes. Meanwhile, three-dimensional structure of ETA was developed by homology modeling methods for further docking studies. According to docking analysis and consensus score, the key amino acid of GLN165 was identified for ETA antagonistic activity. And 27 oligopeptides from G. max were predicted as the potential ETA antagonists by pharmacophore and docking studies.In silico proteolysis could be used to analyze the protein sequences from TCM. According to combination of in silico proteolysis and molecular simulation, the biological activities of oligopeptides could be predicted rapidly based on the known TCM protein sequence. It might provide the methodology basis for rapidly and efficiently implementing the mechanism analysis of TCM oligopeptides. Copyright© by the Chinese Pharmaceutical Association.
Rapid Sequencing of Complete env Genes from Primary HIV-1 Samples.
Laird Smith, Melissa; Murrell, Ben; Eren, Kemal; Ignacio, Caroline; Landais, Elise; Weaver, Steven; Phung, Pham; Ludka, Colleen; Hepler, Lance; Caballero, Gemma; Pollner, Tristan; Guo, Yan; Richman, Douglas; Poignard, Pascal; Paxinos, Ellen E; Kosakovsky Pond, Sergei L; Smith, Davey M
2016-07-01
The ability to study rapidly evolving viral populations has been constrained by the read length of next-generation sequencing approaches and the sampling depth of single-genome amplification methods. Here, we develop and characterize a method using Pacific Biosciences' Single Molecule, Real-Time (SMRT®) sequencing technology to sequence multiple, intact full-length human immunodeficiency virus-1 env genes amplified from viral RNA populations circulating in blood, and provide computational tools for analyzing and visualizing these data.
Costa, Pedro; Botelho, Ana; Couto, Isabel; Viveiros, Miguel; Inácio, João
2014-01-01
Nucleic acid testing (NAT) designate any molecular approach used for the detection, identification, and characterization of pathogenic microorganisms, enabling the rapid, specific, and sensitive diagnostic of infectious diseases, such as tuberculosis. These assays have been widely used since the 90s of the last century in human clinical laboratories and, subsequently, also in veterinary diagnostics. Most NAT strategies are based in the polymerase chain reaction (PCR) and its several enhancements and variations. From the conventional PCR, real-time PCR and its combinations, isothermal DNA amplification, to the nanotechnologies, here we review how the NAT assays have been applied to decipher if and which member of the Mycobacterium tuberculosis complex is present in a clinical sample. Recent advances in DNA sequencing also brought new challenges and have made possible to generate rapidly and at a low cost, large amounts of sequence data. This revolution with the high-throughput sequencing (HTS) technologies makes whole genome sequencing (WGS) and metagenomics the trendiest NAT strategies, today. The ranking of NAT techniques in the field of clinical diagnostics is rising, and we provide a SWOT (Strengths, Weaknesses, Opportunities, and Threats) analysis with our view of the use of molecular diagnostics for detecting tuberculosis in veterinary laboratories, notwithstanding the gold standard being still the classical culture of the agent. The complementary use of both classical and molecular diagnostics approaches is recommended to speed the diagnostic, enabling a fast decision by competent authorities and rapid tackling of the disease. PMID:25988157
Costa, Pedro; Botelho, Ana; Couto, Isabel; Viveiros, Miguel; Inácio, João
2014-01-01
Nucleic acid testing (NAT) designate any molecular approach used for the detection, identification, and characterization of pathogenic microorganisms, enabling the rapid, specific, and sensitive diagnostic of infectious diseases, such as tuberculosis. These assays have been widely used since the 90s of the last century in human clinical laboratories and, subsequently, also in veterinary diagnostics. Most NAT strategies are based in the polymerase chain reaction (PCR) and its several enhancements and variations. From the conventional PCR, real-time PCR and its combinations, isothermal DNA amplification, to the nanotechnologies, here we review how the NAT assays have been applied to decipher if and which member of the Mycobacterium tuberculosis complex is present in a clinical sample. Recent advances in DNA sequencing also brought new challenges and have made possible to generate rapidly and at a low cost, large amounts of sequence data. This revolution with the high-throughput sequencing (HTS) technologies makes whole genome sequencing (WGS) and metagenomics the trendiest NAT strategies, today. The ranking of NAT techniques in the field of clinical diagnostics is rising, and we provide a SWOT (Strengths, Weaknesses, Opportunities, and Threats) analysis with our view of the use of molecular diagnostics for detecting tuberculosis in veterinary laboratories, notwithstanding the gold standard being still the classical culture of the agent. The complementary use of both classical and molecular diagnostics approaches is recommended to speed the diagnostic, enabling a fast decision by competent authorities and rapid tackling of the disease.
Rapid-Sequence Serial Sexual Homicides.
Schlesinger, Louis B; Ramirez, Stephanie; Tusa, Brittany; Jarvis, John P; Erdberg, Philip
2017-03-01
Serial sexual murderers have been described as committing homicides in a methodical manner, taking substantial time between offenses to elude the authorities. The results of our study of the temporal patterns (i.e., the length of time between homicides) of a nonrandom national sample of 44 serial sexual murderers and their 201 victims indicate that this representation may not always be accurate. Although 25 offenders (56.8%) killed with longer than a 14-day period between homicides, a sizeable subgroup was identified: 19 offenders (43.2%) who committed homicides in rapid-sequence fashion, with fewer than 14 days between all or some of the murders. Six offenders (13.6%) killed all their victims in one rapid-sequence, spree-like episode, with homicides just days apart or sometimes two murders in the same day. Thirteen offenders (29.5%) killed in one or two rapid-sequence clusters (i.e., more than one murder within a 14-day period, as well as additional homicides with greater than 14 days between each). The purpose of our study was to describe this subgroup of rapid-sequence offenders who have not been identified until now. These findings argue for accelerated forensic assessments of dangerousness and public safety when a sexual murder is detected. Psychiatric disorders with rapidly occurring symptom patterns, or even atypical mania or mood dysregulation, may serve as exemplars for understanding this extraordinary group of offenders. © 2017 American Academy of Psychiatry and the Law.
Mistri, S K; Sultana, M; Kamal, S M M; Alam, M M; Irin, F; Nessa, J; Ahsan, C R; Yasmin, M
2016-05-01
For an effective control of tuberculosis, rapid detection of multidrug resistant tuberculosis (MDR-TB) is necessary. Therefore, we developed a modified nested multiplex allele-specific polymerase chain reaction (MAS-PCR) method that enables rapid MDR-TB detection directly from sputum samples. The efficacy of this method was evaluated using 79 sputum samples collected from suspected tuberculosis patients. The performance of nested MAS-PCR method was compared with other MDR-TB detection methods like drug susceptibility testing (DST) and DNA sequencing. As rifampicin (RIF) resistance conforms to MDR-TB in greater than 90% cases, only the presence of RIF-associated mutations in rpoB gene was determined by DNA sequencing and nested MAS-PCR to detect MDR-TB. The concordance between nested MAS-PCR and DNA sequencing results was found to be 96·3%. When compared with DST, the sensitivity and specificity of nested MAS-PCR for RIF-resistance detection were determined to be 92·9 and 100% respectively. For developing- and high-TB burden countries, molecular-based tests have been recommended by the World Health Organization for rapid detection of MDR-TB. The results of this study indicate that, nested MAS-PCR assay might be a practical and relatively cost effective molecular method for rapid detection of MDR-TB from suspected sputum samples in developing countries with resource poor settings. © 2016 The Society for Applied Microbiology.
[Rapid prenatal genetic diagnosis of a fetus with a high risk for Morquio A syndrome].
Guo, Yi-bin; Ai, Yang; Zhao, Yan; Tang, Jia; Jiang, Wei-ying; Du, Min-lian; Ma, Hua-mei; Zhong, Yan-fang
2012-04-01
To provide rapid and accurate prenatal genetic diagnosis for a fetus with high risk of Morquio A syndrome. Based on ascertained etiology of the proband and genotypes of the parents, particular mutations of the GALNS gene were screened at 10th gestational week with amplification refractory mutation system (ARMS), denaturing high performance liquid chromatography (DHPLC), and direct DNA sequencing. DHPLC screening has identified abnormal double peaks in the PCR products of exons 1 and 10, whilst only a single peak was detected in normal controls. Amplification of ARMS specific primers derived a specific product for the fetus's gene, whilst no similar product was detected in normal controls. Sequencing of PCR products confirmed that exons 1 and 10 of the GALNS gene from the fetus contained a heterozygous paternal c.106-111 del (p.L36-L37 del) deletion and a heterozygous maternal c.1097 T>C (p.L366P) missense mutation, which resulted in a compound heterozygote status. The fetus was diagnosed with Morquio A syndrome and a genotype similar to the proband. Termination of the pregnancy was recommended. Combined ARMS, DHPLC and DNA sequencing are effective for rapid and accurate prenatal diagnosis for fetus with a high risk for Morquio A syndrome. Such methods are particularly suitable for early diagnosis when pathogenesis is clear. Furthermore, combined ARMS and DHPLC are suitable for rapid processing of large numbers of samples for the identification of new mutations.
Seng, E K; Fang, Q; Lam, T J; Sin, Y M
2004-06-15
A rapid, sensitive and highly specific detection method for Aquareovirus based on reverse-transcription polymerase chain reaction (RT-PCR) was developed. Based on multiple sequence alignment of the cloned sequences of a local isolates, the Threadfin reovirus (TFV) and Guppy reovirus (GPV) with Grass carp reovirus (GCRV), a pair of degenerate primers was selected carefully and synthesized. Using this primer combination, only one specific product, approximately 450 bp in length was obtained when RT-PCR was carried out using the genomic double-stranded RNA (dsRNA) of TFV, GPV and GCRV. Similar results were also obtained when Chum salmon reovirus (CSRV) and Striped bass reovirus (SBRV) dsRNA were used as templates. No products were observed when nucleic acids other than the dsRNA of the aquareoviruses described above were used as RT-PCR templates. This technique could detect not only TFV but also GPV and GCRV in low titer virus-infected cell cultured cells. Furthermore, this method has also been shown to be able to diagnose GPV-infected guppy (Poecilia reticulata) that exhibit clinical symptoms as well as GPV-carrier guppy. Collectively, these results showed that the RT-PCR amplification method using specific degenerate primers described below is very useful for rapid and accurate detection of a variety of aquareovirus strains isolated from different host species and origin.
A high-throughput assay for the comprehensive profiling of DNA ligase fidelity
Lohman, Gregory J. S.; Bauer, Robert J.; Nichols, Nicole M.; Mazzola, Laurie; Bybee, Joanna; Rivizzigno, Danielle; Cantin, Elizabeth; Evans, Thomas C.
2016-01-01
DNA ligases have broad application in molecular biology, from traditional cloning methods to modern synthetic biology and molecular diagnostics protocols. Ligation-based detection of polynucleotide sequences can be achieved by the ligation of probe oligonucleotides when annealed to a complementary target sequence. In order to achieve a high sensitivity and low background, the ligase must efficiently join correctly base-paired substrates, while discriminating against the ligation of substrates containing even one mismatched base pair. In the current study, we report the use of capillary electrophoresis to rapidly generate mismatch fidelity profiles that interrogate all 256 possible base-pair combinations at a ligation junction in a single experiment. Rapid screening of ligase fidelity in a 96-well plate format has allowed the study of ligase fidelity in unprecedented depth. As an example of this new method, herein we report the ligation fidelity of Thermus thermophilus DNA ligase at a range of temperatures, buffer pH and monovalent cation strength. This screen allows the selection of reaction conditions that maximize fidelity without sacrificing activity, while generating a profile of specific mismatches that ligate detectably under each set of conditions. PMID:26365241
A high-throughput assay for the comprehensive profiling of DNA ligase fidelity.
Lohman, Gregory J S; Bauer, Robert J; Nichols, Nicole M; Mazzola, Laurie; Bybee, Joanna; Rivizzigno, Danielle; Cantin, Elizabeth; Evans, Thomas C
2016-01-29
DNA ligases have broad application in molecular biology, from traditional cloning methods to modern synthetic biology and molecular diagnostics protocols. Ligation-based detection of polynucleotide sequences can be achieved by the ligation of probe oligonucleotides when annealed to a complementary target sequence. In order to achieve a high sensitivity and low background, the ligase must efficiently join correctly base-paired substrates, while discriminating against the ligation of substrates containing even one mismatched base pair. In the current study, we report the use of capillary electrophoresis to rapidly generate mismatch fidelity profiles that interrogate all 256 possible base-pair combinations at a ligation junction in a single experiment. Rapid screening of ligase fidelity in a 96-well plate format has allowed the study of ligase fidelity in unprecedented depth. As an example of this new method, herein we report the ligation fidelity of Thermus thermophilus DNA ligase at a range of temperatures, buffer pH and monovalent cation strength. This screen allows the selection of reaction conditions that maximize fidelity without sacrificing activity, while generating a profile of specific mismatches that ligate detectably under each set of conditions. © The Author(s) 2015. Published by Oxford University Press on behalf of Nucleic Acids Research.
Constructing DNA Barcode Sets Based on Particle Swarm Optimization.
Wang, Bin; Zheng, Xuedong; Zhou, Shihua; Zhou, Changjun; Wei, Xiaopeng; Zhang, Qiang; Wei, Ziqi
2018-01-01
Following the completion of the human genome project, a large amount of high-throughput bio-data was generated. To analyze these data, massively parallel sequencing, namely next-generation sequencing, was rapidly developed. DNA barcodes are used to identify the ownership between sequences and samples when they are attached at the beginning or end of sequencing reads. Constructing DNA barcode sets provides the candidate DNA barcodes for this application. To increase the accuracy of DNA barcode sets, a particle swarm optimization (PSO) algorithm has been modified and used to construct the DNA barcode sets in this paper. Compared with the extant results, some lower bounds of DNA barcode sets are improved. The results show that the proposed algorithm is effective in constructing DNA barcode sets.
Rapid phylogenetic dissection of prokaryotic community structure in tidal flat using pyrosequencing.
Kim, Bong-Soo; Kim, Byung Kwon; Lee, Jae-Hak; Kim, Myungjin; Lim, Young Woon; Chun, Jongsik
2008-08-01
Dissection of prokaryotic community structure is prerequisite to understand their ecological roles. Various methods are available for such a purpose which amplification and sequencing of 16S rRNA genes gained its popularity. However, conventional methods based on Sanger sequencing technique require cloning process prior to sequencing, and are expensive and labor-intensive. We investigated prokaryotic community structure in tidal flat sediments, Korea, using pyrosequencing and a subsequent automated bioinformatic pipeline for the rapid and accurate taxonomic assignment of each amplicon. The combination of pyrosequencing and bioinformatic analysis showed that bacterial and archaeal communities were more diverse than previously reported in clone library studies. Pyrosequencing analysis revealed 21 bacterial divisions and 37 candidate divisions. Proteobacteria was the most abundant division in the bacterial community, of which Gamma-and Delta-Proteobacteria were the most abundant. Similarly, 4 archaeal divisions were found in tidal flat sediments. Euryarchaeota was the most abundant division in the archaeal sequences, which were further divided into 8 classes and 11 unclassified euryarchaeota groups. The system developed here provides a simple, in-depth and automated way of dissecting a prokaryotic community structure without extensive pretreatment such as cloning.
Sequence comparison alignment-free approach based on suffix tree and L-words frequency.
Soares, Inês; Goios, Ana; Amorim, António
2012-01-01
The vast majority of methods available for sequence comparison rely on a first sequence alignment step, which requires a number of assumptions on evolutionary history and is sometimes very difficult or impossible to perform due to the abundance of gaps (insertions/deletions). In such cases, an alternative alignment-free method would prove valuable. Our method starts by a computation of a generalized suffix tree of all sequences, which is completed in linear time. Using this tree, the frequency of all possible words with a preset length L-L-words--in each sequence is rapidly calculated. Based on the L-words frequency profile of each sequence, a pairwise standard Euclidean distance is then computed producing a symmetric genetic distance matrix, which can be used to generate a neighbor joining dendrogram or a multidimensional scaling graph. We present an improvement to word counting alignment-free approaches for sequence comparison, by determining a single optimal word length and combining suffix tree structures to the word counting tasks. Our approach is, thus, a fast and simple application that proved to be efficient and powerful when applied to mitochondrial genomes. The algorithm was implemented in Python language and is freely available on the web.
2013-01-01
Background Rapid and reliable identification of quarantine pests is essential for plant inspection services to prevent introduction of invasive species. For insects, this may be a serious problem when dealing with morphologically similar cryptic species complexes and early developmental stages that lack distinctive characters useful for taxonomic identification. DNA based barcoding could solve many of these problems. The standard barcode fragment, an approx. 650 base pairs long sequence of the 5′end of the mitochondrial cytochrome oxidase I (COI), enables differentiation of a very wide range of arthropods. However, problems remain in some taxa, such as Tephritidae, where recent genetic differentiation among some of the described species hinders accurate molecular discrimination. Results In order to explore the full species discrimination potential of COI, we sequenced the barcoding region of the COI gene of a range of economically important Tephritid species and complemented these data with all GenBank and BOLD entries for the systematic group available as of January 2012. We explored the limits of species delimitation of this barcode fragment among 193 putative Tephritid species and established operational taxonomic units (OTUs), between which discrimination is reliably possible. Furthermore, to enable future development of rapid diagnostic assays based on this sequence information, we characterized all single nucleotide polymorphisms (SNPs) and established “near-minimal” sets of SNPs that differentiate among all included OTUs with at least three and four SNPs, respectively. Conclusions We found that although several species cannot be differentiated based on the genetic diversity observed in COI and hence form composite OTUs, 85% of all OTUs correspond to described species. Because our SNP panels are developed based on all currently available sequence information and rely on a minimal pairwise difference of three SNPs, they are highly reliable and hence represent an important resource for developing taxon-specific diagnostic assays. For selected cases, possible explanations that may cause composite OTUs are discussed. PMID:23718854
spa typing for epidemiological surveillance of Staphylococcus aureus.
Hallin, Marie; Friedrich, Alexander W; Struelens, Marc J
2009-01-01
The spa typing method is based on sequencing of the polymorphic X region of the protein A gene (spa), present in all strains of Staphylococcus aureus. The X region is constituted of a variable number of 24-bp repeats flanked by well-conserved regions. This single-locus sequence-based typing method combines a number of technical advantages, such as rapidity, reproducibility, and portability. Moreover, due to its repeat structure, the spa locus simultaneously indexes micro- and macrovariations, enabling the use of spa typing in both local and global epidemiological studies. These studies are facilitated by the establishment of standardized spa type nomenclature and Internet shared databases.
Solieri, Lisa; Giudici, Paolo
2010-01-01
Control over malolactic fermentation (MLF) is a difficult goal in winemaking and needs rapid methods to monitor Oenococcus oeni malolactic starters (MLS) in a stressful environment such as wine. In this study, we describe a novel quantitative PCR (QPCR) assay enabling the detection of an O. oeni strain during MLF without culturing. O. oeni strain LB221 was used as a model to develop a strain-specific sequence-characterized amplified region (SCAR) marker derived from a discriminatory OPA20-based randomly amplified polymorphic DNA (RAPD) band. The 5′ and 3′ flanking regions and the copy number of the SCAR marker were characterized using inverse PCR and Southern blotting, respectively. Primer pairs targeting the SCAR sequence enabled strain-specific detection without cross amplification of other O. oeni strains or wine species of lactic acid bacteria (LAB), acetic acid bacteria (AAB), and yeasts. The SCAR-QPCR assay was linear over a range of cell concentrations (7 log units) and detected as few as 2.2 × 102 CFU per ml of red wine with good quantification effectiveness, as shown by the correlation of QPCR and plate counting results. Therefore, the cultivation-independent monitoring of a single O. oeni strain in wine based on a SCAR marker represents a rapid and effective strain-specific approach. This strategy can be adopted to develop easy and rapid detection techniques for monitoring the implantation of inoculated O. oeni MLS on the indigenous LAB population, reducing the risk of unsuccessful MLF. PMID:20935116
Loconsole, Giuliana; Onelge, Nuket; Yokomi, Raymond K; Kubaa, Raied Abou; Savino, Vito; Saponari, Maria
2013-01-01
The RNA genome of pathogenic and non-pathogenic variants of citrus Hop stunt viroid (HSVd) differ by five to six nucleotides located within the variable (V) domain referred to as the "cachexia expression motif". Sensitive hosts such as mandarin and its hybrids are seriously affected by cachexia disease. Current methods to differentiate HSVd variants rely on lengthy greenhouse biological indexing on Parson's Special mandarin and/or direct nucleotide sequence analysis of amplicons from RT-PCR of HSVd-infected plants. Two independent high throughput assays to segregate HSVd variants by real-time RT-PCR and High-Resolution Melting Temperature (HRM) analysis were developed: one based on EVAGreen dye; the other based on TaqMan probes. Primers for both assays targeted three differentiating nucleotides in the V domain which separated HSVd variants into three clusters by distinct melting temperatures with a confidence level higher than 98%. The accuracy of the HRM assays were validated by nucleotide sequencing of representative samples within each HRM cluster and by testing 45 HSVd-infected field trees from California, Italy, Spain, Syria and Turkey. To our knowledge, this is the first report of a rapid and sensitive approach to detect and differentiate HSVd variants associated with different biological behaviors. Although, HSVd is found in several crops including citrus, cachexia variants are restricted to some citrus-growing areas, particularly the Mediterranean Region. Rapid diagnosis for cachexia and non-cachexia variants is, thus, important for the management of HSVd in citrus and reduces the need for bioindexing and sequencing analysis. Copyright © 2013 Elsevier Ltd. All rights reserved.
Rapid Sequencing of Complete env Genes from Primary HIV-1 Samples
Eren, Kemal; Ignacio, Caroline; Landais, Elise; Weaver, Steven; Phung, Pham; Ludka, Colleen; Hepler, Lance; Caballero, Gemma; Pollner, Tristan; Guo, Yan; Richman, Douglas; Poignard, Pascal; Paxinos, Ellen E.; Kosakovsky Pond, Sergei L.
2016-01-01
Abstract The ability to study rapidly evolving viral populations has been constrained by the read length of next-generation sequencing approaches and the sampling depth of single-genome amplification methods. Here, we develop and characterize a method using Pacific Biosciences’ Single Molecule, Real-Time (SMRT®) sequencing technology to sequence multiple, intact full-length human immunodeficiency virus-1 env genes amplified from viral RNA populations circulating in blood, and provide computational tools for analyzing and visualizing these data. PMID:29492273
Kulkarni, Ketan Sakharam; Dave, Nandini; Saran, Shriyam; Garasia, Madhu; Parelkar, Sandesh
2018-04-01
During positive pressure ventilation, gastric inflation and subsequent pulmonary aspiration can occur. Rapid sequence induction (RSI) technique is an age-old formula to prevent this. We adopted a novel approach of RSI for patients with high risk of aspiration and evaluated it further in patients undergoing laparoscopic surgeries. We believe that, in patients with risk of gastric insufflation and pulmonary aspiration, transnasal humidified rapid-insufflation ventilatory exchange can be useful in facilitating pre- and apnoeic oxygenation till tracheal isolation is achieved.
Shi, Liang; Khandurina, Julia; Ronai, Zsolt; Li, Bi-Yu; Kwan, Wai King; Wang, Xun; Guttman, András
2003-01-01
A capillary gel electrophoresis based automated DNA fraction collection technique was developed to support a novel DNA fragment-pooling strategy for expressed sequence tag (EST) library construction. The cDNA population is first cleaved by BsaJ I and EcoR I restriction enzymes, and then subpooled by selective ligation with specific adapters followed by polymerase chain reaction (PCR) amplification and labeling. Combination of this cDNA fingerprinting method with high-resolution capillary gel electrophoresis separation and precise fractionation of individual cDNA transcript representatives avoids redundant fragment selection and concomitant repetitive sequencing of abundant transcripts. Using a computer-controlled capillary electrophoresis device the transcript representatives were separated by their size and fractions were automatically collected in every 30 s into 96-well plates. The high resolving power of the sieving matrix ensured sequencing grade separation of the DNA fragments (i.e., single-base resolution) and successful fraction collection. Performance and precision of the fraction collection procedure was validated by PCR amplification of the collected DNA fragments followed by capillary electrophoresis analysis for size and purity verification. The collected and PCR-amplified transcript representatives, ranging up to several hundred base pairs, were then sequenced to create an EST library.
Bromberg, Yana; Yachdav, Guy; Ofran, Yanay; Schneider, Reinhard; Rost, Burkhard
2009-05-01
The rapidly increasing quantity of protein sequence data continues to widen the gap between available sequences and annotations. Comparative modeling suggests some aspects of the 3D structures of approximately half of all known proteins; homology- and network-based inferences annotate some aspect of function for a similar fraction of the proteome. For most known protein sequences, however, there is detailed knowledge about neither their function nor their structure. Comprehensive efforts towards the expert curation of sequence annotations have failed to meet the demand of the rapidly increasing number of available sequences. Only the automated prediction of protein function in the absence of homology can close the gap between available sequences and annotations in the foreseeable future. This review focuses on two novel methods for automated annotation, and briefly presents an outlook on how modern web software may revolutionize the field of protein sequence annotation. First, predictions of protein binding sites and functional hotspots, and the evolution of these into the most successful type of prediction of protein function from sequence will be discussed. Second, a new tool, comprehensive in silico mutagenesis, which contributes important novel predictions of function and at the same time prepares for the onset of the next sequencing revolution, will be described. While these two new sub-fields of protein prediction represent the breakthroughs that have been achieved methodologically, it will then be argued that a different development might further change the way biomedical researchers benefit from annotations: modern web software can connect the worldwide web in any browser with the 'Deep Web' (ie, proprietary data resources). The availability of this direct connection, and the resulting access to a wealth of data, may impact drug discovery and development more than any existing method that contributes to protein annotation.
Goonesekere, Nalin Cw
2009-01-01
The large numbers of protein sequences generated by whole genome sequencing projects require rapid and accurate methods of annotation. The detection of homology through computational sequence analysis is a powerful tool in determining the complex evolutionary and functional relationships that exist between proteins. Homology search algorithms employ amino acid substitution matrices to detect similarity between proteins sequences. The substitution matrices in common use today are constructed using sequences aligned without reference to protein structure. Here we present amino acid substitution matrices constructed from the alignment of a large number of protein domain structures from the structural classification of proteins (SCOP) database. We show that when incorporated into the homology search algorithms BLAST and PSI-blast, the structure-based substitution matrices enhance the efficacy of detecting remote homologs.
High-throughput sequencing in veterinary infection biology and diagnostics.
Belák, S; Karlsson, O E; Leijon, M; Granberg, F
2013-12-01
Sequencing methods have improved rapidly since the first versions of the Sanger techniques, facilitating the development of very powerful tools for detecting and identifying various pathogens, such as viruses, bacteria and other microbes. The ongoing development of high-throughput sequencing (HTS; also known as next-generation sequencing) technologies has resulted in a dramatic reduction in DNA sequencing costs, making the technology more accessible to the average laboratory. In this White Paper of the World Organisation for Animal Health (OIE) Collaborating Centre for the Biotechnology-based Diagnosis of Infectious Diseases in Veterinary Medicine (Uppsala, Sweden), several approaches and examples of HTS are summarised, and their diagnostic applicability is briefly discussed. Selected future aspects of HTS are outlined, including the need for bioinformatic resources, with a focus on improving the diagnosis and control of infectious diseases in veterinary medicine.
Sulaiman, Irshad M.; Tang, Kevin; Osborne, John; Sammons, Scott; Wohlhueter, Robert M.
2007-01-01
We developed a set of seven resequencing GeneChips, based on the complete genome sequences of 24 strains of smallpox virus (variola virus), for rapid characterization of this human-pathogenic virus. Each GeneChip was designed to analyze a divergent segment of approximately 30,000 bases of the smallpox virus genome. This study includes the hybridization results of 14 smallpox virus strains. Of the 14 smallpox virus strains hybridized, only 7 had sequence information included in the design of the smallpox virus resequencing GeneChips; similar information for the remaining strains was not tiled as a reference in these GeneChips. By use of variola virus-specific primers and long-range PCR, 22 overlapping amplicons were amplified to cover nearly the complete genome and hybridized with the smallpox virus resequencing GeneChip set. These GeneChips were successful in generating nucleotide sequences for all 14 of the smallpox virus strains hybridized. Analysis of the data indicated that the GeneChip resequencing by hybridization was fast and reproducible and that the smallpox virus resequencing GeneChips could differentiate the 14 smallpox virus strains characterized. This study also suggests that high-density resequencing GeneChips have potential biodefense applications and may be used as an alternate tool for rapid identification of smallpox virus in the future. PMID:17182757
Using Next Generation Sequencing for Multiplexed Trait-Linked Markers in Wheat
Bernardo, Amy; Wang, Shan; St. Amand, Paul; Bai, Guihua
2015-01-01
With the advent of next generation sequencing (NGS) technologies, single nucleotide polymorphisms (SNPs) have become the major type of marker for genotyping in many crops. However, the availability of SNP markers for important traits of bread wheat ( Triticum aestivum L.) that can be effectively used in marker-assisted selection (MAS) is still limited and SNP assays for MAS are usually uniplex. A shift from uniplex to multiplex assays will allow the simultaneous analysis of multiple markers and increase MAS efficiency. We designed 33 locus-specific markers from SNP or indel-based marker sequences that linked to 20 different quantitative trait loci (QTL) or genes of agronomic importance in wheat and analyzed the amplicon sequences using an Ion Torrent Proton Sequencer and a custom allele detection pipeline to determine the genotypes of 24 selected germplasm accessions. Among the 33 markers, 27 were successfully multiplexed and 23 had 100% SNP call rates. Results from analysis of "kompetitive allele-specific PCR" (KASP) and sequence tagged site (STS) markers developed from the same loci fully verified the genotype calls of 23 markers. The NGS-based multiplexed assay developed in this study is suitable for rapid and high-throughput screening of SNPs and some indel-based markers in wheat. PMID:26625271
CLAST: CUDA implemented large-scale alignment search tool.
Yano, Masahiro; Mori, Hiroshi; Akiyama, Yutaka; Yamada, Takuji; Kurokawa, Ken
2014-12-11
Metagenomics is a powerful methodology to study microbial communities, but it is highly dependent on nucleotide sequence similarity searching against sequence databases. Metagenomic analyses with next-generation sequencing technologies produce enormous numbers of reads from microbial communities, and many reads are derived from microbes whose genomes have not yet been sequenced, limiting the usefulness of existing sequence similarity search tools. Therefore, there is a clear need for a sequence similarity search tool that can rapidly detect weak similarity in large datasets. We developed a tool, which we named CLAST (CUDA implemented large-scale alignment search tool), that enables analyses of millions of reads and thousands of reference genome sequences, and runs on NVIDIA Fermi architecture graphics processing units. CLAST has four main advantages over existing alignment tools. First, CLAST was capable of identifying sequence similarities ~80.8 times faster than BLAST and 9.6 times faster than BLAT. Second, CLAST executes global alignment as the default (local alignment is also an option), enabling CLAST to assign reads to taxonomic and functional groups based on evolutionarily distant nucleotide sequences with high accuracy. Third, CLAST does not need a preprocessed sequence database like Burrows-Wheeler Transform-based tools, and this enables CLAST to incorporate large, frequently updated sequence databases. Fourth, CLAST requires <2 GB of main memory, making it possible to run CLAST on a standard desktop computer or server node. CLAST achieved very high speed (similar to the Burrows-Wheeler Transform-based Bowtie 2 for long reads) and sensitivity (equal to BLAST, BLAT, and FR-HIT) without the need for extensive database preprocessing or a specialized computing platform. Our results demonstrate that CLAST has the potential to be one of the most powerful and realistic approaches to analyze the massive amount of sequence data from next-generation sequencing technologies.
Why Blue stragglers formed via collisions may not be rapid rotators
DOE Office of Scientific and Technical Information (OSTI.GOV)
Leonard, P.J.T.; Clement, M.J.
1993-03-01
We propose that the blue stragglers formed via collisions may not be rapid rotators due to magnetic braking during a Hayashi phase as they approach the main sequence. It is conceivable that just the envelopes of the blue stragglers are spun down, while their cores remain rapidly rotating. This would greatly extend the main-sequence lifetimes of the blue stragglers produced by collisions.
Why Blue stragglers formed via collisions may not be rapid rotators
DOE Office of Scientific and Technical Information (OSTI.GOV)
Leonard, P.J.T.; Clement, M.J.
1993-01-01
We propose that the blue stragglers formed via collisions may not be rapid rotators due to magnetic braking during a Hayashi phase as they approach the main sequence. It is conceivable that just the envelopes of the blue stragglers are spun down, while their cores remain rapidly rotating. This would greatly extend the main-sequence lifetimes of the blue stragglers produced by collisions.
NASA Astrophysics Data System (ADS)
Eyles, Nicholas; Mullins, Henry T.; Hine, Albert C.
1991-09-01
This paper presents the first detailed data regarding the newly discovered deep infill of Okanagan Lake. Okanagan Lake (50°00'N, 119°30'W) is 120 km long, ˜ 3-5 km wide and occupies a glacially overdeepened bedrock basin in the southern interior of British Columbia. This basin, and other elongate lakes of the region (e.g. Shuswap, Kootenay, Kalamalka, Canim and Mahood lakes), mark the site of westward flowing ice streams within successive Cordilleran ice sheets. An air gun seismic survey of Okanagan Lake shows that the bedrock floor is nearly 650 m below sea-level, more than 2000 m below the rim of the surrounding plateau. The maximum thickness of Pleistocene sediment in Okanagan Lake basin approaches 800 m. Forty-six seismic reflection traverses and an axial profile show a relatively simple stratigraphy composed of three seismic sequences argued to be no older than the last glacial cycle (< 30 ka). A discontinuous basal unit (sequence I) characterized by large-scale diffractions, and up to 460 m thick, infills the narrow, V-shaped bedrock floor of the basin and is interpreted as a boulder gravel deposited by subglacial meltwaters. Overlying seismic sequence II is composed of two sub-sequences. Sub-sequence IIa is a chaotic to massive facies up to 736 m thick. Lakeshore exposures close to where this unit reaches lake level show deformed and chaotically-bedded glaciolacustrine silts containing gravel lens and large ice-rafted boulders. The surface topography of this sub-sequence is irregular and in general mimics the form of the underlying bedrock as a result of compaction. This sequence passes laterally into stratified facies (sub-sequence IIb) at the northern end of the basin. Seismic sequence II appears to record rapid ice-proximal dumping of glaciolacustrine silt as the Okanagan glacier backwasted upvalley in a deep lake. A thin (60 m max.) laminated seismic sequence (III) drapes the hummocky surface of sequence II and represents postglacial sedimentation from fan-deltas. The extreme thickness of sequences I and II in Okanagan Lake reflects the focussing of large volumes of meltwater and sediment into the basin during deglaciation; pre-existing sediments that pre-date the last glacial cycle appear to have been completely eroded. Glaciological conditions during sedimentation may have been similar to marine-based outlet glaciers calving in deep water in fiord basins. In contrast to marine settings where ice bergs are free to disperse, large volumes of dead ice were trapped within the basin; structural evidence for sedimentation around dead ice blocks has been previously used to argue that the Cordilleran Ice Sheet downwasted in situ. We emphasize in contrast, the trapping of dead ice left behind by rapidly calving lake-based outlet glaciers.
NASA Astrophysics Data System (ADS)
Sherwood, R.; Mutz, D.; Estlin, T.; Chien, S.; Backes, P.; Norris, J.; Tran, D.; Cooper, B.; Rabideau, G.; Mishkin, A.; Maxwell, S.
2001-07-01
This article discusses a proof-of-concept prototype for ground-based automatic generation of validated rover command sequences from high-level science and engineering activities. This prototype is based on ASPEN, the Automated Scheduling and Planning Environment. This artificial intelligence (AI)-based planning and scheduling system will automatically generate a command sequence that will execute within resource constraints and satisfy flight rules. An automated planning and scheduling system encodes rover design knowledge and uses search and reasoning techniques to automatically generate low-level command sequences while respecting rover operability constraints, science and engineering preferences, environmental predictions, and also adhering to hard temporal constraints. This prototype planning system has been field-tested using the Rocky 7 rover at JPL and will be field-tested on more complex rovers to prove its effectiveness before transferring the technology to flight operations for an upcoming NASA mission. Enabling goal-driven commanding of planetary rovers greatly reduces the requirements for highly skilled rover engineering personnel. This in turn greatly reduces mission operations costs. In addition, goal-driven commanding permits a faster response to changes in rover state (e.g., faults) or science discoveries by removing the time-consuming manual sequence validation process, allowing rapid "what-if" analyses, and thus reducing overall cycle times.
Next-Generation Sequencing of Aquatic Oligochaetes: Comparison of Experimental Communities
Vivien, Régis; Lejzerowicz, Franck; Pawlowski, Jan
2016-01-01
Aquatic oligochaetes are a common group of freshwater benthic invertebrates known to be very sensitive to environmental changes and currently used as bioindicators in some countries. However, more extensive application of oligochaetes for assessing the ecological quality of sediments in watercourses and lakes would require overcoming the difficulties related to morphology-based identification of oligochaetes species. This study tested the Next-Generation Sequencing (NGS) of a standard cytochrome c oxydase I (COI) barcode as a tool for the rapid assessment of oligochaete diversity in environmental samples, based on mixed specimen samples. To know the composition of each sample we Sanger sequenced every specimen present in these samples. Our study showed that a large majority of OTUs (Operational Taxonomic Unit) could be detected by NGS analyses. We also observed congruence between the NGS and specimen abundance data for several but not all OTUs. Because the differences in sequence abundance data were consistent across samples, we exploited these variations to empirically design correction factors. We showed that such factors increased the congruence between the values of oligochaetes-based indices inferred from the NGS and the Sanger-sequenced specimen data. The validation of these correction factors by further experimental studies will be needed for the adaptation and use of NGS technology in biomonitoring studies based on oligochaete communities. PMID:26866802
The genomic landscape of rapid, repeated evolutionary rescue from toxic pollution in wild fish
USDA-ARS?s Scientific Manuscript database
Here we describe evolutionary rescue from intense pollution via multiple modes of selection in killifish populations from 4 urban estuaries of the US eastern seaboard. Comparative transcriptomics and analysis of 384 whole genome sequences show that the functioning of a receptor-based signaling pathw...
Teaching Biology for a Sustainable Future
ERIC Educational Resources Information Center
Musante, Susan
2011-01-01
Students at Calvin College in Grand Rapids, Michigan, can now take an innovative biology course in which an integrated, interdisciplinary, problem-based approach is used--one that the scientific community itself is promoting. The first course in a four-semester sequence, Biology 123--The Living World: Concepts and Connections--explores real-world…
An efficient annotation and gene-expression derivation tool for Illumina Solexa datasets
2010-01-01
Background The data produced by an Illumina flow cell with all eight lanes occupied, produces well over a terabyte worth of images with gigabytes of reads following sequence alignment. The ability to translate such reads into meaningful annotation is therefore of great concern and importance. Very easily, one can get flooded with such a great volume of textual, unannotated data irrespective of read quality or size. CASAVA, a optional analysis tool for Illumina sequencing experiments, enables the ability to understand INDEL detection, SNP information, and allele calling. To not only extract from such analysis, a measure of gene expression in the form of tag-counts, but furthermore to annotate such reads is therefore of significant value. Findings We developed TASE (Tag counting and Analysis of Solexa Experiments), a rapid tag-counting and annotation software tool specifically designed for Illumina CASAVA sequencing datasets. Developed in Java and deployed using jTDS JDBC driver and a SQL Server backend, TASE provides an extremely fast means of calculating gene expression through tag-counts while annotating sequenced reads with the gene's presumed function, from any given CASAVA-build. Such a build is generated for both DNA and RNA sequencing. Analysis is broken into two distinct components: DNA sequence or read concatenation, followed by tag-counting and annotation. The end result produces output containing the homology-based functional annotation and respective gene expression measure signifying how many times sequenced reads were found within the genomic ranges of functional annotations. Conclusions TASE is a powerful tool to facilitate the process of annotating a given Illumina Solexa sequencing dataset. Our results indicate that both homology-based annotation and tag-count analysis are achieved in very efficient times, providing researchers to delve deep in a given CASAVA-build and maximize information extraction from a sequencing dataset. TASE is specially designed to translate sequence data in a CASAVA-build into functional annotations while producing corresponding gene expression measurements. Achieving such analysis is executed in an ultrafast and highly efficient manner, whether the analysis be a single-read or paired-end sequencing experiment. TASE is a user-friendly and freely available application, allowing rapid analysis and annotation of any given Illumina Solexa sequencing dataset with ease. PMID:20598141
Development of an ELA-DRA gene typing method based on pyrosequencing technology.
Díaz, S; Echeverría, M G; It, V; Posik, D M; Rogberg-Muñoz, A; Pena, N L; Peral-García, P; Vega-Pla, J L; Giovambattista, G
2008-11-01
The polymorphism of equine lymphocyte antigen (ELA) class II DRA gene had been detected by polymerase chain reaction-single-strand conformational polymorphism (PCR-SSCP) and reference strand-mediated conformation analysis. These methodologies allowed to identify 11 ELA-DRA exon 2 sequences, three of which are widely distributed among domestic horse breeds. Herein, we describe the development of a pyrosequencing-based method applicable to ELA-DRA typing, by screening samples from eight different horse breeds previously typed by PCR-SSCP. This sequence-based method would be useful in high-throughput genotyping of major histocompatibility complex genes in horses and other animal species, making this system interesting as a rapid screening method for animal genotyping of immune-related genes.
Characterization of genetic variability of Venezuelan equine encephalitis viruses
Gardner, Shea N.; McLoughlin, Kevin; Be, Nicholas A.; ...
2016-04-07
Venezuelan equine encephalitis virus (VEEV) is a mosquito-borne alphavirus that has caused large outbreaks of severe illness in both horses and humans. New approaches are needed to rapidly infer the origin of a newly discovered VEEV strain, estimate its equine amplification and resultant epidemic potential, and predict human virulence phenotype. We performed whole genome single nucleotide polymorphism (SNP) analysis of all available VEE antigenic complex genomes, verified that a SNP-based phylogeny accurately captured the features of a phylogenetic tree based on multiple sequence alignment, and developed a high resolution genome-wide SNP microarray. We used the microarray to analyze a broadmore » panel of VEEV isolates, found excellent concordance between array- and sequence-based SNP calls, genotyped unsequenced isolates, and placed them on a phylogeny with sequenced genomes. The microarray successfully genotyped VEEV directly from tissue samples of an infected mouse, bypassing the need for viral isolation, culture and genomic sequencing. Lastly, we identified genomic variants associated with serotypes and host species, revealing a complex relationship between genotype and phenotype.« less
Rapid Detection of Powassan Virus in a Patient With Encephalitis by Metagenomic Sequencing.
Piantadosi, Anne; Kanjilal, Sanjat; Ganesh, Vijay; Khanna, Arjun; Hyle, Emily P; Rosand, Jonathan; Bold, Tyler; Metsky, Hayden C; Lemieux, Jacob; Leone, Michael J; Freimark, Lisa; Matranga, Christian B; Adams, Gordon; McGrath, Graham; Zamirpour, Siavash; Telford, Sam; Rosenberg, Eric; Cho, Tracey; Frosch, Matthew P; Goldberg, Marcia B; Mukerji, Shibani S; Sabeti, Pardis C
2018-02-10
We describe a patient with severe and progressive encephalitis of unknown etiology. We performed rapid metagenomic sequencing from cerebrospinal fluid and identified Powassan virus, an emerging tick-borne flavivirus that has been increasingly detected in the United States.
Pankhurst, Louise J; del Ojo Elias, Carlos; Votintseva, Antonina A; Walker, Timothy M; Cole, Kevin; Davies, Jim; Fermont, Jilles M; Gascoyne-Binzi, Deborah M; Kohl, Thomas A; Kong, Clare; Lemaitre, Nadine; Niemann, Stefan; Paul, John; Rogers, Thomas R; Roycroft, Emma; Smith, E Grace; Supply, Philip; Tang, Patrick; Wilcox, Mark H; Wordsworth, Sarah; Wyllie, David; Xu, Li; Crook, Derrick W
2016-01-01
Summary Background Slow and cumbersome laboratory diagnostics for Mycobacterium tuberculosis complex (MTBC) risk delayed treatment and poor patient outcomes. Whole-genome sequencing (WGS) could potentially provide a rapid and comprehensive diagnostic solution. In this prospective study, we compare real-time WGS with routine MTBC diagnostic workflows. Methods We compared sequencing mycobacteria from all newly positive liquid cultures with routine laboratory diagnostic workflows across eight laboratories in Europe and North America for diagnostic accuracy, processing times, and cost between Sept 6, 2013, and April 14, 2014. We sequenced specimens once using local Illumina MiSeq platforms and processed data centrally using a semi-automated bioinformatics pipeline. We identified species or complex using gene presence or absence, predicted drug susceptibilities from resistance-conferring mutations identified from reference-mapped MTBC genomes, and calculated genetic distance to previously sequenced UK MTBC isolates to detect outbreaks. WGS data processing and analysis was done by staff masked to routine reference laboratory and clinical results. We also did a microcosting analysis to assess the financial viability of WGS-based diagnostics. Findings Compared with routine results, WGS predicted species with 93% (95% CI 90–96; 322 of 345 specimens; 356 mycobacteria specimens submitted) accuracy and drug susceptibility also with 93% (91–95; 628 of 672 specimens; 168 MTBC specimens identified) accuracy, with one sequencing attempt. WGS linked 15 (16% [95% CI 10–26]) of 91 UK patients to an outbreak. WGS diagnosed a case of multidrug-resistant tuberculosis before routine diagnosis was completed and discovered a new multidrug-resistant tuberculosis cluster. Full WGS diagnostics could be generated in a median of 9 days (IQR 6–10), a median of 21 days (IQR 14–32) faster than final reference laboratory reports were produced (median of 31 days [IQR 21–44]), at a cost of £481 per culture-positive specimen, whereas routine diagnosis costs £518, equating to a WGS-based diagnosis cost that is 7% cheaper annually than are present diagnostic workflows. Interpretation We have shown that WGS has a scalable, rapid turnaround, and is a financially feasible method for full MTBC diagnostics. Continued improvements to mycobacterial processing, bioinformatics, and analysis will improve the accuracy, speed, and scope of WGS-based diagnosis. Funding National Institute for Health Research, Department of Health, Wellcome Trust, British Colombia Centre for Disease Control Foundation for Population and Public Health, Department of Clinical Microbiology, Trinity College Dublin. PMID:26669893
Drakatos, Panagis; Patel, Kishankumar; Thakrar, Chiraag; Williams, Adrian J; Kent, Brian D; Leschziner, Guy D
2016-04-01
Current treatment recommendations for narcolepsy suggest that modafinil should be used as a first-line treatment ahead of conventional stimulants or sodium oxybate. In this study, performed in a tertiary sleep disorders centre, treatment responses were examined following these recommendations, and the ability of sleep-stage sequencing of sleep-onset rapid eye movement periods in the multiple sleep latency test to predict treatment response. Over a 3.5-year period, 255 patients were retrospectively identified in the authors' database as patients diagnosed with narcolepsy, type 1 (with cataplexy) or type 2 (without) using clinical and polysomnographic criteria. Eligible patients were examined in detail, sleep study data were abstracted and sleep-stage sequencing of sleep-onset rapid eye movement periods were analysed. Response to treatment was graded utilizing an internally developed scale. Seventy-five patients were included (39% males). Forty (53%) were diagnosed with type 1 narcolepsy with a mean follow-up of 2.37 ± 1.35 years. Ninety-seven percent of the patients were initially started on modafinil, and overall 59% reported complete response on the last follow-up. Twenty-nine patients (39%) had the sequence of sleep stage 1 or wake to rapid eye movement in all of their sleep-onset rapid eye movement periods, with most of these diagnosed as narcolepsy type 1 (72%). The presence of this specific sleep-stage sequence in all sleep-onset rapid eye movement periods was associated with worse treatment response (P = 0.0023). Sleep-stage sequence analysis of sleep-onset rapid eye movement periods in the multiple sleep latency test may aid the prediction of treatment response in narcoleptics and provide a useful prognostic tool in clinical practice, above and beyond their classification as narcolepsy type 1 or 2. © 2015 European Sleep Research Society.
Fan, Long; Hui, Jerome H L; Yu, Zu Guo; Chu, Ka Hou
2014-07-01
Species identification based on short sequences of DNA markers, that is, DNA barcoding, has emerged as an integral part of modern taxonomy. However, software for the analysis of large and multilocus barcoding data sets is scarce. The Basic Local Alignment Search Tool (BLAST) is currently the fastest tool capable of handling large databases (e.g. >5000 sequences), but its accuracy is a concern and has been criticized for its local optimization. However, current more accurate software requires sequence alignment or complex calculations, which are time-consuming when dealing with large data sets during data preprocessing or during the search stage. Therefore, it is imperative to develop a practical program for both accurate and scalable species identification for DNA barcoding. In this context, we present VIP Barcoding: a user-friendly software in graphical user interface for rapid DNA barcoding. It adopts a hybrid, two-stage algorithm. First, an alignment-free composition vector (CV) method is utilized to reduce searching space by screening a reference database. The alignment-based K2P distance nearest-neighbour method is then employed to analyse the smaller data set generated in the first stage. In comparison with other software, we demonstrate that VIP Barcoding has (i) higher accuracy than Blastn and several alignment-free methods and (ii) higher scalability than alignment-based distance methods and character-based methods. These results suggest that this platform is able to deal with both large-scale and multilocus barcoding data with accuracy and can contribute to DNA barcoding for modern taxonomy. VIP Barcoding is free and available at http://msl.sls.cuhk.edu.hk/vipbarcoding/. © 2014 John Wiley & Sons Ltd.
NASA Astrophysics Data System (ADS)
Lau, Han Yih; Wu, Haoqi; Wee, Eugene J. H.; Trau, Matt; Wang, Yuling; Botella, Jose R.
2017-01-01
Developing quick and sensitive molecular diagnostics for plant pathogen detection is challenging. Herein, a nanoparticle based electrochemical biosensor was developed for rapid and sensitive detection of plant pathogen DNA on disposable screen-printed carbon electrodes. This 60 min assay relied on the rapid isothermal amplification of target pathogen DNA sequences by recombinase polymerase amplification (RPA) followed by gold nanoparticle-based electrochemical assessment with differential pulse voltammetry (DPV). Our method was 10,000 times more sensitive than conventional polymerase chain reaction (PCR)/gel electrophoresis and could readily identify P. syringae infected plant samples even before the disease symptoms were visible. On the basis of the speed, sensitivity, simplicity and portability of the approach, we believe the method has potential as a rapid disease management solution for applications in agriculture diagnostics.
Lau, Han Yih; Wu, Haoqi; Wee, Eugene J H; Trau, Matt; Wang, Yuling; Botella, Jose R
2017-01-17
Developing quick and sensitive molecular diagnostics for plant pathogen detection is challenging. Herein, a nanoparticle based electrochemical biosensor was developed for rapid and sensitive detection of plant pathogen DNA on disposable screen-printed carbon electrodes. This 60 min assay relied on the rapid isothermal amplification of target pathogen DNA sequences by recombinase polymerase amplification (RPA) followed by gold nanoparticle-based electrochemical assessment with differential pulse voltammetry (DPV). Our method was 10,000 times more sensitive than conventional polymerase chain reaction (PCR)/gel electrophoresis and could readily identify P. syringae infected plant samples even before the disease symptoms were visible. On the basis of the speed, sensitivity, simplicity and portability of the approach, we believe the method has potential as a rapid disease management solution for applications in agriculture diagnostics.
Pulseq: A rapid and hardware-independent pulse sequence prototyping framework.
Layton, Kelvin J; Kroboth, Stefan; Jia, Feng; Littin, Sebastian; Yu, Huijun; Leupold, Jochen; Nielsen, Jon-Fredrik; Stöcker, Tony; Zaitsev, Maxim
2017-04-01
Implementing new magnetic resonance experiments, or sequences, often involves extensive programming on vendor-specific platforms, which can be time consuming and costly. This situation is exacerbated when research sequences need to be implemented on several platforms simultaneously, for example, at different field strengths. This work presents an alternative programming environment that is hardware-independent, open-source, and promotes rapid sequence prototyping. A novel file format is described to efficiently store the hardware events and timing information required for an MR pulse sequence. Platform-dependent interpreter modules convert the file to appropriate instructions to run the sequence on MR hardware. Sequences can be designed in high-level languages, such as MATLAB, or with a graphical interface. Spin physics simulation tools are incorporated into the framework, allowing for comparison between real and virtual experiments. Minimal effort is required to implement relatively advanced sequences using the tools provided. Sequences are executed on three different MR platforms, demonstrating the flexibility of the approach. A high-level, flexible and hardware-independent approach to sequence programming is ideal for the rapid development of new sequences. The framework is currently not suitable for large patient studies or routine scanning although this would be possible with deeper integration into existing workflows. Magn Reson Med 77:1544-1552, 2017. © 2016 International Society for Magnetic Resonance in Medicine. © 2016 International Society for Magnetic Resonance in Medicine.
Flow cytometry for enrichment and titration in massively parallel DNA sequencing
Sandberg, Julia; Ståhl, Patrik L.; Ahmadian, Afshin; Bjursell, Magnus K.; Lundeberg, Joakim
2009-01-01
Massively parallel DNA sequencing is revolutionizing genomics research throughout the life sciences. However, the reagent costs and labor requirements in current sequencing protocols are still substantial, although improvements are continuously being made. Here, we demonstrate an effective alternative to existing sample titration protocols for the Roche/454 system using Fluorescence Activated Cell Sorting (FACS) technology to determine the optimal DNA-to-bead ratio prior to large-scale sequencing. Our method, which eliminates the need for the costly pilot sequencing of samples during titration is capable of rapidly providing accurate DNA-to-bead ratios that are not biased by the quantification and sedimentation steps included in current protocols. Moreover, we demonstrate that FACS sorting can be readily used to highly enrich fractions of beads carrying template DNA, with near total elimination of empty beads and no downstream sacrifice of DNA sequencing quality. Automated enrichment by FACS is a simple approach to obtain pure samples for bead-based sequencing systems, and offers an efficient, low-cost alternative to current enrichment protocols. PMID:19304748
NASA Astrophysics Data System (ADS)
Song, Yang; Laskay, Ünige A.; Vilcins, Inger-Marie E.; Barbour, Alan G.; Wysocki, Vicki H.
2015-11-01
Ticks are vectors for disease transmission because they are indiscriminant in their feeding on multiple vertebrate hosts, transmitting pathogens between their hosts. Identifying the hosts on which ticks have fed is important for disease prevention and intervention. We have previously shown that hemoglobin (Hb) remnants from a host on which a tick fed can be used to reveal the host's identity. For the present research, blood was collected from 33 bird species that are common in the U.S. as hosts for ticks but that have unknown Hb sequences. A top-down-assisted bottom-up mass spectrometry approach with a customized searching database, based on variability in known bird hemoglobin sequences, has been devised to facilitate fast and complete sequencing of hemoglobin from birds with unknown sequences. These hemoglobin sequences will be added to a hemoglobin database and used for tick host identification. The general approach has the potential to sequence any set of homologous proteins completely in a rapid manner.
Mapping DNA polymerase errors by single-molecule sequencing
DOE Office of Scientific and Technical Information (OSTI.GOV)
Lee, David F.; Lu, Jenny; Chang, Seungwoo
Genomic integrity is compromised by DNA polymerase replication errors, which occur in a sequence-dependent manner across the genome. Accurate and complete quantification of a DNA polymerase's error spectrum is challenging because errors are rare and difficult to detect. We report a high-throughput sequencing assay to map in vitro DNA replication errors at the single-molecule level. Unlike previous methods, our assay is able to rapidly detect a large number of polymerase errors at base resolution over any template substrate without quantification bias. To overcome the high error rate of high-throughput sequencing, our assay uses a barcoding strategy in which each replicationmore » product is tagged with a unique nucleotide sequence before amplification. Here, this allows multiple sequencing reads of the same product to be compared so that sequencing errors can be found and removed. We demonstrate the ability of our assay to characterize the average error rate, error hotspots and lesion bypass fidelity of several DNA polymerases.« less
Mapping DNA polymerase errors by single-molecule sequencing
Lee, David F.; Lu, Jenny; Chang, Seungwoo; ...
2016-05-16
Genomic integrity is compromised by DNA polymerase replication errors, which occur in a sequence-dependent manner across the genome. Accurate and complete quantification of a DNA polymerase's error spectrum is challenging because errors are rare and difficult to detect. We report a high-throughput sequencing assay to map in vitro DNA replication errors at the single-molecule level. Unlike previous methods, our assay is able to rapidly detect a large number of polymerase errors at base resolution over any template substrate without quantification bias. To overcome the high error rate of high-throughput sequencing, our assay uses a barcoding strategy in which each replicationmore » product is tagged with a unique nucleotide sequence before amplification. Here, this allows multiple sequencing reads of the same product to be compared so that sequencing errors can be found and removed. We demonstrate the ability of our assay to characterize the average error rate, error hotspots and lesion bypass fidelity of several DNA polymerases.« less
Discriminative motif optimization based on perceptron training
Patel, Ronak Y.; Stormo, Gary D.
2014-01-01
Motivation: Generating accurate transcription factor (TF) binding site motifs from data generated using the next-generation sequencing, especially ChIP-seq, is challenging. The challenge arises because a typical experiment reports a large number of sequences bound by a TF, and the length of each sequence is relatively long. Most traditional motif finders are slow in handling such enormous amount of data. To overcome this limitation, tools have been developed that compromise accuracy with speed by using heuristic discrete search strategies or limited optimization of identified seed motifs. However, such strategies may not fully use the information in input sequences to generate motifs. Such motifs often form good seeds and can be further improved with appropriate scoring functions and rapid optimization. Results: We report a tool named discriminative motif optimizer (DiMO). DiMO takes a seed motif along with a positive and a negative database and improves the motif based on a discriminative strategy. We use area under receiver-operating characteristic curve (AUC) as a measure of discriminating power of motifs and a strategy based on perceptron training that maximizes AUC rapidly in a discriminative manner. Using DiMO, on a large test set of 87 TFs from human, drosophila and yeast, we show that it is possible to significantly improve motifs identified by nine motif finders. The motifs are generated/optimized using training sets and evaluated on test sets. The AUC is improved for almost 90% of the TFs on test sets and the magnitude of increase is up to 39%. Availability and implementation: DiMO is available at http://stormo.wustl.edu/DiMO Contact: rpatel@genetics.wustl.edu, ronakypatel@gmail.com PMID:24369152
Richard, François D; Kajava, Andrey V
2014-06-01
The dramatic growth of sequencing data evokes an urgent need to improve bioinformatics tools for large-scale proteome analysis. Over the last two decades, the foremost efforts of computer scientists were devoted to proteins with aperiodic sequences having globular 3D structures. However, a large portion of proteins contain periodic sequences representing arrays of repeats that are directly adjacent to each other (so called tandem repeats or TRs). These proteins frequently fold into elongated fibrous structures carrying different fundamental functions. Algorithms specific to the analysis of these regions are urgently required since the conventional approaches developed for globular domains have had limited success when applied to the TR regions. The protein TRs are frequently not perfect, containing a number of mutations, and some of them cannot be easily identified. To detect such "hidden" repeats several algorithms have been developed. However, the most sensitive among them are time-consuming and, therefore, inappropriate for large scale proteome analysis. To speed up the TR detection we developed a rapid filter that is based on the comparison of composition and order of short strings in the adjacent sequence motifs. Tests show that our filter discards up to 22.5% of proteins which are known to be without TRs while keeping almost all (99.2%) TR-containing sequences. Thus, we are able to decrease the size of the initial sequence dataset enriching it with TR-containing proteins which allows a faster subsequent TR detection by other methods. The program is available upon request. Copyright © 2014 Elsevier Inc. All rights reserved.
Tracking B-Cell Repertoires and Clonal Histories in Normal and Malignant Lymphocytes.
Weston-Bell, Nicola J; Cowan, Graeme; Sahota, Surinder S
2017-01-01
Methods for tracking B-cell repertoires and clonal history in normal and malignant B-cells based on immunoglobulin variable region (IGV) gene analysis have developed rapidly with the advent of massive parallel next-generation sequencing (mpNGS) protocols. mpNGS permits a depth of analysis of IGV genes not hitherto feasible, and presents challenges of bioinformatics analysis, which can be readily met by current pipelines. This strategy offers a potential resolution of B-cell usage at a depth that may capture fully the natural state, in a given biological setting. Conventional methods based on RT-PCR amplification and Sanger sequencing are also available where mpNGS is not accessible. Each method offers distinct advantages. Conventional methods for IGV gene sequencing are readily adaptable to most laboratories and provide an ease of analysis to capture salient features of B-cell use. This chapter describes two methods in detail for analysis of IGV genes, mpNGS and conventional RT-PCR with Sanger sequencing.
Electrophoretic mobility shift scanning using an automated infrared DNA sequencer.
Sano, M; Ohyama, A; Takase, K; Yamamoto, M; Machida, M
2001-11-01
Electrophoretic mobility shift assay (EMSA) is widely used in the study of sequence-specific DNA-binding proteins, including transcription factors and mismatch binding proteins. We have established a non-radioisotope-based protocol for EMSA that features an automated DNA sequencer with an infrared fluorescent dye (IRDye) detection unit. Our modification of the elec- trophoresis unit, which includes cooling the gel plates with a reduced well-to-read length, has made it possible to detect shifted bands within 1 h. Further, we have developed a rapid ligation-based method for generating IRDye-labeled probes with an approximately 60% cost reduction. This method has the advantages of real-time scanning, stability of labeled probes, and better safety associated with nonradioactive methods of detection. Analysis of a promoter from an industrially important filamentous fungus, Aspergillus oryzae, in a prototype experiment revealed that the method we describe has potential for use in systematic scanning and identification of the functionally important elements to which cellular factors bind in a sequence-specific manner.
Use of simulated data sets to evaluate the fidelity of metagenomic processing methods
DOE Office of Scientific and Technical Information (OSTI.GOV)
Mavromatis, K; Ivanova, N; Barry, Kerrie
2007-01-01
Metagenomics is a rapidly emerging field of research for studying microbial communities. To evaluate methods presently used to process metagenomic sequences, we constructed three simulated data sets of varying complexity by combining sequencing reads randomly selected from 113 isolate genomes. These data sets were designed to model real metagenomes in terms of complexity and phylogenetic composition. We assembled sampled reads using three commonly used genome assemblers (Phrap, Arachne and JAZZ), and predicted genes using two popular gene-finding pipelines (fgenesb and CRITICA/GLIMMER). The phylogenetic origins of the assembled contigs were predicted using one sequence similarity-based ( blast hit distribution) and twomore » sequence composition-based (PhyloPythia, oligonucleotide frequencies) binning methods. We explored the effects of the simulated community structure and method combinations on the fidelity of each processing step by comparison to the corresponding isolate genomes. The simulated data sets are available online to facilitate standardized benchmarking of tools for metagenomic analysis.« less
Use of simulated data sets to evaluate the fidelity of Metagenomicprocessing methods
DOE Office of Scientific and Technical Information (OSTI.GOV)
Mavromatis, Konstantinos; Ivanova, Natalia; Barry, Kerri
2006-12-01
Metagenomics is a rapidly emerging field of research for studying microbial communities. To evaluate methods presently used to process metagenomic sequences, we constructed three simulated data sets of varying complexity by combining sequencing reads randomly selected from 113 isolate genomes. These data sets were designed to model real metagenomes in terms of complexity and phylogenetic composition. We assembled sampled reads using three commonly used genome assemblers (Phrap, Arachne and JAZZ), and predicted genes using two popular gene finding pipelines (fgenesb and CRITICA/GLIMMER). The phylogenetic origins of the assembled contigs were predicted using one sequence similarity--based (blast hit distribution) and twomore » sequence composition--based (PhyloPythia, oligonucleotide frequencies) binning methods. We explored the effects of the simulated community structure and method combinations on the fidelity of each processing step by comparison to the corresponding isolate genomes. The simulated data sets are available online to facilitate standardized benchmarking of tools for metagenomic analysis.« less
MPRAnator: a web-based tool for the design of massively parallel reporter assay experiments
Georgakopoulos-Soares, Ilias; Jain, Naman; Gray, Jesse M; Hemberg, Martin
2017-01-01
Motivation: With the rapid advances in DNA synthesis and sequencing technologies and the continuing decline in the associated costs, high-throughput experiments can be performed to investigate the regulatory role of thousands of oligonucleotide sequences simultaneously. Nevertheless, designing high-throughput reporter assay experiments such as massively parallel reporter assays (MPRAs) and similar methods remains challenging. Results: We introduce MPRAnator, a set of tools that facilitate rapid design of MPRA experiments. With MPRA Motif design, a set of variables provides fine control of how motifs are placed into sequences, thereby allowing the investigation of the rules that govern transcription factor (TF) occupancy. MPRA single-nucleotide polymorphism design can be used to systematically examine the functional effects of single or combinations of single-nucleotide polymorphisms at regulatory sequences. Finally, the Transmutation tool allows for the design of negative controls by permitting scrambling, reversing, complementing or introducing multiple random mutations in the input sequences or motifs. Availability and implementation: MPRAnator tool set is implemented in Python, Perl and Javascript and is freely available at www.genomegeek.com and www.sanger.ac.uk/science/tools/mpranator. The source code is available on www.github.com/hemberg-lab/MPRAnator/ under the MIT license. The REST API allows programmatic access to MPRAnator using simple URLs. Contact: igs@sanger.ac.uk or mh26@sanger.ac.uk Supplementary information: Supplementary data are available at Bioinformatics online. PMID:27605100
MPRAnator: a web-based tool for the design of massively parallel reporter assay experiments.
Georgakopoulos-Soares, Ilias; Jain, Naman; Gray, Jesse M; Hemberg, Martin
2017-01-01
With the rapid advances in DNA synthesis and sequencing technologies and the continuing decline in the associated costs, high-throughput experiments can be performed to investigate the regulatory role of thousands of oligonucleotide sequences simultaneously. Nevertheless, designing high-throughput reporter assay experiments such as massively parallel reporter assays (MPRAs) and similar methods remains challenging. We introduce MPRAnator, a set of tools that facilitate rapid design of MPRA experiments. With MPRA Motif design, a set of variables provides fine control of how motifs are placed into sequences, thereby allowing the investigation of the rules that govern transcription factor (TF) occupancy. MPRA single-nucleotide polymorphism design can be used to systematically examine the functional effects of single or combinations of single-nucleotide polymorphisms at regulatory sequences. Finally, the Transmutation tool allows for the design of negative controls by permitting scrambling, reversing, complementing or introducing multiple random mutations in the input sequences or motifs. MPRAnator tool set is implemented in Python, Perl and Javascript and is freely available at www.genomegeek.com and www.sanger.ac.uk/science/tools/mpranator The source code is available on www.github.com/hemberg-lab/MPRAnator/ under the MIT license. The REST API allows programmatic access to MPRAnator using simple URLs. igs@sanger.ac.uk or mh26@sanger.ac.ukSupplementary information: Supplementary data are available at Bioinformatics online. © The Author 2016. Published by Oxford University Press.
Thiel, William H.; Bair, Thomas; Peek, Andrew S.; Liu, Xiuying; Dassie, Justin; Stockdale, Katie R.; Behlke, Mark A.; Miller, Francis J.; Giangrande, Paloma H.
2012-01-01
Background The broad applicability of RNA aptamers as cell-specific delivery tools for therapeutic reagents depends on the ability to identify aptamer sequences that selectively access the cytoplasm of distinct cell types. Towards this end, we have developed a novel approach that combines a cell-based selection method (cell-internalization SELEX) with high-throughput sequencing (HTS) and bioinformatics analyses to rapidly identify cell-specific, internalization-competent RNA aptamers. Methodology/Principal Findings We demonstrate the utility of this approach by enriching for RNA aptamers capable of selective internalization into vascular smooth muscle cells (VSMCs). Several rounds of positive (VSMCs) and negative (endothelial cells; ECs) selection were performed to enrich for aptamer sequences that preferentially internalize into VSMCs. To identify candidate RNA aptamer sequences, HTS data from each round of selection were analyzed using bioinformatics methods: (1) metrics of selection enrichment; and (2) pairwise comparisons of sequence and structural similarity, termed edit and tree distance, respectively. Correlation analyses of experimentally validated aptamers or rounds revealed that the best cell-specific, internalizing aptamers are enriched as a result of the negative selection step performed against ECs. Conclusions and Significance We describe a novel approach that combines cell-internalization SELEX with HTS and bioinformatics analysis to identify cell-specific, cell-internalizing RNA aptamers. Our data highlight the importance of performing a pre-clear step against a non-target cell in order to select for cell-specific aptamers. We expect the extended use of this approach to enable the identification of aptamers to a multitude of different cell types, thereby facilitating the broad development of targeted cell therapies. PMID:22962591
Ancient pathogen DNA in archaeological samples detected with a Microbial Detection Array.
Devault, Alison M; McLoughlin, Kevin; Jaing, Crystal; Gardner, Shea; Porter, Teresita M; Enk, Jacob M; Thissen, James; Allen, Jonathan; Borucki, Monica; DeWitte, Sharon N; Dhody, Anna N; Poinar, Hendrik N
2014-03-06
Ancient human remains of paleopathological interest typically contain highly degraded DNA in which pathogenic taxa are often minority components, making sequence-based metagenomic characterization costly. Microarrays may hold a potential solution to these challenges, offering a rapid, affordable, and highly informative snapshot of microbial diversity in complex samples without the lengthy analysis and/or high cost associated with high-throughput sequencing. Their versatility is well established for modern clinical specimens, but they have yet to be applied to ancient remains. Here we report bacterial profiles of archaeological and historical human remains using the Lawrence Livermore Microbial Detection Array (LLMDA). The array successfully identified previously-verified bacterial human pathogens, including Vibrio cholerae (cholera) in a 19th century intestinal specimen and Yersinia pestis ("Black Death" plague) in a medieval tooth, which represented only minute fractions (0.03% and 0.08% alignable high-throughput shotgun sequencing reads) of their respective DNA content. This demonstrates that the LLMDA can identify primary and/or co-infecting bacterial pathogens in ancient samples, thereby serving as a rapid and inexpensive paleopathological screening tool to study health across both space and time.
Li, Ruichao; Xie, Miaomiao; Dong, Ning; Lin, Dachuan; Yang, Xuemei; Wong, Marcus Ho Yin; Chan, Edward Wai-Chi; Chen, Sheng
2018-03-01
Multidrug resistance (MDR)-encoding plasmids are considered major molecular vehicles responsible for transmission of antibiotic resistance genes among bacteria of the same or different species. Delineating the complete sequences of such plasmids could provide valuable insight into the evolution and transmission mechanisms underlying bacterial antibiotic resistance development. However, due to the presence of multiple repeats of mobile elements, complete sequencing of MDR plasmids remains technically complicated, expensive, and time-consuming. Here, we demonstrate a rapid and efficient approach to obtaining multiple MDR plasmid sequences through the use of the MinION nanopore sequencing platform, which is incorporated in a portable device. By assembling the long sequencing reads generated by a single MinION run according to a rapid barcoding sequencing protocol, we obtained the complete sequences of 20 plasmids harbored by multiple bacterial strains. Importantly, single long reads covering a plasmid end-to-end were recorded, indicating that de novo assembly may be unnecessary if the single reads exhibit high accuracy. This workflow represents a convenient and cost-effective approach for systematic assessment of MDR plasmids responsible for treatment failure of bacterial infections, offering the opportunity to perform detailed molecular epidemiological studies to probe the evolutionary and transmission mechanisms of MDR-encoding elements.
[Computational chemistry in structure-based drug design].
Cao, Ran; Li, Wei; Sun, Han-Zi; Zhou, Yu; Huang, Niu
2013-07-01
Today, the understanding of the sequence and structure of biologically relevant targets is growing rapidly and researchers from many disciplines, physics and computational science in particular, are making significant contributions to modern biology and drug discovery. However, it remains challenging to rationally design small molecular ligands with desired biological characteristics based on the structural information of the drug targets, which demands more accurate calculation of ligand binding free-energy. With the rapid advances in computer power and extensive efforts in algorithm development, physics-based computational chemistry approaches have played more important roles in structure-based drug design. Here we reviewed the newly developed computational chemistry methods in structure-based drug design as well as the elegant applications, including binding-site druggability assessment, large scale virtual screening of chemical database, and lead compound optimization. Importantly, here we address the current bottlenecks and propose practical solutions.
Rapid and accurate pyrosequencing of angiosperm plastid genomes
Moore, Michael J; Dhingra, Amit; Soltis, Pamela S; Shaw, Regina; Farmerie, William G; Folta, Kevin M; Soltis, Douglas E
2006-01-01
Background Plastid genome sequence information is vital to several disciplines in plant biology, including phylogenetics and molecular biology. The past five years have witnessed a dramatic increase in the number of completely sequenced plastid genomes, fuelled largely by advances in conventional Sanger sequencing technology. Here we report a further significant reduction in time and cost for plastid genome sequencing through the successful use of a newly available pyrosequencing platform, the Genome Sequencer 20 (GS 20) System (454 Life Sciences Corporation), to rapidly and accurately sequence the whole plastid genomes of the basal eudicot angiosperms Nandina domestica (Berberidaceae) and Platanus occidentalis (Platanaceae). Results More than 99.75% of each plastid genome was simultaneously obtained during two GS 20 sequence runs, to an average depth of coverage of 24.6× in Nandina and 17.3× in Platanus. The Nandina and Platanus plastid genomes shared essentially identical gene complements and possessed the typical angiosperm plastid structure and gene arrangement. To assess the accuracy of the GS 20 sequence, over 45 kilobases of sequence were generated for each genome using conventional sequencing. Overall error rates of 0.043% and 0.031% were observed in GS 20 sequence for Nandina and Platanus, respectively. More than 97% of all observed errors were associated with homopolymer runs, with ~60% of all errors associated with homopolymer runs of 5 or more nucleotides and ~50% of all errors associated with regions of extensive homopolymer runs. No substitution errors were present in either genome. Error rates were generally higher in the single-copy and noncoding regions of both plastid genomes relative to the inverted repeat and coding regions. Conclusion Highly accurate and essentially complete sequence information was obtained for the Nandina and Platanus plastid genomes using the GS 20 System. More importantly, the high accuracy observed in the GS 20 plastid genome sequence was generated for a significant reduction in time and cost over traditional shotgun-based genome sequencing techniques, although with approximately half the coverage of previously reported GS 20 de novo genome sequence. The GS 20 should be broadly applicable to angiosperm plastid genome sequencing, and therefore promises to expand the scale of plant genetic and phylogenetic research dramatically. PMID:16934154
Rapid detection of Mannheimia haemolytica in lung tissues of sheep and from bacterial culture.
Kumar, Jyoti; Dixit, Shivendra Kumar; Kumar, Rajiv
2015-09-01
This study was aimed to detect Mannheimia haemolytica in lung tissues of sheep and from a bacterial culture. M. haemolytica is one of the most important and well-established etiological agents of pneumonia in sheep and other ruminants throughout the world. Accurate diagnosis of M. haemolytica primarily relies on bacteriological examination, biochemical characteristics and, biotyping and serotyping of the isolates. In an effort to facilitate rapid M. haemolytica detection, polymerase chain reaction assay targeting Pasteurella haemolytica serotype-1 specific antigens (PHSSA), Rpt2 and 12S ribosomal RNA (rRNA) genes were used to detect M. haemolytica directly from lung tissues and from bacterial culture. A total of 12 archived lung tissues from sheep that died of pneumonia on an organized farm were used. A multiplex polymerase chain reaction (mPCR) based on two-amplicons targeted PHSSA and Rpt2 genes of M. haemolytica were used for identification of M. haemolytica isolates in culture from the lung samples. All the 12 lung tissue samples were tested for the presence M. haemolytica by PHSSA and Rpt2 genes based PCR and its confirmation by sequencing of the amplicons. All the 12 lung tissue samples tested for the presence of PHSSA and Rpt2 genes of M. haemolytica by mPCR were found to be positive. Amplification of 12S rRNA gene fragment as internal amplification control was obtained with each mPCR reaction performed from DNA extracted directly from lung tissue samples. All the M. haemolytica were also positive for mPCR. No amplified DNA bands were observed for negative control reactions. All the three nucleotide sequences were deposited in NCBI GenBank (Accession No. KJ534629, KJ534630 and KJ534631). Sequencing of the amplified products revealed the identity of 99-100%, with published sequence of PHSSA and Rpt2 genes of M. haemolytica available in the NCBI database. Sheep specific mitochondrial 12S rRNA gene sequence also revealed the identity of 98% with published sequences in the NCBI database. The present study emphasized the PCR as a valuable tool for rapid detection of M. haemolytica in clinical samples from animals. In addition, it offers the opportunity to perform large-scale epidemiological studies regarding the role of M. haemolytica in clinical cases of pneumonia and other disease manifestations in sheep and other ruminants, thereby providing the basis for effective preventive strategies.
Nanopores and nucleic acids: prospects for ultrarapid sequencing
NASA Technical Reports Server (NTRS)
Deamer, D. W.; Akeson, M.
2000-01-01
DNA and RNA molecules can be detected as they are driven through a nanopore by an applied electric field at rates ranging from several hundred microseconds to a few milliseconds per molecule. The nanopore can rapidly discriminate between pyrimidine and purine segments along a single-stranded nucleic acid molecule. Nanopore detection and characterization of single molecules represents a new method for directly reading information encoded in linear polymers. If single-nucleotide resolution can be achieved, it is possible that nucleic acid sequences can be determined at rates exceeding a thousand bases per second.
Chin, Ephrem L H; da Silva, Cristina; Hegde, Madhuri
2013-02-19
Detecting mutations in disease genes by full gene sequence analysis is common in clinical diagnostic laboratories. Sanger dideoxy terminator sequencing allows for rapid development and implementation of sequencing assays in the clinical laboratory, but it has limited throughput, and due to cost constraints, only allows analysis of one or at most a few genes in a patient. Next-generation sequencing (NGS), on the other hand, has evolved rapidly, although to date it has mainly been used for large-scale genome sequencing projects and is beginning to be used in the clinical diagnostic testing. One advantage of NGS is that many genes can be analyzed easily at the same time, allowing for mutation detection when there are many possible causative genes for a specific phenotype. In addition, regions of a gene typically not tested for mutations, like deep intronic and promoter mutations, can also be detected. Here we use 20 previously characterized Sanger-sequenced positive controls in disease-causing genes to demonstrate the utility of NGS in a clinical setting using standard PCR based amplification to assess the analytical sensitivity and specificity of the technology for detecting all previously characterized changes (mutations and benign SNPs). The positive controls chosen for validation range from simple substitution mutations to complex deletion and insertion mutations occurring in autosomal dominant and recessive disorders. The NGS data was 100% concordant with the Sanger sequencing data identifying all 119 previously identified changes in the 20 samples. We have demonstrated that NGS technology is ready to be deployed in clinical laboratories. However, NGS and associated technologies are evolving, and clinical laboratories will need to invest significantly in staff and infrastructure to build the necessary foundation for success.
A disruptive sequencer meets disruptive publishing.
Loman, Nick; Goodwin, Sarah; Jansen, Hans; Loose, Matt
2015-01-01
Nanopore sequencing was recently made available to users in the form of the Oxford Nanopore MinION. Released to users through an early access programme, the MinION is made unique by its tiny form factor and ability to generate very long sequences from single DNA molecules. The platform is undergoing rapid evolution with three distinct nanopore types and five updates to library preparation chemistry in the last 18 months. To keep pace with the rapid evolution of this sequencing platform, and to provide a space where new analysis methods can be openly discussed, we present a new F1000Research channel devoted to updates to and analysis of nanopore sequence data.
Axe: rapid, competitive sequence read demultiplexing using a trie.
Murray, Kevin D; Borevitz, Justin O
2018-06-01
We describe a rapid algorithm for demultiplexing DNA sequence reads with in-read indices. Axe selects the optimal index present in a sequence read, even in the presence of sequencing errors. The algorithm is able to handle combinatorial indexing, indices of differing length, and several mismatches per index sequence. Axe is implemented in C, and is used as a command-line program on Unix-like systems. Axe is available online at https://github.com/kdmurray91/axe, and is available in Debian/Ubuntu distributions of GNU/Linux as the package axe-demultiplexer. Kevin Murray axe@kdmurray.id.au. Supplementary data are available at Bioinformatics online.
RNA interference-based therapeutics: new strategies to fight infectious disease.
López-Fraga, M; Wright, N; Jiménez, A
2008-12-01
For many years, there has been an ongoing search for new compounds that can selectively alter gene expression as a new way to treat human disease by addressing targets that are otherwise "undruggable" with traditional pharmaceutical approaches involving small molecules or proteins. RNA interference (RNAi) strategies have raised a lot of attention and several compounds are currently being tested in clinical trials. Viruses are the obvious target for RNAi-therapy, as most are difficult to treat with conventional drugs, they become rapidly resistant to drug treatment and their genes differ substantially from human genes, minimizing side effects. Antisense strategy offers very high target specificity, i.e., any viral sequence could potentially be targeted using the complementary oligonucleotide sequence. Consequently, new antisense-based therapeutics have the potential to lead a revolution in the anti-infective drug development field. Additionally, the relatively short turnaround for efficacy testing of potential RNAi molecules and that any pathogen is theoretically amenable to rapid targeting, make them invaluable tools for treating a wide range of diseases. This review will focus on some of the current efforts to treat infectious disease with RNAi-based therapies and some of the obstacles that have appeared on the road to successful clinical intervention.
Application of Pyrosequencing® in Food Biodefense.
Amoako, Kingsley Kwaku
2015-01-01
The perpetration of a bioterrorism attack poses a significant risk for public health with potential socioeconomic consequences. It is imperative that we possess reliable assays for the rapid and accurate identification of biothreat agents to make rapid risk-informed decisions on emergency response. The development of advanced methodologies for the detection of biothreat agents has been evolving rapidly since the release of the anthrax spores in the mail in 2001, and recent advances in detection and identification techniques could prove to be an essential component in the defense against biological attacks. Sequence-based approaches such as Pyrosequencing(®), which has the capability to determine short DNA stretches in real time using biotinylated PCR amplicons, have potential biodefense applications. Using markers from the virulence plasmids and chromosomal regions, my laboratory has demonstrated the power of this technology in the rapid, specific, and sensitive detection of B. anthracis spores and Yersinia pestis in food. These are the first applications for the detection of the two organisms in food. Furthermore, my lab has developed a rapid assay to characterize the antimicrobial resistance (AMR) gene profiles for Y. pestis using Pyrosequencing. Pyrosequencing is completed in about 60 min (following PCR amplification) and yields accurate and reliable results with an added layer of confidence, thus enabling rapid risk-informed decisions to be made. A typical run yields 40-84 bp reads with 94-100 % identity to the expected sequence. It also provides a rapid method for determining the AMR profile as compared to the conventional plate method which takes several days. The method described is proposed as a novel detection system for potential application in food biodefense.
Rapid Detection of Powassan Virus in a Patient With Encephalitis by Metagenomic Sequencing
Piantadosi, Anne; Kanjilal, Sanjat; Ganesh, Vijay; Khanna, Arjun; Hyle, Emily P; Rosand, Jonathan; Bold, Tyler; Metsky, Hayden C; Lemieux, Jacob; Leone, Michael J; Freimark, Lisa; Matranga, Christian B; Adams, Gordon; McGrath, Graham; Zamirpour, Siavash; Telford, Sam; Rosenberg, Eric; Cho, Tracey; Frosch, Matthew P; Goldberg, Marcia B; Mukerji, Shibani S; Sabeti, Pardis C
2018-01-01
Abstract We describe a patient with severe and progressive encephalitis of unknown etiology. We performed rapid metagenomic sequencing from cerebrospinal fluid and identified Powassan virus, an emerging tick-borne flavivirus that has been increasingly detected in the United States. PMID:29020227
A Rapid Method to Test for Chloroplast DNA Involvement in Atrazine Resistance
McNally, Sheila; Bettini, Priscilla; Sevignac, Mireille; Darmency, Henry; Gasquez, Jacques; Dron, Michel
1987-01-01
A point mutation in the chloroplast psbA gene at codon 264 resulting in an animo acid substitution (ser-gly) manifests itself as atrazine resistance in all recognized weed species studied to date. The single base substitution overlaps a highly conserved Mae1 restriction site which is present in susceptible but not in resistant plants. This restriction enzyme, recently commercialized, has been used to show that it is now possible to discriminate rapidly between the two biotypes without the need for DNA sequencing. Images Fig. 1 PMID:16665229
Laser mass spectrometry for DNA sequencing, disease diagnosis, and fingerprinting
NASA Astrophysics Data System (ADS)
Chen, C. H. Winston; Taranenko, N. I.; Zhu, Y. F.; Chung, C. N.; Allman, S. L.
1997-05-01
Since laser mass spectrometry has the potential for achieving very fast DNA analysis, we recently applied it to DNA sequencing, DNA typing for fingerprinting, and DNA screening for disease diagnosis. Two different approaches for sequencing DNA have been successfully demonstrated. One is to sequence DNA with DNA ladders produced from Sanger's enzymatic method. The other is to do direct sequencing without DNA ladders. The need for quick DNA typing for identification purposes is critical for forensic application. Our preliminary results indicate laser mass spectrometry can possible be used for rapid DNA fingerprinting applications at a much lower cost than gel electrophoresis. Population screening for certain genetic disease can be a very efficient step to reducing medical costs through prevention. Since laser mass spectrometry can provide very fast DNA analysis, we applied laser mass spectrometry to disease diagnosis. Clinical samples with both base deletion and point mutation have been tested with complete success.
Proteome studies of filamentous fungi.
Baker, Scott E; Panisko, Ellen A
2011-01-01
The continued fast pace of fungal genome sequence generation has enabled proteomic analysis of a wide variety of organisms that span the breadth of the Kingdom Fungi. There is some phylogenetic bias to the current catalog of fungi with reasonable DNA sequence databases (genomic or EST) that could be analyzed at a global proteomic level. However, the rapid development of next generation sequencing platforms has lowered the cost of genome sequencing such that in the near future, having a genome sequence will no longer be a time or cost bottleneck for downstream proteomic (and transcriptomic) analyses. High throughput, nongel-based proteomics offers a snapshot of proteins present in a given sample at a single point in time. There are a number of variations on the general methods and technologies for identifying peptides in a given sample. We present a method that can serve as a "baseline" for proteomic studies of fungi.
Guo, Qian; Yu, Yan; Zhu, Yan Ling; Zhao, Xiu Qin; Liu, Zhi Guang; Zhang, Yuan Yuan; Li, Gui Lian; Wei, Jian Hao; Wu, Yi Mou; Wan, Kang Lin
2015-01-01
A PCR-reverse dot blot hybridization (RDBH) assay was developed for rapid detection of rpoB gene mutations in 'hot mutation region' of Mycobacterium tuberculosis (M. tuberculosis). 12 oligonucleotide probes based on the wild-type and mutant genotype rpoB sequences of M. tuberculosis were designed to screen the most frequent wild-type and mutant genotypes for diagnosing RIF resistance. 300 M. tuberculosis clinical isolates were detected by RDBH, conventional drug-susceptibility testing (DST) and DNA sequencing to evaluate the RDBH assay. The sensitivity and specificity of the RDBH assay were 91.2% (165/181) and 98.3% (117/119), respectively, as compared to DST. When compared with DNA sequencing, the accuracy, positive predictive value (PPV) and negative predictive value (NPV) of the RDBH assay were 97.7% (293/300), 98.2% (164/167), and 97.0% (129/133), respectively. Furthermore, the results indicated that the most common mutations were in codons 531 (48.6%), 526 (25.4%), 516 (8.8%), and 511 (6.6%), and the combinative mutation rate was 15 (8.3%). One and two strains of insertion and deletion were found among all strains, respectively. Our findings demonstrate that the RDBH assay is a rapid, simple and sensitive method for diagnosing RIF-resistant tuberculosis. Copyright © 2015 The Editorial Board of Biomedical and Environmental Sciences. Published by China CDC. All rights reserved.
Oakley, Brian B; Line, J Eric; Berrang, Mark E; Johnson, Jessica M; Buhr, R Jeff; Cox, Nelson A; Hiett, Kelli L; Seal, Bruce S
2012-02-01
Although Campylobacter is an important food-borne human pathogen, there remains a lack of molecular diagnostic assays that are simple to use, cost-effective, and provide rapid results in research, clinical, or regulatory laboratories. Of the numerous Campylobacter assays that do exist, to our knowledge none has been empirically tested for specificity using high-throughput sequencing. Here we demonstrate the power of next-generation sequencing to determine the specificity of a widely cited Campylobacter-specific polymerase chain reaction (PCR) assay and describe a rapid method for direct cell suspension PCR to quickly and easily screen samples for Campylobacter. We present a specific protocol which eliminates the need for time-consuming and expensive genomic DNA extractions and, using a high-processivity polymerase, demonstrate conclusive screening of samples in <1 h. Pyrosequencing results show the assay to be extremely (>99%) sensitive, and spike-back experiments demonstrated a detection threshold of <10(2) CFU mL(-1). Additionally, we present 2 newly designed broad-range bacterial primer sets targeting the 23S rRNA gene that have wide applicability as internal amplification controls. Empirical testing of putative taxon-specific assays using high-throughput sequencing is an important validation step that is now financially feasible for research, regulatory, or clinical applications. Published by Elsevier Inc.
2010-08-25
or intentional genetic modifications that circumvent the targets of the detection assays or in the case of a biological attack using an antibiotic ...genetic changes conferring antibiotic resistance can be deciphered rapidly and accurately using WGS. We demonstrate the utility of Roche 454...Rapid Identification of Genetic Modifications in Bacillus anthracis Using Whole Genome Draft Sequences Generated by 454 Pyrosequencing Peter E. Chen1
Huang, Chien-Hsun; Chang, Mu-Tzu; Huang, Lina
2013-02-01
Cronobacter sakazakii and its phylogenetically closest species are considered to be an opportunistic pathogens associated with food-borne disease in neonates and infants. Neither phenotypic nor genotypic (16S ribosomal DNA sequence analysis) techniques can provide sufficient resolutions for accurately and rapidly identification of these species. The objective of this study was to develop species-specific PCR based on the gyrB gene sequence for direct species identification of the C. sakazakii and Cronobacter dublinensis within the C. sakazakii group. Two pair of species-specific primers were designed and used to specifically identify C. sakazakii and C. dublinensis, but none of the other C. sakazakii group strains. Our data indicate that the novel species-specific primers could be used to rapidly and accurately identify the species of C. sakazakii and C. dublinensis from C. sakazakii group by the PCR based assays. Copyright © 2012 Elsevier Ltd. All rights reserved.
Barrett, Craig F; Wicke, Susann; Sass, Chodon
2018-05-01
Heterotrophic plants provide excellent opportunities to study the effects of altered selective regimes on genome evolution. Plastid genome (plastome) studies in heterotrophic plants are often based on one or a few highly divergent species or sequences as representatives of an entire lineage, thus missing important evolutionary-transitory events. Here, we present the first infraspecific analysis of plastome evolution in any heterotrophic plant. By combining genome skimming and targeted sequence capture, we address hypotheses on the degree and rate of plastome degradation in a complex of leafless orchids (Corallorhiza striata) across its geographic range. Plastomes provide strong support for relationships and evidence of reciprocal monophyly between C. involuta and the endangered C. bentleyi. Plastome degradation is extensive, occurring rapidly over a few million years, with evidence of differing rates of genomic change among the two principal clades of the complex. Genome skimming and targeted sequence capture differ widely in coverage depth overall, with depth in targeted sequence capture datasets varying immensely across the plastome as a function of GC content. These findings will help to fill a knowledge gap in models of heterotrophic plastid genome evolution, and have implications for future studies in heterotrophs. © 2018 The Authors. New Phytologist © 2018 New Phytologist Trust.
Kaleta, Pawel; Callanan, Michael J; O'Callaghan, John; Fitzgerald, Gerald F; Beresford, Thomas P; Ross, R Paul
2009-10-01
The species Lactobacillus helveticus is a commonly used thermophilic starter and/or adjunct culture for Swiss and Cheddar cheese manufacture. Its use is normally associated with flavour improvement which is known to be associated with culture traits such as rapid autolysis and high proteolytic activity. The genome of the commercial strain, DPC4571, was recently sequenced and found to have an abundance of IS sequences in terms of both abundance (213 intact) and diversity (21 types). Given this unique diversity for a lactic acid bacterium, we investigated whether PCR-based IS fingerprinting could be used as a discriminatory tool to distinguish between different strains of Lb. helveticus. A set of ten primers targeting five of the most numerous groups (ISL1201, ISLhe65, ISLhe2, ISLhe15 and ISL2) of IS elements was designed. Multiplex-PCR with all primers resulted in 1-12 discreet amplicons for each strain tested. The resultant fingerprints (in the 0.5 kb-3 kb range) were found to be strain specific and reproducible. This approach thus provides a valuable method to distinguish between Lb. helveticus strains while giving some indication of the relative abundance of IS sequences in each strain.
Detecting Genomic Clustering of Risk Variants from Sequence Data: Cases vs. Controls
Schaid, Daniel J.; Sinnwell, Jason P.; McDonnell, Shannon K.; Thibodeau, Stephen N.
2013-01-01
As the ability to measure dense genetic markers approaches the limit of the DNA sequence itself, taking advantage of possible clustering of genetic variants in, and around, a gene would benefit genetic association analyses, and likely provide biological insights. The greatest benefit might be realized when multiple rare variants cluster in a functional region. Several statistical tests have been developed, one of which is based on the popular Kulldorff scan statistic for spatial clustering of disease. We extended another popular spatial clustering method – Tango’s statistic – to genomic sequence data. An advantage of Tango’s method is that it is rapid to compute, and when single test statistic is computed, its distribution is well approximated by a scaled chi-square distribution, making computation of p-values very rapid. We compared the Type-I error rates and power of several clustering statistics, as well as the omnibus sequence kernel association test (SKAT). Although our version of Tango’s statistic, which we call “Kernel Distance” statistic, took approximately half the time to compute than the Kulldorff scan statistic, it had slightly less power than the scan statistic. Our results showed that the Ionita-Laza version of Kulldorff’s scan statistic had the greatest power over a range of clustering scenarios. PMID:23842950
Rapid rate of control-region evolution in Pacific butterflyfishes (Chaetodontidae).
McMillan, W O; Palumbi, S R
1997-11-01
Sequence differences in the tRNA-proline (tRNApro) end of the mitochondrial control-region of three species of Pacific butterflyfishes accumulated 33-43 times more rapidly than did changes within the mitochondrial cytochrome b gene (cytb). Rapid evolution in this region was accompanied by strong transition/transversion bias and large variation in the probability of a DNA substitution among sites. These substitution constraints placed an absolute ceiling on the magnitude of sequence divergence that could be detected between individuals. This divergence "ceiling" was reached rapidly and led to a decay in the relative rate of control-region/cytb b evolution. A high rate of evolution in this section of the control-region of butterflyfishes stands in marked contrast to the patterns reported in some other fish lineages. Although the mechanism underlying rate variation remains unclear, all taxa with rapid evolution in the 5'-end of the control-region showed extreme transition biases. By contrast, in taxa with slower control-region evolution, transitions accumulated at nearly the same rate as transversions. More information is needed to understand the relationship between nucleotide bias and the rate of evolution in the 5'-end of the control-region. Despite strong constraints on sequence change, phylogenetic information was preserved in the group of recently differentiated species and supported the clustering of sequences into three major mtDNA groupings. Within these groups, very similar control-region sequences were widely distributed across the Pacific Ocean and were shared between recognized species, indicating a lack of mitochondrial sequence monophyly among species.
Chen, Zhuo; Xu, Shixia; Zhou, Kaiya; Yang, Guang
2011-10-27
A diversity of hypotheses have been proposed based on both morphological and molecular data to reveal phylogenetic relationships within the order Cetacea (dolphins, porpoises, and whales), and great progress has been made in the past two decades. However, there is still some controversy concerning relationships among certain cetacean taxa such as river dolphins and delphinoid species, which needs to be further addressed with more markers in an effort to address unresolved portions of the phylogeny. An analysis of additional SINE insertions and SINE-flanking sequences supported the monophyly of the order Cetacea as well as Odontocete, Delphinoidea (Delphinidae + Phocoenidae + Mondontidae), and Delphinidae. A sister relationship between Delphinidae and Phocoenidae + Mondontidae was supported, and members of classical river dolphins and the genera Tursiops and Stenella were found to be paraphyletic. Estimates of divergence times revealed rapid divergences of basal Odontocete lineages in the Oligocene and Early Miocene, and a recent rapid diversification of Delphinidae in the Middle-Late Miocene and Pliocene within a narrow time frame. Several novel SINEs were found to differentiate Delphinidae from the other two families (Monodontidae and Phocoenidae), whereas the sister grouping of the latter two families with exclusion of Delphinidae was further revealed using the SINE-flanking sequences. Interestingly, some anomalous PCR amplification patterns of SINE insertions were detected, which can be explained as the result of potential ancestral SINE polymorphisms and incomplete lineage sorting. Although a few loci were potentially anomalous, this study demonstrated that the SINE-based approach is a powerful tool in phylogenetic studies. Identifying additional SINE elements that resolve the relationships in the superfamily Delphinoidea and family Delphinidae will be important steps forward in completely resolving cetacean phylogenetic relationships in the future.
2011-01-01
Background A diversity of hypotheses have been proposed based on both morphological and molecular data to reveal phylogenetic relationships within the order Cetacea (dolphins, porpoises, and whales), and great progress has been made in the past two decades. However, there is still some controversy concerning relationships among certain cetacean taxa such as river dolphins and delphinoid species, which needs to be further addressed with more markers in an effort to address unresolved portions of the phylogeny. Results An analysis of additional SINE insertions and SINE-flanking sequences supported the monophyly of the order Cetacea as well as Odontocete, Delphinoidea (Delphinidae + Phocoenidae + Mondontidae), and Delphinidae. A sister relationship between Delphinidae and Phocoenidae + Mondontidae was supported, and members of classical river dolphins and the genera Tursiops and Stenella were found to be paraphyletic. Estimates of divergence times revealed rapid divergences of basal Odontocete lineages in the Oligocene and Early Miocene, and a recent rapid diversification of Delphinidae in the Middle-Late Miocene and Pliocene within a narrow time frame. Conclusions Several novel SINEs were found to differentiate Delphinidae from the other two families (Monodontidae and Phocoenidae), whereas the sister grouping of the latter two families with exclusion of Delphinidae was further revealed using the SINE-flanking sequences. Interestingly, some anomalous PCR amplification patterns of SINE insertions were detected, which can be explained as the result of potential ancestral SINE polymorphisms and incomplete lineage sorting. Although a few loci were potentially anomalous, this study demonstrated that the SINE-based approach is a powerful tool in phylogenetic studies. Identifying additional SINE elements that resolve the relationships in the superfamily Delphinoidea and family Delphinidae will be important steps forward in completely resolving cetacean phylogenetic relationships in the future. PMID:22029548
Arkas: Rapid reproducible RNAseq analysis
Colombo, Anthony R.; J. Triche Jr, Timothy; Ramsingh, Giridharan
2017-01-01
The recently introduced Kallisto pseudoaligner has radically simplified the quantification of transcripts in RNA-sequencing experiments. We offer cloud-scale RNAseq pipelines Arkas-Quantification, and Arkas-Analysis available within Illumina’s BaseSpace cloud application platform which expedites Kallisto preparatory routines, reliably calculates differential expression, and performs gene-set enrichment of REACTOME pathways . Due to inherit inefficiencies of scale, Illumina's BaseSpace computing platform offers a massively parallel distributive environment improving data management services and data importing. Arkas-Quantification deploys Kallisto for parallel cloud computations and is conveniently integrated downstream from the BaseSpace Sequence Read Archive (SRA) import/conversion application titled SRA Import. Arkas-Analysis annotates the Kallisto results by extracting structured information directly from source FASTA files with per-contig metadata, calculates the differential expression and gene-set enrichment analysis on both coding genes and transcripts. The Arkas cloud pipeline supports ENSEMBL transcriptomes and can be used downstream from the SRA Import facilitating raw sequencing importing, SRA FASTQ conversion, RNA quantification and analysis steps. PMID:28868134
Han, Yuepeng; Chagné, David; Gasic, Ksenija; Rikkerink, Erik H A; Beever, Jonathan E; Gardiner, Susan E; Korban, Schuyler S
2009-03-01
A genome-wide BAC physical map of the apple, Malus x domestica Borkh., has been recently developed. Here, we report on integrating the physical and genetic maps of the apple using a SNP-based approach in conjunction with bin mapping. Briefly, BAC clones located at ends of BAC contigs were selected, and sequenced at both ends. The BAC end sequences (BESs) were used to identify candidate SNPs. Subsequently, these candidate SNPs were genetically mapped using a bin mapping strategy for the purpose of mapping the physical onto the genetic map. Using this approach, 52 (23%) out of 228 BESs tested were successfully exploited to develop SNPs. These SNPs anchored 51 contigs, spanning approximately 37 Mb in cumulative physical length, onto 14 linkage groups. The reliability of the integration of the physical and genetic maps using this SNP-based strategy is described, and the results confirm the feasibility of this approach to construct an integrated physical and genetic maps for apple.
Use of the Minion nanopore sequencer for rapid sequencing of avian influenza virus isolates
USDA-ARS?s Scientific Manuscript database
A relatively new sequencing technology, the MinION nanopore sequencer, provides a platform that is smaller, faster, and cheaper than existing Next Generation Sequence (NGS) technologies. The MinION sequences of individual strands of DNA and can produce millions of sequencing reads. The cost of the s...
Fluorogenic DNA Sequencing in PDMS Microreactors
Sims, Peter A.; Greenleaf, William J.; Duan, Haifeng; Xie, X. Sunney
2012-01-01
We have developed a multiplex sequencing-by-synthesis method combining terminal-phosphate labeled fluorogenic nucleotides (TPLFNs) and resealable microreactors. In the presence of phosphatase, the incorporation of a non-fluorescent TPLFN into a DNA primer by DNA polymerase results in a fluorophore. We immobilize DNA templates within polydimethylsiloxane (PDMS) microreactors, sequentially introduce one of the four identically labeled TPLFNs, seal the microreactors, allow template-directed TPLFN incorporation, and measure the signal from the fluorophores trapped in the microreactors. This workflow allows sequencing in a manner akin to pyrosequencing but without constant monitoring of each microreactor. With cycle times of <10 minutes, we demonstrate 30 base reads with ∼99% raw accuracy. “Fluorogenic pyrosequencing” combines benefits of pyrosequencing, such as rapid turn-around, native DNA generation, and single-color detection, with benefits of fluorescence-based approaches, such as highly sensitive detection and simple parallelization. PMID:21666670
Liu, Tianyu; Liang, Yinan; Zhong, Xiuqin; Wang, Ning; Hu, Dandan; Zhou, Xuan; Gu, Xiaobin; Peng, Xuerong; Yang, Guangyou
2014-01-01
Dirofilaria immitis (heartworm) is the causative agent of an important zoonotic disease that is spread by mosquitoes. In this study, molecular and phylogenetic characterization of D. immitis were performed based on complete ND1 and 16S rDNA gene sequences, which provided the foundation for more advanced molecular diagnosis, prevention, and control of heartworm diseases. The mutation rate and evolutionary divergence in adult heartworm samples from seven dogs in western China were analyzed to obtain information on genetic diversity and variability. Phylogenetic relationships were inferred using both maximum parsimony (MP) and Bayes methods based on the complete gene sequences. The results suggest that D. immitis formed an independent monophyletic group in which the 16S rDNA gene has mutated more rapidly than has ND1. PMID:24639299
NASA Astrophysics Data System (ADS)
Liu, Feng-xiang; Liu, Rang-su; Hou, Zhao-yang; Liu, Hai-Rong; Tian, Ze-an; Zhou, Li-li
2009-02-01
The rapid solidification processes of Al 50Mg 50 liquid alloy consisting of 50,000 atoms have been simulated by using molecular dynamics method based on the effective pair potential derived from the pseudopotential theory. The formation mechanisms of atomic clusters during the rapid solidification processes have been investigated adopting a new cluster description method—cluster-type index method (CTIM). The simulated partial structure factors are in good agreement with the experimental results. And Al-Mg amorphous structure characterized with Al-centered icosahedral topological short-range order (SRO) is found to form during the rapid solidification processes. The icosahedral cluster plays a key role in the microstructure transition. Besides, it is also found that the size distribution of various clusters in the system presents a magic number sequence of 13, 19, 23, 25, 29, 31, 33, 37, …. The magic clusters are more stable and mainly correspond to the incompact arrangements of linked icosahedra in the form of rings, chains or dendrites. And each magic number point stands correspondingly for one certain combining form of icosahedra. This magic number sequence is different from that generated in the solidification structure of liquid Al and those obtained by methods of gaseous deposition and ionic spray, etc.
Diversification of Rosaceae since the Late Cretaceous based on plastid phylogenomics.
Zhang, Shu-Dong; Jin, Jian-Jun; Chen, Si-Yun; Chase, Mark W; Soltis, Douglas E; Li, Hong-Tao; Yang, Jun-Bo; Li, De-Zhu; Yi, Ting-Shuang
2017-05-01
Phylogenetic relationships in Rosaceae have long been problematic because of frequent hybridisation, apomixis and presumed rapid radiation, and their historical diversification has not been clarified. With 87 genera representing all subfamilies and tribes of Rosaceae and six of the other eight families of Rosales (outgroups), we analysed 130 newly sequenced plastomes together with 12 from GenBank in an attempt to reconstruct deep relationships and reveal temporal diversification of this family. Our results highlight the importance of improving sequence alignment and the use of appropriate substitution models in plastid phylogenomics. Three subfamilies and 16 tribes (as previously delimited) were strongly supported as monophyletic, and their relationships were fully resolved and strongly supported at most nodes. Rosaceae were estimated to have originated during the Late Cretaceous with evidence for rapid diversification events during several geological periods. The major lineages rapidly diversified in warm and wet habits during the Late Cretaceous, and the rapid diversification of genera from the early Oligocene onwards occurred in colder and drier environments. Plastid phylogenomics offers new and important insights into deep phylogenetic relationships and the diversification history of Rosaceae. The robust phylogenetic backbone and time estimates we provide establish a framework for future comparative studies on rosaceous evolution. © 2017 The Authors. New Phytologist © 2017 New Phytologist Trust.
Yin, Li; Yao, Jiqiang; Gardner, Brent P; Chang, Kaifen; Yu, Fahong; Goodenow, Maureen M
2012-01-01
Next Generation sequencing (NGS) applied to human papilloma viruses (HPV) can provide sensitive methods to investigate the molecular epidemiology of multiple type HPV infection. Currently a genotyping system with a comprehensive collection of updated HPV reference sequences and a capacity to handle NGS data sets is lacking. HPV-QUEST was developed as an automated and rapid HPV genotyping system. The web-based HPV-QUEST subtyping algorithm was developed using HTML, PHP, Perl scripting language, and MYSQL as the database backend. HPV-QUEST includes a database of annotated HPV reference sequences with updated nomenclature covering 5 genuses, 14 species and 150 mucosal and cutaneous types to genotype blasted query sequences. HPV-QUEST processes up to 10 megabases of sequences within 1 to 2 minutes. Results are reported in html, text and excel formats and display e-value, blast score, and local and coverage identities; provide genus, species, type, infection site and risk for the best matched reference HPV sequence; and produce results ready for additional analyses.
Wood, Natasha; Bhattacharya, Tanmoy; Keele, Brandon F; Giorgi, Elena; Liu, Michael; Gaschen, Brian; Daniels, Marcus; Ferrari, Guido; Haynes, Barton F; McMichael, Andrew; Shaw, George M; Hahn, Beatrice H; Korber, Bette; Seoighe, Cathal
2009-05-01
The pattern of viral diversification in newly infected individuals provides information about the host environment and immune responses typically experienced by the newly transmitted virus. For example, sites that tend to evolve rapidly across multiple early-infection patients could be involved in enabling escape from common early immune responses, could represent adaptation for rapid growth in a newly infected host, or could represent reversion from less fit forms of the virus that were selected for immune escape in previous hosts. Here we investigated the diversification of HIV-1 env coding sequences in 81 very early B subtype infections previously shown to have resulted from transmission or expansion of single viruses (n = 78) or two closely related viruses (n = 3). In these cases, the sequence of the infecting virus can be estimated accurately, enabling inference of both the direction of substitutions as well as distinction between insertion and deletion events. By integrating information across multiple acutely infected hosts, we find evidence of adaptive evolution of HIV-1 env and identify a subset of codon sites that diversified more rapidly than can be explained by a model of neutral evolution. Of 24 such rapidly diversifying sites, 14 were either i) clustered and embedded in CTL epitopes that were verified experimentally or predicted based on the individual's HLA or ii) in a nucleotide context indicative of APOBEC-mediated G-to-A substitutions, despite having excluded heavily hypermutated sequences prior to the analysis. In several cases, a rapidly evolving site was embedded both in an APOBEC motif and in a CTL epitope, suggesting that APOBEC may facilitate early immune escape. Ten rapidly diversifying sites could not be explained by CTL escape or APOBEC hypermutation, including the most frequently mutated site, in the fusion peptide of gp41. We also examined the distribution, extent, and sequence context of insertions and deletions, and we provide evidence that the length variation seen in hypervariable loop regions of the envelope glycoprotein is a consequence of selection and not of mutational hotspots. Our results provide a detailed view of the process of diversification of HIV-1 following transmission, highlighting the role of CTL escape and hypermutation in shaping viral evolution during the establishment of new infections.
Wood, Natasha; Bhattacharya, Tanmoy; Keele, Brandon F.; Giorgi, Elena; Liu, Michael; Gaschen, Brian; Daniels, Marcus; Ferrari, Guido; Haynes, Barton F.; McMichael, Andrew; Shaw, George M.; Hahn, Beatrice H.; Korber, Bette; Seoighe, Cathal
2009-01-01
The pattern of viral diversification in newly infected individuals provides information about the host environment and immune responses typically experienced by the newly transmitted virus. For example, sites that tend to evolve rapidly across multiple early-infection patients could be involved in enabling escape from common early immune responses, could represent adaptation for rapid growth in a newly infected host, or could represent reversion from less fit forms of the virus that were selected for immune escape in previous hosts. Here we investigated the diversification of HIV-1 env coding sequences in 81 very early B subtype infections previously shown to have resulted from transmission or expansion of single viruses (n = 78) or two closely related viruses (n = 3). In these cases, the sequence of the infecting virus can be estimated accurately, enabling inference of both the direction of substitutions as well as distinction between insertion and deletion events. By integrating information across multiple acutely infected hosts, we find evidence of adaptive evolution of HIV-1 env and identify a subset of codon sites that diversified more rapidly than can be explained by a model of neutral evolution. Of 24 such rapidly diversifying sites, 14 were either i) clustered and embedded in CTL epitopes that were verified experimentally or predicted based on the individual's HLA or ii) in a nucleotide context indicative of APOBEC-mediated G-to-A substitutions, despite having excluded heavily hypermutated sequences prior to the analysis. In several cases, a rapidly evolving site was embedded both in an APOBEC motif and in a CTL epitope, suggesting that APOBEC may facilitate early immune escape. Ten rapidly diversifying sites could not be explained by CTL escape or APOBEC hypermutation, including the most frequently mutated site, in the fusion peptide of gp41. We also examined the distribution, extent, and sequence context of insertions and deletions, and we provide evidence that the length variation seen in hypervariable loop regions of the envelope glycoprotein is a consequence of selection and not of mutational hotspots. Our results provide a detailed view of the process of diversification of HIV-1 following transmission, highlighting the role of CTL escape and hypermutation in shaping viral evolution during the establishment of new infections. PMID:19424423
Leung, Preston; Eltahla, Auda A; Lloyd, Andrew R; Bull, Rowena A; Luciani, Fabio
2017-07-15
With the advent of affordable deep sequencing technologies, detection of low frequency variants within genetically diverse viral populations can now be achieved with unprecedented depth and efficiency. The high-resolution data provided by next generation sequencing technologies is currently recognised as the gold standard in estimation of viral diversity. In the analysis of rapidly mutating viruses, longitudinal deep sequencing datasets from viral genomes during individual infection episodes, as well as at the epidemiological level during outbreaks, now allow for more sophisticated analyses such as statistical estimates of the impact of complex mutation patterns on the evolution of the viral populations both within and between hosts. These analyses are revealing more accurate descriptions of the evolutionary dynamics that underpin the rapid adaptation of these viruses to the host response, and to drug therapies. This review assesses recent developments in methods and provide informative research examples using deep sequencing data generated from rapidly mutating viruses infecting humans, particularly hepatitis C virus (HCV), human immunodeficiency virus (HIV), Ebola virus and influenza virus, to understand the evolution of viral genomes and to explore the relationship between viral mutations and the host adaptive immune response. Finally, we discuss limitations in current technologies, and future directions that take advantage of publically available large deep sequencing datasets. Copyright © 2016 Elsevier B.V. All rights reserved.
Chen, Y. C.; Eisner, J. D.; Kattar, M. M.; Rassoulian-Barrett, S. L.; LaFe, K.; Yarfitz, S. L.; Limaye, A. P.; Cookson, B. T.
2000-01-01
Identification of medically relevant yeasts can be time-consuming and inaccurate with current methods. We evaluated PCR-based detection of sequence polymorphisms in the internal transcribed spacer 2 (ITS2) region of the rRNA genes as a means of fungal identification. Clinical isolates (401), reference strains (6), and type strains (27), representing 34 species of yeasts were examined. The length of PCR-amplified ITS2 region DNA was determined with single-base precision in less than 30 min by using automated capillary electrophoresis. Unique, species-specific PCR products ranging from 237 to 429 bp were obtained from 92% of the clinical isolates. The remaining 8%, divided into groups with ITS2 regions which differed by ≤2 bp in mean length, all contained species-specific DNA sequences easily distinguishable by restriction enzyme analysis. These data, and the specificity of length polymorphisms for identifying yeasts, were confirmed by DNA sequence analysis of the ITS2 region from 93 isolates. Phenotypic and ITS2-based identification was concordant for 427 of 434 yeast isolates examined using sequence identity of ≥99%. Seven clinical isolates contained ITS2 sequences that did not agree with their phenotypic identification, and ITS2-based phylogenetic analyses indicate the possibility of new or clinically unusual species in the Rhodotorula and Candida genera. This work establishes an initial database, validated with over 400 clinical isolates, of ITS2 length and sequence polymorphisms for 34 species of yeasts. We conclude that size and restriction analysis of PCR-amplified ITS2 region DNA is a rapid and reliable method to identify clinically significant yeasts, including potentially new or emerging pathogenic species. PMID:10834993
Lu, Bingxin; Leong, Hon Wai
2016-02-01
Genomic islands (GIs) are clusters of functionally related genes acquired by lateral genetic transfer (LGT), and they are present in many bacterial genomes. GIs are extremely important for bacterial research, because they not only promote genome evolution but also contain genes that enhance adaption and enable antibiotic resistance. Many methods have been proposed to predict GI. But most of them rely on either annotations or comparisons with other closely related genomes. Hence these methods cannot be easily applied to new genomes. As the number of newly sequenced bacterial genomes rapidly increases, there is a need for methods to detect GI based solely on sequences of a single genome. In this paper, we propose a novel method, GI-SVM, to predict GIs given only the unannotated genome sequence. GI-SVM is based on one-class support vector machine (SVM), utilizing composition bias in terms of k-mer content. From our evaluations on three real genomes, GI-SVM can achieve higher recall compared with current methods, without much loss of precision. Besides, GI-SVM allows flexible parameter tuning to get optimal results for each genome. In short, GI-SVM provides a more sensitive method for researchers interested in a first-pass detection of GI in newly sequenced genomes.
Choi, Hong-Kyu; Kim, Dongjin; Uhm, Taesik; Limpens, Eric; Lim, Hyunju; Mun, Jeong-Hwan; Kalo, Peter; Penmetsa, R Varma; Seres, Andrea; Kulikova, Olga; Roe, Bruce A; Bisseling, Ton; Kiss, Gyorgy B; Cook, Douglas R
2004-01-01
A core genetic map of the legume Medicago truncatula has been established by analyzing the segregation of 288 sequence-characterized genetic markers in an F(2) population composed of 93 individuals. These molecular markers correspond to 141 ESTs, 80 BAC end sequence tags, and 67 resistance gene analogs, covering 513 cM. In the case of EST-based markers we used an intron-targeted marker strategy with primers designed to anneal in conserved exon regions and to amplify across intron regions. Polymorphisms were significantly more frequent in intron vs. exon regions, thus providing an efficient mechanism to map transcribed genes. Genetic and cytogenetic analysis produced eight well-resolved linkage groups, which have been previously correlated with eight chromosomes by means of FISH with mapped BAC clones. We anticipated that mapping of conserved coding regions would have utility for comparative mapping among legumes; thus 60 of the EST-based primer pairs were designed to amplify orthologous sequences across a range of legume species. As an initial test of this strategy, we used primers designed against M. truncatula exon sequences to rapidly map genes in M. sativa. The resulting comparative map, which includes 68 bridging markers, indicates that the two Medicago genomes are highly similar and establishes the basis for a Medicago composite map. PMID:15082563
Muangkram, Yuttamol; Wajjwalku, Worawidh; Amano, Akira; Sukmak, Manakorn
2018-01-01
We presented the powerful techniques for species identification using the short amplicon of mitochondrial cytochrome b gene sequence. Two faecal samples and one single hair sample of the Asian tapir were tested using the new cytochrome b primers. The results showed a high sequence similarity with the mainland Asian tapir group. The comparative sequence analysis of the reserved wild mammals in Thailand and the other endangered mammal species from Southeast Asia comprehensibly verified the potential of our novel primers. The forward and reverse primers were 94.2 and 93.2%, respectively, by the average value of the sequence identity among 77 species sequences, and the overall mean distance was 35.9%. This development technique could provide rapid, simple, and reliable tools for species confirmation. Especially, it could recognize the problematic biological specimens contained less DNA material from illegal products and assist with wildlife crime investigation of threatened species and related forensic casework.
Danilowicz, Claudia; Hermans, Laura; Coljee, Vincent; Prévost, Chantal
2017-01-01
Abstract During DNA recombination and repair, RecA family proteins must promote rapid joining of homologous DNA. Repeated sequences with >100 base pair lengths occupy more than 1% of bacterial genomes; however, commitment to strand exchange was believed to occur after testing ∼20–30 bp. If that were true, pairings between different copies of long repeated sequences would usually become irreversible. Our experiments reveal that in the presence of ATP hydrolysis even 75 bp sequence-matched strand exchange products remain quite reversible. Experiments also indicate that when ATP hydrolysis is present, flanking heterologous dsDNA regions increase the reversibility of sequence matched strand exchange products with lengths up to ∼75 bp. Results of molecular dynamics simulations provide insight into how ATP hydrolysis destabilizes strand exchange products. These results inspired a model that shows how pairings between long repeated sequences could be efficiently rejected even though most homologous pairings form irreversible products. PMID:28854739
Cloning and sequence analysis of Hemonchus contortus HC58cDNA.
Muleke, Charles I; Ruofeng, Yan; Lixin, Xu; Xinwen, Bo; Xiangrui, Li
2007-06-01
The complete coding sequence of Hemonchus contortus HC58cDNA was generated by rapid amplification of cDNA ends and polymerase chain reaction using primers based on the 5' and 3' ends of the parasite mRNA, accession no. AF305964. The HC58cDNA gene was 851 bp long, with open reading frame of 717 bp, precursors to 239 amino acids coding for approximately 27 kDa protein. Analysis of amino acid sequence revealed conserved residues of cysteine, histidine, asparagine, occluding loop pattern, hemoglobinase motif and glutamine of the oxyanion hole characteristic of cathepsin B like proteases (CBL). Comparison of the predicted amino acid sequences showed the protein shared 33.5-58.7% identity to cathepsin B homologues in the papain clan CA family (family C1). Phylogenetic analysis revealed close evolutionary proximity of the protein sequence to counterpart sequences in the CBL, suggesting that HC58cDNA was a member of the papain family.
Garcia-Reyero, Natàlia; Griffitt, Robert J.; Liu, Li; Kroll, Kevin J.; Farmerie, William G.; Barber, David S.; Denslow, Nancy D.
2009-01-01
A novel custom microarray for largemouth bass (Micropterus salmoides) was designed with sequences obtained from a normalized cDNA library using the 454 Life Sciences GS-20 pyrosequencer. This approach yielded in excess of 58 million bases of high-quality sequence. The sequence information was combined with 2,616 reads obtained by traditional suppressive subtractive hybridizations to derive a total of 31,391 unique sequences. Annotation and coding sequences were predicted for these transcripts where possible. 16,350 annotated transcripts were selected as target sequences for the design of the custom largemouth bass oligonucleotide microarray. The microarray was validated by examining the transcriptomic response in male largemouth bass exposed to 17β-œstradiol. Transcriptomic responses were assessed in liver and gonad, and indicated gene expression profiles typical of exposure to œstradiol. The results demonstrate the potential to rapidly create the tools necessary to assess large scale transcriptional responses in non-model species, paving the way for expanded impact of toxicogenomics in ecotoxicology. PMID:19936325
Potenza, L; Cafiero, M A; Camarda, A; La Salandra, G; Cucchiarini, L; Dachà, M
2009-10-01
In the present work mites previously identified as Dermanyssus gallinae De Geer (Acari, Mesostigmata) using morphological keys were investigated by molecular tools. The complete internal transcribed spacer 1 (ITS1), 5.8S ribosomal DNA, and ITS2 region of the ribosomal DNA from mites were amplified and sequenced to examine the level of sequence variations and to explore the feasibility of using this region in the identification of this mite. Conserved primers located at the 3'end of 18S and at the 5'start of 28S rRNA genes were used first, and amplified fragments were sequenced. Sequence analyses showed no variation in 5.8S and ITS2 region while slight intraspecific variations involving substitutions as well as deletions concentrated in the ITS1 region. Based on the sequence analyses a nested PCR of the ITS2 region followed by RFLP analyses has been set up in the attempt to provide a rapid molecular diagnostic tool of D. gallinae.
Predicting the host of influenza viruses based on the word vector.
Xu, Beibei; Tan, Zhiying; Li, Kenli; Jiang, Taijiao; Peng, Yousong
2017-01-01
Newly emerging influenza viruses continue to threaten public health. A rapid determination of the host range of newly discovered influenza viruses would assist in early assessment of their risk. Here, we attempted to predict the host of influenza viruses using the Support Vector Machine (SVM) classifier based on the word vector, a new representation and feature extraction method for biological sequences. The results show that the length of the word within the word vector, the sequence type (DNA or protein) and the species from which the sequences were derived for generating the word vector all influence the performance of models in predicting the host of influenza viruses. In nearly all cases, the models built on the surface proteins hemagglutinin (HA) and neuraminidase (NA) (or their genes) produced better results than internal influenza proteins (or their genes). The best performance was achieved when the model was built on the HA gene based on word vectors (words of three-letters long) generated from DNA sequences of the influenza virus. This results in accuracies of 99.7% for avian, 96.9% for human and 90.6% for swine influenza viruses. Compared to the method of sequence homology best-hit searches using the Basic Local Alignment Search Tool (BLAST), the word vector-based models still need further improvements in predicting the host of influenza A viruses.
A novel brain-computer interface based on the rapid serial visual presentation paradigm.
Acqualagna, Laura; Treder, Matthias Sebastian; Schreuder, Martijn; Blankertz, Benjamin
2010-01-01
Most present-day visual brain computer interfaces (BCIs) suffer from the fact that they rely on eye movements, are slow-paced, or feature a small vocabulary. As a potential remedy, we explored a novel BCI paradigm consisting of a central rapid serial visual presentation (RSVP) of the stimuli. It has a large vocabulary and realizes a BCI system based on covert non-spatial selective visual attention. In an offline study, eight participants were presented sequences of rapid bursts of symbols. Two different speeds and two different color conditions were investigated. Robust early visual and P300 components were elicited time-locked to the presentation of the target. Offline classification revealed a mean accuracy of up to 90% for selecting the correct symbol out of 30 possibilities. The results suggest that RSVP-BCI is a promising new paradigm, also for patients with oculomotor impairments.
Identification of Microorganisms by Modern Analytical Techniques.
Buszewski, Bogusław; Rogowska, Agnieszka; Pomastowski, Paweł; Złoch, Michał; Railean-Plugaru, Viorica
2017-11-01
Rapid detection and identification of microorganisms is a challenging and important aspect in a wide range of fields, from medical to industrial, affecting human lives. Unfortunately, classical methods of microorganism identification are based on time-consuming and labor-intensive approaches. Screening techniques require the rapid and cheap grouping of bacterial isolates; however, modern bioanalytics demand comprehensive bacterial studies at a molecular level. Modern approaches for the rapid identification of bacteria use molecular techniques, such as 16S ribosomal RNA gene sequencing based on polymerase chain reaction or electromigration, especially capillary zone electrophoresis and capillary isoelectric focusing. However, there are still several challenges with the analysis of microbial complexes using electromigration technology, such as uncontrolled aggregation and/or adhesion to the capillary surface. Thus, an approach using capillary electrophoresis of microbial aggregates with UV and matrix-assisted laser desorption ionization time-of-flight MS detection is presented.
Robins, Judith H; Tintinger, Vernon; Aplin, Ken P; Hingston, Melanie; Matisoo-Smith, Elizabeth; Penny, David; Lavery, Shane D
2014-01-01
The genus Rattus is highly speciose, the taxonomy is complex, and individuals are often difficult to identify to the species level. Previous studies have demonstrated the usefulness of phylogenetic approaches to identification in Rattus but some species, especially among the endemics of the New Guinean region, showed poor resolution. Possible reasons for this are simple misidentification, incomplete gene lineage sorting, hybridization, and phylogenetically distinct lineages that are unrecognised taxonomically. To assess these explanations we analysed 217 samples, representing nominally 25 Rattus species, collected in New Guinea, Asia, Australia and the Pacific. To reduce misidentification problems we sequenced museum specimens from earlier morphological studies and recently collected tissues from samples with associated voucher specimens. We also reassessed vouchers from previously sequenced specimens. We inferred combined and separate phylogenies from two mitochondrial DNA regions comprising 550 base pair D-loop sequences and both long (655 base pair) and short (150 base pair) cytochrome oxidase I sequences. Our phylogenetic species identification for 17 species was consistent with morphological designations and current taxonomy thus reinforcing the usefulness of this approach. We reduced misidentifications and consequently the number of polyphyletic species in our phylogenies but the New Guinean Rattus clades still exhibited considerable complexity. Only three of our eight New Guinean species were monophyletic. We found good evidence for either incomplete mitochondrial lineage sorting or hybridization between species within two pairs, R. leucopus/R. cf. verecundus and R. steini/R. praetor. Additionally, our results showed that R. praetor, R. niobe and R. verecundus each likely encompass more than one species. Our study clearly points to the need for a revised taxonomy of the rats of New Guinea, based on broader sampling and informed by both morphology and phylogenetics. The remaining taxonomic complexity highlights the recent and rapid radiation of Rattus in the Australo-Papuan region.
Robins, Judith H.; Tintinger, Vernon; Aplin, Ken P.; Hingston, Melanie; Matisoo-Smith, Elizabeth; Penny, David; Lavery, Shane D.
2014-01-01
The genus Rattus is highly speciose, the taxonomy is complex, and individuals are often difficult to identify to the species level. Previous studies have demonstrated the usefulness of phylogenetic approaches to identification in Rattus but some species, especially among the endemics of the New Guinean region, showed poor resolution. Possible reasons for this are simple misidentification, incomplete gene lineage sorting, hybridization, and phylogenetically distinct lineages that are unrecognised taxonomically. To assess these explanations we analysed 217 samples, representing nominally 25 Rattus species, collected in New Guinea, Asia, Australia and the Pacific. To reduce misidentification problems we sequenced museum specimens from earlier morphological studies and recently collected tissues from samples with associated voucher specimens. We also reassessed vouchers from previously sequenced specimens. We inferred combined and separate phylogenies from two mitochondrial DNA regions comprising 550 base pair D-loop sequences and both long (655 base pair) and short (150 base pair) cytochrome oxidase I sequences. Our phylogenetic species identification for 17 species was consistent with morphological designations and current taxonomy thus reinforcing the usefulness of this approach. We reduced misidentifications and consequently the number of polyphyletic species in our phylogenies but the New Guinean Rattus clades still exhibited considerable complexity. Only three of our eight New Guinean species were monophyletic. We found good evidence for either incomplete mitochondrial lineage sorting or hybridization between species within two pairs, R. leucopus/R. cf. verecundus and R. steini/R. praetor. Additionally, our results showed that R. praetor, R. niobe and R. verecundus each likely encompass more than one species. Our study clearly points to the need for a revised taxonomy of the rats of New Guinea, based on broader sampling and informed by both morphology and phylogenetics. The remaining taxonomic complexity highlights the recent and rapid radiation of Rattus in the Australo-Papuan region. PMID:24865350
Falkner, Jayson; Andrews, Philip
2005-05-15
Comparing tandem mass spectra (MSMS) against a known dataset of protein sequences is a common method for identifying unknown proteins; however, the processing of MSMS by current software often limits certain applications, including comprehensive coverage of post-translational modifications, non-specific searches and real-time searches to allow result-dependent instrument control. This problem deserves attention as new mass spectrometers provide the ability for higher throughput and as known protein datasets rapidly grow in size. New software algorithms need to be devised in order to address the performance issues of conventional MSMS protein dataset-based protein identification. This paper describes a novel algorithm based on converting a collection of monoisotopic, centroided spectra to a new data structure, named 'peptide finite state machine' (PFSM), which may be used to rapidly search a known dataset of protein sequences, regardless of the number of spectra searched or the number of potential modifications examined. The algorithm is verified using a set of commercially available tryptic digest protein standards analyzed using an ABI 4700 MALDI TOFTOF mass spectrometer, and a free, open source PFSM implementation. It is illustrated that a PFSM can accurately search large collections of spectra against large datasets of protein sequences (e.g. NCBI nr) using a regular desktop PC; however, this paper only details the method for identifying peptide and subsequently protein candidates from a dataset of known protein sequences. The concept of using a PFSM as a peptide pre-screening technique for MSMS-based search engines is validated by using PFSM with Mascot and XTandem. Complete source code, documentation and examples for the reference PFSM implementation are freely available at the Proteome Commons, http://www.proteomecommons.org and source code may be used both commercially and non-commercially as long as the original authors are credited for their work.
A rule of seven in Watson-Crick base-pairing of mismatched sequences.
Cisse, Ibrahim I; Kim, Hajin; Ha, Taekjip
2012-05-13
Sequence recognition through base-pairing is essential for DNA repair and gene regulation, but the basic rules governing this process remain elusive. In particular, the kinetics of annealing between two imperfectly matched strands is not well characterized, despite its potential importance in nucleic acid-based biotechnologies and gene silencing. Here we use single-molecule fluorescence to visualize the multiple annealing and melting reactions of two untethered strands inside a porous vesicle, allowing us to precisely quantify the annealing and melting rates. The data as a function of mismatch position suggest that seven contiguous base pairs are needed for rapid annealing of DNA and RNA. This phenomenological rule of seven may underlie the requirement for seven nucleotides of complementarity to seed gene silencing by small noncoding RNA and may help guide performance improvement in DNA- and RNA-based bio- and nanotechnologies, in which off-target effects can be detrimental.
Marques, S; Huss, V A R; Pfisterer, K; Grosse, C; Thompson, G
2015-05-01
The increasing incidence of rare mastitis-causing pathogens has urged the implementation of fast and efficient diagnostic and control measures. Prototheca algae are known to be associated with diseases in humans and animals. In the latter, the most prevalent form of protothecosis is bovine mastitis with Prototheca zopfii and Prototheca blaschkeae representing the most common pathogenic species. These nonphotosynthetic and colorless green algae are ubiquitous in different environments and are widely resistant against harmful conditions and antimicrobials. Hence, the association of Prototheca with bovine mastitis represents a herd problem, requiring fast and easy identification of the infectious agent. The purpose of this study was to develop a reliable and rapid method, based on the internal transcribed spacer (ITS) sequences of ribosomal DNA, for molecular identification and discrimination between P. zopfii and P. blaschkeae in bovine mastitic milk. The complete ITS sequences of 32 Prototheca isolates showed substantial interspecies but moderate intraspecies variability facilitating the design of species-specific PCR amplification primers. The species-specific PCR was successfully applied to the identification of P. zopfii and P. blaschkeae directly from milk samples. The intraspecific ITS phylogeny was compared for each species with the geographical distribution of the respective Prototheca isolates, but no significant correlation was found. Copyright © 2015 American Dairy Science Association. Published by Elsevier Inc. All rights reserved.
High-Resolution Melt Analysis for Rapid Comparison of Bacterial Community Compositions
Hjelmsø, Mathis Hjort; Hansen, Lars Hestbjerg; Bælum, Jacob; Feld, Louise; Holben, William E.
2014-01-01
In the study of bacterial community composition, 16S rRNA gene amplicon sequencing is today among the preferred methods of analysis. The cost of nucleotide sequence analysis, including requisite computational and bioinformatic steps, however, takes up a large part of many research budgets. High-resolution melt (HRM) analysis is the study of the melt behavior of specific PCR products. Here we describe a novel high-throughput approach in which we used HRM analysis targeting the 16S rRNA gene to rapidly screen multiple complex samples for differences in bacterial community composition. We hypothesized that HRM analysis of amplified 16S rRNA genes from a soil ecosystem could be used as a screening tool to identify changes in bacterial community structure. This hypothesis was tested using a soil microcosm setup exposed to a total of six treatments representing different combinations of pesticide and fertilization treatments. The HRM analysis identified a shift in the bacterial community composition in two of the treatments, both including the soil fumigant Basamid GR. These results were confirmed with both denaturing gradient gel electrophoresis (DGGE) analysis and 454-based 16S rRNA gene amplicon sequencing. HRM analysis was shown to be a fast, high-throughput technique that can serve as an effective alternative to gel-based screening methods to monitor microbial community composition. PMID:24610853
Neutral changes during divergent evolution of hemoglobins
NASA Technical Reports Server (NTRS)
Jukes, T. H.
1978-01-01
A comparison of the mRNAs for rabbit and human beta-hemoglobins shows that synonymous changes in codons have accumulated three times as rapidly as nucleotide replacements that produced changes in amino acids. This agrees with predictions based on the so-called neutral theory. In addition, seven codon changes that appear to be single-base changes (according to maximum parsimony) are actually two-base changes. This indicates that the construction of primordial sequences is of limited significance when based on inferences that assume minimum base changes for amino acid replacements.
Rapid DNA Sequencing by Direct Nanoscale Reading of Nucleotide Bases on Individual DNA Chains
DOE Office of Scientific and Technical Information (OSTI.GOV)
Lee, James Weifu; Meller, Amit
2007-01-01
Since the independent invention of DNA sequencing by Sanger and by Gilbert 30 years ago, it has grown from a small scale technique capable of reading several kilobase-pair of sequence per day into today's multibillion dollar industry. This growth has spurred the development of new sequencing technologies that do not involve either electrophoresis or Sanger sequencing chemistries. Sequencing by Synthesis (SBS) involves multiple parallel micro-sequencing addition events occurring on a surface, where data from each round is detected by imaging. New High Throughput Technologies for DNA Sequencing and Genomics is the second volume in the Perspectives in Bioanalysis series, whichmore » looks at the electroanalytical chemistry of nucleic acids and proteins, development of electrochemical sensors and their application in biomedicine and in the new fields of genomics and proteomics. The authors have expertly formatted the information for a wide variety of readers, including new developments that will inspire students and young scientists to create new tools for science and medicine in the 21st century. Reviews of complementary developments in Sanger and SBS sequencing chemistries, capillary electrophoresis and microdevice integration, MS sequencing and applications set the framework for the book.« less
Development of a high-speed real-time PCR system for rapid and precise nucleotide recognition
NASA Astrophysics Data System (ADS)
Terazono, Hideyuki; Takei, Hiroyuki; Hattori, Akihiro; Yasuda, Kenji
2010-04-01
Polymerase chain reaction (PCR) is a common method used to create copies of a specific target region of a DNA sequence and to produce large quantities of DNA. A few DNA molecules, which act as templates, are rapidly amplified by PCR into many billions of copies. PCR is a key technology in genome-based biological analysis, revolutionizing many life science fields such as medical diagnostics, food safety monitoring, and countermeasures against bioterrorism. Thus, many applications have been developed with the thermal cycling. For these PCR applications, one of the most important key factors is reduction in the data acquisition time. To reduce the acquisition time, it is necessary to decrease the temperature transition time between the high and low ends as much as possible. We have developed a novel rapid real-time PCR system based on rapid exchange of media maintained at different temperatures. This system consists of two thermal reservoirs and a reaction chamber for PCR observation. The temperature transition was achieved within 0.3 sec, and good thermal stability was achieved during thermal cycling with rapid exchange of circulating media. This system allows rigorous optimization of the temperatures required for each stage of the PCR processes. Resulting amplicons were confirmed by electrophoresis. Using the system, rapid DNA amplification was accomplished within 3.5 min, including initial heating and complete 50 PCR cycles. It clearly shows that the device could allow us faster temperature switching than the conventional conduction-based heating systems based on Peltier heating/cooling.
Wilson, Kitchener D; Shen, Peidong; Fung, Eula; Karakikes, Ioannis; Zhang, Angela; InanlooRahatloo, Kolsoum; Odegaard, Justin; Sallam, Karim; Davis, Ronald W; Lui, George K; Ashley, Euan A; Scharfe, Curt; Wu, Joseph C
2015-09-11
Thousands of mutations across >50 genes have been implicated in inherited cardiomyopathies. However, options for sequencing this rapidly evolving gene set are limited because many sequencing services and off-the-shelf kits suffer from slow turnaround, inefficient capture of genomic DNA, and high cost. Furthermore, customization of these assays to cover emerging targets that suit individual needs is often expensive and time consuming. We sought to develop a custom high throughput, clinical-grade next-generation sequencing assay for detecting cardiac disease gene mutations with improved accuracy, flexibility, turnaround, and cost. We used double-stranded probes (complementary long padlock probes), an inexpensive and customizable capture technology, to efficiently capture and amplify the entire coding region and flanking intronic and regulatory sequences of 88 genes and 40 microRNAs associated with inherited cardiomyopathies, congenital heart disease, and cardiac development. Multiplexing 11 samples per sequencing run resulted in a mean base pair coverage of 420, of which 97% had >20× coverage and >99% were concordant with known heterozygous single nucleotide polymorphisms. The assay correctly detected germline variants in 24 individuals and revealed several polymorphic regions in miR-499. Total run time was 3 days at an approximate cost of $100 per sample. Accurate, high-throughput detection of mutations across numerous cardiac genes is achievable with complementary long padlock probe technology. Moreover, this format allows facile insertion of additional probes as more cardiomyopathy and congenital heart disease genes are discovered, giving researchers a powerful new tool for DNA mutation detection and discovery. © 2015 American Heart Association, Inc.
Yu, Zhongtang; Yu, Marie; Morrison, Mark
2006-04-01
Serial analysis of ribosomal sequence tags (SARST) is a recently developed technology that can generate large 16S rRNA gene (rrs) sequence data sets from microbiomes, but there are numerous enzymatic and purification steps required to construct the ribosomal sequence tag (RST) clone libraries. We report here an improved SARST method, which still targets the V1 hypervariable region of rrs genes, but reduces the number of enzymes, oligonucleotides, reagents, and technical steps needed to produce the RST clone libraries. The new method, hereafter referred to as SARST-V1, was used to examine the eubacterial diversity present in community DNA recovered from the microbiome resident in the ovine rumen. The 190 sequenced clones contained 1055 RSTs and no less than 236 unique phylotypes (based on > or = 95% sequence identity) that were assigned to eight different eubacterial phyla. Rarefaction and monomolecular curve analyses predicted that the complete RST clone library contains 99% of the 353 unique phylotypes predicted to exist in this microbiome. When compared with ribosomal intergenic spacer analysis (RISA) of the same community DNA sample, as well as a compilation of nine previously published conventional rrs clone libraries prepared from the same type of samples, the RST clone library provided a more comprehensive characterization of the eubacterial diversity present in rumen microbiomes. As such, SARST-V1 should be a useful tool applicable to comprehensive examination of diversity and composition in microbiomes and offers an affordable, sequence-based method for diversity analysis.
Temporal Precision of Neuronal Information in a Rapid Perceptual Judgment
Ghose, Geoffrey M.; Harrison, Ian T.
2009-01-01
In many situations, such as pedestrians crossing a busy street or prey evading predators, rapid decisions based on limited perceptual information are critical for survival. The brevity of these perceptual judgments constrains how neuronal signals are integrated or pooled over time because the underlying sequence of processes, from sensation to perceptual evaluation to motor planning and execution, all occur within several hundred milliseconds. Because most previous physiological studies of these processes have relied on tasks requiring considerably longer temporal integration, the neuronal basis of such rapid decisions remains largely unexplored. In this study, we examine the temporal precision of neuronal activity associated with a rapid perceptual judgment. We find that the activity of individual neurons over tens of milliseconds can reliably convey information about sensory events and was well correlated with the animals' judgments. There was a strong correlation between sensory reliability and the correlation with behavioral choice, suggesting that rapid decisions were preferentially based on the most reliable sensory signals. We also find that a simple model in which the responses of a small number of individual neurons (<5) are summed can completely explain behavioral performance. These results suggest that neuronal circuits are sufficiently precise to allow for cognitive decisions to be based on small numbers of action potentials from highly reliable neurons. PMID:19109454
Temporal precision of neuronal information in a rapid perceptual judgment.
Ghose, Geoffrey M; Harrison, Ian T
2009-03-01
In many situations, such as pedestrians crossing a busy street or prey evading predators, rapid decisions based on limited perceptual information are critical for survival. The brevity of these perceptual judgments constrains how neuronal signals are integrated or pooled over time because the underlying sequence of processes, from sensation to perceptual evaluation to motor planning and execution, all occur within several hundred milliseconds. Because most previous physiological studies of these processes have relied on tasks requiring considerably longer temporal integration, the neuronal basis of such rapid decisions remains largely unexplored. In this study, we examine the temporal precision of neuronal activity associated with a rapid perceptual judgment. We find that the activity of individual neurons over tens of milliseconds can reliably convey information about sensory events and was well correlated with the animals' judgments. There was a strong correlation between sensory reliability and the correlation with behavioral choice, suggesting that rapid decisions were preferentially based on the most reliable sensory signals. We also find that a simple model in which the responses of a small number of individual neurons (<5) are summed can completely explain behavioral performance. These results suggest that neuronal circuits are sufficiently precise to allow for cognitive decisions to be based on small numbers of action potentials from highly reliable neurons.
Blom, Mozes P K
2015-08-05
Recently developed molecular methods enable geneticists to target and sequence thousands of orthologous loci and infer evolutionary relationships across the tree of life. Large numbers of genetic markers benefit species tree inference but visual inspection of alignment quality, as traditionally conducted, is challenging with thousands of loci. Furthermore, due to the impracticality of repeated visual inspection with alternative filtering criteria, the potential consequences of using datasets with different degrees of missing data remain nominally explored in most empirical phylogenomic studies. In this short communication, I describe a flexible high-throughput pipeline designed to assess alignment quality and filter exonic sequence data for subsequent inference. The stringency criteria for alignment quality and missing data can be adapted based on the expected level of sequence divergence. Each alignment is automatically evaluated based on the stringency criteria specified, significantly reducing the number of alignments that require visual inspection. By developing a rapid method for alignment filtering and quality assessment, the consistency of phylogenetic estimation based on exonic sequence alignments can be further explored across distinct inference methods, while accounting for different degrees of missing data.
Rowe, Will; Baker, Kate S; Verner-Jeffreys, David; Baker-Austin, Craig; Ryan, Jim J; Maskell, Duncan; Pearce, Gareth
2015-01-01
Antimicrobial resistance remains a growing and significant concern in human and veterinary medicine. Current laboratory methods for the detection and surveillance of antimicrobial resistant bacteria are limited in their effectiveness and scope. With the rapidly developing field of whole genome sequencing beginning to be utilised in clinical practice, the ability to interrogate sequencing data quickly and easily for the presence of antimicrobial resistance genes will become increasingly important and useful for informing clinical decisions. Additionally, use of such tools will provide insight into the dynamics of antimicrobial resistance genes in metagenomic samples such as those used in environmental monitoring. Here we present the Search Engine for Antimicrobial Resistance (SEAR), a pipeline and web interface for detection of horizontally acquired antimicrobial resistance genes in raw sequencing data. The pipeline provides gene information, abundance estimation and the reconstructed sequence of antimicrobial resistance genes; it also provides web links to additional information on each gene. The pipeline utilises clustering and read mapping to annotate full-length genes relative to a user-defined database. It also uses local alignment of annotated genes to a range of online databases to provide additional information. We demonstrate SEAR's application in the detection and abundance estimation of antimicrobial resistance genes in two novel environmental metagenomes, 32 human faecal microbiome datasets and 126 clinical isolates of Shigella sonnei. We have developed a pipeline that contributes to the improved capacity for antimicrobial resistance detection afforded by next generation sequencing technologies, allowing for rapid detection of antimicrobial resistance genes directly from sequencing data. SEAR uses raw sequencing data via an intuitive interface so can be run rapidly without requiring advanced bioinformatic skills or resources. Finally, we show that SEAR is effective in detecting antimicrobial resistance genes in metagenomic and isolate sequencing data from both environmental metagenomes and sequencing data from clinical isolates.
McAlpine, James B
2009-03-27
Over the past decade major changes have occurred in the access to genome sequences that encode the enzymes responsible for the biosynthesis of secondary metabolites, knowledge of how those sequences translate into the final structure of the metabolite, and the ability to alter the sequence to obtain predicted products via both homologous and heterologous expression. Novel genera have been discovered leading to new chemotypes, but more surprisingly several instances have been uncovered where the apparently general rules of modular translation have not applied. Several new biosynthetic pathways have been unearthed, and our general knowledge grows rapidly. This review aims to highlight some of the more striking discoveries and advances of the decade.
Genotyping of Chromobacterium violaceum isolates by recA PCR-RFLP analysis.
Scholz, Holger Christian; Witte, Angela; Tomaso, Herbert; Al Dahouk, Sascha; Neubauer, Heinrich
2005-03-15
Intraspecies variation of Chromobacterium violaceum was examined by comparative sequence - and by restriction fragment length polymorphism analysis of the recombinase A gene (recA-PCR-RFLP). Primers deduced from the known recA gene sequence of the type strain C. violaceum ATCC 12472(T) allowed the specific amplification of a 1040bp recA fragment from each of the 13 C. violaceum strains investigated, whereas other closely related organisms tested negative. HindII-PstI-recA RFLP analysis generated from 13 representative C. violaceum strains enabled us to identify at least three different genospecies. In conclusion, analysis of the recA gene provides a rapid and robust nucleotide sequence-based approach to specifically identify and classify C. violaceum on genospecies level.
Sequencing Structural Variants in Cancer for Precision Therapeutics.
Macintyre, Geoff; Ylstra, Bauke; Brenton, James D
2016-09-01
The identification of mutations that guide therapy selection for patients with cancer is now routine in many clinical centres. The majority of assays used for solid tumour profiling use DNA sequencing to interrogate somatic point mutations because they are relatively easy to identify and interpret. Many cancers, however, including high-grade serous ovarian, oesophageal, and small-cell lung cancer, are driven by somatic structural variants that are not measured by these assays. Therefore, there is currently an unmet need for clinical assays that can cheaply and rapidly profile structural variants in solid tumours. In this review we survey the landscape of 'actionable' structural variants in cancer and identify promising detection strategies based on massively-parallel sequencing. Copyright © 2016 Elsevier Ltd. All rights reserved.
Structural genomics: keeping up with expanding knowledge of the protein universe.
Grabowski, Marek; Joachimiak, Andrzej; Otwinowski, Zbyszek; Minor, Wladek
2007-06-01
Structural characterization of the protein universe is the main mission of Structural Genomics (SG) programs. However, progress in gene sequencing technology, set in motion in the 1990s, has resulted in rapid expansion of protein sequence space--a twelvefold increase in the past seven years. For the SG field, this creates new challenges and necessitates a re-assessment of its strategies. Nevertheless, despite the growth of sequence space, at present nearly half of the content of the Swiss-Prot database and over 40% of Pfam protein families can be structurally modeled based on structures determined so far, with SG projects making an increasingly significant contribution. The SG contribution of new Pfam structures nearly doubled from 27.2% in 2003 to 51.6% in 2006.
2014-01-01
Background Recent innovations in sequencing technologies have provided researchers with the ability to rapidly characterize the microbial content of an environmental or clinical sample with unprecedented resolution. These approaches are producing a wealth of information that is providing novel insights into the microbial ecology of the environment and human health. However, these sequencing-based approaches produce large and complex datasets that require efficient and sensitive computational analysis workflows. Many recent tools for analyzing metagenomic-sequencing data have emerged, however, these approaches often suffer from issues of specificity, efficiency, and typically do not include a complete metagenomic analysis framework. Results We present PathoScope 2.0, a complete bioinformatics framework for rapidly and accurately quantifying the proportions of reads from individual microbial strains present in metagenomic sequencing data from environmental or clinical samples. The pipeline performs all necessary computational analysis steps; including reference genome library extraction and indexing, read quality control and alignment, strain identification, and summarization and annotation of results. We rigorously evaluated PathoScope 2.0 using simulated data and data from the 2011 outbreak of Shiga-toxigenic Escherichia coli O104:H4. Conclusions The results show that PathoScope 2.0 is a complete, highly sensitive, and efficient approach for metagenomic analysis that outperforms alternative approaches in scope, speed, and accuracy. The PathoScope 2.0 pipeline software is freely available for download at: http://sourceforge.net/projects/pathoscope/. PMID:25225611
ERIC Educational Resources Information Center
Heath, Steve M.; Hogben, John H.
2004-01-01
Background: Claims that children with reading and oral language deficits have impaired perception of sequential sounds are usually based on psychophysical measures of auditory temporal processing (ATP) designed to characterise group performance. If we are to use these measures (e.g., the Tallal, 1980, Repetition Test) as the basis for intervention…
Lagkouvardos, Ilias; Joseph, Divya; Kapfhammer, Martin; Giritli, Sabahattin; Horn, Matthias; Haller, Dirk; Clavel, Thomas
2016-09-23
The SRA (Sequence Read Archive) serves as primary depository for massive amounts of Next Generation Sequencing data, and currently host over 100,000 16S rRNA gene amplicon-based microbial profiles from various host habitats and environments. This number is increasing rapidly and there is a dire need for approaches to utilize this pool of knowledge. Here we created IMNGS (Integrated Microbial Next Generation Sequencing), an innovative platform that uniformly and systematically screens for and processes all prokaryotic 16S rRNA gene amplicon datasets available in SRA and uses them to build sample-specific sequence databases and OTU-based profiles. Via a web interface, this integrative sequence resource can easily be queried by users. We show examples of how the approach allows testing the ecological importance of specific microorganisms in different hosts or ecosystems, and performing targeted diversity studies for selected taxonomic groups. The platform also offers a complete workflow for de novo analysis of users' own raw 16S rRNA gene amplicon datasets for the sake of comparison with existing data. IMNGS can be accessed at www.imngs.org.
Annotation, submission and screening of repetitive elements in Repbase: RepbaseSubmitter and Censor.
Kohany, Oleksiy; Gentles, Andrew J; Hankus, Lukasz; Jurka, Jerzy
2006-10-25
Repbase is a reference database of eukaryotic repetitive DNA, which includes prototypic sequences of repeats and basic information described in annotations. Updating and maintenance of the database requires specialized tools, which we have created and made available for use with Repbase, and which may be useful as a template for other curated databases. We describe the software tools RepbaseSubmitter and Censor, which are designed to facilitate updating and screening the content of Repbase. RepbaseSubmitter is a java-based interface for formatting and annotating Repbase entries. It eliminates many common formatting errors, and automates actions such as calculation of sequence lengths and composition, thus facilitating curation of Repbase sequences. In addition, it has several features for predicting protein coding regions in sequences; searching and including Pubmed references in Repbase entries; and searching the NCBI taxonomy database for correct inclusion of species information and taxonomic position. Censor is a tool to rapidly identify repetitive elements by comparison to known repeats. It uses WU-BLAST for speed and sensitivity, and can conduct DNA-DNA, DNA-protein, or translated DNA-translated DNA searches of genomic sequence. Defragmented output includes a map of repeats present in the query sequence, with the options to report masked query sequence(s), repeat sequences found in the query, and alignments. Censor and RepbaseSubmitter are available as both web-based services and downloadable versions. They can be found at http://www.girinst.org/repbase/submission.html (RepbaseSubmitter) and http://www.girinst.org/censor/index.php (Censor).
USDA-ARS?s Scientific Manuscript database
Next-generation sequencing technologies were used to rapidly and efficiently sequence the genome of the domestic turkey (Meleagris gallopavo). The current genome assembly (~1.1 Gb) includes 917 Mb of sequence assigned to chromosomes. Innate heterozygosity of the sequenced bird allowed discovery of...
Guo, Bingfu; Guo, Yong; Hong, Huilong; Qiu, Li-Juan
2016-01-01
Molecular characterization of sequence flanking exogenous fragment insertion is essential for safety assessment and labeling of genetically modified organism (GMO). In this study, the T-DNA insertion sites and flanking sequences were identified in two newly developed transgenic glyphosate-tolerant soybeans GE-J16 and ZH10-6 based on whole genome sequencing (WGS) method. More than 22.4 Gb sequence data (∼21 × coverage) for each line was generated on Illumina HiSeq 2500 platform. The junction reads mapped to boundaries of T-DNA and flanking sequences in these two events were identified by comparing all sequencing reads with soybean reference genome and sequence of transgenic vector. The putative insertion loci and flanking sequences were further confirmed by PCR amplification, Sanger sequencing, and co-segregation analysis. All these analyses supported that exogenous T-DNA fragments were integrated in positions of Chr19: 50543767-50543792 and Chr17: 7980527-7980541 in these two transgenic lines. Identification of genomic insertion sites of G2-EPSPS and GAT transgenes will facilitate the utilization of their glyphosate-tolerant traits in soybean breeding program. These results also demonstrated that WGS was a cost-effective and rapid method for identifying sites of T-DNA insertions and flanking sequences in soybean.
Quantifying the relationship between sequence and three-dimensional structure conservation in RNA
2010-01-01
Background In recent years, the number of available RNA structures has rapidly grown reflecting the increased interest on RNA biology. Similarly to the studies carried out two decades ago for proteins, which gave the fundamental grounds for developing comparative protein structure prediction methods, we are now able to quantify the relationship between sequence and structure conservation in RNA. Results Here we introduce an all-against-all sequence- and three-dimensional (3D) structure-based comparison of a representative set of RNA structures, which have allowed us to quantitatively confirm that: (i) there is a measurable relationship between sequence and structure conservation that weakens for alignments resulting in below 60% sequence identity, (ii) evolution tends to conserve more RNA structure than sequence, and (iii) there is a twilight zone for RNA homology detection. Discussion The computational analysis here presented quantitatively describes the relationship between sequence and structure for RNA molecules and defines a twilight zone region for detecting RNA homology. Our work could represent the theoretical basis and limitations for future developments in comparative RNA 3D structure prediction. PMID:20550657
Optimizing Illumina next-generation sequencing library preparation for extremely AT-biased genomes.
Oyola, Samuel O; Otto, Thomas D; Gu, Yong; Maslen, Gareth; Manske, Magnus; Campino, Susana; Turner, Daniel J; Macinnis, Bronwyn; Kwiatkowski, Dominic P; Swerdlow, Harold P; Quail, Michael A
2012-01-03
Massively parallel sequencing technology is revolutionizing approaches to genomic and genetic research. Since its advent, the scale and efficiency of Next-Generation Sequencing (NGS) has rapidly improved. In spite of this success, sequencing genomes or genomic regions with extremely biased base composition is still a great challenge to the currently available NGS platforms. The genomes of some important pathogenic organisms like Plasmodium falciparum (high AT content) and Mycobacterium tuberculosis (high GC content) display extremes of base composition. The standard library preparation procedures that employ PCR amplification have been shown to cause uneven read coverage particularly across AT and GC rich regions, leading to problems in genome assembly and variation analyses. Alternative library-preparation approaches that omit PCR amplification require large quantities of starting material and hence are not suitable for small amounts of DNA/RNA such as those from clinical isolates. We have developed and optimized library-preparation procedures suitable for low quantity starting material and tolerant to extremely high AT content sequences. We have used our optimized conditions in parallel with standard methods to prepare Illumina sequencing libraries from a non-clinical and a clinical isolate (containing ~53% host contamination). By analyzing and comparing the quality of sequence data generated, we show that our optimized conditions that involve a PCR additive (TMAC), produces amplified libraries with improved coverage of extremely AT-rich regions and reduced bias toward GC neutral templates. We have developed a robust and optimized Next-Generation Sequencing library amplification method suitable for extremely AT-rich genomes. The new amplification conditions significantly reduce bias and retain the complexity of either extremes of base composition. This development will greatly benefit sequencing clinical samples that often require amplification due to low mass of DNA starting material.
Tarantino, Mary E; Bilotti, Katharina; Huang, Ji; Delaney, Sarah
2015-08-21
Flap endonuclease 1 (FEN1) is a structure-specific nuclease responsible for removing 5'-flaps formed during Okazaki fragment maturation and long patch base excision repair. In this work, we use rapid quench flow techniques to examine the rates of 5'-flap removal on DNA substrates of varying length and sequence. Of particular interest are flaps containing trinucleotide repeats (TNR), which have been proposed to affect FEN1 activity and cause genetic instability. We report that FEN1 processes substrates containing flaps of 30 nucleotides or fewer at comparable single-turnover rates. However, for flaps longer than 30 nucleotides, FEN1 kinetically discriminates substrates based on flap length and flap sequence. In particular, FEN1 removes flaps containing TNR sequences at a rate slower than mixed sequence flaps of the same length. Furthermore, multiple-turnover kinetic analysis reveals that the rate-determining step of FEN1 switches as a function of flap length from product release to chemistry (or a step prior to chemistry). These results provide a kinetic perspective on the role of FEN1 in DNA replication and repair and contribute to our understanding of FEN1 in mediating genetic instability of TNR sequences. © 2015 by The American Society for Biochemistry and Molecular Biology, Inc.
Phylo-VISTA: Interactive visualization of multiple DNA sequence alignments
DOE Office of Scientific and Technical Information (OSTI.GOV)
Shah, Nameeta; Couronne, Olivier; Pennacchio, Len A.
The power of multi-sequence comparison for biological discovery is well established. The need for new capabilities to visualize and compare cross-species alignment data is intensified by the growing number of genomic sequence datasets being generated for an ever-increasing number of organisms. To be efficient these visualization algorithms must support the ability to accommodate consistently a wide range of evolutionary distances in a comparison framework based upon phylogenetic relationships. Results: We have developed Phylo-VISTA, an interactive tool for analyzing multiple alignments by visualizing a similarity measure for multiple DNA sequences. The complexity of visual presentation is effectively organized using a frameworkmore » based upon interspecies phylogenetic relationships. The phylogenetic organization supports rapid, user-guided interspecies comparison. To aid in navigation through large sequence datasets, Phylo-VISTA leverages concepts from VISTA that provide a user with the ability to select and view data at varying resolutions. The combination of multiresolution data visualization and analysis, combined with the phylogenetic framework for interspecies comparison, produces a highly flexible and powerful tool for visual data analysis of multiple sequence alignments. Availability: Phylo-VISTA is available at http://www-gsd.lbl. gov/phylovista. It requires an Internet browser with Java Plugin 1.4.2 and it is integrated into the global alignment program LAGAN at http://lagan.stanford.edu« less
Bhattacharyya, Anamitra; Stilwagen, Stephanie; Reznik, Gary; Feil, Helene; Feil, William S; Anderson, Iain; Bernal, Axel; D'Souza, Mark; Ivanova, Natalia; Kapatral, Vinayak; Larsen, Niels; Los, Tamara; Lykidis, Athanasios; Selkov, Eugene; Walunas, Theresa L; Purcell, Alexander; Edwards, Rob A; Hawkins, Trevor; Haselkorn, Robert; Overbeek, Ross; Kyrpides, Nikos C; Predki, Paul F
2002-10-01
Draft sequencing is a rapid and efficient method for determining the near-complete sequence of microbial genomes. Here we report a comparative analysis of one complete and two draft genome sequences of the phytopathogenic bacterium, Xylella fastidiosa, which causes serious disease in plants, including citrus, almond, and oleander. We present highlights of an in silico analysis based on a comparison of reconstructions of core biological subsystems. Cellular pathway reconstructions have been used to identify a small number of genes, which are likely to reside within the draft genomes but are not captured in the draft assembly. These represented only a small fraction of all genes and were predominantly large and small ribosomal subunit protein components. By using this approach, some of the inherent limitations of draft sequence can be significantly reduced. Despite the incomplete nature of the draft genomes, it is possible to identify several phage-related genes, which appear to be absent from the draft genomes and not the result of insufficient sequence sampling. This region may therefore identify potential host-specific functions. Based on this first functional reconstruction of a phytopathogenic microbe, we spotlight an unusual respiration machinery as a potential target for biological control. We also predicted and developed a new defined growth medium for Xylella.
Pham, Nikki T.; Wei, Tong; Schackwitz, Wendy S.; Lipzen, Anna M.; Duong, Phat Q.; Jones, Kyle C.; Ruan, Deling; Bauer, Diane; Peng, Yi; Schmutz, Jeremy
2017-01-01
The availability of a whole-genome sequenced mutant population and the cataloging of mutations of each line at a single-nucleotide resolution facilitate functional genomic analysis. To this end, we generated and sequenced a fast-neutron-induced mutant population in the model rice cultivar Kitaake (Oryza sativa ssp japonica), which completes its life cycle in 9 weeks. We sequenced 1504 mutant lines at 45-fold coverage and identified 91,513 mutations affecting 32,307 genes, i.e., 58% of all rice genes. We detected an average of 61 mutations per line. Mutation types include single-base substitutions, deletions, insertions, inversions, translocations, and tandem duplications. We observed a high proportion of loss-of-function mutations. We identified an inversion affecting a single gene as the causative mutation for the short-grain phenotype in one mutant line. This result reveals the usefulness of the resource for efficient, cost-effective identification of genes conferring specific phenotypes. To facilitate public access to this genetic resource, we established an open access database called KitBase that provides access to sequence data and seed stocks. This population complements other available mutant collections and gene-editing technologies. This work demonstrates how inexpensive next-generation sequencing can be applied to generate a high-density catalog of mutations. PMID:28576844
Dereeper, Alexis; Nicolas, Stéphane; Le Cunff, Loïc; Bacilieri, Roberto; Doligez, Agnès; Peros, Jean-Pierre; Ruiz, Manuel; This, Patrice
2011-05-05
High-throughput re-sequencing, new genotyping technologies and the availability of reference genomes allow the extensive characterization of Single Nucleotide Polymorphisms (SNPs) and insertion/deletion events (indels) in many plant species. The rapidly increasing amount of re-sequencing and genotyping data generated by large-scale genetic diversity projects requires the development of integrated bioinformatics tools able to efficiently manage, analyze, and combine these genetic data with genome structure and external data. In this context, we developed SNiPlay, a flexible, user-friendly and integrative web-based tool dedicated to polymorphism discovery and analysis. It integrates:1) a pipeline, freely accessible through the internet, combining existing softwares with new tools to detect SNPs and to compute different types of statistical indices and graphical layouts for SNP data. From standard sequence alignments, genotyping data or Sanger sequencing traces given as input, SNiPlay detects SNPs and indels events and outputs submission files for the design of Illumina's SNP chips. Subsequently, it sends sequences and genotyping data into a series of modules in charge of various processes: physical mapping to a reference genome, annotation (genomic position, intron/exon location, synonymous/non-synonymous substitutions), SNP frequency determination in user-defined groups, haplotype reconstruction and network, linkage disequilibrium evaluation, and diversity analysis (Pi, Watterson's Theta, Tajima's D).Furthermore, the pipeline allows the use of external data (such as phenotype, geographic origin, taxa, stratification) to define groups and compare statistical indices.2) a database storing polymorphisms, genotyping data and grapevine sequences released by public and private projects. It allows the user to retrieve SNPs using various filters (such as genomic position, missing data, polymorphism type, allele frequency), to compare SNP patterns between populations, and to export genotyping data or sequences in various formats. Our experiments on grapevine genetic projects showed that SNiPlay allows geneticists to rapidly obtain advanced results in several key research areas of plant genetic diversity. Both the management and treatment of large amounts of SNP data are rendered considerably easier for end-users through automation and integration. Current developments are taking into account new advances in high-throughput technologies.SNiPlay is available at: http://sniplay.cirad.fr/.
Functional Brain Activation Differences in Stuttering Identified with a Rapid fMRI Sequence
ERIC Educational Resources Information Center
Loucks, Torrey; Kraft, Shelly Jo; Choo, Ai Leen; Sharma, Harish; Ambrose, Nicoline G.
2011-01-01
The purpose of this study was to investigate whether brain activity related to the presence of stuttering can be identified with rapid functional MRI (fMRI) sequences that involved overt and covert speech processing tasks. The long-term goal is to develop sensitive fMRI approaches with developmentally appropriate tasks to identify deviant speech…
Maluping, R P; Ravelo, C; Lavilla-Pitogo, C R; Krovacek, K; Romalde, J L
2005-01-01
The main aim of the present study was to use three PCR-based techniques for the analysis of genetic variability among Vibrio parahaemolyticus strains isolated from the Philippines. Seventeen strains of V. parahaemolyticus isolated from shrimps (Penaeus monodon) and from the environments where these shrimps are being cultivated were analysed by random amplified polymorphic DNA PCR (RAPD-PCR), enterobacterial repetitive intergenic consensus sequence PCR (ERIC-PCR) and repetitive extragenic palindromic PCR (REP-PCR). The results of this work have demonstrated genetic variability within the V. parahaemolyticus strains that were isolated from the Philippines. In addition, RAPD, ERIC and REP-PCR are suitable rapid typing methods for V. parahaemolyticus. All three methods have good discriminative ability and can be used as a rapid means of comparing V. parahaemolyticus strains for epidemiological investigation. Based on the results of this study, we could say that REP-PCR is inferior to RAPD and ERIC-PCR owing to the fact that it is less reproducible. Moreover, the REP-PCR analysis yielded a relatively small number of products. This may suggests that the REP sequences may not be widely distributed in the V. parahaemolyticus genome. Genetic variability within V. parahaemolyticus strains isolated in the Philippines has been demonstrated. The presence of ERIC and REP sequences in the genome of this bacterial species was confirmed. The RAPD, ERIC and REP-PCR techniques are useful methods for molecular typing of V. parahaemolyticus strains. To our knowledge this is the first study of this kind carried out on V. parahaemolyticus strains isolated from the Philippines.
Liachko, Ivan; Youngblood, Rachel A.; Keich, Uri; Dunham, Maitreya J.
2013-01-01
DNA replication origins are necessary for the duplication of genomes. In addition, plasmid-based expression systems require DNA replication origins to maintain plasmids efficiently. The yeast autonomously replicating sequence (ARS) assay has been a valuable tool in dissecting replication origin structure and function. However, the dearth of information on origins in diverse yeasts limits the availability of efficient replication origin modules to only a handful of species and restricts our understanding of origin function and evolution. To enable rapid study of origins, we have developed a sequencing-based suite of methods for comprehensively mapping and characterizing ARSs within a yeast genome. Our approach finely maps genomic inserts capable of supporting plasmid replication and uses massively parallel deep mutational scanning to define molecular determinants of ARS function with single-nucleotide resolution. In addition to providing unprecedented detail into origin structure, our data have allowed us to design short, synthetic DNA sequences that retain maximal ARS function. These methods can be readily applied to understand and modulate ARS function in diverse systems. PMID:23241746
QuickProbs 2: Towards rapid construction of high-quality alignments of large protein families
Gudyś, Adam; Deorowicz, Sebastian
2017-01-01
The ever-increasing size of sequence databases caused by the development of high throughput sequencing, poses to multiple alignment algorithms one of the greatest challenges yet. As we show, well-established techniques employed for increasing alignment quality, i.e., refinement and consistency, are ineffective when large protein families are investigated. We present QuickProbs 2, an algorithm for multiple sequence alignment. Based on probabilistic models, equipped with novel column-oriented refinement and selective consistency, it offers outstanding accuracy. When analysing hundreds of sequences, Quick-Probs 2 is noticeably better than ClustalΩ and MAFFT, the previous leaders for processing numerous protein families. In the case of smaller sets, for which consistency-based methods are the best performing, QuickProbs 2 is also superior to the competitors. Due to low computational requirements of selective consistency and utilization of massively parallel architectures, presented algorithm has similar execution times to ClustalΩ, and is orders of magnitude faster than full consistency approaches, like MSAProbs or PicXAA. All these make QuickProbs 2 an excellent tool for aligning families ranging from few, to hundreds of proteins. PMID:28139687
Emerman, Amy B; Bowman, Sarah K; Barry, Andrew; Henig, Noa; Patel, Kruti M; Gardner, Andrew F; Hendrickson, Cynthia L
2017-07-05
Next-generation sequencing (NGS) is a powerful tool for genomic studies, translational research, and clinical diagnostics that enables the detection of single nucleotide polymorphisms, insertions and deletions, copy number variations, and other genetic variations. Target enrichment technologies improve the efficiency of NGS by only sequencing regions of interest, which reduces sequencing costs while increasing coverage of the selected targets. Here we present NEBNext Direct ® , a hybridization-based, target-enrichment approach that addresses many of the shortcomings of traditional target-enrichment methods. This approach features a simple, 7-hr workflow that uses enzymatic removal of off-target sequences to achieve a high specificity for regions of interest. Additionally, unique molecular identifiers are incorporated for the identification and filtering of PCR duplicates. The same protocol can be used across a wide range of input amounts, input types, and panel sizes, enabling NEBNext Direct to be broadly applicable across a wide variety of research and diagnostic needs. © 2017 by John Wiley & Sons, Inc. Copyright © 2017 John Wiley & Sons, Inc.
Burns, Cara C; Kilpatrick, David R; Iber, Jane C; Chen, Qi; Kew, Olen M
2016-01-01
Virologic surveillance is essential to the success of the World Health Organization initiative to eradicate poliomyelitis. Molecular methods have been used to detect polioviruses in tissue culture isolates derived from stool samples obtained through surveillance for acute flaccid paralysis. This chapter describes the use of realtime PCR assays to identify and serotype polioviruses. In particular, a degenerate, inosine-containing, panpoliovirus (panPV) PCR primer set is used to distinguish polioviruses from NPEVs. The high degree of nucleotide sequence diversity among polioviruses presents a challenge to the systematic design of nucleic acid-based reagents. To accommodate the wide variability and rapid evolution of poliovirus genomes, degenerate codon positions on the template were matched to mixed-base or deoxyinosine residues on both the primers and the TaqMan™ probes. Additional assays distinguish between Sabin vaccine strains and non-Sabin strains. This chapter also describes the use of generic poliovirus specific primers, along with degenerate and inosine-containing primers, for routine VP1 sequencing of poliovirus isolates. These primers, along with nondegenerate serotype-specific Sabin primers, can also be used to sequence individual polioviruses in mixtures.
Lauerman, Lloyd H
2004-12-01
Since the discovery of the polymerase chain reaction (PCR) 20 years ago, an avalanche of scientific publications have reported major developments and changes in specialized equipment, reagents, sample preparation, computer programs and techniques, generated through business, government and university research. The requirement for genetic sequences for primer selection and validation has been greatly facilitated by the development of new sequencing techniques, machines and computer programs. Genetic libraries, such as GenBank, EMBL and DDBJ continue to accumulate a wealth of genetic sequence information for the development and validation of molecular-based diagnostic procedures concerning human and veterinary disease agents. The mechanization of various aspects of the PCR assay, such as robotics, microfluidics and nanotechnology, has made it possible for the rapid advancement of new procedures. Real-time PCR, DNA microarray and DNA chips utilize these newer techniques in conjunction with computer and computer programs. Instruments for hand-held PCR assays are being developed. The PCR and reverse transcription-PCR (RT-PCR) assays have greatly accelerated the speed and accuracy of diagnoses of human and animal disease, especially of the infectious agents that are difficult to isolate or demonstrate. The PCR has made it possible to genetically characterize a microbial isolate inexpensively and rapidly for identification, typing and epidemiological comparison.
Petzold, Markus; Ehricht, Ralf; Slickers, Peter; Pleischl, Stefan; Brockmann, Ansgar; Exner, Martin; Monecke, Stefan; Lück, Christian
2017-06-01
Between 1 August and 6 September 2013, an outbreak of Legionnaires' disease (LD) with 78 cases confirmed by positive urinary antigen tests occurred in Warstein, North Rhine-Westphalia, Germany. Legionella (L.) pneumophila, serogroup (Sg) 1, monoclonal antibody (mAb) subgroup Knoxville, sequence type (ST) 345, was identified as the epidemic strain. This strain was isolated from seven patients. To detect the source of the infection, epidemiological typing of clinical and environmental strains was performed in two consecutive steps. First, strains were typed by monoclonal antibodies. Indistinguishable strains were further subtyped by sequence-based typing (SBT) which is the internationally recognized standard method for epidemiological genotyping of L. pneumophila. In an early stage of the outbreak investigation, many environmental isolates were found to belong to the mAb subgroup Knoxville, but to two different STs, namely to ST 345, the epidemic strain, and to ST 600. A majority of environmental isolates belonged to ST 600 whereas the epidemic ST 345 strain was less common in environmental samples. To rapidly distinguish both Knoxville strains, we applied a novel typing method based on DNA-hybridization on glass chips. The new assay can easily and rapidly discriminate L. pneumophila Sg 1 strains. Thus, we were able to quickly identify the sources harboring the epidemic strain, i.e., two cooling towers of different companies, the waste water treatment plants (WWTP) of the city and one company as well as water samples of the river Wester and its branches. Copyright © 2016 Elsevier GmbH. All rights reserved.
Protocol for rapid sequence intubation in pediatric patients -- a four-year study.
Marvez-Valls, Eduardo; Houry, Debra; Ernst, Amy A; Weiss, Steven J; Killeen, James
2002-04-01
To evaluate a protocol for rapid sequence intubation (RSI) for pediatric patients in a Level 1 trauma center. Retrospective review of prospectively gathered Continuing Quality Improvement (CQI) data at an inner city Level 1 trauma center with an emergency medicine residency program. Protocols for RSI were established prior to initiating the study. All pediatric intubations at the center from February 1996 to February 2000 were included. Statistical analysis included descriptive statistics for categorical data and Chi-square for comparisons between groups. Over the 4-year study period there were 83 pediatric intubations ranging in age from 18 months to 17 years; mean age 8.6. All had data collected at the time of intubation. There were 20 (24%) females and 62 (76%) males (p<0.001). Reasons for intubation were related to trauma in 71 (86%) and medical reasons in 12 (14%) (p<0.001). Of the trauma intubations 7 (10%) were for gunshot wounds, 39 (55%) were secondary to MVCs, and the remainder (25; 35%) were from assaults, falls, and closed head injuries. The non-trauma intubations were for smoke inhalation, overdose, seizure, HIV related complications, eclampsia, and near drowning. Intubations were successful with one attempt in 65 (78%) cases. No surgical airways were necessary. Rocuronium was used in 4 cases. Protocol deviations did not lead to complications. This protocol based pediatric rapid sequence intubation method worked well in an EM residency program. More intubations were in males and more were necessary due to trauma in this group.
Sanz, Yolanda
2017-01-01
Abstract The miniaturized and portable DNA sequencer MinION™ has demonstrated great potential in different analyses such as genome-wide sequencing, pathogen outbreak detection and surveillance, human genome variability, and microbial diversity. In this study, we tested the ability of the MinION™ platform to perform long amplicon sequencing in order to design new approaches to study microbial diversity using a multi-locus approach. After compiling a robust database by parsing and extracting the rrn bacterial region from more than 67000 complete or draft bacterial genomes, we demonstrated that the data obtained during sequencing of the long amplicon in the MinION™ device using R9 and R9.4 chemistries were sufficient to study 2 mock microbial communities in a multiplex manner and to almost completely reconstruct the microbial diversity contained in the HM782D and D6305 mock communities. Although nanopore-based sequencing produces reads with lower per-base accuracy compared with other platforms, we presented a novel approach consisting of multi-locus and long amplicon sequencing using the MinION™ MkIb DNA sequencer and R9 and R9.4 chemistries that help to overcome the main disadvantage of this portable sequencing platform. Furthermore, the nanopore sequencing library, constructed with the last releases of pore chemistry (R9.4) and sequencing kit (SQK-LSK108), permitted the retrieval of the higher level of 1D read accuracy sufficient to characterize the microbial species present in each mock community analysed. Improvements in nanopore chemistry, such as minimizing base-calling errors and new library protocols able to produce rapid 1D libraries, will provide more reliable information in the near future. Such data will be useful for more comprehensive and faster specific detection of microbial species and strains in complex ecosystems. PMID:28605506
Recognizing human actions by learning and matching shape-motion prototype trees.
Jiang, Zhuolin; Lin, Zhe; Davis, Larry S
2012-03-01
A shape-motion prototype-based approach is introduced for action recognition. The approach represents an action as a sequence of prototypes for efficient and flexible action matching in long video sequences. During training, an action prototype tree is learned in a joint shape and motion space via hierarchical K-means clustering and each training sequence is represented as a labeled prototype sequence; then a look-up table of prototype-to-prototype distances is generated. During testing, based on a joint probability model of the actor location and action prototype, the actor is tracked while a frame-to-prototype correspondence is established by maximizing the joint probability, which is efficiently performed by searching the learned prototype tree; then actions are recognized using dynamic prototype sequence matching. Distance measures used for sequence matching are rapidly obtained by look-up table indexing, which is an order of magnitude faster than brute-force computation of frame-to-frame distances. Our approach enables robust action matching in challenging situations (such as moving cameras, dynamic backgrounds) and allows automatic alignment of action sequences. Experimental results demonstrate that our approach achieves recognition rates of 92.86 percent on a large gesture data set (with dynamic backgrounds), 100 percent on the Weizmann action data set, 95.77 percent on the KTH action data set, 88 percent on the UCF sports data set, and 87.27 percent on the CMU action data set.
Tahayori, B; Khaneja, N; Johnston, L A; Farrell, P M; Mareels, I M Y
2016-01-01
The design of slice selective pulses for magnetic resonance imaging can be cast as an optimal control problem. The Fourier synthesis method is an existing approach to solve these optimal control problems. In this method the gradient field as well as the excitation field are switched rapidly and their amplitudes are calculated based on a Fourier series expansion. Here, we provide a novel insight into the Fourier synthesis method via representing the Bloch equation in spherical coordinates. Based on the spherical Bloch equation, we propose an alternative sequence of pulses that can be used for slice selection which is more time efficient compared to the original method. Simulation results demonstrate that while the performance of both methods is approximately the same, the required time for the proposed sequence of pulses is half of the original sequence of pulses. Furthermore, the slice selectivity of both sequences of pulses changes with radio frequency field inhomogeneities in a similar way. We also introduce a measure, referred to as gradient complexity, to compare the performance of both sequences of pulses. This measure indicates that for a desired level of uniformity in the excited slice, the gradient complexity for the proposed sequence of pulses is less than the original sequence. Copyright © 2015 Associazione Italiana di Fisica Medica. Published by Elsevier Ltd. All rights reserved.
Li, Yiyan; Yang, Xing; Zhao, Weian
2018-01-01
Rapid bacterial identification (ID) and antibiotic susceptibility testing (AST) are in great demand due to the rise of drug-resistant bacteria. Conventional culture-based AST methods suffer from a long turnaround time. By necessity, physicians often have to treat patients empirically with antibiotics, which has led to an inappropriate use of antibiotics, an elevated mortality rate and healthcare costs, and antibiotic resistance. Recent advances in miniaturization and automation provide promising solutions for rapid bacterial ID/AST profiling, which will potentially make a significant impact in the clinical management of infectious diseases and antibiotic stewardship in the coming years. In this review, we summarize and analyze representative emerging micro- and nanotechnologies, as well as automated systems for bacterial ID/AST, including both phenotypic (e.g., microfluidic-based bacterial culture, and digital imaging of single cells) and molecular (e.g., multiplex PCR, hybridization probes, nanoparticles, synthetic biology tools, mass spectrometry, and sequencing technologies) methods. We also discuss representative point-of-care (POC) systems that integrate sample processing, fluid handling, and detection for rapid bacterial ID/AST. Finally, we highlight major remaining challenges and discuss potential future endeavors toward improving clinical outcomes with rapid bacterial ID/AST technologies. PMID:28850804
Li, Yiyan; Yang, Xing; Zhao, Weian
2017-12-01
Rapid bacterial identification (ID) and antibiotic susceptibility testing (AST) are in great demand due to the rise of drug-resistant bacteria. Conventional culture-based AST methods suffer from a long turnaround time. By necessity, physicians often have to treat patients empirically with antibiotics, which has led to an inappropriate use of antibiotics, an elevated mortality rate and healthcare costs, and antibiotic resistance. Recent advances in miniaturization and automation provide promising solutions for rapid bacterial ID/AST profiling, which will potentially make a significant impact in the clinical management of infectious diseases and antibiotic stewardship in the coming years. In this review, we summarize and analyze representative emerging micro- and nanotechnologies, as well as automated systems for bacterial ID/AST, including both phenotypic (e.g., microfluidic-based bacterial culture, and digital imaging of single cells) and molecular (e.g., multiplex PCR, hybridization probes, nanoparticles, synthetic biology tools, mass spectrometry, and sequencing technologies) methods. We also discuss representative point-of-care (POC) systems that integrate sample processing, fluid handling, and detection for rapid bacterial ID/AST. Finally, we highlight major remaining challenges and discuss potential future endeavors toward improving clinical outcomes with rapid bacterial ID/AST technologies.
Isolation of Mycobacterium massiliense from a corneal biopsy in India.
Kulandai, Lily Therese; Lakshmipathy, Dhanurekha; Ramasubban, Gayathri; Rao, Madhavan Hajib Narahari
2014-12-01
Rapidly growing mycobacteria (RGM) are ubiquitous and are usually considered as saprophytes, and have been recovered from the environment, particularly in dust, watery soil and water distribution systems. However, Mycobacterium massiliense is a rare causative agent of ocular infection. We report a case of M. massiliense in a 44-year-old female with signs and symptoms of a corneal ulcer. We carried out PCR-based DNA sequencing targeting the hsp 65 gene for the identification of M. massiliense . To confirm the identification, we also performed PCR-based RFLP targeting the hsp65 gene and PCR-based DNA sequencing targeting the internal transcribed spacer region, which showed 97 % nucleotide identity with M. massiliense . To the best of our knowledge, this is the first study in India to report the detection of M. massiliense from a corneal biopsy.
Transient extracellular glutamate events in the basolateral amygdala track reward seeking actions
Wassum, KM; Tolosa, VM; Tseng, TC; Balleine, BW; Monbouquette, HG; Maidment, NT
2012-01-01
The ability to make rapid, informed decisions about whether or not to engage in a sequence of actions to earn reward is essential for survival. Modeling in rodents has demonstrated a critical role for the basolateral amygdala (BLA) in such reward-seeking actions, but the precise neurochemical underpinnings are not well understood. Taking advantage of recent advancements in biosensor technologies, we made spatially discrete near-real time extracellular recordings of the major excitatory transmitter, glutamate, in the BLA of rats performing a self-paced lever-pressing sequence task for sucrose reward. This allowed us to detect rapid transient fluctuations in extracellular BLA glutamate time-locked to action performance. These glutamate transients tended to precede lever pressing actions and were markedly increased in frequency when rats were engaged in such reward seeking actions. Based on muscimol and tetrodotoxin microinfusions these glutamate transients appeared to originate from the terminals of neurons with cell bodies in the orbital frontal cortex. Importantly, glutamate transient amplitude and frequency fluctuated with the value of the earned reward and positively predicted lever pressing rate. Such novel rapid glutamate recordings during instrumental performance identify a role for glutamatergic signaling within the BLA in instrumental reward-seeking actions. PMID:22357857
Geiling, Benjamin; Vandal, Guillaume; Posner, Ada R.; de Bruyns, Angeline; Dutchak, Kendall L.; Garnett, Samantha; Dankort, David
2013-01-01
The ability to express exogenous cDNAs while suppressing endogenous genes via RNAi represents an extremely powerful research tool with the most efficient non-transient approach being accomplished through stable viral vector integration. Unfortunately, since traditional restriction enzyme based methods for constructing such vectors are sequence dependent, their construction is often difficult and not amenable to mass production. Here we describe a non-sequence dependent Gateway recombination cloning system for the rapid production of novel lentiviral (pLEG) and retroviral (pREG) vectors. Using this system to recombine 3 or 4 modular plasmid components it is possible to generate viral vectors expressing cDNAs with or without inhibitory RNAs (shRNAmirs). In addition, we demonstrate a method to rapidly produce and triage novel shRNAmirs for use with this system. Once strong candidate shRNAmirs have been identified they may be linked together in tandem to knockdown expression of multiple targets simultaneously or to improve the knockdown of a single target. Here we demonstrate that these recombinant vectors are able to express cDNA and effectively knockdown protein expression using both cell culture and animal model systems. PMID:24146852
Antonov, Valery A; Tkachenko, Galina A; Altukhova, Viktoriya V; Savchenko, Sergey S; Zinchenko, Olga V; Viktorov, Dmitry V; Zamaraev, Valery S; Ilyukhin, Vladimir I; Alekseev, Vladimir V
2008-12-01
Burkholderia mallei and B. pseudomallei are highly pathogenic microorganisms for both humans and animals. Moreover, they are regarded as potential agents of bioterrorism. Thus, rapid and unequivocal detection and identification of these dangerous pathogens is critical. In the present study, we describe the use of an optimized protocol for the early diagnosis of experimental glanders and melioidosis and for the rapid differentiation and typing of Burkholderia strains. This experience with PCR-based identification methods indicates that single PCR targets (23S and 16S rRNA genes, 16S-23S intergenic region, fliC and type III secretion gene cluster) should be used with caution for identification of B. mallei and B. pseudomallei, and need to be used alongside molecular methods such as gene sequencing. Several molecular typing procedures have been used to identify genetically related B. pseudomallei and B. mallei isolates, including ribotyping, pulsed-field gel electrophoresis and multilocus sequence typing. However, these methods are time consuming and technically challenging for many laboratories. RAPD, variable amplicon typing scheme, Rep-PCR, BOX-PCR and multiple-locus variable-number tandem repeat analysis have been recommended by us for the rapid differentiation of B. mallei and B. pseudomallei strains.
Ait Kaci Azzou, S; Larribe, F; Froda, S
2016-10-01
In Ait Kaci Azzou et al. (2015) we introduced an Importance Sampling (IS) approach for estimating the demographic history of a sample of DNA sequences, the skywis plot. More precisely, we proposed a new nonparametric estimate of a population size that changes over time. We showed on simulated data that the skywis plot can work well in typical situations where the effective population size does not undergo very steep changes. In this paper, we introduce an iterative procedure which extends the previous method and gives good estimates under such rapid variations. In the iterative calibrated skywis plot we approximate the effective population size by a piecewise constant function, whose values are re-estimated at each step. These piecewise constant functions are used to generate the waiting times of non homogeneous Poisson processes related to a coalescent process with mutation under a variable population size model. Moreover, the present IS procedure is based on a modified version of the Stephens and Donnelly (2000) proposal distribution. Finally, we apply the iterative calibrated skywis plot method to a simulated data set from a rapidly expanding exponential model, and we show that the method based on this new IS strategy correctly reconstructs the demographic history. Copyright © 2016. Published by Elsevier Inc.
Mojo Hand, a TALEN design tool for genome editing applications.
Neff, Kevin L; Argue, David P; Ma, Alvin C; Lee, Han B; Clark, Karl J; Ekker, Stephen C
2013-01-16
Recent studies of transcription activator-like (TAL) effector domains fused to nucleases (TALENs) demonstrate enormous potential for genome editing. Effective design of TALENs requires a combination of selecting appropriate genetic features, finding pairs of binding sites based on a consensus sequence, and, in some cases, identifying endogenous restriction sites for downstream molecular genetic applications. We present the web-based program Mojo Hand for designing TAL and TALEN constructs for genome editing applications (http://www.talendesign.org). We describe the algorithm and its implementation. The features of Mojo Hand include (1) automatic download of genomic data from the National Center for Biotechnology Information, (2) analysis of any DNA sequence to reveal pairs of binding sites based on a user-defined template, (3) selection of restriction-enzyme recognition sites in the spacer between the TAL monomer binding sites including options for the selection of restriction enzyme suppliers, and (4) output files designed for subsequent TALEN construction using the Golden Gate assembly method. Mojo Hand enables the rapid identification of TAL binding sites for use in TALEN design. The assembly of TALEN constructs, is also simplified by using the TAL-site prediction program in conjunction with a spreadsheet management aid of reagent concentrations and TALEN formulation. Mojo Hand enables scientists to more rapidly deploy TALENs for genome editing applications.
Development of a PCR-based assay for rapid and reliable identification of pathogenic Fusaria.
Mishra, Prashant K; Fox, Roland T V; Culham, Alastair
2003-01-28
Identification of Fusarium species has always been difficult due to confusing phenotypic classification systems. We have developed a fluorescent-based polymerase chain reaction assay that allows for rapid and reliable identification of five toxigenic and pathogenic Fusarium species. The species includes Fusarium avenaceum, F. culmorum, F. equiseti, F. oxysporum and F. sambucinum. The method is based on the PCR amplification of species-specific DNA fragments using fluorescent oligonucleotide primers, which were designed based on sequence divergence within the internal transcribed spacer region of nuclear ribosomal DNA. Besides providing an accurate, reliable, and quick diagnosis of these Fusaria, another advantage with this method is that it reduces the potential for exposure to carcinogenic chemicals as it substitutes the use of fluorescent dyes in place of ethidium bromide. Apart from its multidisciplinary importance and usefulness, it also obviates the need for gel electrophoresis.
Neptune: a bioinformatics tool for rapid discovery of genomic variation in bacterial populations
Marinier, Eric; Zaheer, Rahat; Berry, Chrystal; Weedmark, Kelly A.; Domaratzki, Michael; Mabon, Philip; Knox, Natalie C.; Reimer, Aleisha R.; Graham, Morag R.; Chui, Linda; Patterson-Fortin, Laura; Zhang, Jian; Pagotto, Franco; Farber, Jeff; Mahony, Jim; Seyer, Karine; Bekal, Sadjia; Tremblay, Cécile; Isaac-Renton, Judy; Prystajecky, Natalie; Chen, Jessica; Slade, Peter
2017-01-01
Abstract The ready availability of vast amounts of genomic sequence data has created the need to rethink comparative genomics algorithms using ‘big data’ approaches. Neptune is an efficient system for rapidly locating differentially abundant genomic content in bacterial populations using an exact k-mer matching strategy, while accommodating k-mer mismatches. Neptune’s loci discovery process identifies sequences that are sufficiently common to a group of target sequences and sufficiently absent from non-targets using probabilistic models. Neptune uses parallel computing to efficiently identify and extract these loci from draft genome assemblies without requiring multiple sequence alignments or other computationally expensive comparative sequence analyses. Tests on simulated and real datasets showed that Neptune rapidly identifies regions that are both sensitive and specific. We demonstrate that this system can identify trait-specific loci from different bacterial lineages. Neptune is broadly applicable for comparative bacterial analyses, yet will particularly benefit pathogenomic applications, owing to efficient and sensitive discovery of differentially abundant genomic loci. The software is available for download at: http://github.com/phac-nml/neptune. PMID:29048594
Binladen, Jonas; Gilbert, M Thomas P; Bollback, Jonathan P; Panitz, Frank; Bendixen, Christian; Nielsen, Rasmus; Willerslev, Eske
2007-02-14
The invention of the Genome Sequence 20 DNA Sequencing System (454 parallel sequencing platform) has enabled the rapid and high-volume production of sequence data. Until now, however, individual emulsion PCR (emPCR) reactions and subsequent sequencing runs have been unable to combine template DNA from multiple individuals, as homologous sequences cannot be subsequently assigned to their original sources. We use conventional PCR with 5'-nucleotide tagged primers to generate homologous DNA amplification products from multiple specimens, followed by sequencing through the high-throughput Genome Sequence 20 DNA Sequencing System (GS20, Roche/454 Life Sciences). Each DNA sequence is subsequently traced back to its individual source through 5'tag-analysis. We demonstrate that this new approach enables the assignment of virtually all the generated DNA sequences to the correct source once sequencing anomalies are accounted for (miss-assignment rate<0.4%). Therefore, the method enables accurate sequencing and assignment of homologous DNA sequences from multiple sources in single high-throughput GS20 run. We observe a bias in the distribution of the differently tagged primers that is dependent on the 5' nucleotide of the tag. In particular, primers 5' labelled with a cytosine are heavily overrepresented among the final sequences, while those 5' labelled with a thymine are strongly underrepresented. A weaker bias also exists with regards to the distribution of the sequences as sorted by the second nucleotide of the dinucleotide tags. As the results are based on a single GS20 run, the general applicability of the approach requires confirmation. However, our experiments demonstrate that 5'primer tagging is a useful method in which the sequencing power of the GS20 can be applied to PCR-based assays of multiple homologous PCR products. The new approach will be of value to a broad range of research areas, such as those of comparative genomics, complete mitochondrial analyses, population genetics, and phylogenetics.
Barbau-Piednoir, Elodie; De Keersmaecker, Sigrid C J; Delvoye, Maud; Gau, Céline; Philipp, Patrick; Roosens, Nancy H
2015-11-11
Recently, the presence of an unauthorized genetically modified (GM) Bacillus subtilis bacterium overproducing vitamin B2 in a feed additive was notified by the Rapid Alert System for Food and Feed (RASFF). This has demonstrated that a contamination by a GM micro-organism (GMM) may occur in feed additives and has confronted for the first time,the enforcement laboratories with this type of RASFF. As no sequence information of this GMM nor any specific detection or identification method was available, Next GenerationSequencing (NGS) was used to generate sequence information. However, NGS data analysis often requires appropriate tools, involving bioinformatics expertise which is not alwayspresent in the average enforcement laboratory. This hampers the use of this technology to rapidly obtain critical sequence information in order to be able to develop a specific qPCRdetection method. Data generated by NGS were exploited using a simple BLAST approach. A TaqMan® qPCR method was developed and tested on isolated bacterial strains and on the feed additive directly. In this study, a very simple strategy based on the common BLAST tools that can be used by any enforcement lab without profound bioinformatics expertise, was successfully used toanalyse the B. subtilis data generated by NGS. The results were used to design and assess a new TaqMan® qPCR method, specifically detecting this GM vitamin B2 overproducing bacterium. The method complies with EU critical performance parameters for specificity, sensitivity, PCR efficiency and repeatability. The VitB2-UGM method also could detect the B. subtilis strain in genomic DNA extracted from the feed additive, without prior culturing step. The proposed method, provides a crucial tool for specifically and rapidly identifying this unauthorized GM bacterium in food and feed additives by enforcement laboratories. Moreover, this work can be seen as a case study to substantiate how the use of NGS data can offer an added value to easily gain access to sequence information needed to develop qPCR methods to detect unknown andunauthorized GMO in food and feed.
Indel Group in Genomes (IGG) Molecular Genetic Markers1[OPEN
Burkart-Waco, Diana; Kuppu, Sundaram; Britt, Anne; Chetelat, Roger
2016-01-01
Genetic markers are essential when developing or working with genetically variable populations. Indel Group in Genomes (IGG) markers are primer pairs that amplify single-locus sequences that differ in size for two or more alleles. They are attractive for their ease of use for rapid genotyping and their codominant nature. Here, we describe a heuristic algorithm that uses a k-mer-based approach to search two or more genome sequences to locate polymorphic regions suitable for designing candidate IGG marker primers. As input to the IGG pipeline software, the user provides genome sequences and the desired amplicon sizes and size differences. Primer sequences flanking polymorphic insertions/deletions are produced as output. IGG marker files for three sets of genomes, Solanum lycopersicum/Solanum pennellii, Arabidopsis (Arabidopsis thaliana) Columbia-0/Landsberg erecta-0 accessions, and S. lycopersicum/S. pennellii/Solanum tuberosum (three-way polymorphic) are included. PMID:27436831
Rapid and Easy Protocol for Quantification of Next-Generation Sequencing Libraries.
Hawkins, Steve F C; Guest, Paul C
2018-01-01
The emergence of next-generation sequencing (NGS) over the last 10 years has increased the efficiency of DNA sequencing in terms of speed, ease, and price. However, the exact quantification of a NGS library is crucial in order to obtain good data on sequencing platforms developed by the current market leader Illumina. Different approaches for DNA quantification are available currently and the most commonly used are based on analysis of the physical properties of the DNA through spectrophotometric or fluorometric methods. Although these methods are technically simple, they do not allow exact quantification as can be achieved using a real-time quantitative PCR (qPCR) approach. A qPCR protocol for DNA quantification with applications in NGS library preparation studies is presented here. This can be applied in various fields of study such as medical disorders resulting from nutritional programming disturbances.
He, Shui-Lian; Yang, Yang; Morrell, Peter L; Yi, Ting-Shuang
2015-01-01
Foxtail millet (Setaria italica (L.) Beauv) is one of the earliest domesticated grains, which has been cultivated in northern China by 8,700 years before present (YBP) and across Eurasia by 4,000 YBP. Owing to a small genome and diploid nature, foxtail millet is a tractable model crop for studying functional genomics of millets and bioenergy grasses. In this study, we examined nucleotide sequence diversity, geographic structure, and levels of linkage disequilibrium at four nuclear loci (ADH1, G3PDH, IGS1 and TPI1) in representative samples of 311 landrace accessions across its cultivated range. Higher levels of nucleotide sequence and haplotype diversity were observed in samples from China relative to other sampled regions. Genetic assignment analysis classified the accessions into seven clusters based on nucleotide sequence polymorphisms. Intralocus LD decayed rapidly to half the initial value within ~1.2 kb or less.
SeqHBase: a big data toolset for family based sequencing data analysis.
He, Min; Person, Thomas N; Hebbring, Scott J; Heinzen, Ethan; Ye, Zhan; Schrodi, Steven J; McPherson, Elizabeth W; Lin, Simon M; Peissig, Peggy L; Brilliant, Murray H; O'Rawe, Jason; Robison, Reid J; Lyon, Gholson J; Wang, Kai
2015-04-01
Whole-genome sequencing (WGS) and whole-exome sequencing (WES) technologies are increasingly used to identify disease-contributing mutations in human genomic studies. It can be a significant challenge to process such data, especially when a large family or cohort is sequenced. Our objective was to develop a big data toolset to efficiently manipulate genome-wide variants, functional annotations and coverage, together with conducting family based sequencing data analysis. Hadoop is a framework for reliable, scalable, distributed processing of large data sets using MapReduce programming models. Based on Hadoop and HBase, we developed SeqHBase, a big data-based toolset for analysing family based sequencing data to detect de novo, inherited homozygous, or compound heterozygous mutations that may contribute to disease manifestations. SeqHBase takes as input BAM files (for coverage at every site), variant call format (VCF) files (for variant calls) and functional annotations (for variant prioritisation). We applied SeqHBase to a 5-member nuclear family and a 10-member 3-generation family with WGS data, as well as a 4-member nuclear family with WES data. Analysis times were almost linearly scalable with number of data nodes. With 20 data nodes, SeqHBase took about 5 secs to analyse WES familial data and approximately 1 min to analyse WGS familial data. These results demonstrate SeqHBase's high efficiency and scalability, which is necessary as WGS and WES are rapidly becoming standard methods to study the genetics of familial disorders. Published by the BMJ Publishing Group Limited. For permission to use (where not already granted under a licence) please go to http://group.bmj.com/group/rights-licensing/permissions.
An improved genome assembly uncovers prolific tandem repeats in Atlantic cod.
Tørresen, Ole K; Star, Bastiaan; Jentoft, Sissel; Reinar, William B; Grove, Harald; Miller, Jason R; Walenz, Brian P; Knight, James; Ekholm, Jenny M; Peluso, Paul; Edvardsen, Rolf B; Tooming-Klunderud, Ave; Skage, Morten; Lien, Sigbjørn; Jakobsen, Kjetill S; Nederbragt, Alexander J
2017-01-18
The first Atlantic cod (Gadus morhua) genome assembly published in 2011 was one of the early genome assemblies exclusively based on high-throughput 454 pyrosequencing. Since then, rapid advances in sequencing technologies have led to a multitude of assemblies generated for complex genomes, although many of these are of a fragmented nature with a significant fraction of bases in gaps. The development of long-read sequencing and improved software now enable the generation of more contiguous genome assemblies. By combining data from Illumina, 454 and the longer PacBio sequencing technologies, as well as integrating the results of multiple assembly programs, we have created a substantially improved version of the Atlantic cod genome assembly. The sequence contiguity of this assembly is increased fifty-fold and the proportion of gap-bases has been reduced fifteen-fold. Compared to other vertebrates, the assembly contains an unusual high density of tandem repeats (TRs). Indeed, retrospective analyses reveal that gaps in the first genome assembly were largely associated with these TRs. We show that 21% of the TRs across the assembly, 19% in the promoter regions and 12% in the coding sequences are heterozygous in the sequenced individual. The inclusion of PacBio reads combined with the use of multiple assembly programs drastically improved the Atlantic cod genome assembly by successfully resolving long TRs. The high frequency of heterozygous TRs within or in the vicinity of genes in the genome indicate a considerable standing genomic variation in Atlantic cod populations, which is likely of evolutionary importance.
Pathogen profiling for disease management and surveillance.
Sintchenko, Vitali; Iredell, Jonathan R; Gilbert, Gwendolyn L
2007-06-01
The usefulness of rapid pathogen genotyping is widely recognized, but its effective interpretation and application requires integration into clinical and public health decision-making. How can pathogen genotyping data best be translated to inform disease management and surveillance? Pathogen profiling integrates microbial genomics data into communicable disease control by consolidating phenotypic identity-based methods with DNA microarrays, proteomics, metabolomics and sequence-based typing. Sharing data on pathogen profiles should facilitate our understanding of transmission patterns and the dynamics of epidemics.
Prophylaxis against the systemic hypotension induced by propofol during rapid-sequence intubation.
el-Beheiry, H; Kim, J; Milne, B; Seegobin, R
1995-10-01
The objective of this study was to determine the effectiveness of two prophylactic approaches against the anticipated hypotension induced by propofol during rapid-sequence intubation. Thirty-six male or female nonpremedicated ASA class I-II patients aged 21-60 yr undergoing elective outpatient surgery were included in the study. Patients were randomly allocated to receive pre-induction ephedrine sulphate (70 micrograms x kg(-1)iv), pre-induction volume loading (12 ml x kg(-1) Ringer's lactate) or no treatment. Rapid-sequence intubation with cricoid pressure was then performed with propofol (2.5 mg. x kg(-1)) and succinylcholine (1.5 mg x kg(-1). The lungs were subsequently ventilated with 0.25-0.5% isoflurane in a 2:1 N2O/O2 mixture. Vecuronium was given once neuromuscular function had recovered from the succinylcholine. Heart rate and systemic arterial blood pressure were measured non-invasively before induction, after propofol administration and every minute for ten minutes after intubation. Pre-induction volume loading prevented the hypotension observed before surgical stimulation in control and ephedrine groups. Moreover, pre-induction volume loading was not associated with increases in heart rate after intubation as was ephedrine administration. The intubating conditions were excellent to satisfactory in most patients and the overall incidence of adverse events during intubation was mainly due to pain during injection of propofol. The present study showed that preoperative volume loading is more efficacious than pre-induction administration of ephedrine sulphate in maintaining haemodynamic stability during rapid-sequence induction with propofol and succinylcholine. In addition, propofol in combination with succinylcholine provides excellent conditions for rapid-sequence intubation.
Neuwald, Andrew F
2009-08-01
The patterns of sequence similarity and divergence present within functionally diverse, evolutionarily related proteins contain implicit information about corresponding biochemical similarities and differences. A first step toward accessing such information is to statistically analyze these patterns, which, in turn, requires that one first identify and accurately align a very large set of protein sequences. Ideally, the set should include many distantly related, functionally divergent subgroups. Because it is extremely difficult, if not impossible for fully automated methods to align such sequences correctly, researchers often resort to manual curation based on detailed structural and biochemical information. However, multiply-aligning vast numbers of sequences in this way is clearly impractical. This problem is addressed using Multiply-Aligned Profiles for Global Alignment of Protein Sequences (MAPGAPS). The MAPGAPS program uses a set of multiply-aligned profiles both as a query to detect and classify related sequences and as a template to multiply-align the sequences. It relies on Karlin-Altschul statistics for sensitivity and on PSI-BLAST (and other) heuristics for speed. Using as input a carefully curated multiple-profile alignment for P-loop GTPases, MAPGAPS correctly aligned weakly conserved sequence motifs within 33 distantly related GTPases of known structure. By comparison, the sequence- and structurally based alignment methods hmmalign and PROMALS3D misaligned at least 11 and 23 of these regions, respectively. When applied to a dataset of 65 million protein sequences, MAPGAPS identified, classified and aligned (with comparable accuracy) nearly half a million putative P-loop GTPase sequences. A C++ implementation of MAPGAPS is available at http://mapgaps.igs.umaryland.edu. Supplementary data are available at Bioinformatics online.
Xu, Yi-Hua; Manoharan, Herbert T; Pitot, Henry C
2007-09-01
The bisulfite genomic sequencing technique is one of the most widely used techniques to study sequence-specific DNA methylation because of its unambiguous ability to reveal DNA methylation status to the order of a single nucleotide. One characteristic feature of the bisulfite genomic sequencing technique is that a number of sample sequence files will be produced from a single DNA sample. The PCR products of bisulfite-treated DNA samples cannot be sequenced directly because they are heterogeneous in nature; therefore they should be cloned into suitable plasmids and then sequenced. This procedure generates an enormous number of sample DNA sequence files as well as adding extra bases belonging to the plasmids to the sequence, which will cause problems in the final sequence comparison. Finding the methylation status for each CpG in each sample sequence is not an easy job. As a result CpG PatternFinder was developed for this purpose. The main functions of the CpG PatternFinder are: (i) to analyze the reference sequence to obtain CpG and non-CpG-C residue position information. (ii) To tailor sample sequence files (delete insertions and mark deletions from the sample sequence files) based on a configuration of ClustalW multiple alignment. (iii) To align sample sequence files with a reference file to obtain bisulfite conversion efficiency and CpG methylation status. And, (iv) to produce graphics, highlighted aligned sequence text and a summary report which can be easily exported to Microsoft Office suite. CpG PatternFinder is designed to operate cooperatively with BioEdit, a freeware on the internet. It can handle up to 100 files of sample DNA sequences simultaneously, and the total CpG pattern analysis process can be finished in minutes. CpG PatternFinder is an ideal software tool for DNA methylation studies to determine the differential methylation pattern in a large number of individuals in a population. Previously we developed the CpG Analyzer program; CpG PatternFinder is our further effort to create software tools for DNA methylation studies.
Correlation approach to identify coding regions in DNA sequences
NASA Technical Reports Server (NTRS)
Ossadnik, S. M.; Buldyrev, S. V.; Goldberger, A. L.; Havlin, S.; Mantegna, R. N.; Peng, C. K.; Simons, M.; Stanley, H. E.
1994-01-01
Recently, it was observed that noncoding regions of DNA sequences possess long-range power-law correlations, whereas coding regions typically display only short-range correlations. We develop an algorithm based on this finding that enables investigators to perform a statistical analysis on long DNA sequences to locate possible coding regions. The algorithm is particularly successful in predicting the location of lengthy coding regions. For example, for the complete genome of yeast chromosome III (315,344 nucleotides), at least 82% of the predictions correspond to putative coding regions; the algorithm correctly identified all coding regions larger than 3000 nucleotides, 92% of coding regions between 2000 and 3000 nucleotides long, and 79% of coding regions between 1000 and 2000 nucleotides. The predictive ability of this new algorithm supports the claim that there is a fundamental difference in the correlation property between coding and noncoding sequences. This algorithm, which is not species-dependent, can be implemented with other techniques for rapidly and accurately locating relatively long coding regions in genomic sequences.
Nanopore sequencing technology: a new route for the fast detection of unauthorized GMO.
Fraiture, Marie-Alice; Saltykova, Assia; Hoffman, Stefan; Winand, Raf; Deforce, Dieter; Vanneste, Kevin; De Keersmaecker, Sigrid C J; Roosens, Nancy H C
2018-05-21
In order to strengthen the current genetically modified organism (GMO) detection system for unauthorized GMO, we have recently developed a new workflow based on DNA walking to amplify unknown sequences surrounding a known DNA region. This DNA walking is performed on transgenic elements, commonly found in GMO, that were earlier detected by real-time PCR (qPCR) screening. Previously, we have demonstrated the ability of this approach to detect unauthorized GMO via the identification of unique transgene flanking regions and the unnatural associations of elements from the transgenic cassette. In the present study, we investigate the feasibility to integrate the described workflow with the MinION Next-Generation-Sequencing (NGS). The MinION sequencing platform can provide long read-lengths and deal with heterogenic DNA libraries, allowing for rapid and efficient delivery of sequences of interest. In addition, the ability of this NGS platform to characterize unauthorized and unknown GMO without any a priori knowledge has been assessed.
PredSTP: a highly accurate SVM based model to predict sequential cystine stabilized peptides.
Islam, S M Ashiqul; Sajed, Tanvir; Kearney, Christopher Michel; Baker, Erich J
2015-07-05
Numerous organisms have evolved a wide range of toxic peptides for self-defense and predation. Their effective interstitial and macro-environmental use requires energetic and structural stability. One successful group of these peptides includes a tri-disulfide domain arrangement that offers toxicity and high stability. Sequential tri-disulfide connectivity variants create highly compact disulfide folds capable of withstanding a variety of environmental stresses. Their combination of toxicity and stability make these peptides remarkably valuable for their potential as bio-insecticides, antimicrobial peptides and peptide drug candidates. However, the wide sequence variation, sources and modalities of group members impose serious limitations on our ability to rapidly identify potential members. As a result, there is a need for automated high-throughput member classification approaches that leverage their demonstrated tertiary and functional homology. We developed an SVM-based model to predict sequential tri-disulfide peptide (STP) toxins from peptide sequences. One optimized model, called PredSTP, predicted STPs from training set with sensitivity, specificity, precision, accuracy and a Matthews correlation coefficient of 94.86%, 94.11%, 84.31%, 94.30% and 0.86, respectively, using 200 fold cross validation. The same model outperforms existing prediction approaches in three independent out of sample testsets derived from PDB. PredSTP can accurately identify a wide range of cystine stabilized peptide toxins directly from sequences in a species-agnostic fashion. The ability to rapidly filter sequences for potential bioactive peptides can greatly compress the time between peptide identification and testing structural and functional properties for possible antimicrobial and insecticidal candidates. A web interface is freely available to predict STP toxins from http://crick.ecs.baylor.edu/.
USDA-ARS?s Scientific Manuscript database
A recent widespread outbreak of Escherichia coli O104:H4 in Germany demonstrates the dynamic nature of emerging and re-emerging food-borne pathogens, particularly STECs and related pathogenic E. coli. Rapid genomic sequencing and public availability of these data from the German outbreak strain allo...
ERIC Educational Resources Information Center
Conlon, Elizabeth G.; Wright, Craig M.; Norris, Karla; Chekaluk, Eugene
2011-01-01
The experiments conducted aimed to investigate whether reduced accuracy when counting stimuli presented in rapid temporal sequence in adults with dyslexia could be explained by a sensory processing deficit, a general slowing in processing speed or difficulties shifting attention between stimuli. To achieve these aims, the influence of the…
Nanopore sequencing in microgravity
McIntyre, Alexa B R; Rizzardi, Lindsay; Yu, Angela M; Alexander, Noah; Rosen, Gail L; Botkin, Douglas J; Stahl, Sarah E; John, Kristen K; Castro-Wallace, Sarah L; McGrath, Ken; Burton, Aaron S; Feinberg, Andrew P; Mason, Christopher E
2016-01-01
Rapid DNA sequencing and analysis has been a long-sought goal in remote research and point-of-care medicine. In microgravity, DNA sequencing can facilitate novel astrobiological research and close monitoring of crew health, but spaceflight places stringent restrictions on the mass and volume of instruments, crew operation time, and instrument functionality. The recent emergence of portable, nanopore-based tools with streamlined sample preparation protocols finally enables DNA sequencing on missions in microgravity. As a first step toward sequencing in space and aboard the International Space Station (ISS), we tested the Oxford Nanopore Technologies MinION during a parabolic flight to understand the effects of variable gravity on the instrument and data. In a successful proof-of-principle experiment, we found that the instrument generated DNA reads over the course of the flight, including the first ever sequenced in microgravity, and additional reads measured after the flight concluded its parabolas. Here we detail modifications to the sample-loading procedures to facilitate nanopore sequencing aboard the ISS and in other microgravity environments. We also evaluate existing analysis methods and outline two new approaches, the first based on a wave-fingerprint method and the second on entropy signal mapping. Computationally light analysis methods offer the potential for in situ species identification, but are limited by the error profiles (stays, skips, and mismatches) of older nanopore data. Higher accuracies attainable with modified sample processing methods and the latest version of flow cells will further enable the use of nanopore sequencers for diagnostics and research in space. PMID:28725742
BASiNET-BiologicAl Sequences NETwork: a case study on coding and non-coding RNAs identification.
Ito, Eric Augusto; Katahira, Isaque; Vicente, Fábio Fernandes da Rocha; Pereira, Luiz Filipe Protasio; Lopes, Fabrício Martins
2018-06-05
With the emergence of Next Generation Sequencing (NGS) technologies, a large volume of sequence data in particular de novo sequencing was rapidly produced at relatively low costs. In this context, computational tools are increasingly important to assist in the identification of relevant information to understand the functioning of organisms. This work introduces BASiNET, an alignment-free tool for classifying biological sequences based on the feature extraction from complex network measurements. The method initially transform the sequences and represents them as complex networks. Then it extracts topological measures and constructs a feature vector that is used to classify the sequences. The method was evaluated in the classification of coding and non-coding RNAs of 13 species and compared to the CNCI, PLEK and CPC2 methods. BASiNET outperformed all compared methods in all adopted organisms and datasets. BASiNET have classified sequences in all organisms with high accuracy and low standard deviation, showing that the method is robust and non-biased by the organism. The proposed methodology is implemented in open source in R language and freely available for download at https://cran.r-project.org/package=BASiNET.
RECOVIR Software for Identifying Viruses
NASA Technical Reports Server (NTRS)
Chakravarty, Sugoto; Fox, George E.; Zhu, Dianhui
2013-01-01
Most single-stranded RNA (ssRNA) viruses mutate rapidly to generate a large number of strains with highly divergent capsid sequences. Determining the capsid residues or nucleotides that uniquely characterize these strains is critical in understanding the strain diversity of these viruses. RECOVIR (an acronym for "recognize viruses") software predicts the strains of some ssRNA viruses from their limited sequence data. Novel phylogenetic-tree-based databases of protein or nucleic acid residues that uniquely characterize these virus strains are created. Strains of input virus sequences (partial or complete) are predicted through residue-wise comparisons with the databases. RECOVIR uses unique characterizing residues to identify automatically strains of partial or complete capsid sequences of picorna and caliciviruses, two of the most highly diverse ssRNA virus families. Partition-wise comparisons of the database residues with the corresponding residues of more than 300 complete and partial sequences of these viruses resulted in correct strain identification for all of these sequences. This study shows the feasibility of creating databases of hitherto unknown residues uniquely characterizing the capsid sequences of two of the most highly divergent ssRNA virus families. These databases enable automated strain identification from partial or complete capsid sequences of these human and animal pathogens.
Clément, Nathalie; Velu, Thierry; Brandenburger, Annick
2002-09-01
The production of currently available vectors derived from autonomous parvoviruses requires the expression of capsid proteins in trans, from helper sequences. Cotransfection of a helper plasmid always generates significant amounts of replication-competent virus (RCV) that can be reduced by the integration of helper sequences into a packaging cell line. Although stocks of minute virus of mice (MVM)-based vectors with no detectable RCV could be produced by transfection into packaging cells; the latter appear after one or two rounds of replication, precluding further amplification of the vector stock. Indeed, once RCVs become detectable, they are efficiently amplified and rapidly take over the culture. Theoretically RCV-free vector stocks could be produced if all homology between vector and helper DNA is eliminated, thus preventing homologous recombination. We constructed new vectors based on the structure of spontaneously occurring defective particles of MVM. Based on published observations related to the size of vectors and the sequence of the viral origin of replication, these vectors were modified by the insertion of foreign DNA sequences downstream of the transgene and by the introduction of a consensus NS-1 nick site near the origin of replication to optimize their production. In one of the vectors the inserted fragment of mouse genomic DNA had a synergistic effect with the modified origin of replication in increasing vector production.
Falade, Mofolusho O.; Opene, Anthony J.; Benson, Otarigho
2016-01-01
DNA barcoding has been adopted as a gold standard rapid, precise and unifying identification system for animal species and provides a database of genetic sequences that can be used as a tool for universal species identification. In this study, we employed mitochondrial genes 16S rRNA (16S) and cytochrome oxidase subunit I (COI) for the identification of some Nigerian freshwater catfish and Tilapia species. Approximately 655 bp were amplified from the 5′ region of the mitochondrial cytochrome C oxidase subunit I (COI) gene whereas 570 bp were amplified for the 16S rRNA gene. Nucleotide divergences among sequences were estimated based on Kimura 2-parameter distances and the genetic relationships were assessed by constructing phylogenetic trees using the neighbour-joining (NJ) and maximum likelihood (ML) methods. Analyses of consensus barcode sequences for each species, and alignment of individual sequences from within a given species revealed highly consistent barcodes (99% similarity on average), which could be compared with deposited sequences in public databases. The nucleotide distance between species belonging to different genera based on COI ranged from 0.17% between Sarotherodon melanotheron and Coptodon zillii to 0.49% between Clarias gariepinus and C. zillii, indicating that S. melanotheron and C. zillii are closely related. Based on the data obtained, the utility of COI gene was confirmed in accurate identification of three fish species from Southwest Nigeria. PMID:27990256
The U.S. EPA is currently evaluating rapid, real-time quantitative PCR (qPCR) methods for determining recreational water quality based on measurements of fecal indicator bacteria DNA sequences. In order to potentially use qPCR for other Clean Water Act needs, such as updating cri...
Evidence-Based Clinical Recommendations for the Administration of the Sequential Motion Rates Task
ERIC Educational Resources Information Center
Icht, Michal; Ben-David, Boaz M.
2018-01-01
The sequential motion rates (SMR) task, that involves rapid and accurate repetitions of a syllable sequence, /pataka/, is a commonly used evaluation tool for oro-motor abilities. Although the SMR is a well-known tool, some aspects of its administration protocol are unspecified. We address the following factors and their role in the SMR protocol:…
Functional brain activation differences in stuttering identified with a rapid fMRI sequence
Kraft, Shelly Jo; Choo, Ai Leen; Sharma, Harish; Ambrose, Nicoline G.
2011-01-01
The purpose of this study was to investigate whether brain activity related to the presence of stuttering can be identified with rapid functional MRI (fMRI) sequences that involved overt and covert speech processing tasks. The long-term goal is to develop sensitive fMRI approaches with developmentally appropriate tasks to identify deviant speech motor and auditory brain activity in children who stutter closer to the age at which recovery from stuttering is documented. Rapid sequences may be preferred for individuals or populations who do not tolerate long scanning sessions. In this report, we document the application of a picture naming and phoneme monitoring task in three minute fMRI sequences with adults who stutter (AWS). If relevant brain differences are found in AWS with these approaches that conform to previous reports, then these approaches can be extended to younger populations. Pairwise contrasts of brain BOLD activity between AWS and normally fluent adults indicated the AWS showed higher BOLD activity in the right inferior frontal gyrus (IFG), right temporal lobe and sensorimotor cortices during picture naming and and higher activity in the right IFG during phoneme monitoring. The right lateralized pattern of BOLD activity together with higher activity in sensorimotor cortices is consistent with previous reports, which indicates rapid fMRI sequences can be considered for investigating stuttering in younger participants. PMID:22133409
Nasri, Tuba; Hedayati, Mohammad Taghi; Abastabar, Mahdi; Pasqualotto, Alessandro C; Armaki, Mojtaba Taghizadeh; Hoseinnejad, Akbar; Nabili, Mojtaba
2015-10-01
Aspergillus species are important agents of life-threatening infections in immunosuppressed patients. Proper speciation in the Aspergilli has been justified based on varied fungal virulence, clinical presentations, and antifungal resistance. Accurate identification of Aspergillus species usually relies on fungal DNA sequencing but this requires expensive equipment that is not available in most clinical laboratories. We developed and validated a discriminative low-cost PCR-based test to discriminate Aspergillus isolates at the species level. The Beta tubulin gene of various reference strains of Aspergillus species was amplified using the universal fungal primers Bt2a and Bt2b. The PCR products were subjected to digestion with a single restriction enzyme AlwI. All Aspergillus isolates were subjected to DNA sequencing for final species characterization. The PCR-RFLP test generated unique patterns for six clinically important Aspergillus species, including Aspergillus flavus, Aspergillus fumigatus, Aspergillus nidulans, Aspergillus terreus, Aspergillus clavatus and Aspergillus nidulans. The one-enzyme PCR-RFLP on Beta tubulin gene designed in this study is a low-cost tool for the reliable and rapid differentiation of the clinically important Aspergillus species. Copyright © 2015 Elsevier B.V. All rights reserved.
USDA-ARS?s Scientific Manuscript database
Current technologies with next generation sequencing have revolutionized metagenomics analysis of clinical samples. To achieve the non-selective amplification and recovery of low abundance genetic sequences, a simplified Sequence-Independent, Single-Primer Amplification (SISPA) technique in combinat...
Visual Perceptual Echo Reflects Learning of Regularities in Rapid Luminance Sequences.
Chang, Acer Y-C; Schwartzman, David J; VanRullen, Rufin; Kanai, Ryota; Seth, Anil K
2017-08-30
A novel neural signature of active visual processing has recently been described in the form of the "perceptual echo", in which the cross-correlation between a sequence of randomly fluctuating luminance values and occipital electrophysiological signals exhibits a long-lasting periodic (∼100 ms cycle) reverberation of the input stimulus (VanRullen and Macdonald, 2012). As yet, however, the mechanisms underlying the perceptual echo and its function remain unknown. Reasoning that natural visual signals often contain temporally predictable, though nonperiodic features, we hypothesized that the perceptual echo may reflect a periodic process associated with regularity learning. To test this hypothesis, we presented subjects with successive repetitions of a rapid nonperiodic luminance sequence, and examined the effects on the perceptual echo, finding that echo amplitude linearly increased with the number of presentations of a given luminance sequence. These data suggest that the perceptual echo reflects a neural signature of regularity learning.Furthermore, when a set of repeated sequences was followed by a sequence with inverted luminance polarities, the echo amplitude decreased to the same level evoked by a novel stimulus sequence. Crucially, when the original stimulus sequence was re-presented, the echo amplitude returned to a level consistent with the number of presentations of this sequence, indicating that the visual system retained sequence-specific information, for many seconds, even in the presence of intervening visual input. Altogether, our results reveal a previously undiscovered regularity learning mechanism within the human visual system, reflected by the perceptual echo. SIGNIFICANCE STATEMENT How the brain encodes and learns fast-changing but nonperiodic visual input remains unknown, even though such visual input characterizes natural scenes. We investigated whether the phenomenon of "perceptual echo" might index such learning. The perceptual echo is a long-lasting reverberation between a rapidly changing visual input and evoked neural activity, apparent in cross-correlations between occipital EEG and stimulus sequences, peaking in the alpha (∼10 Hz) range. We indeed found that perceptual echo is enhanced by repeatedly presenting the same visual sequence, indicating that the human visual system can rapidly and automatically learn regularities embedded within fast-changing dynamic sequences. These results point to a previously undiscovered regularity learning mechanism, operating at a rate defined by the alpha frequency. Copyright © 2017 the authors 0270-6474/17/378486-12$15.00/0.
Verghese, Bindhu; Lok, Mei; Wen, Jia; Alessandria, Valentina; Chen, Yi; Kathariou, Sophia; Knabel, Stephen
2011-01-01
Different strains of Listeria monocytogenes are well known to persist in individual food processing plants and to contaminate foods for many years; however, the specific genotypic and phenotypic mechanisms responsible for persistence of these unique strains remain largely unknown. Based on sequences in comK prophage junction fragments, different strains of epidemic clones (ECs), which included ECII, ECIII, and ECV, were identified and shown to be specific to individual meat and poultry processing plants. The comK prophage-containing strains showed significantly higher cell densities after incubation at 30°C for 48 h on meat and poultry food-conditioning films than did strains lacking the comK prophage (P < 0.05). Overall, the type of strain, the type of conditioning film, and the interaction between the two were all highly significant (P < 0.001). Recombination analysis indicated that the comK prophage junction fragments in these strains had evolved due to extensive recombination. Based on the results of the present study, we propose a novel model in which the concept of defective comK prophage was replaced with the rapid adaptation island (RAI). Genes within the RAI were recharacterized as “adaptons,” as these genes may allow L. monocytogenes to rapidly adapt to different food processing facilities and foods. If confirmed, the model presented would help explain Listeria's rapid niche adaptation, biofilm formation, persistence, and subsequent transmission to foods. Also, comK prophage junction fragment sequences may permit accurate tracking of persistent strains back to and within individual food processing operations and thus allow the design of more effective intervention strategies to reduce contamination and enhance food safety. PMID:21441318
Verghese, Bindhu; Lok, Mei; Wen, Jia; Alessandria, Valentina; Chen, Yi; Kathariou, Sophia; Knabel, Stephen
2011-05-01
Different strains of Listeria monocytogenes are well known to persist in individual food processing plants and to contaminate foods for many years; however, the specific genotypic and phenotypic mechanisms responsible for persistence of these unique strains remain largely unknown. Based on sequences in comK prophage junction fragments, different strains of epidemic clones (ECs), which included ECII, ECIII, and ECV, were identified and shown to be specific to individual meat and poultry processing plants. The comK prophage-containing strains showed significantly higher cell densities after incubation at 30°C for 48 h on meat and poultry food-conditioning films than did strains lacking the comK prophage (P < 0.05). Overall, the type of strain, the type of conditioning film, and the interaction between the two were all highly significant (P < 0.001). Recombination analysis indicated that the comK prophage junction fragments in these strains had evolved due to extensive recombination. Based on the results of the present study, we propose a novel model in which the concept of defective comK prophage was replaced with the rapid adaptation island (RAI). Genes within the RAI were recharacterized as "adaptons," as these genes may allow L. monocytogenes to rapidly adapt to different food processing facilities and foods. If confirmed, the model presented would help explain Listeria's rapid niche adaptation, biofilm formation, persistence, and subsequent transmission to foods. Also, comK prophage junction fragment sequences may permit accurate tracking of persistent strains back to and within individual food processing operations and thus allow the design of more effective intervention strategies to reduce contamination and enhance food safety.
TaxI: a software tool for DNA barcoding using distance methods
Steinke, Dirk; Vences, Miguel; Salzburger, Walter; Meyer, Axel
2005-01-01
DNA barcoding is a promising approach to the diagnosis of biological diversity in which DNA sequences serve as the primary key for information retrieval. Most existing software for evolutionary analysis of DNA sequences was designed for phylogenetic analyses and, hence, those algorithms do not offer appropriate solutions for the rapid, but precise analyses needed for DNA barcoding, and are also unable to process the often large comparative datasets. We developed a flexible software tool for DNA taxonomy, named TaxI. This program calculates sequence divergences between a query sequence (taxon to be barcoded) and each sequence of a dataset of reference sequences defined by the user. Because the analysis is based on separate pairwise alignments this software is also able to work with sequences characterized by multiple insertions and deletions that are difficult to align in large sequence sets (i.e. thousands of sequences) by multiple alignment algorithms because of computational restrictions. Here, we demonstrate the utility of this approach with two datasets of fish larvae and juveniles from Lake Constance and juvenile land snails under different models of sequence evolution. Sets of ribosomal 16S rRNA sequences, characterized by multiple indels, performed as good as or better than cox1 sequence sets in assigning sequences to species, demonstrating the suitability of rRNA genes for DNA barcoding. PMID:16214755
2012-01-01
Background Although modern sequencing technologies permit the ready detection of numerous DNA sequence variants in any organisms, converting such information to PCR-based genetic markers is hampered by a lack of simple, scalable tools. Onion is an example of an under-researched crop with a complex, heterozygous genome where genome-based research has previously been hindered by limited sequence resources and genetic markers. Results We report the development of generic tools for large-scale web-based PCR-based marker design in the Galaxy bioinformatics framework, and their application for development of next-generation genetics resources in a wide cross of bulb onion (Allium cepa L.). Transcriptome sequence resources were developed for the homozygous doubled-haploid bulb onion line ‘CUDH2150’ and the genetically distant Indian landrace ‘Nasik Red’, using 454™ sequencing of normalised cDNA libraries of leaf and shoot. Read mapping of ‘Nasik Red’ reads onto ‘CUDH2150’ assemblies revealed 16836 indel and SNP polymorphisms that were mined for portable PCR-based marker development. Tools for detection of restriction polymorphisms and primer set design were developed in BioPython and adapted for use in the Galaxy workflow environment, enabling large-scale and targeted assay design. Using PCR-based markers designed with these tools, a framework genetic linkage map of over 800cM spanning all chromosomes was developed in a subset of 93 F2 progeny from a very large F2 family developed from the ‘Nasik Red’ x ‘CUDH2150’ inter-cross. The utility of tools and genetic resources developed was tested by designing markers to transcription factor-like polymorphic sequences. Bin mapping these markers using a subset of 10 progeny confirmed the ability to place markers within 10 cM bins, enabling increased efficiency in marker assignment and targeted map refinement. The major genetic loci conditioning red bulb colour (R) and fructan content (Frc) were located on this map by QTL analysis. Conclusions The generic tools developed for the Galaxy environment enable rapid development of sets of PCR assays targeting sequence variants identified from Illumina and 454 sequence data. They enable non-specialist users to validate and exploit large volumes of next-generation sequence data using basic equipment. PMID:23157543
Baldwin, Samantha; Revanna, Roopashree; Thomson, Susan; Pither-Joyce, Meeghan; Wright, Kathryn; Crowhurst, Ross; Fiers, Mark; Chen, Leshi; Macknight, Richard; McCallum, John A
2012-11-19
Although modern sequencing technologies permit the ready detection of numerous DNA sequence variants in any organisms, converting such information to PCR-based genetic markers is hampered by a lack of simple, scalable tools. Onion is an example of an under-researched crop with a complex, heterozygous genome where genome-based research has previously been hindered by limited sequence resources and genetic markers. We report the development of generic tools for large-scale web-based PCR-based marker design in the Galaxy bioinformatics framework, and their application for development of next-generation genetics resources in a wide cross of bulb onion (Allium cepa L.). Transcriptome sequence resources were developed for the homozygous doubled-haploid bulb onion line 'CUDH2150' and the genetically distant Indian landrace 'Nasik Red', using 454™ sequencing of normalised cDNA libraries of leaf and shoot. Read mapping of 'Nasik Red' reads onto 'CUDH2150' assemblies revealed 16836 indel and SNP polymorphisms that were mined for portable PCR-based marker development. Tools for detection of restriction polymorphisms and primer set design were developed in BioPython and adapted for use in the Galaxy workflow environment, enabling large-scale and targeted assay design. Using PCR-based markers designed with these tools, a framework genetic linkage map of over 800cM spanning all chromosomes was developed in a subset of 93 F(2) progeny from a very large F(2) family developed from the 'Nasik Red' x 'CUDH2150' inter-cross. The utility of tools and genetic resources developed was tested by designing markers to transcription factor-like polymorphic sequences. Bin mapping these markers using a subset of 10 progeny confirmed the ability to place markers within 10 cM bins, enabling increased efficiency in marker assignment and targeted map refinement. The major genetic loci conditioning red bulb colour (R) and fructan content (Frc) were located on this map by QTL analysis. The generic tools developed for the Galaxy environment enable rapid development of sets of PCR assays targeting sequence variants identified from Illumina and 454 sequence data. They enable non-specialist users to validate and exploit large volumes of next-generation sequence data using basic equipment.
Hernández, Yözen; Bernstein, Rocky; Pagan, Pedro; Vargas, Levy; McCaig, William; Ramrattan, Girish; Akther, Saymon; Larracuente, Amanda; Di, Lia; Vieira, Filipe G; Qiu, Wei-Gang
2018-03-02
Automated bioinformatics workflows are more robust, easier to maintain, and results more reproducible when built with command-line utilities than with custom-coded scripts. Command-line utilities further benefit by relieving bioinformatics developers to learn the use of, or to interact directly with, biological software libraries. There is however a lack of command-line utilities that leverage popular Open Source biological software toolkits such as BioPerl ( http://bioperl.org ) to make many of the well-designed, robust, and routinely used biological classes available for a wider base of end users. Designed as standard utilities for UNIX-family operating systems, BpWrapper makes functionality of some of the most popular BioPerl modules readily accessible on the command line to novice as well as to experienced bioinformatics practitioners. The initial release of BpWrapper includes four utilities with concise command-line user interfaces, bioseq, bioaln, biotree, and biopop, specialized for manipulation of molecular sequences, sequence alignments, phylogenetic trees, and DNA polymorphisms, respectively. Over a hundred methods are currently available as command-line options and new methods are easily incorporated. Performance of BpWrapper utilities lags that of precompiled utilities while equivalent to that of other utilities based on BioPerl. BpWrapper has been tested on BioPerl Release 1.6, Perl versions 5.10.1 to 5.25.10, and operating systems including Apple macOS, Microsoft Windows, and GNU/Linux. Release code is available from the Comprehensive Perl Archive Network (CPAN) at https://metacpan.org/pod/Bio::BPWrapper . Source code is available on GitHub at https://github.com/bioperl/p5-bpwrapper . BpWrapper improves on existing sequence utilities by following the design principles of Unix text utilities such including a concise user interface, extensive command-line options, and standard input/output for serialized operations. Further, dozens of novel methods for manipulation of sequences, alignments, and phylogenetic trees, unavailable in existing utilities (e.g., EMBOSS, Newick Utilities, and FAST), are provided. Bioinformaticians should find BpWrapper useful for rapid prototyping of workflows on the command-line without creating custom scripts for comparative genomics and other bioinformatics applications.
Li, De-Zhu
2011-01-01
Background Bambusoideae is the only subfamily that contains woody members in the grass family, Poaceae. In phylogenetic analyses, Bambusoideae, Pooideae and Ehrhartoideae formed the BEP clade, yet the internal relationships of this clade are controversial. The distinctive life history (infrequent flowering and predominance of asexual reproduction) of woody bamboos makes them an interesting but taxonomically difficult group. Phylogenetic analyses based on large DNA fragments could only provide a moderate resolution of woody bamboo relationships, although a robust phylogenetic tree is needed to elucidate their evolutionary history. Phylogenomics is an alternative choice for resolving difficult phylogenies. Methodology/Principal Findings Here we present the complete nucleotide sequences of six woody bamboo chloroplast (cp) genomes using Illumina sequencing. These genomes are similar to those of other grasses and rather conservative in evolution. We constructed a phylogeny of Poaceae from 24 complete cp genomes including 21 grass species. Within the BEP clade, we found strong support for a sister relationship between Bambusoideae and Pooideae. In a substantial improvement over prior studies, all six nodes within Bambusoideae were supported with ≥0.95 posterior probability from Bayesian inference and 5/6 nodes resolved with 100% bootstrap support in maximum parsimony and maximum likelihood analyses. We found that repeats in the cp genome could provide phylogenetic information, while caution is needed when using indels in phylogenetic analyses based on few selected genes. We also identified relatively rapidly evolving cp genome regions that have the potential to be used for further phylogenetic study in Bambusoideae. Conclusions/Significance The cp genome of Bambusoideae evolved slowly, and phylogenomics based on whole cp genome could be used to resolve major relationships within the subfamily. The difficulty in resolving the diversification among three clades of temperate woody bamboos, even with complete cp genome sequences, suggests that these lineages may have diverged very rapidly. PMID:21655229
[Personalized urooncology based on molecular uropathology: what is the future?].
Dahl, E; Haller, F
2013-07-01
Targeted therapies and biomarker validation are key drivers in the advancement of personalized oncology which is a growing topic in all clinical areas. Compared with other professions, such as pulmonology and gynecology, development in urology has so far been retarded but has recently gained increasing momentum. A basis for this is the currently growing and in future accelerated application of new knowledge derived from molecular biology in the field of uropathology. The rapid gain of knowledge is driven by a whole new class of analytical methods, such as massively parallel sequencing (deep sequencing or next generation sequencing), which enables analysis of virtually a new universe of potential biomarkers. This article describes the emerging paradigm shift in molecular pathological diagnostics of urological tumors using the example of prostate cancer.
Structural genomics: keeping up with expanding knowledge of the protein universe
Grabowski, Marek; Joachimiak, Andrzej; Otwinowski, Zbyszek; Minor, Wladek
2010-01-01
Structural characterization of the protein universe is the main mission of Structural Genomics (SG) programs. However, progress in gene sequencing technology, set in motion in the 1990s, has resulted in rapid expansion of protein sequence space — a twelvefold increase in the past seven years. For the SG field, this creates new challenges and necessitates a reassessment of its strategies. Nevertheless, despite the growth of sequence space, at present nearly half of the content of the Swiss-Prot database and over 40% of Pfam protein families can be structurally modeled based on structures determined so far, with SG projects making an increasingly significant contribution. The SG contribution of new Pfam structures nearly doubled from 27.2% in 2003 to 51.6% in 2006. PMID:17587562
Yu, Hui; Zhang, Victor Wei; Stray-Pedersen, Asbjørg; Hanson, Imelda Celine; Forbes, Lisa R; de la Morena, M Teresa; Chinn, Ivan K; Gorman, Elizabeth; Mendelsohn, Nancy J; Pozos, Tamara; Wiszniewski, Wojciech; Nicholas, Sarah K; Yates, Anne B; Moore, Lindsey E; Berge, Knut Erik; Sorte, Hanne; Bayer, Diana K; ALZahrani, Daifulah; Geha, Raif S; Feng, Yanming; Wang, Guoli; Orange, Jordan S; Lupski, James R; Wang, Jing; Wong, Lee-Jun
2016-10-01
Primary immunodeficiency diseases (PIDDs) are inherited disorders of the immune system. The most severe form, severe combined immunodeficiency (SCID), presents with profound deficiencies of T cells, B cells, or both at birth. If not treated promptly, affected patients usually do not live beyond infancy because of infections. Genetic heterogeneity of SCID frequently delays the diagnosis; a specific diagnosis is crucial for life-saving treatment and optimal management. We developed a next-generation sequencing (NGS)-based multigene-targeted panel for SCID and other severe PIDDs requiring rapid therapeutic actions in a clinical laboratory setting. The target gene capture/NGS assay provides an average read depth of approximately 1000×. The deep coverage facilitates simultaneous detection of single nucleotide variants and exonic copy number variants in one comprehensive assessment. Exons with insufficient coverage (<20× read depth) or high sequence homology (pseudogenes) are complemented by amplicon-based sequencing with specific primers to ensure 100% coverage of all targeted regions. Analysis of 20 patient samples with low T-cell receptor excision circle numbers on newborn screening or a positive family history or clinical suspicion of SCID or other severe PIDD identified deleterious mutations in 14 of them. Identified pathogenic variants included both single nucleotide variants and exonic copy number variants, such as hemizygous nonsense, frameshift, and missense changes in IL2RG; compound heterozygous changes in ATM, RAG1, and CIITA; homozygous changes in DCLRE1C and IL7R; and a heterozygous nonsense mutation in CHD7. High-throughput deep sequencing analysis with complete clinical validation greatly increases the diagnostic yield of severe primary immunodeficiency. Establishing a molecular diagnosis enables early immune reconstitution through prompt therapeutic intervention and guides management for improved long-term quality of life. Copyright © 2016 American Academy of Allergy, Asthma & Immunology. Published by Elsevier Inc. All rights reserved.
Rapid and reliable protein structure determination via chemical shift threading.
Hafsa, Noor E; Berjanskii, Mark V; Arndt, David; Wishart, David S
2018-01-01
Protein structure determination using nuclear magnetic resonance (NMR) spectroscopy can be both time-consuming and labor intensive. Here we demonstrate how chemical shift threading can permit rapid, robust, and accurate protein structure determination using only chemical shift data. Threading is a relatively old bioinformatics technique that uses a combination of sequence information and predicted (or experimentally acquired) low-resolution structural data to generate high-resolution 3D protein structures. The key motivations behind using NMR chemical shifts for protein threading lie in the fact that they are easy to measure, they are available prior to 3D structure determination, and they contain vital structural information. The method we have developed uses not only sequence and chemical shift similarity but also chemical shift-derived secondary structure, shift-derived super-secondary structure, and shift-derived accessible surface area to generate a high quality protein structure regardless of the sequence similarity (or lack thereof) to a known structure already in the PDB. The method (called E-Thrifty) was found to be very fast (often < 10 min/structure) and to significantly outperform other shift-based or threading-based structure determination methods (in terms of top template model accuracy)-with an average TM-score performance of 0.68 (vs. 0.50-0.62 for other methods). Coupled with recent developments in chemical shift refinement, these results suggest that protein structure determination, using only NMR chemical shifts, is becoming increasingly practical and reliable. E-Thrifty is available as a web server at http://ethrifty.ca .
What Advances Are Being Made in DNA Sequencing?
... to identify genetic variations; both methods rely on new technologies that allow rapid sequencing of large amounts of ... describes the different sequencing technologies and what the new technologies have meant for the study of the genetic ...
Jia, Xianbo; Lin, Xinjian; Chen, Jichen
2017-11-02
Current genome walking methods are very time consuming, and many produce non-specific amplification products. To amplify the flanking sequences that are adjacent to Tn5 transposon insertion sites in Serratia marcescens FZSF02, we developed a genome walking method based on TAIL-PCR. This PCR method added a 20-cycle linear amplification step before the exponential amplification step to increase the concentration of the target sequences. Products of the linear amplification and the exponential amplification were diluted 100-fold to decrease the concentration of the templates that cause non-specific amplification. Fast DNA polymerase with a high extension speed was used in this method, and an amplification program was used to rapidly amplify long specific sequences. With this linear and exponential TAIL-PCR (LETAIL-PCR), we successfully obtained products larger than 2 kb from Tn5 transposon insertion mutant strains within 3 h. This method can be widely used in genome walking studies to amplify unknown sequences that are adjacent to known sequences.
NGS Catalog: A Database of Next Generation Sequencing Studies in Humans
Xia, Junfeng; Wang, Qingguo; Jia, Peilin; Wang, Bing; Pao, William; Zhao, Zhongming
2015-01-01
Next generation sequencing (NGS) technologies have been rapidly applied in biomedical and biological research since its advent only a few years ago, and they are expected to advance at an unprecedented pace in the following years. To provide the research community with a comprehensive NGS resource, we have developed the database Next Generation Sequencing Catalog (NGS Catalog, http://bioinfo.mc.vanderbilt.edu/NGS/index.html), a continually updated database that collects, curates and manages available human NGS data obtained from published literature. NGS Catalog deposits publication information of NGS studies and their mutation characteristics (SNVs, small insertions/deletions, copy number variations, and structural variants), as well as mutated genes and gene fusions detected by NGS. Other functions include user data upload, NGS general analysis pipelines, and NGS software. NGS Catalog is particularly useful for investigators who are new to NGS but would like to take advantage of these powerful technologies for their own research. Finally, based on the data deposited in NGS Catalog, we summarized features and findings from whole exome sequencing, whole genome sequencing, and transcriptome sequencing studies for human diseases or traits. PMID:22517761
Using cellular automata to generate image representation for biological sequences.
Xiao, X; Shao, S; Ding, Y; Huang, Z; Chen, X; Chou, K-C
2005-02-01
A novel approach to visualize biological sequences is developed based on cellular automata (Wolfram, S. Nature 1984, 311, 419-424), a set of discrete dynamical systems in which space and time are discrete. By transforming the symbolic sequence codes into the digital codes, and using some optimal space-time evolvement rules of cellular automata, a biological sequence can be represented by a unique image, the so-called cellular automata image. Many important features, which are originally hidden in a long and complicated biological sequence, can be clearly revealed thru its cellular automata image. With biological sequences entering into databanks rapidly increasing in the post-genomic era, it is anticipated that the cellular automata image will become a very useful vehicle for investigation into their key features, identification of their function, as well as revelation of their "fingerprint". It is anticipated that by using the concept of the pseudo amino acid composition (Chou, K.C. Proteins: Structure, Function, and Genetics, 2001, 43, 246-255), the cellular automata image approach can also be used to improve the quality of predicting protein attributes, such as structural class and subcellular location.
Kwon, Hyuk-Sang; Yang, Eun-Hee; Yeon, Seung-Woo; Kang, Byoung-Hwa; Kim, Tae-Yong
2004-10-15
This study aimed to develop a novel multiplex polymerase chain reaction (PCR) primer set for the identification of seven probiotic Lactobacillus species such as Lactobacillus acidophilus, Lactobacillus delbrueckii, Lactobacillus casei, Lactobacillus gasseri, Lactobacillus plantarum, Lactobacillus reuteri and Lactobacillus rhamnosus. The primer set, comprising of seven specific and two conserved primers, was derived from the integrated sequences of 16S and 23S rRNA genes and their rRNA intergenic spacer region of each species. It was able to identify the seven target species with 93.6% accuracy, which exceeds that of the general biochemical methods. The phylogenetic analyses, using 16S rDNA sequences of the probiotic isolates, also provided further support that the results from the multiplex PCR assay were trustworthy. Taken together, we suggest that the multiplex primer set is an efficient tool for simple, rapid and reliable identification of seven Lactobacillus species.
Kumánovics, Attila; Wittwer, Carl T.; Pryor, Robert J.; Augustine, Nancy H.; Leppert, Mark F.; Carey, John C.; Ochs, Hans D.; Wedgwood, Ralph J.; Faville, Ralph J.; Quie, Paul G.; Hill, Harry R.
2010-01-01
With the recent discovery of mutations in the STAT3 gene in the majority of patients with classic Hyper-IgE syndrome, it is now possible to make a molecular diagnosis in most of these cases. We have developed a PCR-based high-resolution DNA-melting assay to scan selected exons of the STAT3 gene for mutations responsible for Hyper-IgE syndrome, which is then followed by targeted sequencing. We scanned for mutations in 10 unrelated pedigrees, which include 16 patients with classic Hyper-IgE syndrome. These pedigrees include both sporadic and familial cases and their relatives, and we have found STAT3 mutations in all affected individuals. High-resolution melting analysis allows a single day turn-around time for mutation scanning and targeted sequencing of the STAT3 gene, which will greatly facilitate the rapid diagnosis of the Hyper-IgE syndrome, allowing prompt and appropriate therapy, prophylaxis, improved clinical outcome, and accurate genetic counseling. PMID:20093388
Rapid identification of acetic acid bacteria using MALDI-TOF mass spectrometry fingerprinting.
Andrés-Barrao, Cristina; Benagli, Cinzia; Chappuis, Malou; Ortega Pérez, Ruben; Tonolla, Mauro; Barja, François
2013-03-01
Acetic acid bacteria (AAB) are widespread microorganisms characterized by their ability to transform alcohols and sugar-alcohols into their corresponding organic acids. The suitability of matrix-assisted laser desorption-time of flight mass spectrometry (MALDI-TOF MS) for the identification of cultured AAB involved in the industrial production of vinegar was evaluated on 64 reference strains from the genera Acetobacter, Gluconacetobacter and Gluconobacter. Analysis of MS spectra obtained from single colonies of these strains confirmed their basic classification based on comparative 16S rRNA gene sequence analysis. MALDI-TOF analyses of isolates from vinegar cross-checked by comparative sequence analysis of 16S rRNA gene fragments allowed AAB to be identified, and it was possible to differentiate them from mixed cultures and non-AAB. The results showed that MALDI-TOF MS analysis was a rapid and reliable method for the clustering and identification of AAB species. Copyright © 2012 Elsevier GmbH. All rights reserved.
Mitochondrial DNA polymorphism in a maternal lineage of Holstein cows.
Hauswirth, W W; Laipis, P J
1982-01-01
Two mitochondrial genotypes are shown to exist within one Holstein cow maternal lineage. They were detected by the appearance of an extra Hae III recognition site in one genotype. The nucleotide sequence of this region has been determined and the genotypes are distinguished by an adenine/guanine base transition which creates the new Hae III site. This point mutation occurs within an open reading frame at the third position of a glycine codon and therefore does not alter the amino acid sequence. The present pattern of genotypes within the lineage demands that multiple shifts between genotypes must have occurred within the past 20 years with the most rapid shift taking place in no more than 4 years and indicates that mitochondrial DNA polymorphism can occur between maternally related mammals. The process that gave rise to different genotypes in one lineage is clearly of fundamental importance in understanding intraspecific mitochondrial polymorphism and evolution in mammals. Several potential mechanisms for rapid mitochondrial DNA variation are discussed in light of these results. Images PMID:6289312
Zhan, Xiangjiang; Pan, Shengkai; Wang, Junyi; Dixon, Andrew; He, Jing; Muller, Margit G; Ni, Peixiang; Hu, Li; Liu, Yuan; Hou, Haolong; Chen, Yuanping; Xia, Jinquan; Luo, Qiong; Xu, Pengwei; Chen, Ying; Liao, Shengguang; Cao, Changchang; Gao, Shukun; Wang, Zhaobao; Yue, Zhen; Li, Guoqing; Yin, Ye; Fox, Nick C; Wang, Jun; Bruford, Michael W
2013-05-01
As top predators, falcons possess unique morphological, physiological and behavioral adaptations that allow them to be successful hunters: for example, the peregrine is renowned as the world's fastest animal. To examine the evolutionary basis of predatory adaptations, we sequenced the genomes of both the peregrine (Falco peregrinus) and saker falcon (Falco cherrug), and we present parallel, genome-wide evidence for evolutionary innovation and selection for a predatory lifestyle. The genomes, assembled using Illumina deep sequencing with greater than 100-fold coverage, are both approximately 1.2 Gb in length, with transcriptome-assisted prediction of approximately 16,200 genes for both species. Analysis of 8,424 orthologs in both falcons, chicken, zebra finch and turkey identified consistent evidence for genome-wide rapid evolution in these raptors. SNP-based inference showed contrasting recent demographic trajectories for the two falcons, and gene-based analysis highlighted falcon-specific evolutionary novelties for beak development and olfaction and specifically for homeostasis-related genes in the arid environment-adapted saker.
Pease, Anthony; Sullivan, Stacey; Olby, Natasha; Galano, Heather; Cerda-Gonzalez, Sophia; Robertson, Ian D; Gavin, Patrick; Thrall, Donald
2006-01-01
Three case history reports are presented to illustrate the value of the single-shot turbo spin-echo pulse sequence for assessment of the subarachnoid space. The use of the single-shot turbo spin-echo pulse sequence, which is a heavily T2-weighted sequence, allows for a rapid, noninvasive evaluation of the subarachnoid space by using the high signal from cerebrospinal fluid. This sequence can be completed in seconds rather than the several minutes required for a T2-fast spin-echo sequence. Unlike the standard T2-fast spin-echo sequence, a single-shot turbo spin-echo pulse sequence also provides qualitative information about the protein and the cellular content of the cerebrospinal fluid, such as in patients with inflammatory debris or hemorrhage in the cerebrospinal fluid. Although the resolution of the single-shot turbo spin-echo pulse sequence images is relatively poor compared with more conventional sequences, the qualitative information about the subarachnoid space and cerebrospinal fluid and the rapid acquisition time, make it a useful sequence to include in standard protocols of spinal magnetic resonance imaging.
Jaffe, R I; Lane, J D; Albury, S V; Niemeyer, D M
2000-09-01
Methicillin-resistant staphylococci (MRS) are one of the most common causes of nosocomial infections and bacteremia. Standard bacterial identification and susceptibility testing frequently require as long as 72 h to report results, and there may be difficulty in rapidly and accurately identifying methicillin resistance. The use of the PCR is a rapid and simple process for the amplification of target DNA sequences, which can be used to identify and test bacteria for antimicrobial resistance. However, many sample preparation methods are unsuitable for PCR utilization in the clinical laboratory because they either are not cost-effective, take too long to perform, or do not provide a satisfactory DNA template for PCR. Our goal was to provide same-day results to facilitate rapid diagnosis and therapy. In this report, we describe a rapid method for extraction of bacterial DNA directly from blood culture bottles that gave quality DNA for PCR in as little as 20 min. We compared this extraction method to the standard QIAGEN method for turnaround time (TAT), cost, purity, and use of template in PCR. Specific identification of MRS was determined using intragenic primer sets for bacterial and Staphylococcus 16S rRNA and mecA gene sequences. The PCR primer sets were validated with 416 isolates of staphylococci, including methicillin-resistant Staphylococcus aureus (n = 106), methicillin-sensitive S. aureus (n = 134), and coagulase-negative Staphylococcus (n = 176). The total supply cost of our extraction method and PCR was $2.15 per sample with a result TAT of less than 4 h. The methods described herein represent a rapid and accurate DNA extraction and PCR-based identification system, which makes the system an ideal candidate for use under austere field conditions and one that may have utility in the clinical laboratory.
Specific Primers for Rapid Detection of Microsporum audouinii by PCR in Clinical Samples▿
Roque, H. D.; Vieira, R.; Rato, S.; Luz-Martins, M.
2006-01-01
This report describes application of PCR fingerprinting to identify common species of dermatophytes using the microsatellite primers M13, (GACA)4, and (GTG)5. The initial PCR analysis rendered a specific DNA fragment for Microsporum audouinii, which was cloned and sequenced. Based on the sequencing data of this fragment, forward (MA_1F) and reverse (MA_1R) primers were designed and verified by PCR to establish their reliability in the diagnosis of M. audouinii. These primers produced a singular PCR band of 431 bp specific only to strains and isolates of M. audouinii, based on a global test of 182 strains/isolates belonging to 11 species of dermatophytes. These findings indicate these primers are reliable for diagnostic purposes, and we recommend their use in laboratory analysis. PMID:17005755
Specific primers for rapid detection of Microsporum audouinii by PCR in clinical samples.
Roque, H D; Vieira, R; Rato, S; Luz-Martins, M
2006-12-01
This report describes application of PCR fingerprinting to identify common species of dermatophytes using the microsatellite primers M13, (GACA)4, and (GTG)5. The initial PCR analysis rendered a specific DNA fragment for Microsporum audouinii, which was cloned and sequenced. Based on the sequencing data of this fragment, forward (MA_1F) and reverse (MA_1R) primers were designed and verified by PCR to establish their reliability in the diagnosis of M. audouinii. These primers produced a singular PCR band of 431 bp specific only to strains and isolates of M. audouinii, based on a global test of 182 strains/isolates belonging to 11 species of dermatophytes. These findings indicate these primers are reliable for diagnostic purposes, and we recommend their use in laboratory analysis.
Hide, Geoff; Hughes, Jacqueline M; McNuff, Robert
2003-01-01
Background The rapid expansion in the availability of genome and DNA sequence information has opened up new possibilities for the development of methods for detecting free-living protozoa in environmental samples. The protozoan Blepharisma japonicum was used to investigate a rapid and simple detection system based on polymerase chain reaction amplification (PCR) from organisms immobilised on FTA paper. Results Using primers designed from the α-tubulin genes of Blepharisma, specific and sensitive detection to the equivalent of a single Blepharisma cell could be achieved. Similar detection levels were found using water samples, containing Blepharisma, which were dried onto Whatman FTA paper. Conclusion This system has potential as a sensitive convenient detection system for Blepharisma and could be applied to other protozoan organisms. PMID:14516472
Smith, Michael G; Gianoulis, Tara A; Pukatzki, Stefan; Mekalanos, John J; Ornston, L Nicholas; Gerstein, Mark; Snyder, Michael
2007-03-01
Acinetobacter baumannii has emerged as an important and problematic human pathogen as it is the causative agent of several types of infections including pneumonia, meningitis, septicemia, and urinary tract infections. We explored the pathogenic content of this harmful pathogen using a combination of DNA sequencing and insertional mutagenesis. The genome of this organism was sequenced using a strategy involving high-density pyrosequencing, a novel, rapid method of high-throughput sequencing. Excluding the rDNA repeats, the assembled genome is 3,976,746 base pairs (bp) and has 3830 ORFs. A significant fraction of ORFs (17.2%) are located in 28 putative alien islands, indicating that the genome has acquired a large amount of foreign DNA. Consistent with its role in pathogenesis, a remarkable number of the islands (16) contain genes implicated in virulence, indicating the organism devotes a considerable portion of its genes to pathogenesis. The largest island contains elements homologous to the Legionella/Coxiella Type IV secretion apparatus. Type IV secretion systems have been demonstrated to be important for virulence in other organisms and thus are likely to help mediate pathogenesis of A. baumannii. Insertional mutagenesis generated avirulent isolates of A. baumannii and verified that six of the islands contain virulence genes, including two novel islands containing genes that lacked homology with others in the databases. The DNA sequencing approach described in this study allows the rapid elucidation of the DNA sequence of any microbe and, when combined with genetic screens, can identify many novel genes important for microbial pathogenesis.
Klee, Julia; Besana, Andrea M; Genersch, Elke; Gisder, Sebastian; Nanetti, Antonio; Tam, Dinh Quyet; Chinh, Tong Xuan; Puerta, Francisco; Ruz, José Maria; Kryger, Per; Message, Dejair; Hatjina, Fani; Korpela, Seppo; Fries, Ingemar; Paxton, Robert J
2007-09-01
The economically most important honey bee species, Apis mellifera, was formerly considered to be parasitized by one microsporidian, Nosema apis. Recently, [Higes, M., Martín, R., Meana, A., 2006. Nosema ceranae, a new microsporidian parasite in honeybees in Europe, J. Invertebr. Pathol. 92, 93-95] and [Huang, W.-F., Jiang, J.-H., Chen, Y.-W., Wang, C.-H., 2007. A Nosema ceranae isolate from the honeybee Apis mellifera. Apidologie 38, 30-37] used 16S (SSU) rRNA gene sequences to demonstrate the presence of Nosema ceranae in A. mellifera from Spain and Taiwan, respectively. We developed a rapid method to differentiate between N. apis and N. ceranae based on PCR-RFLPs of partial SSU rRNA. The reliability of the method was confirmed by sequencing 29 isolates from across the world (N =9 isolates gave N. apis RFLPs and sequences, N =20 isolates gave N. ceranae RFLPs and sequences; 100% correct classification). We then employed the method to analyze N =115 isolates from across the world. Our data, combined with N =36 additional published sequences demonstrate that (i) N. ceranae most likely jumped host to A. mellifera, probably within the last decade, (ii) that host colonies and individuals may be co-infected by both microsporidia species, and that (iii) N. ceranae is now a parasite of A. mellifera across most of the world. The rapid, long-distance dispersal of N. ceranae is likely due to transport of infected honey bees by commercial or hobbyist beekeepers. We discuss the implications of this emergent pathogen for worldwide beekeeping.
Mulkern, Robert; Haker, Steven; Mamata, Hatsuho; Lee, Edward; Mitsouras, Dimitrios; Oshio, Koichi; Balasubramanian, Mukund; Hatabu, Hiroto
2014-03-01
Lung parenchyma is challenging to image with proton MRI. The large air space results in ~l/5th as many signal-generating protons compared to other organs. Air/tissue magnetic susceptibility differences lead to strong magnetic field gradients throughout the lungs and to broad frequency distributions, much broader than within other organs. Such distributions have been the subject of experimental and theoretical analyses which may reveal aspects of lung microarchitecture useful for diagnosis. Their most immediate relevance to current imaging practice is to cause rapid signal decays, commonly discussed in terms of short T 2 * values of 1 ms or lower at typical imaging field strengths. Herein we provide a brief review of previous studies describing and interpreting proton lung spectra. We then link these broad frequency distributions to rapid signal decays, though not necessarily the exponential decays generally used to define T 2 * values. We examine how these decays influence observed signal intensities and spatial mapping features associated with the most prominent torso imaging sequences, including spoiled gradient and spin echo sequences. Effects of imperfect refocusing pulses on the multiple echo signal decays in single shot fast spin echo (SSFSE) sequences and effects of broad frequency distributions on balanced steady state free precession (bSSFP) sequence signal intensities are also provided. The theoretical analyses are based on the concept of explicitly separating the effects of reversible and irreversible transverse relaxation processes, thus providing a somewhat novel and more general framework from which to estimate lung signal intensity behavior in modern imaging practice.
MULKERN, ROBERT; HAKER, STEVEN; MAMATA, HATSUHO; LEE, EDWARD; MITSOURAS, DIMITRIOS; OSHIO, KOICHI; BALASUBRAMANIAN, MUKUND; HATABU, HIROTO
2014-01-01
Lung parenchyma is challenging to image with proton MRI. The large air space results in ~l/5th as many signal-generating protons compared to other organs. Air/tissue magnetic susceptibility differences lead to strong magnetic field gradients throughout the lungs and to broad frequency distributions, much broader than within other organs. Such distributions have been the subject of experimental and theoretical analyses which may reveal aspects of lung microarchitecture useful for diagnosis. Their most immediate relevance to current imaging practice is to cause rapid signal decays, commonly discussed in terms of short T2* values of 1 ms or lower at typical imaging field strengths. Herein we provide a brief review of previous studies describing and interpreting proton lung spectra. We then link these broad frequency distributions to rapid signal decays, though not necessarily the exponential decays generally used to define T2* values. We examine how these decays influence observed signal intensities and spatial mapping features associated with the most prominent torso imaging sequences, including spoiled gradient and spin echo sequences. Effects of imperfect refocusing pulses on the multiple echo signal decays in single shot fast spin echo (SSFSE) sequences and effects of broad frequency distributions on balanced steady state free precession (bSSFP) sequence signal intensities are also provided. The theoretical analyses are based on the concept of explicitly separating the effects of reversible and irreversible transverse relaxation processes, thus providing a somewhat novel and more general framework from which to estimate lung signal intensity behavior in modern imaging practice. PMID:25228852
Yu, Miao; Ji, Lexiang; Neumann, Drexel A.; ...
2015-07-15
Restriction-modification (R-M) systems pose a major barrier to DNA transformation and genetic engineering of bacterial species. Systematic identification of DNA methylation in R-M systems, including N 6-methyladenine (6mA), 5-methylcytosine (5mC) and N 4-methylcytosine (4mC), will enable strategies to make these species genetically tractable. Although single-molecule, real time (SMRT) sequencing technology is capable of detecting 4mC directly for any bacterial species regardless of whether an assembled genome exists or not, it is not as scalable to profiling hundreds to thousands of samples compared with the commonly used next-generation sequencing technologies. Here, we present 4mC-Tet-assisted bisulfite-sequencing (4mC-TAB-seq), a next-generation sequencing method thatmore » rapidly and cost efficiently reveals the genome-wide locations of 4mC for bacterial species with an available assembled reference genome. In 4mC-TAB-seq, both cytosines and 5mCs are read out as thymines, whereas only 4mCs are read out as cytosines, revealing their specific positions throughout the genome. We applied 4mC-TAB-seq to study the methylation of a member of the hyperthermophilc genus, Caldicellulosiruptor, in which 4mC-related restriction is a major barrier to DNA transformation from other species. Lastly, in combination with MethylC-seq, both 4mC- and 5mC-containing motifs are identified which can assist in rapid and efficient genetic engineering of these bacteria in the future.« less
Dialdestoro, Kevin; Sibbesen, Jonas Andreas; Maretty, Lasse; Raghwani, Jayna; Gall, Astrid; Kellam, Paul; Pybus, Oliver G.; Hein, Jotun; Jenkins, Paul A.
2016-01-01
Human immunodeficiency virus (HIV) is a rapidly evolving pathogen that causes chronic infections, so genetic diversity within a single infection can be very high. High-throughput “deep” sequencing can now measure this diversity in unprecedented detail, particularly since it can be performed at different time points during an infection, and this offers a potentially powerful way to infer the evolutionary dynamics of the intrahost viral population. However, population genomic inference from HIV sequence data is challenging because of high rates of mutation and recombination, rapid demographic changes, and ongoing selective pressures. In this article we develop a new method for inference using HIV deep sequencing data, using an approach based on importance sampling of ancestral recombination graphs under a multilocus coalescent model. The approach further extends recent progress in the approximation of so-called conditional sampling distributions, a quantity of key interest when approximating coalescent likelihoods. The chief novelties of our method are that it is able to infer rates of recombination and mutation, as well as the effective population size, while handling sampling over different time points and missing data without extra computational difficulty. We apply our method to a data set of HIV-1, in which several hundred sequences were obtained from an infected individual at seven time points over 2 years. We find mutation rate and effective population size estimates to be comparable to those produced by the software BEAST. Additionally, our method is able to produce local recombination rate estimates. The software underlying our method, Coalescenator, is freely available. PMID:26857628
Rapid Multistep Synthesis of 1,2,4-Oxadiazoles in a Single Continuous Microreactor Sequence
Grant, Daniel; Dahl, Russell; Cosford, Nicholas D. P.
2009-01-01
A general method for the synthesis of bis-substituted 1,2,4-oxadiazoles from readily available arylnitriles and activated carbonyls in a single continuous microreactor sequence is described. The synthesis incorporates three sequential microreactors to produce 1,2,4-oxadiazoles in ~30 min in quantities (40–80 mg) sufficient for full characterization and rapid library supply. PMID:18687005
Gerstner, Arpad; DeFord, James H; Papaconstantinou, John
2003-07-25
Ames dwarfism is caused by a homozygous single nucleotide mutation in the pituitary specific prop-1 gene, resulting in combined pituitary hormone deficiency, reduced growth and extended lifespan. Thus, these mice serve as an important model system for endocrinological, aging and longevity studies. Because the phenotype of wild type and heterozygous mice is undistinguishable, it is imperative for successful breeding to accurately genotype these animals. Here we report a novel, yet simple, approach for prop-1 genotyping using PCR-based allele-specific amplification (PCR-ASA). We also compare this method to other potential genotyping techniques, i.e. PCR-based restriction fragment length polymorphism analysis (PCR-RFLP) and fluorescence automated DNA sequencing. We demonstrate that the single-step PCR-ASA has several advantages over the classical PCR-RFLP because the procedure is simple, less expensive and rapid. To further increase the specificity and sensitivity of the PCR-ASA, we introduced a single-base mismatch at the 3' penultimate position of the mutant primer. Our results also reveal that the fluorescence automated DNA sequencing has limitations for detecting a single nucleotide polymorphism in the prop-1 gene, particularly in heterozygotes.
Parallel human genome analysis: microarray-based expression monitoring of 1000 genes.
Schena, M; Shalon, D; Heller, R; Chai, A; Brown, P O; Davis, R W
1996-01-01
Microarrays containing 1046 human cDNAs of unknown sequence were printed on glass with high-speed robotics. These 1.0-cm2 DNA "chips" were used to quantitatively monitor differential expression of the cognate human genes using a highly sensitive two-color hybridization assay. Array elements that displayed differential expression patterns under given experimental conditions were characterized by sequencing. The identification of known and novel heat shock and phorbol ester-regulated genes in human T cells demonstrates the sensitivity of the assay. Parallel gene analysis with microarrays provides a rapid and efficient method for large-scale human gene discovery. Images Fig. 1 Fig. 2 Fig. 3 PMID:8855227
Xue, Yong; Wilkes, Jon G.; Moskal, Ted J.; Williams, Anna J.; Cooper, Willie M.; Nayak, Rajesh; Rafii, Fatemeh; Buzatu, Dan A.
2016-01-01
Standard methods to detect Escherichia coli contamination in food use the polymerase chain reaction (PCR) and agar culture plates. These methods require multiple incubation steps and take a long time to results. An improved rapid flow-cytometry based detection method was developed, using a fluorescence-labeled oligonucleotide probe specifically binding a16S rRNA sequence. The method positively detected 51 E. coli isolates as well as 4 Shigella species. All 27 non-E. coli strains tested gave negative results. Comparison of the new genetic assay with a total plate count (TPC) assay and agar plate counting indicated similar sensitivity, agreement between cytometry cell and colony counts. This method can detect a small number of E.coli cells in the presence of large numbers of other bacteria. This method can be used for rapid, economical, and stable detection of E. coli and Shigella contamination in the food industry and other contexts. PMID:26913737
Xue, Yong; Wilkes, Jon G; Moskal, Ted J; Williams, Anna J; Cooper, Willie M; Nayak, Rajesh; Rafii, Fatemeh; Buzatu, Dan A
2016-01-01
Standard methods to detect Escherichia coli contamination in food use the polymerase chain reaction (PCR) and agar culture plates. These methods require multiple incubation steps and take a long time to results. An improved rapid flow-cytometry based detection method was developed, using a fluorescence-labeled oligonucleotide probe specifically binding a16S rRNA sequence. The method positively detected 51 E. coli isolates as well as 4 Shigella species. All 27 non-E. coli strains tested gave negative results. Comparison of the new genetic assay with a total plate count (TPC) assay and agar plate counting indicated similar sensitivity, agreement between cytometry cell and colony counts. This method can detect a small number of E.coli cells in the presence of large numbers of other bacteria. This method can be used for rapid, economical, and stable detection of E. coli and Shigella contamination in the food industry and other contexts.
Kobayashi, Takehito; Yagi, Yusuke; Nakamura, Takahiro
2016-01-01
The pentatricopeptide repeat (PPR) motif is a sequence-specific RNA/DNA-binding module. Elucidation of the RNA/DNA recognition mechanism has enabled engineering of PPR motifs as new RNA/DNA manipulation tools in living cells, including for genome editing. However, the biochemical characteristics of PPR proteins remain unknown, mostly due to the instability and/or unfolding propensities of PPR proteins in heterologous expression systems such as bacteria and yeast. To overcome this issue, we constructed reporter systems using animal cultured cells. The cell-based system has highly attractive features for PPR engineering: robust eukaryotic gene expression; availability of various vectors, reagents, and antibodies; highly efficient DNA delivery ratio (>80 %); and rapid, high-throughput data production. In this chapter, we introduce an example of such reporter systems: a PPR-based sequence-specific translational activation system. The cell-based reporter system can be applied to characterize plant genes of interested and to PPR engineering.
Evaluation of exome variants using the Ion Proton Platform to sequence error-prone regions.
Seo, Heewon; Park, Yoomi; Min, Byung Joo; Seo, Myung Eui; Kim, Ju Han
2017-01-01
The Ion Proton sequencer from Thermo Fisher accurately determines sequence variants from target regions with a rapid turnaround time at a low cost. However, misleading variant-calling errors can occur. We performed a systematic evaluation and manual curation of read-level alignments for the 675 ultrarare variants reported by the Ion Proton sequencer from 27 whole-exome sequencing data but that are not present in either the 1000 Genomes Project and the Exome Aggregation Consortium. We classified positive variant calls into 393 highly likely false positives, 126 likely false positives, and 156 likely true positives, which comprised 58.2%, 18.7%, and 23.1% of the variants, respectively. We identified four distinct error patterns of variant calling that may be bioinformatically corrected when using different strategies: simplicity region, SNV cluster, peripheral sequence read, and base inversion. Local de novo assembly successfully corrected 201 (38.7%) of the 519 highly likely or likely false positives. We also demonstrate that the two sequencing kits from Thermo Fisher (the Ion PI Sequencing 200 kit V3 and the Ion PI Hi-Q kit) exhibit different error profiles across different error types. A refined calling algorithm with better polymerase may improve the performance of the Ion Proton sequencing platform.
Li, Guotian; Jain, Rashmi; Chern, Mawsheng; Pham, Nikki T; Martin, Joel A; Wei, Tong; Schackwitz, Wendy S; Lipzen, Anna M; Duong, Phat Q; Jones, Kyle C; Jiang, Liangrong; Ruan, Deling; Bauer, Diane; Peng, Yi; Barry, Kerrie W; Schmutz, Jeremy; Ronald, Pamela C
2017-06-01
The availability of a whole-genome sequenced mutant population and the cataloging of mutations of each line at a single-nucleotide resolution facilitate functional genomic analysis. To this end, we generated and sequenced a fast-neutron-induced mutant population in the model rice cultivar Kitaake ( Oryza sativa ssp japonica ), which completes its life cycle in 9 weeks. We sequenced 1504 mutant lines at 45-fold coverage and identified 91,513 mutations affecting 32,307 genes, i.e., 58% of all rice genes. We detected an average of 61 mutations per line. Mutation types include single-base substitutions, deletions, insertions, inversions, translocations, and tandem duplications. We observed a high proportion of loss-of-function mutations. We identified an inversion affecting a single gene as the causative mutation for the short-grain phenotype in one mutant line. This result reveals the usefulness of the resource for efficient, cost-effective identification of genes conferring specific phenotypes. To facilitate public access to this genetic resource, we established an open access database called KitBase that provides access to sequence data and seed stocks. This population complements other available mutant collections and gene-editing technologies. This work demonstrates how inexpensive next-generation sequencing can be applied to generate a high-density catalog of mutations. © 2017 American Society of Plant Biologists. All rights reserved.
Li, Guotian; Jain, Rashmi; Chern, Mawsheng; ...
2017-06-02
The availability of a whole-genome sequenced mutant population and the cataloging of mutations of each line at a single-nucleotide resolution facilitate functional genomic analysis. To this end, we generated and sequenced a fast-neutron-induced mutant population in the model rice cultivar Kitaake (Oryza sativa ssp japonica), which completes its life cycle in 9 weeks. We sequenced 1504 mutant lines at 45-fold coverage and identified 91,513 mutations affecting 32,307 genes, i.e., 58% of all rice genes. We detected an average of 61 mutations per line. Mutation types include single-base substitutions, deletions, insertions, inversions, translocations, and tandem duplications. We observed a high proportionmore » of loss-of-function mutations. We identified an inversion affecting a single gene as the causative mutation for the short-grain phenotype in one mutant line. This result reveals the usefulness of the resource for efficient, cost-effective identification of genes conferring specific phenotypes. To facilitate public access to this genetic resource, we established an open access database called KitBase that provides access to sequence data and seed stocks. This population complements other available mutant collections and gene-editing technologies. In conclusion, this work demonstrates how inexpensive next-generation sequencing can be applied to generate a high-density catalog of mutations.« less
Zhang, Huimin; He, Hongkui; Yu, Xiujuan; Xu, Zhaohui; Zhang, Zhizhou
2016-11-01
It remains an unsolved problem to quantify a natural microbial community by rapidly and conveniently measuring multiple species with functional significance. Most widely used high throughput next-generation sequencing methods can only generate information mainly for genus-level taxonomic identification and quantification, and detection of multiple species in a complex microbial community is still heavily dependent on approaches based on near full-length ribosome RNA gene or genome sequence information. In this study, we used near full-length rRNA gene library sequencing plus Primer-Blast to design species-specific primers based on whole microbial genome sequences. The primers were intended to be specific at the species level within relevant microbial communities, i.e., a defined genomics background. The primers were tested with samples collected from the Daqu (also called fermentation starters) and pit mud of a traditional Chinese liquor production plant. Sixteen pairs of primers were found to be suitable for identification of individual species. Among them, seven pairs were chosen to measure the abundance of microbial species through quantitative PCR. The combination of near full-length ribosome RNA gene library sequencing and Primer-Blast may represent a broadly useful protocol to quantify multiple species in complex microbial population samples with species-specific primers.
Zhang, Yue; Feng, Shiqian; Zeng, Yiying; Ning, Hong; Liu, Lijun; Zhao, Zihua; Jiang, Fan; Li, Zhihong
2018-06-23
Bactrocera tsuneonis (Miyake), generally known as the Japanese orange fly, is considered to be a major pest of commercial citrus crops. It has a limited distribution in China, Japan and Vietnam, but it has the potential to invade areas outside of Asia. More genetic information of B. tsuneonis should be obtained in order to develop effective methodologies for rapid and accurate molecular identification due to the difficulty of distinguishing it from Bactrocera minax based on morphological features. We report here the whole mitochondrial genome of B. tsuneonis sequenced by next-generation sequencing. This mitogenome sequence had a total length of 15,865 bp, a typical circular molecule comprising 13 protein-coding genes, 2 rRNA genes, 22 tRNA genes and a non-coding region (A + T-rich control region). The structure and organization of the molecule were typical and similar compared with the published homologous sequences of other fruit flies in Tephritidae. The phylogenetic analyses based on the mitochondrial genome data presented a close genetic relationship between B. tsuneonis and B. minax. This is the first report of the complete mitochondrial genome of B. tsuneonis, and it can be used in further studies of species diagnosis, evolutionary biology, prevention and control. Copyright © 2018. Published by Elsevier B.V.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Kuppuswamy, M.N.; Hoffmann, J.W.; Spitzer, S.G.
1991-02-15
In this report, the authors describe an approach to detect the presence of abnormal alleles in those genetic diseases in which frequency of occurrence of the same mutation is high (e.g., hemophilia B). Initially, from each subject, the DNA fragment containing the putative mutation site is amplified by the polymerase chain reaction. For each fragment two reaction mixtures are then prepared. Each contains the amplified fragment, a primer (18-mer or longer) whose sequence is identical to the coding sequence of the normal gene immediately flanking the 5{prime} end of the mutation site, and either an {alpha}-{sup 32}P-labeled nucleotide corresponding tomore » the normal coding sequence at the mutation site or an {alpha}-{sup 32}P-labeled nucleotide corresponding to the mutant sequence. An essential feature of the present methodology is that the base immediately 3{prime} to the template-bound primer is one of those altered in the mutant, since in this way an extension of the primer by a single base will give an extended molecule characteristic of either the mutant or the wild type. The method is rapid and should be useful in carrier detection and prenatal diagnosis of every genetic disease with a known sequence variation.« less
Rapid Generation of Optimal Asteroid Powered Descent Trajectories Via Convex Optimization
NASA Technical Reports Server (NTRS)
Pinson, Robin; Lu, Ping
2015-01-01
This paper investigates a convex optimization based method that can rapidly generate the fuel optimal asteroid powered descent trajectory. The ultimate goal is to autonomously design the optimal powered descent trajectory on-board the spacecraft immediately prior to the descent burn. Compared to a planetary powered landing problem, the major difficulty is the complex gravity field near the surface of an asteroid that cannot be approximated by a constant gravity field. This paper uses relaxation techniques and a successive solution process that seeks the solution to the original nonlinear, nonconvex problem through the solutions to a sequence of convex optimal control problems.
Ma, Lijun; Lee, Letitia; Barani, Igor; Hwang, Andrew; Fogh, Shannon; Nakamura, Jean; McDermott, Michael; Sneed, Penny; Larson, David A; Sahgal, Arjun
2011-11-21
Rapid delivery of multiple shots or isocenters is one of the hallmarks of Gamma Knife radiosurgery. In this study, we investigated whether the temporal order of shots delivered with Gamma Knife Perfexion would significantly influence the biological equivalent dose for complex multi-isocenter treatments. Twenty single-target cases were selected for analysis. For each case, 3D dose matrices of individual shots were extracted and single-fraction equivalent uniform dose (sEUD) values were determined for all possible shot delivery sequences, corresponding to different patterns of temporal dose delivery within the target. We found significant variations in the sEUD values among these sequences exceeding 15% for certain cases. However, the sequences for the actual treatment delivery were found to agree (<3%) and to correlate (R² = 0.98) excellently with the sequences yielding the maximum sEUD values for all studied cases. This result is applicable for both fast and slow growing tumors with α/β values of 2 to 20 according to the linear-quadratic model. In conclusion, despite large potential variations in different shot sequences for multi-isocenter Gamma Knife treatments, current clinical delivery sequences exhibited consistent biological target dosing that approached that maximally achievable for all studied cases.
Yajima, Misako; Ikuta, Kazufumi; Kanda, Teru
2018-04-03
Herpesviruses have relatively large DNA genomes of more than 150 kb that are difficult to clone and sequence. Bacterial artificial chromosome (BAC) cloning of herpesvirus genomes is a powerful technique that greatly facilitates whole viral genome sequencing as well as functional characterization of reconstituted viruses. We describe recently invented technologies for rapid BAC cloning of herpesvirus genomes using CRISPR/Cas9-mediated homology-directed repair. We focus on recent BAC cloning techniques of Epstein-Barr virus (EBV) genomes and discuss the possible advantages of a CRISPR/Cas9-mediated strategy comparatively with precedent EBV-BAC cloning strategies. We also describe the design decisions of this technology as well as possible pitfalls and points to be improved in the future. The obtained EBV-BAC clones are subjected to long-read sequencing analysis to determine complete EBV genome sequence including repetitive regions. Rapid cloning and sequence determination of various EBV strains will greatly contribute to the understanding of their global geographical distribution. This technology can also be used to clone disease-associated EBV strains and test the hypothesis that they have special features that distinguish them from strains that infect asymptomatically.
Ikuta, Kazufumi; Kanda, Teru
2018-01-01
Herpesviruses have relatively large DNA genomes of more than 150 kb that are difficult to clone and sequence. Bacterial artificial chromosome (BAC) cloning of herpesvirus genomes is a powerful technique that greatly facilitates whole viral genome sequencing as well as functional characterization of reconstituted viruses. We describe recently invented technologies for rapid BAC cloning of herpesvirus genomes using CRISPR/Cas9-mediated homology-directed repair. We focus on recent BAC cloning techniques of Epstein-Barr virus (EBV) genomes and discuss the possible advantages of a CRISPR/Cas9-mediated strategy comparatively with precedent EBV-BAC cloning strategies. We also describe the design decisions of this technology as well as possible pitfalls and points to be improved in the future. The obtained EBV-BAC clones are subjected to long-read sequencing analysis to determine complete EBV genome sequence including repetitive regions. Rapid cloning and sequence determination of various EBV strains will greatly contribute to the understanding of their global geographical distribution. This technology can also be used to clone disease-associated EBV strains and test the hypothesis that they have special features that distinguish them from strains that infect asymptomatically. PMID:29614006
2014-01-01
Background Due to rapid sequencing of genomes, there are now millions of deposited protein sequences with no known function. Fast sequence-based comparisons allow detecting close homologs for a protein of interest to transfer functional information from the homologs to the given protein. Sequence-based comparison cannot detect remote homologs, in which evolution has adjusted the sequence while largely preserving structure. Structure-based comparisons can detect remote homologs but most methods for doing so are too expensive to apply at a large scale over structural databases of proteins. Recently, fragment-based structural representations have been proposed that allow fast detection of remote homologs with reasonable accuracy. These representations have also been used to obtain linearly-reducible maps of protein structure space. It has been shown, as additionally supported from analysis in this paper that such maps preserve functional co-localization of the protein structure space. Methods Inspired by a recent application of the Latent Dirichlet Allocation (LDA) model for conducting structural comparisons of proteins, we propose higher-order LDA-obtained topic-based representations of protein structures to provide an alternative route for remote homology detection and organization of the protein structure space in few dimensions. Various techniques based on natural language processing are proposed and employed to aid the analysis of topics in the protein structure domain. Results We show that a topic-based representation is just as effective as a fragment-based one at automated detection of remote homologs and organization of protein structure space. We conduct a detailed analysis of the information content in the topic-based representation, showing that topics have semantic meaning. The fragment-based and topic-based representations are also shown to allow prediction of superfamily membership. Conclusions This work opens exciting venues in designing novel representations to extract information about protein structures, as well as organizing and mining protein structure space with mature text mining tools. PMID:25080993
Identifying active foraminifera in the Sea of Japan using metatranscriptomic approach
NASA Astrophysics Data System (ADS)
Lejzerowicz, Franck; Voltsky, Ivan; Pawlowski, Jan
2013-02-01
Metagenetics represents an efficient and rapid tool to describe environmental diversity patterns of microbial eukaryotes based on ribosomal DNA sequences. However, the results of metagenetic studies are often biased by the presence of extracellular DNA molecules that are persistent in the environment, especially in deep-sea sediment. As an alternative, short-lived RNA molecules constitute a good proxy for the detection of active species. Here, we used a metatranscriptomic approach based on RNA-derived (cDNA) sequences to study the diversity of the deep-sea benthic foraminifera and compared it to the metagenetic approach. We analyzed 257 ribosomal DNA and cDNA sequences obtained from seven sediments samples collected in the Sea of Japan at depths ranging from 486 to 3665 m. The DNA and RNA-based approaches gave a similar view of the taxonomic composition of foraminiferal assemblage, but differed in some important points. First, the cDNA dataset was dominated by sequences of rotaliids and robertiniids, suggesting that these calcareous species, some of which have been observed in Rose Bengal stained samples, are the most active component of foraminiferal community. Second, the richness of monothalamous (single-chambered) foraminifera was particularly high in DNA extracts from the deepest samples, confirming that this group of foraminifera is abundant but not necessarily very active in the deep-sea sediments. Finally, the high divergence of undetermined sequences in cDNA dataset indicate the limits of our database and lack of knowledge about some active but possibly rare species. Our study demonstrates the capability of the metatranscriptomic approach to detect active foraminiferal species and prompt its use in future high-throughput sequencing-based environmental surveys.
RATT: Rapid Annotation Transfer Tool
Otto, Thomas D.; Dillon, Gary P.; Degrave, Wim S.; Berriman, Matthew
2011-01-01
Second-generation sequencing technologies have made large-scale sequencing projects commonplace. However, making use of these datasets often requires gene function to be ascribed genome wide. Although tool development has kept pace with the changes in sequence production, for tasks such as mapping, de novo assembly or visualization, genome annotation remains a challenge. We have developed a method to rapidly provide accurate annotation for new genomes using previously annotated genomes as a reference. The method, implemented in a tool called RATT (Rapid Annotation Transfer Tool), transfers annotations from a high-quality reference to a new genome on the basis of conserved synteny. We demonstrate that a Mycobacterium tuberculosis genome or a single 2.5 Mb chromosome from a malaria parasite can be annotated in less than five minutes with only modest computational resources. RATT is available at http://ratt.sourceforge.net. PMID:21306991
Parente, Eugenio; Cocolin, Luca; De Filippis, Francesca; Zotta, Teresa; Ferrocino, Ilario; O'Sullivan, Orla; Neviani, Erasmo; De Angelis, Maria; Cotter, Paul D; Ercolini, Danilo
2016-02-16
Amplicon targeted high-throughput sequencing has become a popular tool for the culture-independent analysis of microbial communities. Although the data obtained with this approach are portable and the number of sequences available in public databases is increasing, no tool has been developed yet for the analysis and presentation of data obtained in different studies. This work describes an approach for the development of a database for the rapid exploration and analysis of data on food microbial communities. Data from seventeen studies investigating the structure of bacterial communities in dairy, meat, sourdough and fermented vegetable products, obtained by 16S rRNA gene targeted high-throughput sequencing, were collated and analysed using Gephi, a network analysis software. The resulting database, which we named FoodMicrobionet, was used to analyse nodes and network properties and to build an interactive web-based visualisation. The latter allows the visual exploration of the relationships between Operational Taxonomic Units (OTUs) and samples and the identification of core- and sample-specific bacterial communities. It also provides additional search tools and hyperlinks for the rapid selection of food groups and OTUs and for rapid access to external resources (NCBI taxonomy, digital versions of the original articles). Microbial interaction network analysis was carried out using CoNet on datasets extracted from FoodMicrobionet: the complexity of interaction networks was much lower than that found for other bacterial communities (human microbiome, soil and other environments). This may reflect both a bias in the dataset (which was dominated by fermented foods and starter cultures) and the lower complexity of food bacterial communities. Although some technical challenges exist, and are discussed here, the net result is a valuable tool for the exploration of food bacterial communities by the scientific community and food industry. Copyright © 2015. Published by Elsevier B.V.
Memory and learning with rapid audiovisual sequences
Keller, Arielle S.; Sekuler, Robert
2015-01-01
We examined short-term memory for sequences of visual stimuli embedded in varying multisensory contexts. In two experiments, subjects judged the structure of the visual sequences while disregarding concurrent, but task-irrelevant auditory sequences. Stimuli were eight-item sequences in which varying luminances and frequencies were presented concurrently and rapidly (at 8 Hz). Subjects judged whether the final four items in a visual sequence identically replicated the first four items. Luminances and frequencies in each sequence were either perceptually correlated (Congruent) or were unrelated to one another (Incongruent). Experiment 1 showed that, despite encouragement to ignore the auditory stream, subjects' categorization of visual sequences was strongly influenced by the accompanying auditory sequences. Moreover, this influence tracked the similarity between a stimulus's separate audio and visual sequences, demonstrating that task-irrelevant auditory sequences underwent a considerable degree of processing. Using a variant of Hebb's repetition design, Experiment 2 compared musically trained subjects and subjects who had little or no musical training on the same task as used in Experiment 1. Test sequences included some that intermittently and randomly recurred, which produced better performance than sequences that were generated anew for each trial. The auditory component of a recurring audiovisual sequence influenced musically trained subjects more than it did other subjects. This result demonstrates that stimulus-selective, task-irrelevant learning of sequences can occur even when such learning is an incidental by-product of the task being performed. PMID:26575193
Memory and learning with rapid audiovisual sequences.
Keller, Arielle S; Sekuler, Robert
2015-01-01
We examined short-term memory for sequences of visual stimuli embedded in varying multisensory contexts. In two experiments, subjects judged the structure of the visual sequences while disregarding concurrent, but task-irrelevant auditory sequences. Stimuli were eight-item sequences in which varying luminances and frequencies were presented concurrently and rapidly (at 8 Hz). Subjects judged whether the final four items in a visual sequence identically replicated the first four items. Luminances and frequencies in each sequence were either perceptually correlated (Congruent) or were unrelated to one another (Incongruent). Experiment 1 showed that, despite encouragement to ignore the auditory stream, subjects' categorization of visual sequences was strongly influenced by the accompanying auditory sequences. Moreover, this influence tracked the similarity between a stimulus's separate audio and visual sequences, demonstrating that task-irrelevant auditory sequences underwent a considerable degree of processing. Using a variant of Hebb's repetition design, Experiment 2 compared musically trained subjects and subjects who had little or no musical training on the same task as used in Experiment 1. Test sequences included some that intermittently and randomly recurred, which produced better performance than sequences that were generated anew for each trial. The auditory component of a recurring audiovisual sequence influenced musically trained subjects more than it did other subjects. This result demonstrates that stimulus-selective, task-irrelevant learning of sequences can occur even when such learning is an incidental by-product of the task being performed.
Genomic Insights into Geothermal Spring Community Members using a 16S Agnostic Single-Cell Approach
NASA Astrophysics Data System (ADS)
Bowers, R. M.
2016-12-01
INSTUTIONS (ALL): DOE Joint Genome Institute, Walnut Creek, CA USA. Bigelow Laboratory for Ocean Sciences, East Boothbay, ME USA. Department of Biological Sciences, University of Calgary, Calgary, Alberta, Canada. ABSTRACT BODY: With recent advances in DNA sequencing, rapid and affordable screening of single-cell genomes has become a reality. Single-cell sequencing is a multi-step process that takes advantage of any number of single-cell sorting techniques, whole genome amplification (WGA), and 16S rRNA gene based PCR screening to identify the microbes of interest prior to shotgun sequencing. However, the 16S PCR based screening step is costly and may lead to unanticipated losses of microbial diversity, as cells that do not produce a clean 16S amplicon are typically omitted from downstream shotgun sequencing. While many of the sorted cells that fail the 16S PCR step likely originate from poor quality amplified DNA, some of the cells with good WGA kinetics may instead represent bacteria or archaea with 16S genes that fail to amplify due to primer mis-matches or the presence of intervening sequences. Using cell material from Dewar Creek, a hot spring in British Columbia, we sequenced all sorted cells with good WGA kinetics irrespective of their 16S amplification success. We show that this high-throughput approach to single-cell sequencing (i) can reduce the overall cost of single-cell genome production, and (ii). may lead to the discovery of previously unknown branches on the microbial tree of life.
Ries, David; Holtgräwe, Daniela; Viehöver, Prisca; Weisshaar, Bernd
2016-03-15
The combination of bulk segregant analysis (BSA) and next generation sequencing (NGS), also known as mapping by sequencing (MBS), has been shown to significantly accelerate the identification of causal mutations for species with a reference genome sequence. The usual approach is to cross homozygous parents that differ for the monogenic trait to address, to perform deep sequencing of DNA from F2 plants pooled according to their phenotype, and subsequently to analyze the allele frequency distribution based on a marker table for the parents studied. The method has been successfully applied for EMS induced mutations as well as natural variation. Here, we show that pooling genetically diverse breeding lines according to a contrasting phenotype also allows high resolution mapping of the causal gene in a crop species. The test case was the monogenic locus causing red vs. green hypocotyl color in Beta vulgaris (R locus). We determined the allele frequencies of polymorphic sequences using sequence data from two diverging phenotypic pools of 180 B. vulgaris accessions each. A single interval of about 31 kbp among the nine chromosomes was identified which indeed contained the causative mutation. By applying a variation of the mapping by sequencing approach, we demonstrated that phenotype-based pooling of diverse accessions from breeding panels and subsequent direct determination of the allele frequency distribution can be successfully applied for gene identification in a crop species. Our approach made it possible to identify a small interval around the causative gene. Sequencing of parents or individual lines was not necessary. Whenever the appropriate plant material is available, the approach described saves time compared to the generation of an F2 population. In addition, we provide clues for planning similar experiments with regard to pool size and the sequencing depth required.
MIPS: a database for protein sequences, homology data and yeast genome information.
Mewes, H W; Albermann, K; Heumann, K; Liebl, S; Pfeiffer, F
1997-01-01
The MIPS group (Martinsried Institute for Protein Sequences) at the Max-Planck-Institute for Biochemistry, Martinsried near Munich, Germany, collects, processes and distributes protein sequence data within the framework of the tripartite association of the PIR-International Protein Sequence Database (,). MIPS contributes nearly 50% of the data input to the PIR-International Protein Sequence Database. The database is distributed on CD-ROM together with PATCHX, an exhaustive supplement of unique, unverified protein sequences from external sources compiled by MIPS. Through its WWW server (http://www.mips.biochem.mpg.de/ ) MIPS permits internet access to sequence databases, homology data and to yeast genome information. (i) Sequence similarity results from the FASTA program () are stored in the FASTA database for all proteins from PIR-International and PATCHX. The database is dynamically maintained and permits instant access to FASTA results. (ii) Starting with FASTA database queries, proteins have been classified into families and superfamilies (PROT-FAM). (iii) The HPT (hashed position tree) data structure () developed at MIPS is a new approach for rapid sequence and pattern searching. (iv) MIPS provides access to the sequence and annotation of the complete yeast genome (), the functional classification of yeast genes (FunCat) and its graphical display, the 'Genome Browser' (). A CD-ROM based on the JAVA programming language providing dynamic interactive access to the yeast genome and the related protein sequences has been compiled and is available on request. PMID:9016498
Walsh, Aaron M.; Crispie, Fiona; Daari, Kareem; O'Sullivan, Orla; Martin, Jennifer C.; Arthur, Cornelius T.; Claesson, Marcus J.; Scott, Karen P.
2017-01-01
ABSTRACT The rapid detection of pathogenic strains in food products is essential for the prevention of disease outbreaks. It has already been demonstrated that whole-metagenome shotgun sequencing can be used to detect pathogens in food but, until recently, strain-level detection of pathogens has relied on whole-metagenome assembly, which is a computationally demanding process. Here we demonstrated that three short-read-alignment-based methods, i.e., MetaMLST, PanPhlAn, and StrainPhlAn, could accurately and rapidly identify pathogenic strains in spinach metagenomes that had been intentionally spiked with Shiga toxin-producing Escherichia coli in a previous study. Subsequently, we employed the methods, in combination with other metagenomics approaches, to assess the safety of nunu, a traditional Ghanaian fermented milk product that is produced by the spontaneous fermentation of raw cow milk. We showed that nunu samples were frequently contaminated with bacteria associated with the bovine gut and, worryingly, we detected putatively pathogenic E. coli and Klebsiella pneumoniae strains in a subset of nunu samples. Ultimately, our work establishes that short-read-alignment-based bioinformatics approaches are suitable food safety tools, and we describe a real-life example of their utilization. IMPORTANCE Foodborne pathogens are responsible for millions of illnesses each year. Here we demonstrate that short-read-alignment-based bioinformatics tools can accurately and rapidly detect pathogenic strains in food products by using shotgun metagenomics data. The methods used here are considerably faster than both traditional culturing methods and alternative bioinformatics approaches that rely on metagenome assembly; therefore, they can potentially be used for more high-throughput food safety testing. Overall, our results suggest that whole-metagenome sequencing can be used as a practical food safety tool to prevent diseases or to link outbreaks to specific food products. PMID:28625983
Pembleton, Luke W; Inch, Courtney; Baillie, Rebecca C; Drayton, Michelle C; Thakur, Preeti; Ogaji, Yvonne O; Spangenberg, German C; Forster, John W; Daetwyler, Hans D; Cogan, Noel O I
2018-06-02
Exploitation of data from a ryegrass breeding program has enabled rapid development and implementation of genomic selection for sward-based biomass yield with a twofold-to-threefold increase in genetic gain. Genomic selection, which uses genome-wide sequence polymorphism data and quantitative genetics techniques to predict plant performance, has large potential for the improvement in pasture plants. Major factors influencing the accuracy of genomic selection include the size of reference populations, trait heritability values and the genetic diversity of breeding populations. Global diversity of the important forage species perennial ryegrass is high and so would require a large reference population in order to achieve moderate accuracies of genomic selection. However, diversity of germplasm within a breeding program is likely to be lower. In addition, de novo construction and characterisation of reference populations are a logistically complex process. Consequently, historical phenotypic records for seasonal biomass yield and heading date over a 18-year period within a commercial perennial ryegrass breeding program have been accessed, and target populations have been characterised with a high-density transcriptome-based genotyping-by-sequencing assay. Ability to predict observed phenotypic performance in each successive year was assessed by using all synthetic populations from previous years as a reference population. Moderate and high accuracies were achieved for the two traits, respectively, consistent with broad-sense heritability values. The present study represents the first demonstration and validation of genomic selection for seasonal biomass yield within a diverse commercial breeding program across multiple years. These results, supported by previous simulation studies, demonstrate the ability to predict sward-based phenotypic performance early in the process of individual plant selection, so shortening the breeding cycle, increasing the rate of genetic gain and allowing rapid adoption in ryegrass improvement programs.
USDA-ARS?s Scientific Manuscript database
New and emerging next generation sequencing technologies have been promising in reducing sequencing costs, but not significantly for complex polyploid plant genomes such as cotton. Large and highly repetitive genome of G. hirsutum (~2.5GB) is less amenable and cost-intensive with traditional BAC-by...
NASA Astrophysics Data System (ADS)
Zhang, Jinyu; Steel, Ronald; Ambrose, William
2017-12-01
Shelf margins prograde and aggrade by the incremental addition of deltaic sediments supplied from river channel belts and by stored shoreline sediment. This paper documents the shelf-edge trajectory and coeval channel belts for a segment of Paleocene Lower Wilcox Group in the northern Gulf of Mexico based on 400 wireline logs and 300 m of whole cores. By quantitatively analyzing these data and comparing them with global databases, we demonstrate how varying sediment supply impacted the Wilcox shelf-margin growth and deep-water sediment dispersal under greenhouse eustatic conditions. The coastal plain to marine topset and uppermost continental slope succession of the Lower Wilcox shelf-margin sediment prism is divided into eighteen high-frequency ( 300 ky duration) stratigraphic sequences, and further grouped into 5 sequence sets (labeled as A-E from bottom to top). Sequence Set A is dominantly muddy slope deposits. The shelf edge of Sequence Sets B and C prograded rapidly (> 10 km/Ma) and aggraded modestly (< 80 m/Ma). The coeval channel belts are relatively large (individually averaging 11-13 m thick) and amalgamated. The water discharge of Sequence Sets B and C rivers, estimated by channel-belt thickness, bedform type, and grain size, is 7000-29,000 m3/s, considered as large rivers when compared with modern river databases. In contrast, slow progradation (< 10 km/Ma) and rapid aggradation (> 80 m/Ma) characterizes Sequence Sets D and E, which is associated with smaller (9-10 m thick on average) and isolated channel belts. This stratigraphic trend is likely due to an upward decreasing sediment supply indicated by the shelf-edge progradation rate and channel size, as well as an upward increasing shelf accommodation indicated by the shelf-edge aggradation rate. The rapid shelf-edge progradation and large rivers in Sequence Sets B and C confirm earlier suggestions that it was the early phase of Lower Wilcox dispersal that brought the largest deep-water sediment volumes into the Gulf of Mexico. Key factors in this Lower Wilcox stratigraphic trend are likely to have been a very high initial sediment flux to the Gulf because of the high initial release of sediment from Laramide catchments to the north and northwest, possibly aided by modest eustatic sea-level fall on the Texas shelf, which is suggested by the early, flat shelf-edge trajectory, high amalgamation of channel belts, and the low overall aggradation rate of the Sequence Sets B and C.
Geens, Tom; Desplanques, Ann; Van Loock, Marnix; Bönner, Brigitte M.; Kaleta, Erhard F.; Magnino, Simone; Andersen, Arthur A.; Everett, Karin D. E.; Vanrompay, Daisy
2005-01-01
Twenty-one avian Chlamydophila psittaci isolates from different European countries were characterized using ompA restriction fragment length polymorphism, ompA sequencing, and major outer membrane protein serotyping. Results reveal the presence of a new genotype, E/B, in several European countries and stress the need for a discriminatory rapid genotyping method. PMID:15872282
Ravi, Arthi; Hassan, Syed Zahid; Vanikrishna, Ajithkumar N; Sureshan, Kana M
2017-04-04
Triflates of myo-inositol undergo facile solvolysis in DMSO and DMF yielding S N 2 products substituted with O-nucleophiles; DMF showed slower kinetics. Axial O-triflate undergoes faster substitution than equatorial O-triflate. By exploiting this difference in kinetics, solvent-tuning and sequence-controlled nucleophilysis, rapid synthesis of three azido-inositols of myo-configuration from myo-inositol itself has been achieved.
SAM: String-based sequence search algorithm for mitochondrial DNA database queries
Röck, Alexander; Irwin, Jodi; Dür, Arne; Parsons, Thomas; Parson, Walther
2011-01-01
The analysis of the haploid mitochondrial (mt) genome has numerous applications in forensic and population genetics, as well as in disease studies. Although mtDNA haplotypes are usually determined by sequencing, they are rarely reported as a nucleotide string. Traditionally they are presented in a difference-coded position-based format relative to the corrected version of the first sequenced mtDNA. This convention requires recommendations for standardized sequence alignment that is known to vary between scientific disciplines, even between laboratories. As a consequence, database searches that are vital for the interpretation of mtDNA data can suffer from biased results when query and database haplotypes are annotated differently. In the forensic context that would usually lead to underestimation of the absolute and relative frequencies. To address this issue we introduce SAM, a string-based search algorithm that converts query and database sequences to position-free nucleotide strings and thus eliminates the possibility that identical sequences will be missed in a database query. The mere application of a BLAST algorithm would not be a sufficient remedy as it uses a heuristic approach and does not address properties specific to mtDNA, such as phylogenetically stable but also rapidly evolving insertion and deletion events. The software presented here provides additional flexibility to incorporate phylogenetic data, site-specific mutation rates, and other biologically relevant information that would refine the interpretation of mitochondrial DNA data. The manuscript is accompanied by freeware and example data sets that can be used to evaluate the new software (http://stringvalidation.org). PMID:21056022
Liu, Shanlin; Yang, Chentao; Zhou, Chengran; Zhou, Xin
2017-12-01
Over the past decade, biodiversity researchers have dedicated tremendous efforts to constructing DNA reference barcodes for rapid species registration and identification. Although analytical cost for standard DNA barcoding has been significantly reduced since early 2000, further dramatic reduction in barcoding costs is unlikely because Sanger sequencing is approaching its limits in throughput and chemistry cost. Constraints in barcoding cost not only led to unbalanced barcoding efforts around the globe, but also prevented high-throughput sequencing (HTS)-based taxonomic identification from applying binomial species names, which provide crucial linkages to biological knowledge. We developed an Illumina-based pipeline, HIFI-Barcode, to produce full-length Cytochrome c oxidase subunit I (COI) barcodes from pooled polymerase chain reaction amplicons generated by individual specimens. The new pipeline generated accurate barcode sequences that were comparable to Sanger standards, even for different haplotypes of the same species that were only a few nucleotides different from each other. Additionally, the new pipeline was much more sensitive in recovering amplicons at low quantity. The HIFI-Barcode pipeline successfully recovered barcodes from more than 78% of the polymerase chain reactions that didn't show clear bands on the electrophoresis gel. Moreover, sequencing results based on the single molecular sequencing platform Pacbio confirmed the accuracy of the HIFI-Barcode results. Altogether, the new pipeline can provide an improved solution to produce full-length reference barcodes at about one-tenth of the current cost, enabling construction of comprehensive barcode libraries for local fauna, leading to a feasible direction for DNA barcoding global biomes. © The Authors 2017. Published by Oxford University Press.
Li, Jian; Batcha, Aarif Mohamed Nazeer; Grüning, Björn; Mansmann, Ulrich R.
2015-01-01
Next-generation sequencing (NGS) technologies that have advanced rapidly in the past few years possess the potential to classify diseases, decipher the molecular code of related cell processes, identify targets for decision-making on targeted therapy or prevention strategies, and predict clinical treatment response. Thus, NGS is on its way to revolutionize oncology. With the help of NGS, we can draw a finer map for the genetic basis of diseases and can improve our understanding of diagnostic and prognostic applications and therapeutic methods. Despite these advantages and its potential, NGS is facing several critical challenges, including reduction of sequencing cost, enhancement of sequencing quality, improvement of technical simplicity and reliability, and development of semiautomated and integrated analysis workflow. In order to address these challenges, we conducted a literature research and summarized a four-stage NGS workflow for providing a systematic review on NGS-based analysis, explaining the strength and weakness of diverse NGS-based software tools, and elucidating its potential connection to individualized medicine. By presenting this four-stage NGS workflow, we try to provide a minimal structural layout required for NGS data storage and reproducibility. PMID:27081306
Simple sequence repeats in Escherichia coli: abundance, distribution, composition, and polymorphism.
Gur-Arie, R; Cohen, C J; Eitan, Y; Shelef, L; Hallerman, E M; Kashi, Y
2000-01-01
Computer-based genome-wide screening of the DNA sequence of Escherichia coli strain K12 revealed tens of thousands of tandem simple sequence repeat (SSR) tracts, with motifs ranging from 1 to 6 nucleotides. SSRs were well distributed throughout the genome. Mononucleotide SSRs were over-represented in noncoding regions and under-represented in open reading frames (ORFs). Nucleotide composition of mono- and dinucleotide SSRs, both in ORFs and in noncoding regions, differed from that of the genomic region in which they occurred, with 93% of all mononucleotide SSRs proving to be of A or T. Computer-based analysis of the fine position of every SSR locus in the noncoding portion of the genome relative to downstream ORFs showed SSRs located in areas that could affect gene regulation. DNA sequences at 14 arbitrarily chosen SSR tracts were compared among E. coli strains. Polymorphisms of SSR copy number were observed at four of seven mononucleotide SSR tracts screened, with all polymorphisms occurring in noncoding regions. SSR polymorphism could prove important as a genome-wide source of variation, both for practical applications (including rapid detection, strain identification, and detection of loci affecting key phenotypes) and for evolutionary adaptation of microbes.
Yang, Cheng; Lates, Vasilica; Prieto-Simón, Beatriz; Marty, Jean-Louis; Yang, Xiurong
2013-11-15
We report a new label-free colorimetric aptasensor based on DNAzyme-aptamer conjugate for rapid and high-throughput detection of Ochratoxin A (OTA, a possible human carcinogen, group 2B) in wine. Two oligonucleotides were designed for this detection. One is N1 for biorecognition, which includes two adjacent sequences: the OTA-specific aptamer sequence and the horseradish peroxidase (HRP)-mimicking DNAzyme sequence. The other is a blocking DNA (B2), which is partially complementary to a part of the OTA aptamer and partially complementary to a part of the DNAzyme. The existence of OTA reduces the hybridization between N1 and B2. Thus, the activity of the non-hybridized DNAzyme is linearly correlated with the concentration of OTA up to 30 nM with a limit of detection of 4 nM (3σ). Meanwhile, a double liquid-liquid extraction (LLE) method is accordingly developed to purify OTA from wine. Compared with the existing HPLC-FD or immunoassay methods, the proposed strategy presents the most appropriate balance between accuracy and facility, resulting in a considerable improvement of real-time quality control, and thereby, preventing chronic poisoning caused by OTA contained red wine. Copyright © 2013 Elsevier B.V. All rights reserved.
Faster computation of exact RNA shape probabilities.
Janssen, Stefan; Giegerich, Robert
2010-03-01
Abstract shape analysis allows efficient computation of a representative sample of low-energy foldings of an RNA molecule. More comprehensive information is obtained by computing shape probabilities, accumulating the Boltzmann probabilities of all structures within each abstract shape. Such information is superior to free energies because it is independent of sequence length and base composition. However, up to this point, computation of shape probabilities evaluates all shapes simultaneously and comes with a computation cost which is exponential in the length of the sequence. We device an approach called RapidShapes that computes the shapes above a specified probability threshold T by generating a list of promising shapes and constructing specialized folding programs for each shape to compute its share of Boltzmann probability. This aims at a heuristic improvement of runtime, while still computing exact probability values. Evaluating this approach and several substrategies, we find that only a small proportion of shapes have to be actually computed. For an RNA sequence of length 400, this leads, depending on the threshold, to a 10-138 fold speed-up compared with the previous complete method. Thus, probabilistic shape analysis has become feasible in medium-scale applications, such as the screening of RNA transcripts in a bacterial genome. RapidShapes is available via http://bibiserv.cebitec.uni-bielefeld.de/rnashapes
Sajduda, Anna; Martin, Anandi; Portaels, Françoise; Palomino, Juan Carlos
2010-02-01
We developed a scheme for rapid identification of Mycobacterium species using an automated fluorescence capillary electrophoresis instrument. A 441-bp region of the hsp65 gene was examined using PCR-restriction analysis (PRA). The assay was initially evaluated on 38 reference strains. The observed sizes of restriction fragments were consistently smaller than the real sizes for each of the species as deduced from the sequence analysis (mean variance=7bp). Nevertheless, the obtained PRA patterns were highly reproducible and resulted in correct species identifications. A blind test was then successfully performed on 64 test isolates previously characterized by conventional biochemical methods, a commercial INNO-LiPA Mycobacteria assay and/or sequence determination of the 5' end of 16S rRNA gene. A total of 14 of 64 isolates were erroneously identified by conventional methods (78% accuracy). In contrast, PRA performed very well in comparison with the LiPA (89% concordance) and especially with DNA sequencing (93.3% of concordant results). Also, PRA identified seven isolates representing five previously unreported hsp65 alleles. We conclude that hsp65 PRA based on automated capillary electrophoresis is a rapid, simple and reliable method for identification of mycobacteria. Copyright 2010 Elsevier B.V. All rights reserved.
Delaney, Nigel F.; Marx, Christopher J.
2012-01-01
Understanding evolutionary dynamics within microbial populations requires the ability to accurately follow allele frequencies through time. Here we present a rapid, cost-effective method (FREQ-Seq) that leverages Illumina next-generation sequencing for localized, quantitative allele frequency detection. Analogous to RNA-Seq, FREQ-Seq relies upon counts from the >105 reads generated per locus per time-point to determine allele frequencies. Loci of interest are directly amplified from a mixed population via two rounds of PCR using inexpensive, user-designed oligonucleotides and a bar-coded bridging primer system that can be regenerated in-house. The resulting bar-coded PCR products contain the adapters needed for Illumina sequencing, eliminating further library preparation. We demonstrate the utility of FREQ-Seq by determining the order and dynamics of beneficial alleles that arose as a microbial population, founded with an engineered strain of Methylobacterium, evolved to grow on methanol. Quantifying allele frequencies with minimal bias down to 1% abundance allowed effective analysis of SNPs, small in-dels and insertions of transposable elements. Our data reveal large-scale clonal interference during the early stages of adaptation and illustrate the utility of FREQ-Seq as a cost-effective tool for tracking allele frequencies in populations. PMID:23118913
Ma, Peng-Fei; Guo, Zhen-Hua; Li, De-Zhu
2012-01-01
Compared to their counterparts in animals, the mitochondrial (mt) genomes of angiosperms exhibit a number of unique features. However, unravelling their evolution is hindered by the few completed genomes, of which are essentially Sanger sequenced. While next-generation sequencing technologies have revolutionized chloroplast genome sequencing, they are just beginning to be applied to angiosperm mt genomes. Chloroplast genomes of grasses (Poaceae) have undergone episodic evolution and the evolutionary rate was suggested to be correlated between chloroplast and mt genomes in Poaceae. It is interesting to investigate whether correlated rate change also occurred in grass mt genomes as expected under lineage effects. A time-calibrated phylogenetic tree is needed to examine rate change. We determined a largely completed mt genome from a bamboo, Ferrocalamus rimosivaginus (Poaceae), through Illumina sequencing of total DNA. With combination of de novo and reference-guided assembly, 39.5-fold coverage Illumina reads were finally assembled into scaffolds totalling 432,839 bp. The assembled genome contains nearly the same genes as the completed mt genomes in Poaceae. For examining evolutionary rate in grass mt genomes, we reconstructed a phylogenetic tree including 22 taxa based on 31 mt genes. The topology of the well-resolved tree was almost identical to that inferred from chloroplast genome with only minor difference. The inconsistency possibly derived from long branch attraction in mtDNA tree. By calculating absolute substitution rates, we found significant rate change (∼4-fold) in mt genome before and after the diversification of Poaceae both in synonymous and nonsynonymous terms. Furthermore, the rate change was correlated with that of chloroplast genomes in grasses. Our result demonstrates that it is a rapid and efficient approach to obtain angiosperm mt genome sequences using Illumina sequencing technology. The parallel episodic evolution of mt and chloroplast genomes in grasses is consistent with lineage effects.
Ma, Peng-Fei; Guo, Zhen-Hua; Li, De-Zhu
2012-01-01
Background Compared to their counterparts in animals, the mitochondrial (mt) genomes of angiosperms exhibit a number of unique features. However, unravelling their evolution is hindered by the few completed genomes, of which are essentially Sanger sequenced. While next-generation sequencing technologies have revolutionized chloroplast genome sequencing, they are just beginning to be applied to angiosperm mt genomes. Chloroplast genomes of grasses (Poaceae) have undergone episodic evolution and the evolutionary rate was suggested to be correlated between chloroplast and mt genomes in Poaceae. It is interesting to investigate whether correlated rate change also occurred in grass mt genomes as expected under lineage effects. A time-calibrated phylogenetic tree is needed to examine rate change. Methodology/Principal Findings We determined a largely completed mt genome from a bamboo, Ferrocalamus rimosivaginus (Poaceae), through Illumina sequencing of total DNA. With combination of de novo and reference-guided assembly, 39.5-fold coverage Illumina reads were finally assembled into scaffolds totalling 432,839 bp. The assembled genome contains nearly the same genes as the completed mt genomes in Poaceae. For examining evolutionary rate in grass mt genomes, we reconstructed a phylogenetic tree including 22 taxa based on 31 mt genes. The topology of the well-resolved tree was almost identical to that inferred from chloroplast genome with only minor difference. The inconsistency possibly derived from long branch attraction in mtDNA tree. By calculating absolute substitution rates, we found significant rate change (∼4-fold) in mt genome before and after the diversification of Poaceae both in synonymous and nonsynonymous terms. Furthermore, the rate change was correlated with that of chloroplast genomes in grasses. Conclusions/Significance Our result demonstrates that it is a rapid and efficient approach to obtain angiosperm mt genome sequences using Illumina sequencing technology. The parallel episodic evolution of mt and chloroplast genomes in grasses is consistent with lineage effects. PMID:22272330
Dynamic programming algorithms for biological sequence comparison.
Pearson, W R; Miller, W
1992-01-01
Efficient dynamic programming algorithms are available for a broad class of protein and DNA sequence comparison problems. These algorithms require computer time proportional to the product of the lengths of the two sequences being compared [O(N2)] but require memory space proportional only to the sum of these lengths [O(N)]. Although the requirement for O(N2) time limits use of the algorithms to the largest computers when searching protein and DNA sequence databases, many other applications of these algorithms, such as calculation of distances for evolutionary trees and comparison of a new sequence to a library of sequence profiles, are well within the capabilities of desktop computers. In particular, the results of library searches with rapid searching programs, such as FASTA or BLAST, should be confirmed by performing a rigorous optimal alignment. Whereas rapid methods do not overlook significant sequence similarities, FASTA limits the number of gaps that can be inserted into an alignment, so that a rigorous alignment may extend the alignment substantially in some cases. BLAST does not allow gaps in the local regions that it reports; a calculation that allows gaps is very likely to extend the alignment substantially. Although a Monte Carlo evaluation of the statistical significance of a similarity score with a rigorous algorithm is much slower than the heuristic approach used by the RDF2 program, the dynamic programming approach should take less than 1 hr on a 386-based PC or desktop Unix workstation. For descriptive purposes, we have limited our discussion to methods for calculating similarity scores and distances that use gap penalties of the form g = rk. Nevertheless, programs for the more general case (g = q+rk) are readily available. Versions of these programs that run either on Unix workstations, IBM-PC class computers, or the Macintosh can be obtained from either of the authors.
Singh, Digvijay; Mallon, John; Poddar, Anustup; Wang, Yanbo; Tippana, Ramreddy; Yang, Olivia; Bailey, Scott; Ha, Taekjip
2018-05-22
CRISPR-Cas9, which imparts adaptive immunity against foreign genomic invaders in certain prokaryotes, has been repurposed for genome-engineering applications. More recently, another RNA-guided CRISPR endonuclease called Cpf1 (also known as Cas12a) was identified and is also being repurposed. Little is known about the kinetics and mechanism of Cpf1 DNA interaction and how sequence mismatches between the DNA target and guide-RNA influence this interaction. We used single-molecule fluorescence analysis and biochemical assays to characterize DNA interrogation, cleavage, and product release by three Cpf1 orthologs. Our Cpf1 data are consistent with the DNA interrogation mechanism proposed for Cas9. They both bind any DNA in search of protospacer-adjacent motif (PAM) sequences, verify the target sequence directionally from the PAM-proximal end, and rapidly reject any targets that lack a PAM or that are poorly matched with the guide-RNA. Unlike Cas9, which requires 9 bp for stable binding and ∼16 bp for cleavage, Cpf1 requires an ∼17-bp sequence match for both stable binding and cleavage. Unlike Cas9, which does not release the DNA cleavage products, Cpf1 rapidly releases the PAM-distal cleavage product, but not the PAM-proximal product. Solution pH, reducing conditions, and 5' guanine in guide-RNA differentially affected different Cpf1 orthologs. Our findings have important implications on Cpf1-based genome engineering and manipulation applications.
Li, Jinjian; Dridi, Mahjoub; El-Moudni, Abdellah
2016-01-01
The problem of reducing traffic delays and decreasing fuel consumption simultaneously in a network of intersections without traffic lights is solved by a cooperative traffic control algorithm, where the cooperation is executed based on the connection of Vehicle-to-Infrastructure (V2I). This resolution of the problem contains two main steps. The first step concerns the itinerary of which intersections are chosen by vehicles to arrive at their destination from their starting point. Based on the principle of minimal travel distance, each vehicle chooses its itinerary dynamically based on the traffic loads in the adjacent intersections. The second step is related to the following proposed cooperative procedures to allow vehicles to pass through each intersection rapidly and economically: on one hand, according to the real-time information sent by vehicles via V2I in the edge of the communication zone, each intersection applies Dynamic Programming (DP) to cooperatively optimize the vehicle passing sequence with minimal traffic delays so that the vehicles may rapidly pass the intersection under the relevant safety constraints; on the other hand, after receiving this sequence, each vehicle finds the optimal speed profiles with the minimal fuel consumption by an exhaustive search. The simulation results reveal that the proposed algorithm can significantly reduce both travel delays and fuel consumption compared with other papers under different traffic volumes. PMID:27999333
Saleh, Mona; El-Matbouli, Mansour
2015-06-01
Cyprinid herpesvirus-3 (CyHV-3) is a highly infectious pathogen that causes fatal disease in common and koi carp Cyprinus carpio L. CyHV-3 detection is usually based on virus propagation or amplification of the viral DNA using the PCR or LAMP techniques. However, due to the limited susceptibility of cells used for propagation, it is not always possible to successfully isolate CyHV-3 even from tissue samples that have high virus titres. All previously described detection methods including PCR-based assays are time consuming, laborious and require specialized equipment. To overcome these limitations, gold nanoparticles (AuNPs) have been explored for direct and sensitive detection of DNA. In this study, a label-free colorimetric nanodiagnostic method for direct detection of unamplified CyHV-3 DNA using gold nanoparticles is introduced. Under appropriate conditions, DNA probes hybridize with their complementary target sequences in the sample DNA, which results in aggregation of the gold nanoparticles and a concomitant colour change from red to blue, whereas test samples with non complementary DNA sequences remain red. In this study, gold nanoparticles were used to develop and evaluate a specific and sensitive hybridization assay for direct and rapid detection of the highly infectious pathogen termed Cyprinid herpesvirus-3. Copyright © 2015 Elsevier B.V. All rights reserved.
Disease Management in the Genomics Era-Summaries of Focus Issue Papers.
Klosterman, S J; Rollins, J R; Sudarshana, M R; Vinatzer, B A
2016-10-01
The genomics revolution has contributed enormously to research and disease management applications in plant pathology. This development has rapidly increased our understanding of the molecular mechanisms underpinning pathogenesis and resistance, contributed novel markers for rapid pathogen detection and diagnosis, and offered further insights into the genetics of pathogen populations on a larger scale. The availability of whole genome resources coupled with next-generation sequencing (NGS) technologies has helped fuel genomics-based approaches to improve disease resistance in crops. NGS technologies have accelerated the pace at which whole plant and pathogen genomes have become available, and made possible the metagenomic analysis of plant-associated microbial communities. Furthermore, NGS technologies can now be applied routinely and cost effectively to rapidly generate plant and/or pathogen genome or transcriptome marker sequences associated with virulence phenotypes in the pathogen or resistance phenotypes in the plant, potentially leading to improvements in plant disease management. In some systems, investments in plant and pathogen genomics have led to immediate, tangible benefits. This focus issue covers some of the systems. The articles in this focus issue range from overall perspective articles to research articles describing specific genomics applications for detection and control of diseases caused by nematode, viral, bacterial, fungal, and oomycete pathogens. The following are representative short summaries of the articles that appear in this Focus Issue .
Hsieh, Ying-Hsin; Wang, Yun F; Moura, Hercules; Miranda, Nancy; Simpson, Steven; Gowrishankar, Ramnath; Barr, John; Kerdahi, Khalil; Sulaiman, Irshad M
2018-05-01
Campylobacteriosis is an infectious gastrointestinal disease caused by Campylobacter spp. In most cases, it is either underdiagnosed or underreported due to poor diagnostics and limited databases. Several DNA-based molecular diagnostic techniques, including 16S ribosomal RNA (rRNA) sequence typing, have been widely used in the species identification of Campylobacter. Nevertheless, these assays are time-consuming and require a high quality of bacterial DNA. Matrix-assisted laser desorption ionization (MALDI) time-of-flight (TOF) MS is an emerging diagnostic technology that can provide the rapid identification of microorganisms by using their intact cells without extraction or purification. In this study, we analyzed 24 American Type Culture Collection reference isolates of 16 Campylobacter spp. and five unknown clinical bacterial isolates for rapid identification utilizing two commercially available MADI-TOF MS platforms, namely the bioMérieux VITEK® MS and Bruker Biotyper systems. In addition, 16S rRNA sequencing was performed to confirm the species-level identification of the unknown clinical isolates. Both MALDI-TOF MS systems identified the isolates of C. jejuni, C. coli, C. lari, and C. fetus. The results of this study suggest that the MALDI-TOF MS technique can be used in the identification of Campylobacter spp. of public health importance.
Lang, Jillian M.; Langlois, Paul; Nguyen, Marian Hanna R.; Triplett, Lindsay R.; Purdie, Laura; Holton, Timothy A.; Djikeng, Appolinaire; Vera Cruz, Casiana M.; Verdier, Valérie
2014-01-01
Molecular diagnostics for crop diseases can enhance food security by enabling the rapid identification of threatening pathogens and providing critical information for the deployment of disease management strategies. Loop-mediated isothermal amplification (LAMP) is a PCR-based tool that allows the rapid, highly specific amplification of target DNA sequences at a single temperature and is thus ideal for field-level diagnosis of plant diseases. We developed primers highly specific for two globally important rice pathogens, Xanthomonas oryzae pv. oryzae, the causal agent of bacterial blight (BB) disease, and X. oryzae pv. oryzicola, the causal agent of bacterial leaf streak disease (BLS), for use in reliable, sensitive LAMP assays. In addition to pathovar distinction, two assays that differentiate X. oryzae pv. oryzae by African or Asian lineage were developed. Using these LAMP primer sets, the presence of each pathogen was detected from DNA and bacterial cells, as well as leaf and seed samples. Thresholds of detection for all assays were consistently 104 to 105 CFU ml−1, while genomic DNA thresholds were between 1 pg and 10 fg. Use of the unique sequences combined with the LAMP assay provides a sensitive, accurate, rapid, simple, and inexpensive protocol to detect both BB and BLS pathogens. PMID:24837384
Whole genome sequencing in clinical and public health microbiology
Kwong, J. C.; McCallum, N.; Sintchenko, V.; Howden, B. P.
2015-01-01
SummaryGenomics and whole genome sequencing (WGS) have the capacity to greatly enhance knowledge and understanding of infectious diseases and clinical microbiology. The growth and availability of bench-top WGS analysers has facilitated the feasibility of genomics in clinical and public health microbiology. Given current resource and infrastructure limitations, WGS is most applicable to use in public health laboratories, reference laboratories, and hospital infection control-affiliated laboratories. As WGS represents the pinnacle for strain characterisation and epidemiological analyses, it is likely to replace traditional typing methods, resistance gene detection and other sequence-based investigations (e.g., 16S rDNA PCR) in the near future. Although genomic technologies are rapidly evolving, widespread implementation in clinical and public health microbiology laboratories is limited by the need for effective semi-automated pipelines, standardised quality control and data interpretation, bioinformatics expertise, and infrastructure. PMID:25730631
Whole genome sequencing in clinical and public health microbiology.
Kwong, J C; McCallum, N; Sintchenko, V; Howden, B P
2015-04-01
Genomics and whole genome sequencing (WGS) have the capacity to greatly enhance knowledge and understanding of infectious diseases and clinical microbiology.The growth and availability of bench-top WGS analysers has facilitated the feasibility of genomics in clinical and public health microbiology.Given current resource and infrastructure limitations, WGS is most applicable to use in public health laboratories, reference laboratories, and hospital infection control-affiliated laboratories.As WGS represents the pinnacle for strain characterisation and epidemiological analyses, it is likely to replace traditional typing methods, resistance gene detection and other sequence-based investigations (e.g., 16S rDNA PCR) in the near future.Although genomic technologies are rapidly evolving, widespread implementation in clinical and public health microbiology laboratories is limited by the need for effective semi-automated pipelines, standardised quality control and data interpretation, bioinformatics expertise, and infrastructure.
Evolution of the arginase fold and functional diversity
Dowling, Daniel P.; Costanzo, Luigi Di; Gennadios, Heather A.; Christianson, David W.
2009-01-01
The large number of protein structures deposited in the Protein Data Bank allows for the identification of novel structural superfamilies based on conservation of fold in addition to conservation of amino acid sequence. Since sequence diverges more rapidly than fold in protein evolution, proteins with little or no significant sequence identity are occasionally observed to adopt similar folds, thereby reflecting unanticipated evolutionary relationships. Here, we review the unique α/β fold first observed in the manganese metalloenzyme rat liver arginase, consisting of a parallel 8 stranded β-sheet surrounded by several helices, and its evolutionary relationship with the zinc-requiring and/or iron-requiring histone deacetylases and acetylpolyamine amidohydrolases. Structural comparisons reveal key features of the core α/β fold that contribute to the divergent metal ion specificity and stoichiometry required for the chemical and biological functions of these enzymes. PMID:18360740
Rapid and Accurate Sequencing of Enterovirus Genomes Using MinION Nanopore Sequencer.
Wang, Ji; Ke, Yue Hua; Zhang, Yong; Huang, Ke Qiang; Wang, Lei; Shen, Xin Xin; Dong, Xiao Ping; Xu, Wen Bo; Ma, Xue Jun
2017-10-01
Knowledge of an enterovirus genome sequence is very important in epidemiological investigation to identify transmission patterns and ascertain the extent of an outbreak. The MinION sequencer is increasingly used to sequence various viral pathogens in many clinical situations because of its long reads, portability, real-time accessibility of sequenced data, and very low initial costs. However, information is lacking on MinION sequencing of enterovirus genomes. In this proof-of-concept study using Enterovirus 71 (EV71) and Coxsackievirus A16 (CA16) strains as examples, we established an amplicon-based whole genome sequencing method using MinION. We explored the accuracy, minimum sequencing time, discrimination and high-throughput sequencing ability of MinION, and compared its performance with Sanger sequencing. Within the first minute (min) of sequencing, the accuracy of MinION was 98.5% for the single EV71 strain and 94.12%-97.33% for 10 genetically-related CA16 strains. In as little as 14 min, 99% identity was reached for the single EV71 strain, and in 17 min (on average), 99% identity was achieved for 10 CA16 strains in a single run. MinION is suitable for whole genome sequencing of enteroviruses with sufficient accuracy and fine discrimination and has the potential as a fast, reliable and convenient method for routine use. Copyright © 2017 The Editorial Board of Biomedical and Environmental Sciences. Published by China CDC. All rights reserved.
Habits, action sequences, and reinforcement learning
Dezfouli, Amir; Balleine, Bernard W.
2012-01-01
It is now widely accepted that instrumental actions can be either goal-directed or habitual; whereas the former are rapidly acquire and regulated by their outcome, the latter are reflexive, elicited by antecedent stimuli rather than their consequences. Model-based reinforcement learning (RL) provides an elegant description of goal-directed action. Through exposure to states, actions and rewards, the agent rapidly constructs a model of the world and can choose an appropriate action based on quite abstract changes in environmental and evaluative demands. This model is powerful but has a problem explaining the development of habitual actions. To account for habits, theorists have argued that another action controller is required, called model-free RL, that does not form a model of the world but rather caches action values within states allowing a state to select an action based on its reward history rather than its consequences. Nevertheless, there are persistent problems with important predictions from the model; most notably the failure of model-free RL correctly to predict the insensitivity of habitual actions to changes in the action-reward contingency. Here, we suggest that introducing model-free RL in instrumental conditioning is unnecessary and demonstrate that reconceptualizing habits as action sequences allows model-based RL to be applied to both goal-directed and habitual actions in a manner consistent with what real animals do. This approach has significant implications for the way habits are currently investigated and generates new experimental predictions. PMID:22487034
Park, Doori; Park, Su-Hyun; Ban, Yong Wook; Kim, Youn Shic; Park, Kyoung-Cheul; Kim, Nam-Soo; Kim, Ju-Kon; Choi, Ik-Young
2017-08-15
Genetically modified crops (GM crops) have been developed to improve the agricultural traits of modern crop cultivars. Safety assessments of GM crops are of paramount importance in research at developmental stages and before releasing transgenic plants into the marketplace. Sequencing technology is developing rapidly, with higher output and labor efficiencies, and will eventually replace existing methods for the molecular characterization of genetically modified organisms. To detect the transgenic insertion locations in the three GM rice gnomes, Illumina sequencing reads are mapped and classified to the rice genome and plasmid sequence. The both mapped reads are classified to characterize the junction site between plant and transgene sequence by sequence alignment. Herein, we present a next generation sequencing (NGS)-based molecular characterization method, using transgenic rice plants SNU-Bt9-5, SNU-Bt9-30, and SNU-Bt9-109. Specifically, using bioinformatics tools, we detected the precise insertion locations and copy numbers of transfer DNA, genetic rearrangements, and the absence of backbone sequences, which were equivalent to results obtained from Southern blot analyses. NGS methods have been suggested as an effective means of characterizing and detecting transgenic insertion locations in genomes. Our results demonstrate the use of a combination of NGS technology and bioinformatics approaches that offers cost- and time-effective methods for assessing the safety of transgenic plants.
On site DNA barcoding by nanopore sequencing
Menegon, Michele; Cantaloni, Chiara; Rodriguez-Prieto, Ana; Centomo, Cesare; Abdelfattah, Ahmed; Rossato, Marzia; Bernardi, Massimo; Xumerle, Luciano; Loader, Simon; Delledonne, Massimo
2017-01-01
Biodiversity research is becoming increasingly dependent on genomics, which allows the unprecedented digitization and understanding of the planet’s biological heritage. The use of genetic markers i.e. DNA barcoding, has proved to be a powerful tool in species identification. However, full exploitation of this approach is hampered by the high sequencing costs and the absence of equipped facilities in biodiversity-rich countries. In the present work, we developed a portable sequencing laboratory based on the portable DNA sequencer from Oxford Nanopore Technologies, the MinION. Complementary laboratory equipment and reagents were selected to be used in remote and tough environmental conditions. The performance of the MinION sequencer and the portable laboratory was tested for DNA barcoding in a mimicking tropical environment, as well as in a remote rainforest of Tanzania lacking electricity. Despite the relatively high sequencing error-rate of the MinION, the development of a suitable pipeline for data analysis allowed the accurate identification of different species of vertebrates including amphibians, reptiles and mammals. In situ sequencing of a wild frog allowed us to rapidly identify the species captured, thus confirming that effective DNA barcoding in the field is possible. These results open new perspectives for real-time-on-site DNA sequencing thus potentially increasing opportunities for the understanding of biodiversity in areas lacking conventional laboratory facilities. PMID:28977016
Danley, Patrick D; Mullen, Sean P; Liu, Fenglong; Nene, Vishvanath; Quackenbush, John; Shaw, Kerry L
2007-01-01
Background As the developmental costs of genomic tools decline, genomic approaches to non-model systems are becoming more feasible. Many of these systems may lack advanced genetic tools but are extremely valuable models in other biological fields. Here we report the development of expressed sequence tags (EST's) in an orthopteroid insect, a model for the study of neurobiology, speciation, and evolution. Results We report the sequencing of 14,502 EST's from clones derived from a nerve cord cDNA library, and the subsequent construction of a Gene Index from these sequences, from the Hawaiian trigonidiine cricket Laupala kohalensis. The Gene Index contains 8607 unique sequences comprised of 2575 tentative consensus (TC) sequences and 6032 singletons. For each of the unique sequences, an attempt was made to assign a provisional annotation and to categorize its function using a Gene Ontology-based classification through a sequence-based comparison to known proteins. In addition, a set of unique 70 base pair oligomers that can be used for DNA microarrays was developed. All Gene Index information is posted at the DFCI Gene Indices web page Conclusion Orthopterans are models used to understand the neurophysiological basis of complex motor patterns such as flight and stridulation. The sequences presented in the cricket Gene Index will provide neurophysiologists with many genetic tools that have been largely absent in this field. The cricket Gene Index is one of only two gene indices to be developed in an evolutionary model system. Species within the genus Laupala have speciated recently, rapidly, and extensively. Therefore, the genes identified in the cricket Gene Index can be used to study the genomics of speciation. Furthermore, this gene index represents a significant EST resources for basal insects. As such, this resource is a valuable comparative tool for the understanding of invertebrate molecular evolution. The sequences presented here will provide much needed genomic resources for three distinct but overlapping fields of inquiry: neurobiology, speciation, and molecular evolution. PMID:17459168
Tan, Wui Siew; Lewis, Christina L; Horelik, Nicholas E; Pregibon, Daniel C; Doyle, Patrick S; Yi, Hyunmin
2008-11-04
We demonstrate hierarchical assembly of tobacco mosaic virus (TMV)-based nanotemplates with hydrogel-based encoded microparticles via nucleic acid hybridization. TMV nanotemplates possess a highly defined structure and a genetically engineered high density thiol functionality. The encoded microparticles are produced in a high throughput microfluidic device via stop-flow lithography (SFL) and consist of spatially discrete regions containing encoded identity information, an internal control, and capture DNAs. For the hybridization-based assembly, partially disassembled TMVs were programmed with linker DNAs that contain sequences complementary to both the virus 5' end and a selected capture DNA. Fluorescence microscopy, atomic force microscopy (AFM), and confocal microscopy results clearly indicate facile assembly of TMV nanotemplates onto microparticles with high spatial and sequence selectivity. We anticipate that our hybridization-based assembly strategy could be employed to create multifunctional viral-synthetic hybrid materials in a rapid and high-throughput manner. Additionally, we believe that these viral-synthetic hybrid microparticles may find broad applications in high capacity, multiplexed target sensing.
USDA-ARS?s Scientific Manuscript database
The complete genome sequence (6,423 nt) of an emerging Cucumber green mottle mosaic virus (CGMMV) isolate on cucumber in North America was determined through deep sequencing of sRNA and rapid amplification of cDNA ends. It shares 99% nucleotide sequence identity to the Asian genotype, but only 90% t...
The promise and challenge of high-throughput sequencing of the antibody repertoire
Georgiou, George; Ippolito, Gregory C; Beausang, John; Busse, Christian E; Wardemann, Hedda; Quake, Stephen R
2014-01-01
Efforts to determine the antibody repertoire encoded by B cells in the blood or lymphoid organs using high-throughput DNA sequencing technologies have been advancing at an extremely rapid pace and are transforming our understanding of humoral immune responses. Information gained from high-throughput DNA sequencing of immunoglobulin genes (Ig-seq) can be applied to detect B-cell malignancies with high sensitivity, to discover antibodies specific for antigens of interest, to guide vaccine development and to understand autoimmunity. Rapid progress in the development of experimental protocols and informatics analysis tools is helping to reduce sequencing artifacts, to achieve more precise quantification of clonal diversity and to extract the most pertinent biological information. That said, broader application of Ig-seq, especially in clinical settings, will require the development of a standardized experimental design framework that will enable the sharing and meta-analysis of sequencing data generated by different laboratories. PMID:24441474
Singh, Aditya; Bhatia, Prateek
2016-12-01
Sanger sequencing platforms, such as applied biosystems instruments, generate chromatogram files. Generally, for 1 region of a sequence, we use both forward and reverse primers to sequence that area, in that way, we have 2 sequences that need to be aligned and a consensus generated before mutation detection studies. This work is cumbersome and takes time, especially if the gene is large with many exons. Hence, we devised a rapid automated command system to filter, build, and align consensus sequences and also optionally extract exonic regions, translate them in all frames, and perform an amino acid alignment starting from raw sequence data within a very short time. In full capabilities of Automated Mutation Analysis Pipeline (ASAP), it is able to read "*.ab1" chromatogram files through command line interface, convert it to the FASTQ format, trim the low-quality regions, reverse-complement the reverse sequence, create a consensus sequence, extract the exonic regions using a reference exonic sequence, translate the sequence in all frames, and align the nucleic acid and amino acid sequences to reference nucleic acid and amino acid sequences, respectively. All files are created and can be used for further analysis. ASAP is available as Python 3.x executable at https://github.com/aditya-88/ASAP. The version described in this paper is 0.28.
Rapid identification of causative species in patients with Old World leishmaniasis.
Minodier, P; Piarroux, R; Gambarelli, F; Joblet, C; Dumon, H
1997-01-01
Conventional methods for the identification of species of Leishmania parasite causing infections have limitations. By using a DNA-based alternative, the present study tries to develop a new tool for this purpose. Thirty-three patients living in Marseilles (in the south of France) were suffering from visceral or cutaneous leishmaniasis. DNA of the parasite in clinical samples (bone marrow, peripheral blood, or skin) from these patients were amplified by PCR and were directly sequenced. The sequences observed were compared to these of 30 strains of the genus causing Old World leishmaniasis collected in Europe, Africa, or Asia. In the analysis of the sequences of the strains, two different sequence patterns for Leishmania infantum, one sequence for Leishmania donovani, one sequence for Leishmania major, two sequences for Leishmania tropica, and one sequence for Leishmania aethiopica were obtained. Four sequences were observed among the strains from the patients: one was similar to the sequence for the L. major strains, two were identical to the sequences for the L. infantum strains, and the last sequence was not observed within the strains but had a high degree of homology with the sequences of the L. infantum and L. donovani strains. The L. infantum strains from all immunocompetent patients had the same sequence. The L. infantum strains from immunodeficient patients suffering from visceral leishmaniasis had three different sequences. This fact might signify that some variants of L. infantum acquire pathogenicity exclusively in immunocompromised patients. To dispense with the sequencing step, a restriction assay with HaeIII was used. Some restriction patterns might support genetic exchanges in members of the genus Leishmania. PMID:9316906
Rapid method to detect duplex formation in sequencing by hybridization methods
Mirzabekov, A.D.; Timofeev, E.N.; Florentiev, V.L.; Kirillov, E.V.
1999-01-19
A method for determining the existence of duplexes of oligonucleotide complementary molecules is provided. A plurality of immobilized oligonucleotide molecules, each of a specific length and each having a specific base sequence, is contacted with complementary, single stranded oligonucleotide molecules to form a duplex. Each duplex facilitates intercalation of a fluorescent dye between the base planes of the duplex. The invention also provides for a method for constructing oligonucleotide matrices comprising confining light sensitive fluid to a surface and exposing the light-sensitive fluid to a light pattern. This causes the fluid exposed to the light to coalesce into discrete units and adhere to the surface. This places each of the units in contact with a set of different oligonucleotide molecules so as to allow the molecules to disperse into the units. 13 figs.
A real-time PCR diagnostic method for detection of Naegleria fowleri.
Madarová, Lucia; Trnková, Katarína; Feiková, Sona; Klement, Cyril; Obernauerová, Margita
2010-09-01
Naegleria fowleri is a free-living amoeba that can cause primary amoebic meningoencephalitis (PAM). While, traditional methods for diagnosing PAM still rely on culture, more current laboratory diagnoses exist based on conventional PCR methods; however, only a few real-time PCR processes have been described as yet. Here, we describe a real-time PCR-based diagnostic method using hybridization fluorescent labelled probes, with a LightCycler instrument and accompanying software (Roche), targeting the Naegleria fowleriMp2Cl5 gene sequence. Using this method, no cross reactivity with other tested epidemiologically relevant prokaryotic and eukaryotic organisms was found. The reaction detection limit was 1 copy of the Mp2Cl5 DNA sequence. This assay could become useful in the rapid laboratory diagnostic assessment of the presence or absence of Naegleria fowleri. Copyright 2009 Elsevier Inc. All rights reserved.
Rapid method to detect duplex formation in sequencing by hybridization methods
Mirzabekov, Andrei Darievich; Timofeev, Edward Nikolaevich; Florentiev, Vladimer Leonidovich; Kirillov, Eugene Vladislavovich
1999-01-01
A method for determining the existence of duplexes of oligonucleotide complementary molecules is provided whereby a plurality of immobilized oligonucleotide molecules, each of a specific length and each having a specific base sequence, is contacted with complementary, single stranded oligonucleotide molecules to form a duplex so as to facilitate intercalation of a fluorescent dye between the base planes of the duplex. The invention also provides for a method for constructing oligonucleotide matrices comprising confining light sensitive fluid to a surface, exposing said light-sensitive fluid to a light pattern so as to cause the fluid exposed to the light to coalesce into discrete units and adhere to the surface; and contacting each of the units with a set of different oligonucleotide molecules so as to allow the molecules to disperse into the units.
Detection of damaged DNA bases by DNA glycosylase enzymes.
Friedman, Joshua I; Stivers, James T
2010-06-22
A fundamental and shared process in all forms of life is the use of DNA glycosylase enzymes to excise rare damaged bases from genomic DNA. Without such enzymes, the highly ordered primary sequences of genes would rapidly deteriorate. Recent structural and biophysical studies are beginning to reveal a fascinating multistep mechanism for damaged base detection that begins with short-range sliding of the glycosylase along the DNA chain in a distinct conformation we call the search complex (SC). Sliding is frequently punctuated by the formation of a transient "interrogation" complex (IC) where the enzyme extrahelically inspects both normal and damaged bases in an exosite pocket that is distant from the active site. When normal bases are presented in the exosite, the IC rapidly collapses back to the SC, while a damaged base will efficiently partition forward into the active site to form the catalytically competent excision complex (EC). Here we review the unique problems associated with enzymatic detection of rare damaged DNA bases in the genome and emphasize how each complex must have specific dynamic properties that are tuned to optimize the rate and efficiency of damage site location.
Monitoring Error Rates In Illumina Sequencing.
Manley, Leigh J; Ma, Duanduan; Levine, Stuart S
2016-12-01
Guaranteeing high-quality next-generation sequencing data in a rapidly changing environment is an ongoing challenge. The introduction of the Illumina NextSeq 500 and the depreciation of specific metrics from Illumina's Sequencing Analysis Viewer (SAV; Illumina, San Diego, CA, USA) have made it more difficult to determine directly the baseline error rate of sequencing runs. To improve our ability to measure base quality, we have created an open-source tool to construct the Percent Perfect Reads (PPR) plot, previously provided by the Illumina sequencers. The PPR program is compatible with HiSeq 2000/2500, MiSeq, and NextSeq 500 instruments and provides an alternative to Illumina's quality value (Q) scores for determining run quality. Whereas Q scores are representative of run quality, they are often overestimated and are sourced from different look-up tables for each platform. The PPR's unique capabilities as a cross-instrument comparison device, as a troubleshooting tool, and as a tool for monitoring instrument performance can provide an increase in clarity over SAV metrics that is often crucial for maintaining instrument health. These capabilities are highlighted.
Validation of Metagenomic Next-Generation Sequencing Tests for Universal Pathogen Detection.
Schlaberg, Robert; Chiu, Charles Y; Miller, Steve; Procop, Gary W; Weinstock, George
2017-06-01
- Metagenomic sequencing can be used for detection of any pathogens using unbiased, shotgun next-generation sequencing (NGS), without the need for sequence-specific amplification. Proof-of-concept has been demonstrated in infectious disease outbreaks of unknown causes and in patients with suspected infections but negative results for conventional tests. Metagenomic NGS tests hold great promise to improve infectious disease diagnostics, especially in immunocompromised and critically ill patients. - To discuss challenges and provide example solutions for validating metagenomic pathogen detection tests in clinical laboratories. A summary of current regulatory requirements, largely based on prior guidance for NGS testing in constitutional genetics and oncology, is provided. - Examples from 2 separate validation studies are provided for steps from assay design, and validation of wet bench and bioinformatics protocols, to quality control and assurance. - Although laboratory and data analysis workflows are still complex, metagenomic NGS tests for infectious diseases are increasingly being validated in clinical laboratories. Many parallels exist to NGS tests in other fields. Nevertheless, specimen preparation, rapidly evolving data analysis algorithms, and incomplete reference sequence databases are idiosyncratic to the field of microbiology and often overlooked.
A Concise Atlas of Thyroid Cancer Next-Generation Sequencing Panel ThyroSeq v.2
Alsina, Jorge; Alsina, Raul; Gulec, Seza
2017-01-01
The next-generation sequencing technology allows high out-put genomic analysis. An innovative assay in thyroid cancer, ThyroSeq® was developed for targeted mutation detection by next generation sequencing technology in fine needle aspiration and tissue samples. ThyroSeq v.2 next generation sequencing panel offers simultaneous sequencing and detection in >1000 hotspots of 14 thyroid cancer-related genes and for 42 types of gene fusions known to occur in thyroid cancer. ThyroSeq is being increasingly used to further narrow the indeterminate category defined by cytology for thyroid nodules. From a surgical perspective, genomic profiling also provides prognostic and predictive information and closely relates to determination of surgical strategy. Both the genomic analysis technology and the informatics for the cancer genome data base are rapidly developing. In this paper, we have gathered existing information on the thyroid cancer-related genes involved in the initiation and progression of thyroid cancer. Our goal is to assemble a glossary for the current ThyroSeq genomic panel that can help elucidate the role genomics play in thyroid cancer oncogenesis. PMID:28117295
Microsatellite DNA capture from enriched libraries.
Gonzalez, Elena G; Zardoya, Rafael
2013-01-01
Microsatellites are DNA sequences of tandem repeats of one to six nucleotides, which are highly polymorphic, and thus the molecular markers of choice in many kinship, population genetic, and conservation studies. There have been significant technical improvements since the early methods for microsatellite isolation were developed, and today the most common procedures take advantage of the hybrid capture methods of enriched-targeted microsatellite DNA. Furthermore, recent advents in sequencing technologies (i.e., next-generation sequencing, NGS) have fostered the mining of microsatellite markers in non-model organisms, affording a cost-effective way of obtaining a large amount of sequence data potentially useful for loci characterization. The rapid improvements of NGS platforms together with the increase in available microsatellite information open new avenues to the understanding of the evolutionary forces that shape genetic structuring in wild populations. Here, we provide detailed methodological procedures for microsatellite isolation based on the screening of GT microsatellite-enriched libraries, either by cloning and Sanger sequencing of positive clones or by direct NGS. Guides for designing new species-specific primers and basic genotyping are also given.
Wang, Jian-Yan; Zhen, Yu; Wang, Guo-shan; Mi, Tie-Zhu; Yu, Zhi-gang
2013-03-01
Taking the moon jellyfish Aurelia sp. commonly found in our coastal sea areas as test object, its genome DNA was extracted, the partial sequences of mt-16S rDNA (650 bp) and mt-COI (709 bp) were PCR-amplified, and, after purification, cloning, and sequencing, the sequences obtained were BLASTn-analyzed. The sequences of greater difference with those of the other jellyfish were chosen, and eight specific primers for the mt-16S rDNA and mt-COI of Aurelia sp. were designed, respectively. The specificity test indicated that the primer AS3 for the mt-16S rDNA and the primer AC3 for the mt-COI were excellent in rapidly detecting the target jellyfish from Rhopilema esculentum, Nemopilema nomurai, Cyanea nozakii, Acromitus sp., and Aurelia sp., and thus, the techniques for the molecular identification and detection of moon jellyfish were preliminarily established, which could get rid of the limitations in classical morphological identification of Aurelia sp. , being able to find the Aurelia sp. in the samples more quickly and accurately.
Sequences show rapid motor transfer and spatial translation in the oculomotor system.
Stainer, Matthew J; Carpenter, R H S; Brotchie, Peter; Anderson, Andrew J
2016-07-01
Every day we perform learnt sequences of actions that seem to happen almost without awareness. It has been argued that for learning such sequences parallel learning networks exist - one using spatial coordinates and one using motor coordinates - with sequence acquisition involving a progressive shift from the former to the latter as a sequence is rehearsed. When sequences are interrupted by an out-of-sequence target, there is a delay in the response to the target, and so here we transiently interrupt oculomotor sequences to probe the influence of oculomotor rehearsal and spatial coordinates in sequence acquisition. For our main experiments, we used a repeating sequences of eight targets in length that was first learnt either using saccadic eye movements (left/right), manual responses (left/right or up/down) or as a sequence of colour (blue/red) requiring no motor response. The sequence was immediately repeated for saccadic eye movements, during which the influence of on out-of-sequence target (an interruption) was assessed. When a sequence is learnt beforehand in an abstract way (for example, as a sequence of colours or of orthogonally mapped manual responses), interruptions are immediately disruptive to latency, suggesting neither motor rehearsal nor specific spatial coordinates are essential for encoding sequences of actions and that sequences - no matter how they are encoded - can be rapidly translated into oculomotor coordinates. The magnitude of a disruption does, however, correspond to how well a sequence is learnt: introducing an interruption to an extended sequence before it was reliably learnt reduces the magnitude of the latency disruption. Copyright © 2016 Elsevier Ltd. All rights reserved.
Cellular automata and its applications in protein bioinformatics.
Xiao, Xuan; Wang, Pu; Chou, Kuo-Chen
2011-09-01
With the explosion of protein sequences generated in the postgenomic era, it is highly desirable to develop high-throughput tools for rapidly and reliably identifying various attributes of uncharacterized proteins based on their sequence information alone. The knowledge thus obtained can help us timely utilize these newly found protein sequences for both basic research and drug discovery. Many bioinformatics tools have been developed by means of machine learning methods. This review is focused on the applications of a new kind of science (cellular automata) in protein bioinformatics. A cellular automaton (CA) is an open, flexible and discrete dynamic model that holds enormous potentials in modeling complex systems, in spite of the simplicity of the model itself. Researchers, scientists and practitioners from different fields have utilized cellular automata for visualizing protein sequences, investigating their evolution processes, and predicting their various attributes. Owing to its impressive power, intuitiveness and relative simplicity, the CA approach has great potential for use as a tool for bioinformatics.
Towards writing the encyclopaedia of life: an introduction to DNA barcoding
Savolainen, Vincent; Cowan, Robyn S; Vogler, Alfried P; Roderick, George K; Lane, Richard
2005-01-01
An international consortium of major natural history museums, herbaria and other organizations has launched an ambitious project, the ‘Barcode of Life Initiative’, to promote a process enabling the rapid and inexpensive identification of the estimated 10 million species on Earth. DNA barcoding is a diagnostic technique in which short DNA sequence(s) can be used for species identification. The first international scientific conference on Barcoding of Life was held at the Natural History Museum in London in February 2005, and here we review the scientific challenges discussed during this conference and in previous publications. Although still controversial, the scientific benefits of DNA barcoding include: (i) enabling species identification, including any life stage or fragment, (ii) facilitating species discoveries based on cluster analyses of gene sequences (e.g. cox1=CO1, in animals), (iii) promoting development of handheld DNA sequencing technology that can be applied in the field for biodiversity inventories and (iv) providing insight into the diversity of life. PMID:16214739
CNV-seq, a new method to detect copy number variation using high-throughput sequencing.
Xie, Chao; Tammi, Martti T
2009-03-06
DNA copy number variation (CNV) has been recognized as an important source of genetic variation. Array comparative genomic hybridization (aCGH) is commonly used for CNV detection, but the microarray platform has a number of inherent limitations. Here, we describe a method to detect copy number variation using shotgun sequencing, CNV-seq. The method is based on a robust statistical model that describes the complete analysis procedure and allows the computation of essential confidence values for detection of CNV. Our results show that the number of reads, not the length of the reads is the key factor determining the resolution of detection. This favors the next-generation sequencing methods that rapidly produce large amount of short reads. Simulation of various sequencing methods with coverage between 0.1x to 8x show overall specificity between 91.7 - 99.9%, and sensitivity between 72.2 - 96.5%. We also show the results for assessment of CNV between two individual human genomes.
Cho, Namjin; Hwang, Byungjin; Yoon, Jung-ki; Park, Sangun; Lee, Joongoo; Seo, Han Na; Lee, Jeewon; Huh, Sunghoon; Chung, Jinsoo; Bang, Duhee
2015-09-21
Interpreting epistatic interactions is crucial for understanding evolutionary dynamics of complex genetic systems and unveiling structure and function of genetic pathways. Although high resolution mapping of en masse variant libraries renders molecular biologists to address genotype-phenotype relationships, long-read sequencing technology remains indispensable to assess functional relationship between mutations that lie far apart. Here, we introduce JigsawSeq for multiplexed sequence identification of pooled gene variant libraries by combining a codon-based molecular barcoding strategy and de novo assembly of short-read data. We first validate JigsawSeq on small sub-pools and observed high precision and recall at various experimental settings. With extensive simulations, we then apply JigsawSeq to large-scale gene variant libraries to show that our method can be reliably scaled using next-generation sequencing. JigsawSeq may serve as a rapid screening tool for functional genomics and offer the opportunity to explore evolutionary trajectories of protein variants.
Design and preparation of beta-sheet forming repetitive and block-copolymerized polypeptides.
Higashiya, Seiichiro; Topilina, Natalya I; Ngo, Silvana C; Zagorevskii, Dmitri; Welch, John T
2007-05-01
The design and rapid construction of libraries of genes coding beta-sheet forming repetitive and block-copolymerized polypeptides bearing various C- and N-terminal sequences are described. The design was based on the assembly of DNA cassettes coding for the (GA)3GX amino acid sequence where the (GAGAGA) sequences would constitute the beta-strand units of a larger beta-sheet assembly. The edges of this beta-sheet would be functionalized by the turn-inducing amino acids (GX). The polypeptides were expressed in Escherichia coli using conventional vectors and were purified by Ni-nitriloacetic acid (NTA) chromatography. The correlation of polymer structure with molecular weight was investigated by gel electrophoresis and mass spectrometry. The monomer sequences and post-translational chemical modifications were found to influence the mobility of the polypeptides over the full range of polypeptide molecular weights while the electrophoretic mobility of lower molecular weight polypeptides was more susceptible to C- and N-termini polypeptide modifications.
Verstappen, Koen M; Huijbregts, Loes; Spaninks, Mirlin; Wagenaar, Jaap A; Fluit, Ad C; Duim, Birgitta
2017-01-01
Staphylococcus pseudintermedius is an opportunistic pathogen in dogs and cats and occasionally causes infections in humans. S. pseudintermedius is often resistant to multiple classes of antimicrobials. It requires a reliable detection so that it is not misidentified as S. aureus. Phenotypic and currently-used molecular-based diagnostic assays lack specificity or are labour-intensive using multiplex PCR or nucleic acid sequencing. The aim of this study was to identify a specific target for real-time PCR by comparing whole genome sequences of S. pseudintermedius and non-pseudintermedius.Genome sequences were downloaded from public repositories and supplemented by isolates that were sequenced in this study. A Perl-script was written that analysed 300-nt fragments from a reference genome sequence of S. pseudintermedius and checked if this sequence was present in other S. pseudintermedius genomes (n = 74) and non-pseudintermedius genomes (n = 138). Six sequences specific for S. pseudintermedius were identified (sequence length between 300-500 nt). One sequence, which was located in the spsJ gene, was used to develop primers and a probe. The real-time PCR showed 100% specificity when testing for S. pseudintermedius isolates (n = 54), and eight other staphylococcal species (n = 43). In conclusion, a novel approach by comparing whole genome sequences identified a sequence that is specific for S. pseudintermedius and provided a real-time PCR target for rapid and reliable detection of S. pseudintermedius.
Rapid motif compliance scoring with match weight sets.
Venezia, D; O'Hara, P J
1993-02-01
Most current implementations of motif matching in biological sequences have sacrificed the generality of weight matrix scoring for shorter runtimes. The program MOTIF incorporates a weight matrix and a rapid, backtracking tree-search algorithm to score motif compliance with greatly enhanced performance while placing no constraints on the motif. In addition, any positions within a motif can be marked as 'inviolate', thereby requiring an exact match. MOTIF allows a choice of regular expression formats and can use both motif and sequence libraries as either targets or queries. Nucleic acid sequences can optionally be translated by MOTIF in any frame(s) and used against peptide motifs.
The INNs and outs of antibody nonproprietary names
Jones, Tim D.; Carter, Paul J.; Plückthun, Andreas; Vásquez, Max; Holgate, Robert G.E.; Hötzel, Isidro; Popplewell, Andrew G.; Parren, Paul W.H.I.; Enzelberger, Markus; Rademaker, Hendrik J.; Clark, Michael R.; Lowe, David C.; Dahiyat, Bassil I.; Smith, Victoria; Lambert, John M.; Wu, Herren; Reilly, Mary; Haurum, John S.; Dübel, Stefan; Huston, James S.; Schirrmann, Thomas; Janssen, Richard A.J.; Steegmaier, Martin; Gross, Jane A.; Bradbury, Andrew R.M.; Burton, Dennis R.; Dimitrov, Dimiter S.; Chester, Kerry A.; Glennie, Martin J.; Davies, Julian; Walker, Adam; Martin, Steve; McCafferty, John; Baker, Matthew P.
2016-01-01
An important step in drug development is the assignment of an International Nonproprietary Name (INN) by the World Health Organization (WHO) that provides healthcare professionals with a unique and universally available designated name to identify each pharmaceutical substance. Monoclonal antibody INNs comprise a –mab suffix preceded by a substem indicating the antibody type, e.g., chimeric (-xi-), humanized (-zu-), or human (-u-). The WHO publishes INN definitions that specify how new monoclonal antibody therapeutics are categorized and adapts the definitions to new technologies. However, rapid progress in antibody technologies has blurred the boundaries between existing antibody categories and created a burgeoning array of new antibody formats. Thus, revising the INN system for antibodies is akin to aiming for a rapidly moving target. The WHO recently revised INN definitions for antibodies now to be based on amino acid sequence identity. These new definitions, however, are critically flawed as they are ambiguous and go against decades of scientific literature. A key concern is the imposition of an arbitrary threshold for identity against human germline antibody variable region sequences. This leads to inconsistent classification of somatically mutated human antibodies, humanized antibodies as well as antibodies derived from semi-synthetic/synthetic libraries and transgenic animals. Such sequence-based classification implies clear functional distinction between categories (e.g., immunogenicity). However, there is no scientific evidence to support this. Dialog between the WHO INN Expert Group and key stakeholders is needed to develop a new INN system for antibodies and to avoid confusion and miscommunication between researchers and clinicians prescribing antibodies. PMID:26716992
A Low-Cost PC-Based Image Workstation for Dynamic Interactive Display of Three-Dimensional Anatomy
NASA Astrophysics Data System (ADS)
Barrett, William A.; Raya, Sai P.; Udupa, Jayaram K.
1989-05-01
A system for interactive definition, automated extraction, and dynamic interactive display of three-dimensional anatomy has been developed and implemented on a low-cost PC-based image workstation. An iconic display is used for staging predefined image sequences through specified increments of tilt and rotation over a solid viewing angle. Use of a fast processor facilitates rapid extraction and rendering of the anatomy into predefined image views. These views are formatted into a display matrix in a large image memory for rapid interactive selection and display of arbitrary spatially adjacent images within the viewing angle, thereby providing motion parallax depth cueing for efficient and accurate perception of true three-dimensional shape, size, structure, and spatial interrelationships of the imaged anatomy. The visual effect is that of holding and rotating the anatomy in the hand.
Wiley, Laura K.; Sivley, R. Michael; Bush, William S.
2013-01-01
Efficient storage and retrieval of genomic annotations based on range intervals is necessary, given the amount of data produced by next-generation sequencing studies. The indexing strategies of relational database systems (such as MySQL) greatly inhibit their use in genomic annotation tasks. This has led to the development of stand-alone applications that are dependent on flat-file libraries. In this work, we introduce MyNCList, an implementation of the NCList data structure within a MySQL database. MyNCList enables the storage, update and rapid retrieval of genomic annotations from the convenience of a relational database system. Range-based annotations of 1 million variants are retrieved in under a minute, making this approach feasible for whole-genome annotation tasks. Database URL: https://github.com/bushlab/mynclist PMID:23894185
Wiley, Laura K; Sivley, R Michael; Bush, William S
2013-01-01
Efficient storage and retrieval of genomic annotations based on range intervals is necessary, given the amount of data produced by next-generation sequencing studies. The indexing strategies of relational database systems (such as MySQL) greatly inhibit their use in genomic annotation tasks. This has led to the development of stand-alone applications that are dependent on flat-file libraries. In this work, we introduce MyNCList, an implementation of the NCList data structure within a MySQL database. MyNCList enables the storage, update and rapid retrieval of genomic annotations from the convenience of a relational database system. Range-based annotations of 1 million variants are retrieved in under a minute, making this approach feasible for whole-genome annotation tasks. Database URL: https://github.com/bushlab/mynclist.
Optimizing high performance computing workflow for protein functional annotation.
Stanberry, Larissa; Rekepalli, Bhanu; Liu, Yuan; Giblock, Paul; Higdon, Roger; Montague, Elizabeth; Broomall, William; Kolker, Natali; Kolker, Eugene
2014-09-10
Functional annotation of newly sequenced genomes is one of the major challenges in modern biology. With modern sequencing technologies, the protein sequence universe is rapidly expanding. Newly sequenced bacterial genomes alone contain over 7.5 million proteins. The rate of data generation has far surpassed that of protein annotation. The volume of protein data makes manual curation infeasible, whereas a high compute cost limits the utility of existing automated approaches. In this work, we present an improved and optmized automated workflow to enable large-scale protein annotation. The workflow uses high performance computing architectures and a low complexity classification algorithm to assign proteins into existing clusters of orthologous groups of proteins. On the basis of the Position-Specific Iterative Basic Local Alignment Search Tool the algorithm ensures at least 80% specificity and sensitivity of the resulting classifications. The workflow utilizes highly scalable parallel applications for classification and sequence alignment. Using Extreme Science and Engineering Discovery Environment supercomputers, the workflow processed 1,200,000 newly sequenced bacterial proteins. With the rapid expansion of the protein sequence universe, the proposed workflow will enable scientists to annotate big genome data.
Optimizing high performance computing workflow for protein functional annotation
Stanberry, Larissa; Rekepalli, Bhanu; Liu, Yuan; Giblock, Paul; Higdon, Roger; Montague, Elizabeth; Broomall, William; Kolker, Natali; Kolker, Eugene
2014-01-01
Functional annotation of newly sequenced genomes is one of the major challenges in modern biology. With modern sequencing technologies, the protein sequence universe is rapidly expanding. Newly sequenced bacterial genomes alone contain over 7.5 million proteins. The rate of data generation has far surpassed that of protein annotation. The volume of protein data makes manual curation infeasible, whereas a high compute cost limits the utility of existing automated approaches. In this work, we present an improved and optmized automated workflow to enable large-scale protein annotation. The workflow uses high performance computing architectures and a low complexity classification algorithm to assign proteins into existing clusters of orthologous groups of proteins. On the basis of the Position-Specific Iterative Basic Local Alignment Search Tool the algorithm ensures at least 80% specificity and sensitivity of the resulting classifications. The workflow utilizes highly scalable parallel applications for classification and sequence alignment. Using Extreme Science and Engineering Discovery Environment supercomputers, the workflow processed 1,200,000 newly sequenced bacterial proteins. With the rapid expansion of the protein sequence universe, the proposed workflow will enable scientists to annotate big genome data. PMID:25313296
Simons, S O; van der Laan, T; Mulder, A; van Ingen, J; Rigouts, L; Dekhuijzen, P N R; Boeree, M J; van Soolingen, D
2014-10-01
There is an urgent need for rapid and accurate diagnosis of pyrazinamide-resistant multidrug-resistant tuberculosis (MDR-TB). No diagnostic algorithm has been validated in this population. We hypothesized that pncA sequencing added to rpoB mutation analysis can accurately identify patients with pyrazinamide-resistant MDR-TB. We identified from the Dutch national database (2007-11) patients with a positive Mycobacterium tuberculosis culture containing a mutation in the rpoB gene. In these cases, we prospectively sequenced the pncA gene. Results from the rpoB and pncA mutation analysis (pncA added to rpoB) were compared with phenotypic susceptibility testing results to rifampicin, isoniazid and pyrazinamide (reference standard) using the Mycobacterial Growth Indicator Tube 960 system. We included 83 clinical M. tuberculosis isolates containing rpoB mutations in the primary analysis. Rifampicin resistance was seen in 72 isolates (87%), isoniazid resistance in 73 isolates (88%) and MDR-TB in 65 isolates (78%). Phenotypic reference testing identified pyrazinamide-resistant MDR-TB in 31 isolates (48%). Sensitivity of pncA sequencing added to rpoB mutation analysis for detecting pyrazinamide-resistant MDR-TB was 96.8%, the specificity was 94.2%, the positive predictive value was 90.9%, the negative predictive value was 98.0%, the positive likelihood was 16.8 and the negative likelihood was 0.03. In conclusion, pyrazinamide-resistant MDR-TB can be accurately detected using pncA sequencing added to rpoB mutation analysis. We propose to include pncA sequencing in every isolate with an rpoB mutation, allowing for stratification of MDR-TB treatment according to pyrazinamide susceptibility. © 2014 The Authors Clinical Microbiology and Infection © 2014 European Society of Clinical Microbiology and Infectious Diseases.
Visvesvara, Govinda S; De Jonckheere, Johan F; Sriram, Rama; Daft, Barbara
2005-08-01
Naegleria fowleri causes an acute and rapidly fatal central nervous system infection called primary amebic meningoencephalitis (PAM) in healthy children and young adults. We describe here the identification of N. fowleri isolated from the brain of one of several cows that died of PAM based on sequencing of the internal transcribed spacers, including the 5.8S rRNA genes.
Decousser, Jean-Winoc; Poirel, Laurent; Nordmann, Patrice
2017-04-01
The rapid detection of resistance is a challenge for clinical microbiologists who wish to prevent deleterious individual and collective consequences such as (i) delaying efficient antibiotic therapy, which worsens the survival rate of the most severely ill patients, or (ii) delaying the isolation of the carriers of multidrug-resistant bacteria and promoting outbreaks; this last consequence is of special concern, and there are an increasing number of approaches and market-based solutions in response. Areas covered: From simple, cheap biochemical tests to whole-genome sequencing, clinical microbiologists must select the most adequate phenotypic and genotypic tools to promptly detect and confirm β-lactam resistance from cultivated bacteria or from clinical specimens. Here, the authors review the published literature from the last 5 years about the primary technical approaches and commercial laboratory reagents for these purposes, including molecular, biochemical and immune assays. Furthermore, the authors discuss their intrinsic and relative performance, and we challenge their putative clinical impact. Expert commentary: Until the availability of fully automated wet and dry whole genome sequencing solutions, microbiologists should focus on inexpensive biochemical tests for cultured isolates or monomicrobial clinical specimen and on using the expensive molecular PCR-based strategies for the targeted screening of complex biological environments.
PeanutDB: an integrated bioinformatics web portal for Arachis hypogaea transcriptomics
2012-01-01
Background The peanut (Arachis hypogaea) is an important crop cultivated worldwide for oil production and food sources. Its complex genetic architecture (e.g., the large and tetraploid genome possibly due to unique cross of wild diploid relatives and subsequent chromosome duplication: 2n = 4x = 40, AABB, 2800 Mb) presents a major challenge for its genome sequencing and makes it a less-studied crop. Without a doubt, transcriptome sequencing is the most effective way to harness the genome structure and gene expression dynamics of this non-model species that has a limited genomic resource. Description With the development of next generation sequencing technologies such as 454 pyro-sequencing and Illumina sequencing by synthesis, the transcriptomics data of peanut is rapidly accumulated in both the public databases and private sectors. Integrating 187,636 Sanger reads (103,685,419 bases), 1,165,168 Roche 454 reads (333,862,593 bases) and 57,135,995 Illumina reads (4,073,740,115 bases), we generated the first release of our peanut transcriptome assembly that contains 32,619 contigs. We provided EC, KEGG and GO functional annotations to these contigs and detected SSRs, SNPs and other genetic polymorphisms for each contig. Based on both open-source and our in-house tools, PeanutDB presents many seamlessly integrated web interfaces that allow users to search, filter, navigate and visualize easily the whole transcript assembly, its annotations and detected polymorphisms and simple sequence repeats. For each contig, sequence alignment is presented in both bird’s-eye view and nucleotide level resolution, with colorfully highlighted regions of mismatches, indels and repeats that facilitate close examination of assembly quality, genetic polymorphisms, sequence repeats and/or sequencing errors. Conclusion As a public genomic database that integrates peanut transcriptome data from different sources, PeanutDB (http://bioinfolab.muohio.edu/txid3818v1) provides the Peanut research community with an easy-to-use web portal that will definitely facilitate genomics research and molecular breeding in this less-studied crop. PMID:22712730
Morozumi, Takeya; Toki, Daisuke; Eguchi-Ogawa, Tomoko; Uenishi, Hirohide
2011-09-01
Large-scale cDNA-sequencing projects require an efficient strategy for mass sequencing. Here we describe a method for sequencing pooled cDNA clones using a combination of transposon insertion and Gateway technology. Our method reduces the number of shotgun clones that are unsuitable for reconstruction of cDNA sequences, and has the advantage of reducing the total costs of the sequencing project.
Decoding DNA labels by melting curve analysis using real-time PCR.
Balog, József A; Fehér, Liliána Z; Puskás, László G
2017-12-01
Synthetic DNA has been used as an authentication code for a diverse number of applications. However, existing decoding approaches are based on either DNA sequencing or the determination of DNA length variations. Here, we present a simple alternative protocol for labeling different objects using a small number of short DNA sequences that differ in their melting points. Code amplification and decoding can be done in two steps using quantitative PCR (qPCR). To obtain a DNA barcode with high complexity, we defined 8 template groups, each having 4 different DNA templates, yielding 158 (>2.5 billion) combinations of different individual melting temperature (Tm) values and corresponding ID codes. The reproducibility and specificity of the decoding was confirmed by using the most complex template mixture, which had 32 different products in 8 groups with different Tm values. The industrial applicability of our protocol was also demonstrated by labeling a drone with an oil-based paint containing a predefined DNA code, which was then successfully decoded. The method presented here consists of a simple code system based on a small number of synthetic DNA sequences and a cost-effective, rapid decoding protocol using a few qPCR reactions, enabling a wide range of authentication applications.
Sakızcı-Uyar, Bahar; Çelik, Şeref; Postacı, Aysun; Bayraktar, Yeşim; Dikmen, Bayazit; Özkoçak-Turan, Işıl; Saçan, Özlem
2016-01-01
Objectives: To compare onset time, duration of action, and tracheal intubation conditions in obese patients when the intubation dose of rocuronium was based on corrected body weight (CBW) versus lean body weight (LBW) for rapid sequence induction. Methods: This prospective study was carried out at Numune Education and Research Hospital, Ankara, Turkey between August 2013 and May 2014. Forty female obese patients scheduled for laparoscopic surgery under general anesthesia were randomized into 2 groups. Group CBW (n=20) received 1.2 mg/kg rocuronium based on CBW, and group LBW (n=20) received 1.2 mg/kg rocuronium based on LBW. Endotracheal intubation was performed 60 seconds after injection of muscle relaxant, and intubating conditions were evaluated. Neuromuscular transmission was monitored using acceleromyography of the adductor pollicis. Onset time, defined as time to depression of the twitch tension to 95% of its control value, and duration of action, defined as time to achieve one response to train-of-four stimulation (T1) were recorded. Results: No significant differences were observed between the groups in intubation conditions or onset time (50-60 seconds median, 30-30 interquartile range [IQR]). Duration of action was significantly longer in the CBW group (60 minutes median, 12 IQR) than the LBW group (35 minutes median, 16 IQR; p<0.01). Conclusion: In obese patients, dosing of 1.2 mg/kg rocuronium based on LBW provides excellent or good tracheal intubating conditions within 60 seconds after administration and does not lead to prolonged duration of action. PMID:26739976
A Method for Preparing DNA Sequencing Templates Using a DNA-Binding Microplate
Yang, Yu; Hebron, Haroun R.; Hang, Jun
2009-01-01
A DNA-binding matrix was immobilized on the surface of a 96-well microplate and used for plasmid DNA preparation for DNA sequencing. The same DNA-binding plate was used for bacterial growth, cell lysis, DNA purification, and storage. In a single step using one buffer, bacterial cells were lysed by enzymes, and released DNA was captured on the plate simultaneously. After two wash steps, DNA was eluted and stored in the same plate. Inclusion of phosphates in the culture medium was found to enhance the yield of plasmid significantly. Purified DNA samples were used successfully in DNA sequencing with high consistency and reproducibility. Eleven vectors and nine libraries were tested using this method. In 10 μl sequencing reactions using 3 μl sample and 0.25 μl BigDye Terminator v3.1, the results from a 3730xl sequencer gave a success rate of 90–95% and read-lengths of 700 bases or more. The method is fully automatable and convenient for manual operation as well. It enables reproducible, high-throughput, rapid production of DNA with purity and yields sufficient for high-quality DNA sequencing at a substantially reduced cost. PMID:19568455
Noronha, Jyothi M; Liu, Mengya; Squires, R Burke; Pickett, Brett E; Hale, Benjamin G; Air, Gillian M; Galloway, Summer E; Takimoto, Toru; Schmolke, Mirco; Hunt, Victoria; Klem, Edward; García-Sastre, Adolfo; McGee, Monnie; Scheuermann, Richard H
2012-05-01
Genetic drift of influenza virus genomic sequences occurs through the combined effects of sequence alterations introduced by a low-fidelity polymerase and the varying selective pressures experienced as the virus migrates through different host environments. While traditional phylogenetic analysis is useful in tracking the evolutionary heritage of these viruses, the specific genetic determinants that dictate important phenotypic characteristics are often difficult to discern within the complex genetic background arising through evolution. Here we describe a novel influenza virus sequence feature variant type (Flu-SFVT) approach, made available through the public Influenza Research Database resource (www.fludb.org), in which variant types (VTs) identified in defined influenza virus protein sequence features (SFs) are used for genotype-phenotype association studies. Since SFs have been defined for all influenza virus proteins based on known structural, functional, and immune epitope recognition properties, the Flu-SFVT approach allows the rapid identification of the molecular genetic determinants of important influenza virus characteristics and their connection to underlying biological functions. We demonstrate the use of the SFVT approach to obtain statistical evidence for effects of NS1 protein sequence variations in dictating influenza virus host range restriction.
Salton, S R
1991-09-01
A nervous system-specific mRNA that is rapidly induced in PC12 cells to a greater extent by nerve growth factor (NGF) than by epidermal growth factor treatment has been cloned. The polypeptide deduced from the nucleic acid sequence of the NGF33.1 cDNA clone contains regions of amino acid sequence identity with that predicted by the cDNA clone VGF, and further analysis suggests that both NGF33.1 and VGF cDNA clones very likely correspond to the same mRNA (VGF). In this report both the nucleic acid sequence that corresponds to VGF mRNA and the polypeptide predicted by the NGF33.1 cDNA clone are presented. Genomic Southern analysis and database comparison did not detect additional sequences with high homology to the VGF gene. Induction of VGF mRNA by depolarization and phorbol 12-myristate 13-acetate treatment was greater than by serum stimulation or protein kinase A pathway activation. These studies suggest that VGF mRNA is induced to the greatest extent by NGF treatment and that VGF is one of the most rapidly regulated neuronal mRNAs identified in PC12 cells.
Osmundson, Todd W; Eyre, Catherine A; Hayden, Katherine M; Dhillon, Jaskirn; Garbelotto, Matteo M
2013-01-01
The ubiquity, high diversity and often-cryptic manifestations of fungi and oomycetes frequently necessitate molecular tools for detecting and identifying them in the environment. In applications including DNA barcoding, pathogen detection from plant samples, and genotyping for population genetics and epidemiology, rapid and dependable DNA extraction methods scalable from one to hundreds of samples are desirable. We evaluated several rapid extraction methods (NaOH, Rapid one-step extraction (ROSE), Chelex 100, proteinase K) for their ability to obtain DNA of quantity and quality suitable for the following applications: PCR amplification of the multicopy barcoding locus ITS1/5.8S/ITS2 from various fungal cultures and sporocarps; single-copy microsatellite amplification from cultures of the phytopathogenic oomycete Phytophthora ramorum; probe-based P. ramorum detection from leaves. Several methods were effective for most of the applications, with NaOH extraction favored in terms of success rate, cost, speed and simplicity. Frozen dilutions of ROSE and NaOH extracts maintained PCR viability for over 32 months. DNA from rapid extractions performed poorly compared to CTAB/phenol-chloroform extracts for TaqMan diagnostics from tanoak leaves, suggesting that incomplete removal of PCR inhibitors is an issue for sensitive diagnostic procedures, especially from plants with recalcitrant leaf chemistry. NaOH extracts exhibited lower yield and size than CTAB/phenol-chloroform extracts; however, NaOH extraction facilitated obtaining clean sequence data from sporocarps contaminated by other fungi, perhaps due to dilution resulting from low DNA yield. We conclude that conventional extractions are often unnecessary for routine DNA sequencing or genotyping of fungi and oomycetes, and recommend simpler strategies where source materials and intended applications warrant such use. © 2012 Blackwell Publishing Ltd.
Development of Species-specific Primers for Rapid Detection of Phellinus linteus and P. baumii
Kim, Mun-Ok; Kim, Gi-Young; Nam, Byung-Hyouk; Jin, Cheng-Yun; Lee, Ki-Won; Park, Jae-Min; Lee, Sang-Joon
2005-01-01
Genus Phellinus taxonomically belongs to Aphyllophorales and some species of this genus have been used as a medicinal ingredients and Indian folk medicines. Especially, P. linteus and morphological-related species are well-known medicinal fungi that have various biological activities such as humoral and cell-mediated, anti-mutagenic, and anti-cancer activities. However, little is known about the rapid detection for complex Phellinus species. Therefore, this study was carried out to develop specific primers for the rapid detection of P. linteus and other related species. Designing the species-specific primers was done based on internal transcribed spacer sequence data. Each primer set detected specifically P. linteus (PL2/PL5R) and P. baumii (PB1/PB4R). These primer sets could be useful for the rapid detection of specific-species among unidentified Phellinus species. Moreover, restriction fragment length polymorphism analysis of the ITS region with HaeIII was also useful for clarifying the relationship between each 5 Phellinus species. PMID:24049482
Richards, Stephen; Liu, Yue; Bettencourt, Brian R.; Hradecky, Pavel; Letovsky, Stan; Nielsen, Rasmus; Thornton, Kevin; Hubisz, Melissa J.; Chen, Rui; Meisel, Richard P.; Couronne, Olivier; Hua, Sujun; Smith, Mark A.; Zhang, Peili; Liu, Jing; Bussemaker, Harmen J.; van Batenburg, Marinus F.; Howells, Sally L.; Scherer, Steven E.; Sodergren, Erica; Matthews, Beverly B.; Crosby, Madeline A.; Schroeder, Andrew J.; Ortiz-Barrientos, Daniel; Rives, Catharine M.; Metzker, Michael L.; Muzny, Donna M.; Scott, Graham; Steffen, David; Wheeler, David A.; Worley, Kim C.; Havlak, Paul; Durbin, K. James; Egan, Amy; Gill, Rachel; Hume, Jennifer; Morgan, Margaret B.; Miner, George; Hamilton, Cerissa; Huang, Yanmei; Waldron, Lenée; Verduzco, Daniel; Clerc-Blankenburg, Kerstin P.; Dubchak, Inna; Noor, Mohamed A.F.; Anderson, Wyatt; White, Kevin P.; Clark, Andrew G.; Schaeffer, Stephen W.; Gelbart, William; Weinstock, George M.; Gibbs, Richard A.
2005-01-01
We have sequenced the genome of a second Drosophila species, Drosophila pseudoobscura, and compared this to the genome sequence of Drosophila melanogaster, a primary model organism. Throughout evolution the vast majority of Drosophila genes have remained on the same chromosome arm, but within each arm gene order has been extensively reshuffled, leading to a minimum of 921 syntenic blocks shared between the species. A repetitive sequence is found in the D. pseudoobscura genome at many junctions between adjacent syntenic blocks. Analysis of this novel repetitive element family suggests that recombination between offset elements may have given rise to many paracentric inversions, thereby contributing to the shuffling of gene order in the D. pseudoobscura lineage. Based on sequence similarity and synteny, 10,516 putative orthologs have been identified as a core gene set conserved over 25–55 million years (Myr) since the pseudoobscura/melanogaster divergence. Genes expressed in the testes had higher amino acid sequence divergence than the genome-wide average, consistent with the rapid evolution of sex-specific proteins. Cis-regulatory sequences are more conserved than random and nearby sequences between the species—but the difference is slight, suggesting that the evolution of cis-regulatory elements is flexible. Overall, a pattern of repeat-mediated chromosomal rearrangement, and high coadaptation of both male genes and cis-regulatory sequences emerges as important themes of genome divergence between these species of Drosophila. PMID:15632085
CRISPR interference and priming varies with individual spacer sequences
Xue, Chaoyou; Seetharam, Arun S.; Musharova, Olga; Severinov, Konstantin; J. Brouns, Stan J.; Severin, Andrew J.; Sashital, Dipali G.
2015-01-01
CRISPR–Cas (clustered regularly interspaced short palindromic repeats-CRISPR associated) systems allow bacteria to adapt to infection by acquiring ‘spacer’ sequences from invader DNA into genomic CRISPR loci. Cas proteins use RNAs derived from these loci to target cognate sequences for destruction through CRISPR interference. Mutations in the protospacer adjacent motif (PAM) and seed regions block interference but promote rapid ‘primed’ adaptation. Here, we use multiple spacer sequences to reexamine the PAM and seed sequence requirements for interference and priming in the Escherichia coli Type I-E CRISPR–Cas system. Surprisingly, CRISPR interference is far more tolerant of mutations in the seed and the PAM than previously reported, and this mutational tolerance, as well as priming activity, is highly dependent on spacer sequence. We identify a large number of functional PAMs that can promote interference, priming or both activities, depending on the associated spacer sequence. Functional PAMs are preferentially acquired during unprimed ‘naïve’ adaptation, leading to a rapid priming response following infection. Our results provide numerous insights into the importance of both spacer and target sequences for interference and priming, and reveal that priming is a major pathway for adaptation during initial infection. PMID:26586800
2012-01-01
Background The NCBI Conserved Domain Database (CDD) consists of a collection of multiple sequence alignments of protein domains that are at various stages of being manually curated into evolutionary hierarchies based on conserved and divergent sequence and structural features. These domain models are annotated to provide insights into the relationships between sequence, structure and function via web-based BLAST searches. Results Here we automate the generation of conserved domain (CD) hierarchies using a combination of heuristic and Markov chain Monte Carlo (MCMC) sampling procedures and starting from a (typically very large) multiple sequence alignment. This procedure relies on statistical criteria to define each hierarchy based on the conserved and divergent sequence patterns associated with protein functional-specialization. At the same time this facilitates the sequence and structural annotation of residues that are functionally important. These statistical criteria also provide a means to objectively assess the quality of CD hierarchies, a non-trivial task considering that the protein subgroups are often very distantly related—a situation in which standard phylogenetic methods can be unreliable. Our aim here is to automatically generate (typically sub-optimal) hierarchies that, based on statistical criteria and visual comparisons, are comparable to manually curated hierarchies; this serves as the first step toward the ultimate goal of obtaining optimal hierarchical classifications. A plot of runtimes for the most time-intensive (non-parallelizable) part of the algorithm indicates a nearly linear time complexity so that, even for the extremely large Rossmann fold protein class, results were obtained in about a day. Conclusions This approach automates the rapid creation of protein domain hierarchies and thus will eliminate one of the most time consuming aspects of conserved domain database curation. At the same time, it also facilitates protein domain annotation by identifying those pattern residues that most distinguish each protein domain subgroup from other related subgroups. PMID:22726767
Nucleic Acid-Based Approaches for Detection of Viral Hepatitis
Behzadi, Payam; Ranjbar, Reza; Alavian, Seyed Moayed
2014-01-01
Context: To determining suitable nucleic acid diagnostics for individual viral hepatitis agent, an extensive search using related keywords was done in major medical library and data were collected, categorized, and summarized in different sections. Results: Various types of molecular biology tools can be used to detect and quantify viral genomic elements and analyze the sequences. These molecular assays are proper technologies for rapidly detecting viral agents with high accuracy, high sensitivity, and high specificity. Nonetheless, the application of each diagnostic method is completely dependent on viral agent. Conclusions: Despite rapidity, automation, accuracy, cost-effectiveness, high sensitivity, and high specificity of molecular techniques, each type of molecular technology has its own advantages and disadvantages. PMID:25789132
Repeatless and repeat-based centromeres in potato: implications for centromere evolution.
Gong, Zhiyun; Wu, Yufeng; Koblízková, Andrea; Torres, Giovana A; Wang, Kai; Iovene, Marina; Neumann, Pavel; Zhang, Wenli; Novák, Petr; Buell, C Robin; Macas, Jirí; Jiang, Jiming
2012-09-01
Centromeres in most higher eukaryotes are composed of long arrays of satellite repeats. By contrast, most newly formed centromeres (neocentromeres) do not contain satellite repeats and instead include DNA sequences representative of the genome. An unknown question in centromere evolution is how satellite repeat-based centromeres evolve from neocentromeres. We conducted a genome-wide characterization of sequences associated with CENH3 nucleosomes in potato (Solanum tuberosum). Five potato centromeres (Cen4, Cen6, Cen10, Cen11, and Cen12) consisted primarily of single- or low-copy DNA sequences. No satellite repeats were identified in these five centromeres. At least one transcribed gene was associated with CENH3 nucleosomes. Thus, these five centromeres structurally resemble neocentromeres. By contrast, six potato centromeres (Cen1, Cen2, Cen3, Cen5, Cen7, and Cen8) contained megabase-sized satellite repeat arrays that are unique to individual centromeres. The satellite repeat arrays likely span the entire functional cores of these six centromeres. At least four of the centromeric repeats were amplified from retrotransposon-related sequences and were not detected in Solanum species closely related to potato. The presence of two distinct types of centromeres, coupled with the boom-and-bust cycles of centromeric satellite repeats in Solanum species, suggests that repeat-based centromeres can rapidly evolve from neocentromeres by de novo amplification and insertion of satellite repeats in the CENH3 domains.
Naccache, Samia N.; Federman, Scot; Veeraraghavan, Narayanan; Zaharia, Matei; Lee, Deanna; Samayoa, Erik; Bouquet, Jerome; Greninger, Alexander L.; Luk, Ka-Cheung; Enge, Barryett; Wadford, Debra A.; Messenger, Sharon L.; Genrich, Gillian L.; Pellegrino, Kristen; Grard, Gilda; Leroy, Eric; Schneider, Bradley S.; Fair, Joseph N.; Martínez, Miguel A.; Isa, Pavel; Crump, John A.; DeRisi, Joseph L.; Sittler, Taylor; Hackett, John; Miller, Steve; Chiu, Charles Y.
2014-01-01
Unbiased next-generation sequencing (NGS) approaches enable comprehensive pathogen detection in the clinical microbiology laboratory and have numerous applications for public health surveillance, outbreak investigation, and the diagnosis of infectious diseases. However, practical deployment of the technology is hindered by the bioinformatics challenge of analyzing results accurately and in a clinically relevant timeframe. Here we describe SURPI (“sequence-based ultrarapid pathogen identification”), a computational pipeline for pathogen identification from complex metagenomic NGS data generated from clinical samples, and demonstrate use of the pipeline in the analysis of 237 clinical samples comprising more than 1.1 billion sequences. Deployable on both cloud-based and standalone servers, SURPI leverages two state-of-the-art aligners for accelerated analyses, SNAP and RAPSearch, which are as accurate as existing bioinformatics tools but orders of magnitude faster in performance. In fast mode, SURPI detects viruses and bacteria by scanning data sets of 7–500 million reads in 11 min to 5 h, while in comprehensive mode, all known microorganisms are identified, followed by de novo assembly and protein homology searches for divergent viruses in 50 min to 16 h. SURPI has also directly contributed to real-time microbial diagnosis in acutely ill patients, underscoring its potential key role in the development of unbiased NGS-based clinical assays in infectious diseases that demand rapid turnaround times. PMID:24899342