Science.gov

Sample records for rna instability sequences

  1. The role of topoisomerase I in suppressing genome instability associated with a highly transcribed guanine-rich sequence is not restricted to preventing RNA:DNA hybrid accumulation

    PubMed Central

    Yadav, Puja; Owiti, Norah; Kim, Nayun

    2016-01-01

    Highly transcribed guanine-run containing sequences, in Saccharomyces cerevisiae, become unstable when topoisomerase I (Top1) is disrupted. Topological changes, such as the formation of extended RNA:DNA hybrids or R-loops or non-canonical DNA structures including G-quadruplexes has been proposed as the major underlying cause of the transcription-linked genome instability. Here, we report that R-loop accumulation at a guanine-rich sequence, which is capable of assembling into the four-stranded G4 DNA structure, is dependent on the level and the orientation of transcription. In the absence of Top1 or RNase Hs, R-loops accumulated to substantially higher extent when guanine-runs were located on the non-transcribed strand. This coincides with the orientation where higher genome instability was observed. However, we further report that there are significant differences between the disruption of RNase Hs and Top1 in regards to the orientation-specific elevation in genome instability at the guanine-rich sequence. Additionally, genome instability in Top1-deficient yeasts is not completely suppressed by removal of negative supercoils and further aggravated by expression of mutant Top1. Together, our data provide a strong support for a function of Top1 in suppressing genome instability at the guanine-run containing sequence that goes beyond preventing the transcription-associated RNA:DNA hybrid formation. PMID:26527723

  2. The role of topoisomerase I in suppressing genome instability associated with a highly transcribed guanine-rich sequence is not restricted to preventing RNA:DNA hybrid accumulation.

    PubMed

    Yadav, Puja; Owiti, Norah; Kim, Nayun

    2016-01-29

    Highly transcribed guanine-run containing sequences, in Saccharomyces cerevisiae, become unstable when topoisomerase I (Top1) is disrupted. Topological changes, such as the formation of extended RNA:DNA hybrids or R-loops or non-canonical DNA structures including G-quadruplexes has been proposed as the major underlying cause of the transcription-linked genome instability. Here, we report that R-loop accumulation at a guanine-rich sequence, which is capable of assembling into the four-stranded G4 DNA structure, is dependent on the level and the orientation of transcription. In the absence of Top1 or RNase Hs, R-loops accumulated to substantially higher extent when guanine-runs were located on the non-transcribed strand. This coincides with the orientation where higher genome instability was observed. However, we further report that there are significant differences between the disruption of RNase Hs and Top1 in regards to the orientation-specific elevation in genome instability at the guanine-rich sequence. Additionally, genome instability in Top1-deficient yeasts is not completely suppressed by removal of negative supercoils and further aggravated by expression of mutant Top1. Together, our data provide a strong support for a function of Top1 in suppressing genome instability at the guanine-run containing sequence that goes beyond preventing the transcription-associated RNA:DNA hybrid formation. PMID:26527723

  3. RNA Sequencing in Schizophrenia

    PubMed Central

    Li, Xin; Teng, Shaolei

    2015-01-01

    Schizophrenia (SCZ) is a serious psychiatric disorder that affects 1% of general population and places a heavy burden worldwide. The underlying genetic mechanism of SCZ remains unknown, but studies indicate that the disease is associated with a global gene expression disturbance across many genes. Next-generation sequencing, particularly of RNA sequencing (RNA-Seq), provides a powerful genome-scale technology to investigate the pathological processes of SCZ. RNA-Seq has been used to analyze the gene expressions and identify the novel splice isoforms and rare transcripts associated with SCZ. This paper provides an overview on the genetics of SCZ, the advantages of RNA-Seq for transcriptome analysis, the accomplishments of RNA-Seq in SCZ cohorts, and the applications of induced pluripotent stem cells and RNA-Seq in SCZ research. PMID:27053919

  4. AMPLIFICATION OF RIBOSOMAL RNA SEQUENCES

    EPA Science Inventory

    This book chapter offers an overview of the use of ribosomal RNA sequences. A history of the technology traces the evolution of techniques to measure bacterial phylogenetic relationships and recent advances in obtaining rRNA sequence information. The manual also describes procedu...

  5. Deciphering the RNA landscape by RNAome sequencing.

    PubMed

    Derks, Kasper W J; Misovic, Branislav; van den Hout, Mirjam C G N; Kockx, Christel E M; Gomez, Cesar Payan; Brouwer, Rutger W W; Vrieling, Harry; Hoeijmakers, Jan H J; van IJcken, Wilfred F J; Pothof, Joris

    2015-01-01

    Current RNA expression profiling methods rely on enrichment steps for specific RNA classes, thereby not detecting all RNA species in an unperturbed manner. We report strand-specific RNAome sequencing that determines expression of small and large RNAs from rRNA-depleted total RNA in a single sequence run. Since current analysis pipelines cannot reliably analyze small and large RNAs simultaneously, we developed TRAP, Total Rna Analysis Pipeline, a robust interface that is also compatible with existing RNA sequencing protocols. RNAome sequencing quantitatively preserved all RNA classes, allowing cross-class comparisons that facilitates the identification of relationships between different RNA classes. We demonstrate the strength of RNAome sequencing in mouse embryonic stem cells treated with cisplatin. MicroRNA and mRNA expression in RNAome sequencing significantly correlated between replicates and was in concordance with both existing RNA sequencing methods and gene expression arrays generated from the same samples. Moreover, RNAome sequencing also detected additional RNA classes such as enhancer RNAs, anti-sense RNAs, novel RNA species and numerous differentially expressed RNAs undetectable by other methods. At the level of complete RNA classes, RNAome sequencing also identified a specific global repression of the microRNA and microRNA isoform classes after cisplatin treatment whereas all other classes such as mRNAs were unchanged. These characteristics of RNAome sequencing will significantly improve expression analysis as well as studies on RNA biology not covered by existing methods. PMID:25826412

  6. Deciphering the RNA landscape by RNAome sequencing

    PubMed Central

    Derks, Kasper WJ; Misovic, Branislav; van den Hout, Mirjam CGN; Kockx, Christel EM; Payan Gomez, Cesar; Brouwer, Rutger WW; Vrieling, Harry; Hoeijmakers, Jan HJ; van IJcken, Wilfred FJ; Pothof, Joris

    2015-01-01

    Current RNA expression profiling methods rely on enrichment steps for specific RNA classes, thereby not detecting all RNA species in an unperturbed manner. We report strand-specific RNAome sequencing that determines expression of small and large RNAs from rRNA-depleted total RNA in a single sequence run. Since current analysis pipelines cannot reliably analyze small and large RNAs simultaneously, we developed TRAP, Total Rna Analysis Pipeline, a robust interface that is also compatible with existing RNA sequencing protocols. RNAome sequencing quantitatively preserved all RNA classes, allowing cross-class comparisons that facilitates the identification of relationships between different RNA classes. We demonstrate the strength of RNAome sequencing in mouse embryonic stem cells treated with cisplatin. MicroRNA and mRNA expression in RNAome sequencing significantly correlated between replicates and was in concordance with both existing RNA sequencing methods and gene expression arrays generated from the same samples. Moreover, RNAome sequencing also detected additional RNA classes such as enhancer RNAs, anti-sense RNAs, novel RNA species and numerous differentially expressed RNAs undetectable by other methods. At the level of complete RNA classes, RNAome sequencing also identified a specific global repression of the microRNA and microRNA isoform classes after cisplatin treatment whereas all other classes such as mRNAs were unchanged. These characteristics of RNAome sequencing will significantly improve expression analysis as well as studies on RNA biology not covered by existing methods. PMID:25826412

  7. RNAome sequencing delineates the complete RNA landscape.

    PubMed

    Derks, Kasper W J; Pothof, Joris

    2015-09-01

    Standard RNA expression profiling methods rely on enrichment steps for specific RNA classes, thereby not detecting all RNA species. For example, small and large RNAs from the same sample cannot be sequenced in a single sequence run. We designed RNAome sequencing, which is a strand-specific method to determine the expression of small and large RNAs from ribosomal RNA-depleted total RNA in a single sequence run. RNAome sequencing quantitatively preserves all RNA classes. This characteristic allows comparisons between RNA classes, thereby facilitating relationships between different RNA classes. Here, we describe in detail the experimental procedure associated with RNAome sequencing published by Derks and colleagues in RNA Biology (2015) [1]. We also provide the R code for the developed Total Rna Analysis Pipeline (TRAP), an algorithm to analyze RNAome sequencing datasets (deposited at the Gene Expression Omnibus data repository, accession number GSE48084). PMID:26484291

  8. RNA sequence analysis using covariance models.

    PubMed Central

    Eddy, S R; Durbin, R

    1994-01-01

    We describe a general approach to several RNA sequence analysis problems using probabilistic models that flexibly describe the secondary structure and primary sequence consensus of an RNA sequence family. We call these models 'covariance models'. A covariance model of tRNA sequences is an extremely sensitive and discriminative tool for searching for additional tRNAs and tRNA-related sequences in sequence databases. A model can be built automatically from an existing sequence alignment. We also describe an algorithm for learning a model and hence a consensus secondary structure from initially unaligned example sequences and no prior structural information. Models trained on unaligned tRNA examples correctly predict tRNA secondary structure and produce high-quality multiple alignments. The approach may be applied to any family of small RNA sequences. Images PMID:8029015

  9. Next generation sequencing of viral RNA genomes

    PubMed Central

    2013-01-01

    Background With the advent of Next Generation Sequencing (NGS) technologies, the ability to generate large amounts of sequence data has revolutionized the genomics field. Most RNA viruses have relatively small genomes in comparison to other organisms and as such, would appear to be an obvious success story for the use of NGS technologies. However, due to the relatively low abundance of viral RNA in relation to host RNA, RNA viruses have proved relatively difficult to sequence using NGS technologies. Here we detail a simple, robust methodology, without the use of ultra-centrifugation, filtration or viral enrichment protocols, to prepare RNA from diagnostic clinical tissue samples, cell monolayers and tissue culture supernatant, for subsequent sequencing on the Roche 454 platform. Results As representative RNA viruses, full genome sequence was successfully obtained from known lyssaviruses belonging to recognized species and a novel lyssavirus species using these protocols and assembling the reads using de novo algorithms. Furthermore, genome sequences were generated from considerably less than 200 ng RNA, indicating that manufacturers’ minimum template guidance is conservative. In addition to obtaining genome consensus sequence, a high proportion of SNPs (Single Nucleotide Polymorphisms) were identified in the majority of samples analyzed. Conclusions The approaches reported clearly facilitate successful full genome lyssavirus sequencing and can be universally applied to discovering and obtaining consensus genome sequences of RNA viruses from a variety of sources. PMID:23822119

  10. antaRNA: ant colony-based RNA sequence design

    PubMed Central

    Kleinkauf, Robert; Mann, Martin; Backofen, Rolf

    2015-01-01

    Motivation: RNA sequence design is studied at least as long as the classical folding problem. Although for the latter the functional fold of an RNA molecule is to be found, inverse folding tries to identify RNA sequences that fold into a function-specific target structure. In combination with RNA-based biotechnology and synthetic biology, reliable RNA sequence design becomes a crucial step to generate novel biochemical components. Results: In this article, the computational tool antaRNA is presented. It is capable of compiling RNA sequences for a given structure that comply in addition with an adjustable full range objective GC-content distribution, specific sequence constraints and additional fuzzy structure constraints. antaRNA applies ant colony optimization meta-heuristics and its superior performance is shown on a biological datasets. Availability and implementation: http://www.bioinf.uni-freiburg.de/Software/antaRNA Contact: backofen@informatik.uni-freiburg.de Supplementary information: Supplementary data are available at Bioinformatics online. PMID:26023105

  11. Mechanisms of genome instability induced by RNA processing defects

    PubMed Central

    Chan, Yujia A.; Hieter, Philip

    2014-01-01

    The role of normal transcription and RNA processing in maintaining genome integrity is becoming increasingly appreciated in organisms ranging from bacteria to humans. Several mutations in RNA biogenesis factors have been implicated in human cancers, but the mechanisms and potential connections to tumor genome instability are not clear. Here we discuss how RNA processing defects could destabilize genomes through mutagenic R-loop structures and by altering expression of genes required for genome stability. A compelling body of evidence now suggests that researchers should be directly testing these mechanisms in models of human cancer. PMID:24794811

  12. Mechanisms of genome instability induced by RNA-processing defects.

    PubMed

    Chan, Yujia A; Hieter, Philip; Stirling, Peter C

    2014-06-01

    The role of normal transcription and RNA processing in maintaining genome integrity is becoming increasingly appreciated in organisms ranging from bacteria to humans. Several mutations in RNA biogenesis factors have been implicated in human cancers, but the mechanisms and potential connections to tumor genome instability are not clear. Here, we discuss how RNA-processing defects could destabilize genomes through mutagenic R-loop structures and by altering expression of genes required for genome stability. A compelling body of evidence now suggests that researchers should be directly testing these mechanisms in models of human cancer. PMID:24794811

  13. Sequence Dependence of Viral RNA Encapsidation.

    PubMed

    Kelly, Joshua; Grosberg, Alexander Y; Bruinsma, Robijn

    2016-07-01

    We develop a Flory mean-field theory for viral RNA (vRNA) molecules that extends the current RNA folding algorithms to include interactions between different sections of the secondary structure. The theory is applied to sequence-selective vRNA encapsidation. The dependence on sequence enters through a single parameter: the largest eigenvalue of the Kramers matrix of the branched polymer obtained by coarse graining the secondary structure. Differences between the work of encapsidation of vRNA molecules and of randomized isomers are found to be in the range of 20 kBT, more than sufficient to provide a strong bias in favor of vRNA encapsidation. The method is applied to a packaging competition experiment where large vRNA molecules compete for encapsidation with two smaller RNA species that together have the same nucleotide sequence as the large molecule. We encounter a substantial, generic free energy bias, that also is of the order of 20 kBT, in favor of encapsidating the single large RNA molecule. The bias is mainly the consequence of the fact that dividing up a large vRNA molecule involves the release of stored elastic energy. This provides an important, nonspecific mechanism for preferential encapsidation of single larger vRNA molecules over multiple smaller mRNA molecules with the same total number of nucleotides. The result is also consistent with recent RNA packaging competition experiments by Comas-Garcia et al.1 Finally, the Flory method leads to the result that when two RNA molecules are copackaged, they are expected to remain segregated inside the capsid. PMID:27116641

  14. Experimental investigation of an RNA sequence space

    NASA Technical Reports Server (NTRS)

    Lee, Youn-Hyung; Dsouza, Lisa; Fox, George E.

    1993-01-01

    Modern rRNAs are the historic consequence of an ongoing evolutionary exploration of a sequence space. These extant sequences belong to a special subset of the sequence space that is comprised only of those primary sequences that can validly perform the biological function(s) required of the particular RNA. If it were possible to readily identify all such valid sequences, stochastic predictions could be made about the relative likelihood of various evolutionary pathways available to an RNA. Herein an experimental system which can assess whether a particular sequence is likely to have validity as a eubacterial 5S rRNA is described. A total of ten naturally occurring, and hence known to be valid, sequences and two point mutants of unknown validity were used to test the usefulness of the approach. Nine of the ten valid sequences tested positive whereas both mutants tested as clearly defective. The tenth valid sequence gave results that would be interpreted as reflecting a borderline status were the answer not known. These results demonstrate that it is possible to experimentally determine which sequences in local regions of the sequence space are potentially valid 5S rRNAs.

  15. RCARE: RNA Sequence Comparison and Annotation for RNA Editing

    PubMed Central

    2015-01-01

    The post-transcriptional sequence modification of transcripts through RNA editing is an important mechanism for regulating protein function and is associated with human disease phenotypes. The identification of RNA editing or RNA-DNA difference (RDD) sites is a fundamental step in the study of RNA editing. However, a substantial number of false-positive RDD sites have been identified recently. A major challenge in identifying RDD sites is to distinguish between the true RNA editing sites and the false positives. Furthermore, determining the location of condition-specific RDD sites and elucidating their functional roles will help toward understanding various biological phenomena that are mediated by RNA editing. The present study developed RNA-sequence comparison and annotation for RNA editing (RCARE) for searching, annotating, and visualizing RDD sites using thousands of previously known editing sites, which can be used for comparative analyses between multiple samples. RCARE also provides evidence for improving the reliability of identified RDD sites. RCARE is a web-based comparison, annotation, and visualization tool, which provides rich biological annotations and useful summary plots. The developers of previous tools that identify or annotate RNA-editing sites seldom mention the reliability of their respective tools. In order to address the issue, RCARE utilizes a number of scientific publications and databases to find specific documentations respective to a particular RNA-editing site, which generates evidence levels to convey the reliability of RCARE. Sequence-based alignment files can be converted into VCF files using a Python script and uploaded to the RCARE server for further analysis. RCARE is available for free at http://www.snubi.org/software/rcare/. PMID:26043858

  16. Alternative applications for distinct RNA sequencing strategies

    PubMed Central

    Han, Leng; Vickers, Kasey C.; Samuels, David C.

    2015-01-01

    Recent advances in RNA library preparation methods, platform accessibility and cost efficiency have allowed high-throughput RNA sequencing (RNAseq) to replace conventional hybridization microarray platforms as the method of choice for mRNA profiling and transcriptome analyses. RNAseq is a powerful technique to profile both long and short RNA expression, and the depth of information gained from distinct RNAseq methods is striking and facilitates discovery. In addition to expression analysis, distinct RNAseq approaches also allow investigators the ability to assess transcriptional elongation, DNA variance and exogenous RNA content. Here we review the current state of the art in transcriptome sequencing and address epigenetic regulation, quantification of transcription activation, RNAseq output and a diverse set of applications for RNAseq data. We detail how RNAseq can be used to identify allele-specific expression, single-nucleotide polymorphisms and somatic mutations and discuss the benefits and limitations of using RNAseq to monitor DNA characteristics. Moreover, we highlight the power of combining RNA- and DNAseq methods for genomic analysis. In summary, RNAseq provides the opportunity to gain greater insight into transcriptional regulation and output than simply miRNA and mRNA profiling. PMID:25246237

  17. Alternative applications for distinct RNA sequencing strategies.

    PubMed

    Han, Leng; Vickers, Kasey C; Samuels, David C; Guo, Yan

    2015-07-01

    Recent advances in RNA library preparation methods, platform accessibility and cost efficiency have allowed high-throughput RNA sequencing (RNAseq) to replace conventional hybridization microarray platforms as the method of choice for mRNA profiling and transcriptome analyses. RNAseq is a powerful technique to profile both long and short RNA expression, and the depth of information gained from distinct RNAseq methods is striking and facilitates discovery. In addition to expression analysis, distinct RNAseq approaches also allow investigators the ability to assess transcriptional elongation, DNA variance and exogenous RNA content. Here we review the current state of the art in transcriptome sequencing and address epigenetic regulation, quantification of transcription activation, RNAseq output and a diverse set of applications for RNAseq data. We detail how RNAseq can be used to identify allele-specific expression, single-nucleotide polymorphisms and somatic mutations and discuss the benefits and limitations of using RNAseq to monitor DNA characteristics. Moreover, we highlight the power of combining RNA- and DNAseq methods for genomic analysis. In summary, RNAseq provides the opportunity to gain greater insight into transcriptional regulation and output than simply miRNA and mRNA profiling. PMID:25246237

  18. Transcriptional profiling of Dictyostelium with RNA sequencing

    PubMed Central

    Miranda, Edward Roshan; Rot, Gregor; Toplak, Marko; Santhanam, Balaji; Curk, Tomaz; Shaulsky, Gad; Zupan, Blaz

    2014-01-01

    Summary Transcriptional profiling methods have been utilized in the analysis of various biological processes in Dictyostelium. Recent advances in high-throughput sequencing have increased the resolution and the dynamic range of transcriptional profiling. Here we describe the utility of RNA-sequencing with the Illumina technology for production of transcriptional profiles. We also describe methods for data mapping and storage as well as common and specialized tools for data analysis, both online and offline. PMID:23494306

  19. Ribosomal RNA sequence suggest microsporidia are extremely ancient eukaryotes

    NASA Technical Reports Server (NTRS)

    Vossbrinck, C. R.; Maddox, J. V.; Friedman, S.; Debrunner-Vossbrinck, B. A.; Woese, C. R.

    1987-01-01

    A comparative sequence analysis of the 18S small subunit ribosomal RNA (rRNA) of the microsporidium Vairimorpha necatrix is presented. The results show that this rRNA sequence is more unlike those of other eukaryotes than any known eukaryote rRNA sequence. It is concluded that the lineage leading to microsporidia branched very early from that leading to other eukaryotes.

  20. Advanced Applications of RNA Sequencing and Challenges

    PubMed Central

    Han, Yixing; Gao, Shouguo; Muegge, Kathrin; Zhang, Wei; Zhou, Bing

    2015-01-01

    Next-generation sequencing technologies have revolutionarily advanced sequence-based research with the advantages of high-throughput, high-sensitivity, and high-speed. RNA-seq is now being used widely for uncovering multiple facets of transcriptome to facilitate the biological applications. However, the large-scale data analyses associated with RNA-seq harbors challenges. In this study, we present a detailed overview of the applications of this technology and the challenges that need to be addressed, including data preprocessing, differential gene expression analysis, alternative splicing analysis, variants detection and allele-specific expression, pathway analysis, co-expression network analysis, and applications combining various experimental procedures beyond the achievements that have been made. Specifically, we discuss essential principles of computational methods that are required to meet the key challenges of the RNA-seq data analyses, development of various bioinformatics tools, challenges associated with the RNA-seq applications, and examples that represent the advances made so far in the characterization of the transcriptome. PMID:26609224

  1. Dis3- and exosome subunit-responsive 3 Prime mRNA instability elements

    SciTech Connect

    Kiss, Daniel L.; Hou, Dezhi; Gross, Robert H.; Andrulis, Erik D.

    2012-07-06

    Highlights: Black-Right-Pointing-Pointer Successful use of a novel RNA-specific bioinformatic tool, RNA SCOPE. Black-Right-Pointing-Pointer Identified novel 3 Prime UTR cis-acting element that destabilizes a reporter mRNA. Black-Right-Pointing-Pointer Show exosome subunits are required for cis-acting element-mediated mRNA instability. Black-Right-Pointing-Pointer Define precise sequence requirements of novel cis-acting element. Black-Right-Pointing-Pointer Show that microarray-defined exosome subunit-regulated mRNAs have novel element. -- Abstract: Eukaryotic RNA turnover is regulated in part by the exosome, a nuclear and cytoplasmic complex of ribonucleases (RNases) and RNA-binding proteins. The major RNase of the complex is thought to be Dis3, a multi-functional 3 Prime -5 Prime exoribonuclease and endoribonuclease. Although it is known that Dis3 and core exosome subunits are recruited to transcriptionally active genes and to messenger RNA (mRNA) substrates, this recruitment is thought to occur indirectly. We sought to discover cis-acting elements that recruit Dis3 or other exosome subunits. Using a bioinformatic tool called RNA SCOPE to screen the 3 Prime untranslated regions of up-regulated transcripts from our published Dis3 depletion-derived transcriptomic data set, we identified several motifs as candidate instability elements. Secondary screening using a luciferase reporter system revealed that one cassette-harboring four elements-destabilized the reporter transcript. RNAi-based depletion of Dis3, Rrp6, Rrp4, Rrp40, or Rrp46 diminished the efficacy of cassette-mediated destabilization. Truncation analysis of the cassette showed that two exosome subunit-sensitive elements (ESSEs) destabilized the reporter. Point-directed mutagenesis of ESSE abrogated the destabilization effect. An examination of the transcriptomic data from exosome subunit depletion-based microarrays revealed that mRNAs with ESSEs are found in every up-regulated mRNA data set but are

  2. De novo assembly of a bell pepper endornavirus genome sequence using RNA sequencing data.

    PubMed

    Jo, Yeonhwa; Choi, Hoseng; Cho, Won Kyong

    2015-01-01

    The genus Endornavirus is a double-stranded RNA virus that infects a wide range of hosts. In this study, we report on the de novo assembly of a bell pepper endornavirus genome sequence by RNA sequencing (RNA-Seq). Our result demonstrates the successful application of RNA-Seq to obtain a complete viral genome sequence from the transcriptome data. PMID:25792042

  3. Nucleotide sequence of Neurospora crassa cytoplasmic initiator tRNA.

    PubMed Central

    Gillum, A M; Hecker, L I; Silberklang, M; Schwartzbach, S D; RajBhandary, U L; Barnett, W E

    1977-01-01

    Initiator methionine tRNA from the cytoplasm of Neurospora crassa has been purified and sequenced. The sequence is: pAGCUGCAUm1GGCGCAGCGGAAGCGCM22GCY*GGGCUCAUt6AACCCGGAGm7GU (or D) - CACUCGAUCGm1AAACGAG*UUGCAGCUACCAOH. Similar to initiator tRNAs from the cytoplasm of other eukaryotes, this tRNA also contains the sequence -AUCG- instead of the usual -TphiCG (or A)- found in loop IV of other tRNAs. The sequence of the N. crassa cytoplasmic initiator tRNA is quite different from that of the corresponding mitochondrial initiator tRNA. Comparison of the sequence of N. crassa cytoplasmic initiator tRNA to those of yeast, wheat germ and vertebrate cytoplasmic initiator tRNA indicates that the sequences of the two fungal tRNAs are no more similar to each other than they are to those of other initiator tRNAs. Images PMID:146192

  4. Empirical insights into the stochasticity of small RNA sequencing

    NASA Astrophysics Data System (ADS)

    Qin, Li-Xuan; Tuschl, Thomas; Singer, Samuel

    2016-04-01

    The choice of stochasticity distribution for modeling the noise distribution is a fundamental assumption for the analysis of sequencing data and consequently is critical for the accurate assessment of biological heterogeneity and differential expression. The stochasticity of RNA sequencing has been assumed to follow Poisson distributions. We collected microRNA sequencing data and observed that its stochasticity is better approximated by gamma distributions, likely because of the stochastic nature of exponential PCR amplification. We validated our findings with two independent datasets, one for microRNA sequencing and another for RNA sequencing. Motivated by the gamma distributed stochasticity, we provided a simple method for the analysis of RNA sequencing data and showed its superiority to three existing methods for differential expression analysis using three data examples of technical replicate data and biological replicate data.

  5. Empirical insights into the stochasticity of small RNA sequencing

    PubMed Central

    Qin, Li-Xuan; Tuschl, Thomas; Singer, Samuel

    2016-01-01

    The choice of stochasticity distribution for modeling the noise distribution is a fundamental assumption for the analysis of sequencing data and consequently is critical for the accurate assessment of biological heterogeneity and differential expression. The stochasticity of RNA sequencing has been assumed to follow Poisson distributions. We collected microRNA sequencing data and observed that its stochasticity is better approximated by gamma distributions, likely because of the stochastic nature of exponential PCR amplification. We validated our findings with two independent datasets, one for microRNA sequencing and another for RNA sequencing. Motivated by the gamma distributed stochasticity, we provided a simple method for the analysis of RNA sequencing data and showed its superiority to three existing methods for differential expression analysis using three data examples of technical replicate data and biological replicate data. PMID:27052356

  6. DSAP: deep-sequencing small RNA analysis pipeline.

    PubMed

    Huang, Po-Jung; Liu, Yi-Chung; Lee, Chi-Ching; Lin, Wei-Chen; Gan, Richie Ruei-Chi; Lyu, Ping-Chiang; Tang, Petrus

    2010-07-01

    DSAP is an automated multiple-task web service designed to provide a total solution to analyzing deep-sequencing small RNA datasets generated by next-generation sequencing technology. DSAP uses a tab-delimited file as an input format, which holds the unique sequence reads (tags) and their corresponding number of copies generated by the Solexa sequencing platform. The input data will go through four analysis steps in DSAP: (i) cleanup: removal of adaptors and poly-A/T/C/G/N nucleotides; (ii) clustering: grouping of cleaned sequence tags into unique sequence clusters; (iii) non-coding RNA (ncRNA) matching: sequence homology mapping against a transcribed sequence library from the ncRNA database Rfam (http://rfam.sanger.ac.uk/); and (iv) known miRNA matching: detection of known miRNAs in miRBase (http://www.mirbase.org/) based on sequence homology. The expression levels corresponding to matched ncRNAs and miRNAs are summarized in multi-color clickable bar charts linked to external databases. DSAP is also capable of displaying miRNA expression levels from different jobs using a log(2)-scaled color matrix. Furthermore, a cross-species comparative function is also provided to show the distribution of identified miRNAs in different species as deposited in miRBase. DSAP is available at http://dsap.cgu.edu.tw. PMID:20478825

  7. Unbiased Deep Sequencing of RNA Viruses from Clinical Samples.

    PubMed

    Matranga, Christian B; Gladden-Young, Adrianne; Qu, James; Winnicki, Sarah; Nosamiefan, Dolo; Levin, Joshua Z; Sabeti, Pardis C

    2016-01-01

    Here we outline a next-generation RNA sequencing protocol that enables de novo assemblies and intra-host variant calls of viral genomes collected from clinical and biological sources. The method is unbiased and universal; it uses random primers for cDNA synthesis and requires no prior knowledge of the viral sequence content. Before library construction, selective RNase H-based digestion is used to deplete unwanted RNA - including poly(rA) carrier and ribosomal RNA - from the viral RNA sample. Selective depletion improves both the data quality and the number of unique reads in viral RNA sequencing libraries. Moreover, a transposase-based 'tagmentation' step is used in the protocol as it reduces overall library construction time. The protocol has enabled rapid deep sequencing of over 600 Lassa and Ebola virus samples-including collections from both blood and tissue isolates-and is broadly applicable to other microbial genomics studies. PMID:27403729

  8. RNase P-Mediated Sequence-Specific Cleavage of RNA by Engineered External Guide Sequences.

    PubMed

    Derksen, Merel; Mertens, Vicky; Pruijn, Ger J M

    2015-01-01

    The RNA cleavage activity of RNase P can be employed to decrease the levels of specific RNAs and to study their function or even to eradicate pathogens. Two different technologies have been developed to use RNase P as a tool for RNA knockdown. In one of these, an external guide sequence, which mimics a tRNA precursor, a well-known natural RNase P substrate, is used to target an RNA molecule for cleavage by endogenous RNase P. Alternatively, a guide sequence can be attached to M1 RNA, the (catalytic) RNase P RNA subunit of Escherichia coli. The guide sequence is specific for an RNA target, which is subsequently cleaved by the bacterial M1 RNA moiety. These approaches are applicable in both bacteria and eukaryotes. In this review, we will discuss the two technologies in which RNase P is used to reduce RNA expression levels. PMID:26569326

  9. RNase P-Mediated Sequence-Specific Cleavage of RNA by Engineered External Guide Sequences

    PubMed Central

    Derksen, Merel; Mertens, Vicky; Pruijn, Ger J.M.

    2015-01-01

    The RNA cleavage activity of RNase P can be employed to decrease the levels of specific RNAs and to study their function or even to eradicate pathogens. Two different technologies have been developed to use RNase P as a tool for RNA knockdown. In one of these, an external guide sequence, which mimics a tRNA precursor, a well-known natural RNase P substrate, is used to target an RNA molecule for cleavage by endogenous RNase P. Alternatively, a guide sequence can be attached to M1 RNA, the (catalytic) RNase P RNA subunit of Escherichia coli. The guide sequence is specific for an RNA target, which is subsequently cleaved by the bacterial M1 RNA moiety. These approaches are applicable in both bacteria and eukaryotes. In this review, we will discuss the two technologies in which RNase P is used to reduce RNA expression levels. PMID:26569326

  10. Efficient prediction methods for selecting effective siRNA sequences.

    PubMed

    Takasaki, Shigeru

    2010-02-01

    Although short interfering RNA (siRNA) has been widely used for studying gene functions in mammalian cells, its gene silencing efficacy varies markedly and there are only a few consistencies among the recently reported design rules/guidelines for selecting siRNA sequences effective for mammalian genes. Another shortcoming of the previously reported methods is that they cannot estimate the probability that a candidate sequence will silence the target gene. This paper first reviewed the recently reported siRNA design guidelines and clarified the problems concerning the guidelines. It then proposed two prediction methods-Radial Basis Function (RBF) network and decision tree learning-and their combined method for selecting effective siRNA target sequences from many possible candidate sequences. They are quite different from the previous score-based siRNA design techniques and can predict the probability that a candidate siRNA sequence will be effective. The methods imply high estimation accuracy for selecting candidate siRNA sequences. PMID:20022002

  11. The primary nucleotide sequence of U4 RNA.

    PubMed

    Reddy, R; Henning, D; Busch, H

    1981-04-10

    U4 RNA is one of the "capped" nuclear snRNAs recently found to be precipitable by anti-Sm antibodies as ribonucleoprotein particles. U4 RNA, along with other snRNAs, has been implicated in hnRNA processing, mRNA transport, or both (Lerner, M. R., Boyle, J., Mount, S., Wolin, S., and Steitz, J. A. (1980) Nature 283, 220-224). Since the proteins bound to different snRNAs appear to be the same, the functions of different snRNPs might be dependent on the RNA components. To help understand the function of U4 RNP, the nucleotide sequence of U4 RNA was determined. The sequence is (formula see text) In addition to the modified nucleotides in the "cap," U4 RNA contains Am at position 63 and m6A at position 98. It also exhibited A-C microheterogeneity at position 97. PMID:6162848

  12. DNA Instability Maintains the Repeat Length of the Yeast RNA Polymerase II C-terminal Domain.

    PubMed

    Morrill, Summer A; Exner, Alexandra E; Babokhov, Michael; Reinfeld, Bradley I; Fuchs, Stephen M

    2016-05-27

    The C-terminal domain (CTD) of RNA polymerase II in eukaryotes is comprised of tandemly repeating units of a conserved seven-amino acid sequence. The number of repeats is, however, quite variable across different organisms. Furthermore, previous studies have identified evidence of rearrangements within the CTD coding region, suggesting that DNA instability may play a role in regulating or maintaining CTD repeat number. The work described here establishes a clear connection between DNA instability and CTD repeat number in Saccharomyces cerevisiae First, analysis of 36 diverse S. cerevisiae isolates revealed evidence of numerous past rearrangements within the DNA sequence that encodes the CTD. Interestingly, the total number of CTD repeats was relatively static (24-26 repeats in all strains), suggesting a balancing act between repeat expansion and contraction. In an effort to explore the genetic plasticity within this region, we measured the rates of repeat expansion and contraction using novel reporters and a doxycycline-regulated expression system for RPB1 In efforts to determine the mechanisms leading to CTD repeat variability, we identified the presence of DNA secondary structures, specifically G-quadruplex-like DNA, within the CTD coding region. Furthermore, we demonstrated that mutating PIF1, a G-quadruplex-specific helicase, results in increased CTD repeat length polymorphisms. We also determined that RAD52 is necessary for CTD repeat expansion but not contraction, identifying a role for recombination in repeat expansion. Results from these DNA rearrangements may help explain the CTD copy number variation seen across eukaryotes, as well as support a model of CTD expansion and contraction to maintain CTD integrity and overall length. PMID:27026700

  13. DNA Instability Maintains the Repeat Length of the Yeast RNA Polymerase II C-terminal Domain*

    PubMed Central

    Morrill, Summer A.; Exner, Alexandra E.; Babokhov, Michael; Reinfeld, Bradley I.

    2016-01-01

    The C-terminal domain (CTD) of RNA polymerase II in eukaryotes is comprised of tandemly repeating units of a conserved seven-amino acid sequence. The number of repeats is, however, quite variable across different organisms. Furthermore, previous studies have identified evidence of rearrangements within the CTD coding region, suggesting that DNA instability may play a role in regulating or maintaining CTD repeat number. The work described here establishes a clear connection between DNA instability and CTD repeat number in Saccharomyces cerevisiae. First, analysis of 36 diverse S. cerevisiae isolates revealed evidence of numerous past rearrangements within the DNA sequence that encodes the CTD. Interestingly, the total number of CTD repeats was relatively static (24–26 repeats in all strains), suggesting a balancing act between repeat expansion and contraction. In an effort to explore the genetic plasticity within this region, we measured the rates of repeat expansion and contraction using novel reporters and a doxycycline-regulated expression system for RPB1. In efforts to determine the mechanisms leading to CTD repeat variability, we identified the presence of DNA secondary structures, specifically G-quadruplex-like DNA, within the CTD coding region. Furthermore, we demonstrated that mutating PIF1, a G-quadruplex-specific helicase, results in increased CTD repeat length polymorphisms. We also determined that RAD52 is necessary for CTD repeat expansion but not contraction, identifying a role for recombination in repeat expansion. Results from these DNA rearrangements may help explain the CTD copy number variation seen across eukaryotes, as well as support a model of CTD expansion and contraction to maintain CTD integrity and overall length. PMID:27026700

  14. RNAcentral: an international database of ncRNA sequences

    PubMed Central

    2015-01-01

    The field of non-coding RNA biology has been hampered by the lack of availability of a comprehensive, up-to-date collection of accessioned RNA sequences. Here we present the first release of RNAcentral, a database that collates and integrates information from an international consortium of established RNA sequence databases. The initial release contains over 8.1 million sequences, including representatives of all major functional classes. A web portal (http://rnacentral.org) provides free access to data, search functionality, cross-references, source code and an integrated genome browser for selected species. PMID:25352543

  15. RNAcentral: an international database of ncRNA sequences.

    PubMed

    2015-01-01

    The field of non-coding RNA biology has been hampered by the lack of availability of a comprehensive, up-to-date collection of accessioned RNA sequences. Here we present the first release of RNAcentral, a database that collates and integrates information from an international consortium of established RNA sequence databases. The initial release contains over 8.1 million sequences, including representatives of all major functional classes. A web portal (http://rnacentral.org) provides free access to data, search functionality, cross-references, source code and an integrated genome browser for selected species. PMID:25352543

  16. RNAcentral: an international database of ncRNA sequences

    DOE PAGESBeta

    Williams, Kelly Porter

    2014-10-28

    The field of non-coding RNA biology has been hampered by the lack of availability of a comprehensive, up-to-date collection of accessioned RNA sequences. Here we present the first release of RNAcentral, a database that collates and integrates information from an international consortium of established RNA sequence databases. The initial release contains over 8.1 million sequences, including representatives of all major functional classes. A web portal (http://rnacentral.org) provides free access to data, search functionality, cross-references, source code and an integrated genome browser for selected species.

  17. Nucleotide sequence of a human tRNA gene heterocluster

    SciTech Connect

    Chang, Y.N.; Pirtle, I.L.; Pirtle, R.M.

    1986-05-01

    Leucine tRNA from bovine liver was used as a hybridization probe to screen a human gene library harbored in Charon-4A of bacteriophage lambda. The human DNA inserts from plaque-pure clones were characterized by restriction endonuclease mapping and Southern hybridization techniques, using both (3'-/sup 32/P)-labeled bovine liver leucine tRNA and total tRNA as hybridization probes. An 8-kb Hind III fragment of one of these ..gamma..-clones was subcloned into the Hind III site of pBR322. Subsequent fine restriction mapping and DNA sequence analysis of this plasmid DNA indicated the presence of four tRNA genes within the 8-kb DNA fragment. A leucine tRNA gene with an anticodon of AAG and a proline tRNA gene with an anticodon of AGG are in a 1.6-kb subfragment. A threonine tRNA gene with an anticodon of UGU and an as yet unidentified tRNA gene are located in a 1.1-kb subfragment. These two different subfragments are separated by 2.8 kb. The coding regions of the three sequenced genes contain characteristic internal split promoter sequences and do not have intervening sequences. The 3'-flanking region of these three genes have typical RNA polymerase III termination sites of at least four consecutive T residues.

  18. Compilation of 5S rRNA and 5S rRNA gene sequences

    PubMed Central

    Specht, Thomas; Wolters, Jörn; Erdmann, Volker A.

    1990-01-01

    The BERLIN RNA DATABANK as of Dezember 31, 1989, contains a total of 667 sequences of 5S rRNAs or their genes, which is an increase of 114 new sequence entries over the last compilation (1). It covers sequences from 44 archaebacteria, 267 eubacteria, 20 plastids, 6 mitochondria, 319 eukaryotes and 11 eukaryotic pseudogenes. The hardcopy shows only the list (Table 1) of those organisms whose sequences have been determined. The BERLIN RNA DATABANK uses the format of the EMBL Nucleotide Sequence Data Library complemented by a Sequence Alignment (SA) field including secondary structure information. PMID:1692116

  19. Complete sequence of RNA1 and subgenomic RNA3 of Atlantic halibut nodavirus (AHNV).

    PubMed

    Sommerset, Ingunn; Nerland, Audun H

    2004-03-10

    The Nodaviridae are divided into the alphanodavirus genus, which infects insects, and the betanodavirus genus, which infects fishes. Betanodaviruses are the causative agent of viral encephalopathy and retinopathy (VER) in a number of cultivated marine fish species. The Nodaviridae are small non-enveloped RNA viruses that contain a genome consisting of 2 single-stranded positivesense RNA segments: RNA1 (3.1 kb), which encodes the viral part of the RNA-dependent RNA polymerase (RdRp); and RNA2 (1.4 kb), which encodes the capsid protein. In addition to RNA1 and RNA2, a subgenomic transcript of RNA1, RNA3, is present in infected cells. We have cloned and sequenced RNA1 from the Atlantic halibut Hippoglossus hippoglossus nodavirus (AHNV), and for the first time, the sequence of a betanodaviral subgenomic RNA3 has been determined. AHNV RNA1 was 3100 nucleotides in length and contained a main open reading frame encoding a polypeptide of 981 amino acids. Conservative motifs for RdRp were found in the deduced amino acid sequence. RNA3 was 371 nucleotides in length, and contained an open reading frame encoding a peptide of 75 amino acids corresponding to a hypothetical B2 protein, although sequence alignments with the alphanodavirus B2 proteins showed only marginal similarities. AHNV RNA replication in the fish cell-line SSN-1 (derived from striped snakehead) was analysed by Northern blot analysis, which indicated that RNA3 was synthesised in large amounts (compared to RNA1) at an early point in time post-infection. PMID:15109133

  20. Small RNA Deep Sequencing Reveals Role for Arabidopsis thaliana RNA-Dependent RNA Polymerases in Viral siRNA Biogenesis

    PubMed Central

    Qi, Xiaopeng; Bao, Forrest Sheng; Xie, Zhixin

    2009-01-01

    RNA silencing functions as an important antiviral defense mechanism in a broad range of eukaryotes. In plants, biogenesis of several classes of endogenous small interfering RNAs (siRNAs) requires RNA-dependent RNA Polymerase (RDR) activities. Members of the RDR family proteins, including RDR1and RDR6, have also been implicated in antiviral defense, although a direct role for RDRs in viral siRNA biogenesis has yet to be demonstrated. Using a crucifer-infecting strain of Tobacco Mosaic Virus (TMV-Cg) and Arabidopsis thaliana as a model system, we analyzed the viral small RNA profile in wild-type plants as well as rdr mutants by applying small RNA deep sequencing technology. Over 100,000 TMV-Cg-specific small RNA reads, mostly of 21- (78.4%) and 22-nucleotide (12.9%) in size and originating predominately (79.9%) from the genomic sense RNA strand, were captured at an early infection stage, yielding the first high-resolution small RNA map for a plant virus. The TMV-Cg genome harbored multiple, highly reproducible small RNA-generating hot spots that corresponded to regions with no apparent local hairpin-forming capacity. Significantly, both the rdr1 and rdr6 mutants exhibited globally reduced levels of viral small RNA production as well as reduced strand bias in viral small RNA population, revealing an important role for these host RDRs in viral siRNA biogenesis. In addition, an informatics analysis showed that a large set of host genes could be potentially targeted by TMV-Cg-derived siRNAs for posttranscriptional silencing. Two of such predicted host targets, which encode a cleavage and polyadenylation specificity factor (CPSF30) and an unknown protein similar to translocon-associated protein alpha (TRAP α), respectively, yielded a positive result in cleavage validation by 5′RACE assays. Our data raised the interesting possibility for viral siRNA-mediated virus-host interactions that may contribute to viral pathogenicity and host specificity. PMID:19308254

  1. FLDS: A Comprehensive dsRNA Sequencing Method for Intracellular RNA Virus Surveillance.

    PubMed

    Urayama, Syun-Ichi; Takaki, Yoshihiro; Nunoura, Takuro

    2016-03-26

    Knowledge of the distribution and diversity of RNA viruses is still limited in spite of their possible environmental and epidemiological impacts because RNA virus-specific metagenomic methods have not yet been developed. We herein constructed an effective metagenomic method for RNA viruses by targeting long double-stranded (ds)RNA in cellular organisms, which is a hallmark of infection, or the replication of dsRNA and single-stranded (ss)RNA viruses, except for retroviruses. This novel dsRNA targeting metagenomic method is characterized by an extremely high recovery rate of viral RNA sequences, the retrieval of terminal sequences, and uniform read coverage, which has not previously been reported in other metagenomic methods targeting RNA viruses. This method revealed a previously unidentified viral RNA diversity of more than 20 complete RNA viral genomes including dsRNA and ssRNA viruses associated with an environmental diatom colony. Our approach will be a powerful tool for cataloging RNA viruses associated with organisms of interest. PMID:26877136

  2. Deep Sequencing Insights in Therapeutic shRNA Processing and siRNA Target Cleavage Precision

    PubMed Central

    Denise, Hubert; Moschos, Sterghios A.; Sidders, Benjamin; Burden, Frances; Perkins, Hannah; Carter, Nikki; Stroud, Tim; Kennedy, Michael; Fancy, Sally-Ann; Lapthorn, Cris; Lavender, Helen; Kinloch, Ross; Suhy, David; Corbau, Romu

    2014-01-01

    TT-034 (PF-05095808) is a recombinant adeno-associated virus serotype 8 (AAV8) agent expressing three short hairpin RNA (shRNA) pro-drugs that target the hepatitis C virus (HCV) RNA genome. The cytosolic enzyme Dicer cleaves each shRNA into multiple, potentially active small interfering RNA (siRNA) drugs. Using next-generation sequencing (NGS) to identify and characterize active shRNAs maturation products, we observed that each TT-034–encoded shRNA could be processed into as many as 95 separate siRNA strands. Few of these appeared active as determined by Sanger 5′ RNA Ligase-Mediated Rapid Amplification of cDNA Ends (5-RACE) and through synthetic shRNA and siRNA analogue studies. Moreover, NGS scrutiny applied on 5-RACE products (RACE-seq) suggested that synthetic siRNAs could direct cleavage in not one, but up to five separate positions on targeted RNA, in a sequence-dependent manner. These data support an on-target mechanism of action for TT-034 without cytotoxicity and question the accepted precision of substrate processing by the key RNA interference (RNAi) enzymes Dicer and siRNA-induced silencing complex (siRISC). PMID:24496437

  3. FLDS: A Comprehensive dsRNA Sequencing Method for Intracellular RNA Virus Surveillance

    PubMed Central

    Urayama, Syun-ichi; Takaki, Yoshihiro; Nunoura, Takuro

    2016-01-01

    Knowledge of the distribution and diversity of RNA viruses is still limited in spite of their possible environmental and epidemiological impacts because RNA virus-specific metagenomic methods have not yet been developed. We herein constructed an effective metagenomic method for RNA viruses by targeting long double-stranded (ds)RNA in cellular organisms, which is a hallmark of infection, or the replication of dsRNA and single-stranded (ss)RNA viruses, except for retroviruses. This novel dsRNA targeting metagenomic method is characterized by an extremely high recovery rate of viral RNA sequences, the retrieval of terminal sequences, and uniform read coverage, which has not previously been reported in other metagenomic methods targeting RNA viruses. This method revealed a previously unidentified viral RNA diversity of more than 20 complete RNA viral genomes including dsRNA and ssRNA viruses associated with an environmental diatom colony. Our approach will be a powerful tool for cataloging RNA viruses associated with organisms of interest. PMID:26877136

  4. Complete Genome Sequence of the WHO International Standard for HIV-2 RNA Determined by Deep Sequencing

    PubMed Central

    Ham, Claire; Morris, Clare

    2016-01-01

    The World Health Organization (WHO) International Standard for HIV-2 RNA nucleic acid assays was characterized by complete genome deep sequencing. The entire coding sequence and flanking long terminal repeats (LTRs), including minority species, were assigned subtype A. This information will aid design, development, and evaluation of HIV-2 RNA amplification assays. PMID:26847885

  5. Sequence-non-specific effects of RNA interference triggers and microRNA regulators

    PubMed Central

    Olejniczak, Marta; Galka, Paulina; Krzyzosiak, Wlodzimierz J.

    2010-01-01

    RNA reagents of diverse lengths and structures, unmodified or containing various chemical modifications are powerful tools of RNA interference and microRNA technologies. These reagents which are either delivered to cells using appropriate carriers or are expressed in cells from suitable vectors often cause unintended sequence-non-specific immune responses besides triggering intended sequence-specific silencing effects. This article reviews the present state of knowledge regarding the cellular sensors of foreign RNA, the signaling pathways these sensors mobilize and shows which specific features of the RNA reagents set the responsive systems on alert. The representative examples of toxic effects caused in the investigated cell lines and tissues by the RNAs of specific types and structures are collected and may be instructive for further studies of sequence-non-specific responses to foreign RNA in human cells. PMID:19843612

  6. The chemical structure of DNA sequence signals for RNA transcription

    NASA Technical Reports Server (NTRS)

    George, D. G.; Dayhoff, M. O.

    1982-01-01

    The proposed recognition sites for RNA transcription for E. coli NRA polymerase, bacteriophage T7 RNA polymerase, and eukaryotic RNA polymerase Pol II are evaluated in the light of the requirements for efficient recognition. It is shown that although there is good experimental evidence that specific nucleic acid sequence patterns are involved in transcriptional regulation in bacteria and bacterial viruses, among the sequences now available, only in the case of the promoters recognized by bacteriophage T7 polymerase does it seem likely that the pattern is sufficient. It is concluded that the eukaryotic pattern that is investigated is not restrictive enough to serve as a recognition site.

  7. Comparison of ribosomal RNA removal methods for transcriptome sequencing workflows in teleost fish

    Technology Transfer Automated Retrieval System (TEKTRAN)

    RNA sequencing (RNA-Seq) is becoming the standard for transcriptome analysis. Removal of contaminating ribosomal RNA (rRNA) is a priority in the preparation of libraries suitable for sequencing. rRNAs are commonly removed from total RNA via either mRNA selection or rRNA depletion. These methods have...

  8. Spliced synthetic genes as internal controls in RNA sequencing experiments.

    PubMed

    Hardwick, Simon A; Chen, Wendy Y; Wong, Ted; Deveson, Ira W; Blackburn, James; Andersen, Stacey B; Nielsen, Lars K; Mattick, John S; Mercer, Tim R

    2016-09-01

    RNA sequencing (RNA-seq) can be used to assemble spliced isoforms, quantify expressed genes and provide a global profile of the transcriptome. However, the size and diversity of the transcriptome, the wide dynamic range in gene expression and inherent technical biases confound RNA-seq analysis. We have developed a set of spike-in RNA standards, termed 'sequins' (sequencing spike-ins), that represent full-length spliced mRNA isoforms. Sequins have an entirely artificial sequence with no homology to natural reference genomes, but they align to gene loci encoded on an artificial in silico chromosome. The combination of multiple sequins across a range of concentrations emulates alternative splicing and differential gene expression, and it provides scaling factors for normalization between samples. We demonstrate the use of sequins in RNA-seq experiments to measure sample-specific biases and determine the limits of reliable transcript assembly and quantification in accompanying human RNA samples. In addition, we have designed a complementary set of sequins that represent fusion genes arising from rearrangements of the in silico chromosome to aid in cancer diagnosis. RNA sequins provide a qualitative and quantitative reference with which to navigate the complexity of the human transcriptome. PMID:27502218

  9. The nucleotide sequence of cowpea mosaic virus B RNA

    PubMed Central

    Lomonossoff, G.P.; Shanks, M.

    1983-01-01

    The complete sequence of the bottom component RNA (B RNA) of cowpea mosaic virus (CPMV) has been determined. Restriction enzyme fragments of double-stranded cDNA were cloned in M13 and the sequence of the inserts was determined by a combination of enzymatic and chemical sequencing techniques. Additional sequence information was obtained by primed synthesis on first strand cDNA. The complete sequence deduced is 5889 nucleotides long excluding the 3' poly(A), and contains an open reading frame sufficient to code for a polypeptide of mol. wt. 207 760. The coding region is flanked by a 5' leader sequence of 206 nucleotides and a 3' non-coding region of 82 residues which does not contain a polyadenylation signal. PMID:16453487

  10. Identifying novel sequence variants of RNA 3D motifs

    PubMed Central

    Zirbel, Craig L.; Roll, James; Sweeney, Blake A.; Petrov, Anton I.; Pirrung, Meg; Leontis, Neocles B.

    2015-01-01

    Predicting RNA 3D structure from sequence is a major challenge in biophysics. An important sub-goal is accurately identifying recurrent 3D motifs from RNA internal and hairpin loop sequences extracted from secondary structure (2D) diagrams. We have developed and validated new probabilistic models for 3D motif sequences based on hybrid Stochastic Context-Free Grammars and Markov Random Fields (SCFG/MRF). The SCFG/MRF models are constructed using atomic-resolution RNA 3D structures. To parameterize each model, we use all instances of each motif found in the RNA 3D Motif Atlas and annotations of pairwise nucleotide interactions generated by the FR3D software. Isostericity relations between non-Watson–Crick basepairs are used in scoring sequence variants. SCFG techniques model nested pairs and insertions, while MRF ideas handle crossing interactions and base triples. We use test sets of randomly-generated sequences to set acceptance and rejection thresholds for each motif group and thus control the false positive rate. Validation was carried out by comparing results for four motif groups to RMDetect. The software developed for sequence scoring (JAR3D) is structured to automatically incorporate new motifs as they accumulate in the RNA 3D Motif Atlas when new structures are solved and is available free for download. PMID:26130723

  11. Identifying novel sequence variants of RNA 3D motifs.

    PubMed

    Zirbel, Craig L; Roll, James; Sweeney, Blake A; Petrov, Anton I; Pirrung, Meg; Leontis, Neocles B

    2015-09-01

    Predicting RNA 3D structure from sequence is a major challenge in biophysics. An important sub-goal is accurately identifying recurrent 3D motifs from RNA internal and hairpin loop sequences extracted from secondary structure (2D) diagrams. We have developed and validated new probabilistic models for 3D motif sequences based on hybrid Stochastic Context-Free Grammars and Markov Random Fields (SCFG/MRF). The SCFG/MRF models are constructed using atomic-resolution RNA 3D structures. To parameterize each model, we use all instances of each motif found in the RNA 3D Motif Atlas and annotations of pairwise nucleotide interactions generated by the FR3D software. Isostericity relations between non-Watson-Crick basepairs are used in scoring sequence variants. SCFG techniques model nested pairs and insertions, while MRF ideas handle crossing interactions and base triples. We use test sets of randomly-generated sequences to set acceptance and rejection thresholds for each motif group and thus control the false positive rate. Validation was carried out by comparing results for four motif groups to RMDetect. The software developed for sequence scoring (JAR3D) is structured to automatically incorporate new motifs as they accumulate in the RNA 3D Motif Atlas when new structures are solved and is available free for download. PMID:26130723

  12. Discovering common stem–loop motifs in unaligned RNA sequences

    PubMed Central

    Gorodkin, Jan; Stricklin, Shawn L.; Stormo, Gary D.

    2001-01-01

    Post-transcriptional regulation of gene expression is often accomplished by proteins binding to specific sequence motifs in mRNA molecules, to affect their translation or stability. The motifs are often composed of a combination of sequence and structural constraints such that the overall structure is preserved even though much of the primary sequence is variable. While several methods exist to discover transcriptional regulatory sites in the DNA sequences of coregulated genes, the RNA motif discovery problem is much more difficult because of covariation in the positions. We describe the combined use of two approaches for RNA structure prediction, FOLDALIGN and COVE, that together can discover and model stem–loop RNA motifs in unaligned sequences, such as UTRs from post-transcriptionally coregulated genes. We evaluate the method on two datasets, one a section of rRNA genes with randomly truncated ends so that a global alignment is not possible, and the other a hyper-variable collection of IRE-like elements that were inserted into randomized UTR sequences. In both cases the combined method identified the motifs correctly, and in the rRNA example we show that it is capable of determining the structure, which includes bulge and internal loops as well as a variable length hairpin loop. Those automated results are quantitatively evaluated and found to agree closely with structures contained in curated databases, with correlation coefficients up to 0.9. A basic server, Stem–Loop Align SearcH (SLASH), which will perform stem–loop searches in unaligned RNA sequences, is available at http://www.bioinf.au.dk/slash/. PMID:11353083

  13. Dinoflagellate 17S rRNA sequence inferred from the gene sequence: Evolutionary implications.

    PubMed

    Herzog, M; Maroteaux, L

    1986-11-01

    We present the complete sequence of the nuclear-encoded small-ribosomal-subunit RNA inferred from the cloned gene sequence of the dinoflagellate Prorocentrum micans. The dinoflagellate 17S rRNA sequence of 1798 nucleotides is contained in a family of 200 tandemly repeated genes per haploid genome. A tentative model of the secondary structure of P. micans 17S rRNA is presented. This sequence is compared with the small-ribosomal-subunit rRNA of Xenopus laevis (Animalia), Saccharomyces cerevisiae (Fungi), Zea mays (Planta), Dictyostelium discoideum (Protoctista), and Halobacterium volcanii (Monera). Although the secondary structure of the dinoflagellate 17S rRNA presents most of the eukaryotic characteristics, it contains sufficient archaeobacterial-like structural features to reinforce the view that dinoflagellates branch off very early from the eukaryotic lineage. PMID:16578795

  14. Dinoflagellate 17S rRNA sequence inferred from the gene sequence: Evolutionary implications

    PubMed Central

    Herzog, Michel; Maroteaux, Luc

    1986-01-01

    We present the complete sequence of the nuclear-encoded small-ribosomal-subunit RNA inferred from the cloned gene sequence of the dinoflagellate Prorocentrum micans. The dinoflagellate 17S rRNA sequence of 1798 nucleotides is contained in a family of 200 tandemly repeated genes per haploid genome. A tentative model of the secondary structure of P. micans 17S rRNA is presented. This sequence is compared with the small-ribosomal-subunit rRNA of Xenopus laevis (Animalia), Saccharomyces cerevisiae (Fungi), Zea mays (Planta), Dictyostelium discoideum (Protoctista), and Halobacterium volcanii (Monera). Although the secondary structure of the dinoflagellate 17S rRNA presents most of the eukaryotic characteristics, it contains sufficient archaeobacterial-like structural features to reinforce the view that dinoflagellates branch off very early from the eukaryotic lineage. PMID:16578795

  15. Evaluation of Commercially Available RNA Amplification Kits for RNA Sequencing Using Very Low Input Amounts of Total RNA

    PubMed Central

    Shanker, Savita; Paulson, Ariel; Edenberg, Howard J.; Peak, Allison; Perera, Anoja; Alekseyev, Yuriy O.; Beckloff, Nicholas; Bivens, Nathan J.; Donnelly, Robert; Gillaspy, Allison F.; Grove, Deborah; Gu, Weikuan; Jafari, Nadereh; Kerley-Hamilton, Joanna S.; Lyons, Robert H.; Tepper, Clifford

    2015-01-01

    This article includes supplemental data. Please visit http://www.fasebj.org to obtain this information.Multiple recent publications on RNA sequencing (RNA-seq) have demonstrated the power of next-generation sequencing technologies in whole-transcriptome analysis. Vendor-specific protocols used for RNA library construction often require at least 100 ng total RNA. However, under certain conditions, much less RNA is available for library construction. In these cases, effective transcriptome profiling requires amplification of subnanogram amounts of RNA. Several commercial RNA amplification kits are available for amplification prior to library construction for next-generation sequencing, but these kits have not been comprehensively field evaluated for accuracy and performance of RNA-seq for picogram amounts of RNA. To address this, 4 types of amplification kits were tested with 3 different concentrations, from 5 ng to 50 pg, of a commercially available RNA. Kits were tested at multiple sites to assess reproducibility and ease of use. The human total reference RNA used was spiked with a control pool of RNA molecules in order to further evaluate quantitative recovery of input material. Additional control data sets were generated from libraries constructed following polyA selection or ribosomal depletion using established kits and protocols. cDNA was collected from the different sites, and libraries were synthesized at a single site using established protocols. Sequencing runs were carried out on the Illumina platform. Numerous metrics were compared among the kits and dilutions used. Overall, no single kit appeared to meet all the challenges of small input material. However, it is encouraging that excellent data can be recovered with even the 50 pg input total RNA. PMID:25649271

  16. repRNA: a web server for generating various feature vectors of RNA sequences.

    PubMed

    Liu, Bin; Liu, Fule; Fang, Longyun; Wang, Xiaolong; Chou, Kuo-Chen

    2016-02-01

    With the rapid growth of RNA sequences generated in the postgenomic age, it is highly desired to develop a flexible method that can generate various kinds of vectors to represent these sequences by focusing on their different features. This is because nearly all the existing machine-learning methods, such as SVM (support vector machine) and KNN (k-nearest neighbor), can only handle vectors but not sequences. To meet the increasing demands and speed up the genome analyses, we have developed a new web server, called "representations of RNA sequences" (repRNA). Compared with the existing methods, repRNA is much more comprehensive, flexible and powerful, as reflected by the following facts: (1) it can generate 11 different modes of feature vectors for users to choose according to their investigation purposes; (2) it allows users to select the features from 22 built-in physicochemical properties and even those defined by users' own; (3) the resultant feature vectors and the secondary structures of the corresponding RNA sequences can be visualized. The repRNA web server is freely accessible to the public at http://bioinformatics.hitsz.edu.cn/repRNA/ . PMID:26085220

  17. RNAcentral: A vision for an international database of RNA sequences.

    PubMed

    Bateman, Alex; Agrawal, Shipra; Birney, Ewan; Bruford, Elspeth A; Bujnicki, Janusz M; Cochrane, Guy; Cole, James R; Dinger, Marcel E; Enright, Anton J; Gardner, Paul P; Gautheret, Daniel; Griffiths-Jones, Sam; Harrow, Jen; Herrero, Javier; Holmes, Ian H; Huang, Hsien-Da; Kelly, Krystyna A; Kersey, Paul; Kozomara, Ana; Lowe, Todd M; Marz, Manja; Moxon, Simon; Pruitt, Kim D; Samuelsson, Tore; Stadler, Peter F; Vilella, Albert J; Vogel, Jan-Hinnerk; Williams, Kelly P; Wright, Mathew W; Zwieb, Christian

    2011-11-01

    During the last decade there has been a great increase in the number of noncoding RNA genes identified, including new classes such as microRNAs and piRNAs. There is also a large growth in the amount of experimental characterization of these RNA components. Despite this growth in information, it is still difficult for researchers to access RNA data, because key data resources for noncoding RNAs have not yet been created. The most pressing omission is the lack of a comprehensive RNA sequence database, much like UniProt, which provides a comprehensive set of protein knowledge. In this article we propose the creation of a new open public resource that we term RNAcentral, which will contain a comprehensive collection of RNA sequences and fill an important gap in the provision of biomedical databases. We envision RNA researchers from all over the world joining a federated RNAcentral network, contributing specialized knowledge and databases. RNAcentral would centralize key data that are currently held across a variety of databases, allowing researchers instant access to a single, unified resource. This resource would facilitate the next generation of RNA research and help drive further discoveries, including those that improve food production and human and animal health. We encourage additional RNA database resources and research groups to join this effort. We aim to obtain international network funding to further this endeavor. PMID:21940779

  18. RNAcentral: A vision for an international database of RNA sequences

    PubMed Central

    Bateman, Alex; Agrawal, Shipra; Birney, Ewan; Bruford, Elspeth A.; Bujnicki, Janusz M.; Cochrane, Guy; Cole, James R.; Dinger, Marcel E.; Enright, Anton J.; Gardner, Paul P.; Gautheret, Daniel; Griffiths-Jones, Sam; Harrow, Jen; Herrero, Javier; Holmes, Ian H.; Huang, Hsien-Da; Kelly, Krystyna A.; Kersey, Paul; Kozomara, Ana; Lowe, Todd M.; Marz, Manja; Moxon, Simon; Pruitt, Kim D.; Samuelsson, Tore; Stadler, Peter F.; Vilella, Albert J.; Vogel, Jan-Hinnerk; Williams, Kelly P.; Wright, Mathew W.; Zwieb, Christian

    2011-01-01

    During the last decade there has been a great increase in the number of noncoding RNA genes identified, including new classes such as microRNAs and piRNAs. There is also a large growth in the amount of experimental characterization of these RNA components. Despite this growth in information, it is still difficult for researchers to access RNA data, because key data resources for noncoding RNAs have not yet been created. The most pressing omission is the lack of a comprehensive RNA sequence database, much like UniProt, which provides a comprehensive set of protein knowledge. In this article we propose the creation of a new open public resource that we term RNAcentral, which will contain a comprehensive collection of RNA sequences and fill an important gap in the provision of biomedical databases. We envision RNA researchers from all over the world joining a federated RNAcentral network, contributing specialized knowledge and databases. RNAcentral would centralize key data that are currently held across a variety of databases, allowing researchers instant access to a single, unified resource. This resource would facilitate the next generation of RNA research and help drive further discoveries, including those that improve food production and human and animal health. We encourage additional RNA database resources and research groups to join this effort. We aim to obtain international network funding to further this endeavor. PMID:21940779

  19. Method for rapid base sequencing in DNA and RNA

    DOEpatents

    Jett, J.H.; Keller, R.A.; Martin, J.C.; Moyzis, R.K.; Ratliff, R.L.; Shera, E.B.; Stewart, C.C.

    1987-10-07

    A method is provided for the rapid base sequencing of DNA or RNA fragments wherein a single fragment of DNA or RNA is provided with identifiable bases and suspended in a moving flow stream. An exonuclease sequentially cleaves individual bases from the end of the suspended fragment. The moving flow stream maintains the cleaved bases in an orderly train for subsequent detection and identification. In a particular embodiment, individual bases forming the DNA or RNA fragments are individually tagged with a characteristic fluorescent dye. The train of bases is then excited to fluorescence with an output spectrum characteristic of the individual bases. Accordingly, the base sequence of the original DNA or RNA fragment can be reconstructed. 2 figs.

  20. Method for rapid base sequencing in DNA and RNA

    DOEpatents

    Jett, James H.; Keller, Richard A.; Martin, John C.; Moyzis, Robert K.; Ratliff, Robert L.; Shera, E. Brooks; Stewart, Carleton C.

    1990-01-01

    A method is provided for the rapid base sequencing of DNA or RNA fragments wherein a single fragment of DNA or RNA is provided with identifiable bases and suspended in a moving flow stream. An exonuclease sequentially cleaves individual bases from the end of the suspended fragment. The moving flow stream maintains the cleaved bases in an orderly train for subsequent detection and identification. In a particular embodiment, individual bases forming the DNA or RNA fragments are individually tagged with a characteristic fluorescent dye. The train of bases is then excited to fluorescence with an output spectrum characteristic of the individual bases. Accordingly, the base sequence of the original DNA or RNA fragment can be reconstructed.

  1. Method for rapid base sequencing in DNA and RNA

    DOEpatents

    Jett, J.H.; Keller, R.A.; Martin, J.C.; Moyzis, R.K.; Ratliff, R.L.; Shera, E.B.; Stewart, C.C.

    1990-10-09

    A method is provided for the rapid base sequencing of DNA or RNA fragments wherein a single fragment of DNA or RNA is provided with identifiable bases and suspended in a moving flow stream. An exonuclease sequentially cleaves individual bases from the end of the suspended fragment. The moving flow stream maintains the cleaved bases in an orderly train for subsequent detection and identification. In a particular embodiment, individual bases forming the DNA or RNA fragments are individually tagged with a characteristic fluorescent dye. The train of bases is then excited to fluorescence with an output spectrum characteristic of the individual bases. Accordingly, the base sequence of the original DNA or RNA fragment can be reconstructed. 2 figs.

  2. Reentrant Melting of RNA with Quenched Sequence Randomness

    NASA Astrophysics Data System (ADS)

    Hayrapetyan, G. N.; Iannelli, F.; Lekscha, J.; Morozov, V. F.; Netz, R. R.; Mamasakhlisov, Y. Sh.

    2014-08-01

    The effect of quenched sequence disorder on the thermodynamics of RNA secondary structure formation is investigated for two- and four-letter alphabet models using the constrained annealing approach, from which the temperature behavior of the free energy, specific heat, and helicity is analytically obtained. For competing base pairing energies, the calculations reveal reentrant melting at low temperatures, in excellent agreement with numerical results. Our results suggest an additional mechanism for the experimental phenomenon of RNA cold denaturation.

  3. Sequence determinants of improved CRISPR sgRNA design

    PubMed Central

    Xu, Han; Xiao, Tengfei; Chen, Chen-Hao; Li, Wei; Meyer, Clifford A.; Wu, Qiu; Wu, Di; Cong, Le; Zhang, Feng; Liu, Jun S.; Brown, Myles; Liu, X. Shirley

    2015-01-01

    The CRISPR/Cas9 system has revolutionized mammalian somatic cell genetics. Genome-wide functional screens using CRISPR/Cas9-mediated knockout or dCas9 fusion-mediated inhibition/activation (CRISPRi/a) are powerful techniques for discovering phenotype-associated gene function. We systematically assessed the DNA sequence features that contribute to single guide RNA (sgRNA) efficiency in CRISPR-based screens. Leveraging the information from multiple designs, we derived a new sequence model for predicting sgRNA efficiency in CRISPR/Cas9 knockout experiments. Our model confirmed known features and suggested new features including a preference for cytosine at the cleavage site. The model was experimentally validated for sgRNA-mediated mutation rate and protein knockout efficiency. Tested on independent data sets, the model achieved significant results in both positive and negative selection conditions and outperformed existing models. We also found that the sequence preference for CRISPRi/a is substantially different from that for CRISPR/Cas9 knockout and propose a new model for predicting sgRNA efficiency in CRISPRi/a experiments. These results facilitate the genome-wide design of improved sgRNA for both knockout and CRISPRi/a studies. PMID:26063738

  4. Quantifying sequence and structural features of protein-RNA interactions.

    PubMed

    Li, Songling; Yamashita, Kazuo; Amada, Karlou Mar; Standley, Daron M

    2014-09-01

    Increasing awareness of the importance of protein-RNA interactions has motivated many approaches to predict residue-level RNA binding sites in proteins based on sequence or structural characteristics. Sequence-based predictors are usually high in sensitivity but low in specificity; conversely structure-based predictors tend to have high specificity, but lower sensitivity. Here we quantified the contribution of both sequence- and structure-based features as indicators of RNA-binding propensity using a machine-learning approach. In order to capture structural information for proteins without a known structure, we used homology modeling to extract the relevant structural features. Several novel and modified features enhanced the accuracy of residue-level RNA-binding propensity beyond what has been reported previously, including by meta-prediction servers. These features include: hidden Markov model-based evolutionary conservation, surface deformations based on the Laplacian norm formalism, and relative solvent accessibility partitioned into backbone and side chain contributions. We constructed a web server called aaRNA that implements the proposed method and demonstrate its use in identifying putative RNA binding sites. PMID:25063293

  5. Phylogenetic relationships of Cryptosporidium determined by ribosomal RNA sequence comparison.

    PubMed

    Johnson, A M; Fielke, R; Lumb, R; Baverstock, P R

    1990-04-01

    Reverse transcription of total cellular RNA was used to obtain a partial sequence of the small subunit ribosomal RNA of Cryptosporidium, a protist currently placed in the phylum Apicomplexa. The semi-conserved regions were aligned with homologous sequences in a range of other eukaryotes, and the evolutionary relationships of Cryptosporidium were determined by two different methods of phylogenetic analysis. The prokaryotes Escherichia coli and Halobacterium cuti were included as outgroups. The results do not show an especially close relationship of Cryptosporidium to other members of the phylum Apicomplexa. PMID:2332273

  6. Probing dimensionality beyond the linear sequence of mRNA.

    PubMed

    Del Campo, Cristian; Ignatova, Zoya

    2016-05-01

    mRNA is a nexus entity between DNA and translating ribosomes. Recent developments in deep sequencing technologies coupled with structural probing have revealed new insights beyond the classic role of mRNA and place it more centrally as a direct effector of a variety of processes, including translation, cellular localization, and mRNA degradation. Here, we highlight emerging approaches to probe mRNA secondary structure on a global transcriptome-wide level and compare their potential and resolution. Combined approaches deliver a richer and more complex picture. While our understanding on the effect of secondary structure for various cellular processes is quite advanced, the next challenge is to unravel more complex mRNA architectures and tertiary interactions. PMID:26650615

  7. Statistical mechanics of secondary structures formed by random RNA sequences

    NASA Astrophysics Data System (ADS)

    Bundschuh, Ralf

    2003-03-01

    In addition to its importance for the biological function of RNA molecules RNA secondary structure formation is an interesting system from the statistical physics point of view. The ensemble of secondary structures of random RNA sequences shows a rich phase diagram with distinct native, denatured, molten, and glassy phases separated by thermodynamical phase transitions. These phase transitions are driven by the competition between thermal fluctuations, the disorder frozen into the specific sequence of a given RNA molecule, and the evolutionary bias towards the formation of some biologically relevant structure. Yet, in contrast to the protein folding problem which is driven by very similar principles and shows a similar phase diagram RNA secondary structure formation can be represented by a simple diagrammatic language which allows the application of various analytical and numerical methods. This makes RNA secondary structure formation an ideal model system for heteropolymer folding. In the talk, I will characterize and explain the complex behaviour of RNA folding using several simple models and discuss possible implications to biological processes.

  8. High-throughput RNA interference screening using pooled shRNA libraries and next generation sequencing

    PubMed Central

    2011-01-01

    RNA interference (RNAi) screening is a state-of-the-art technology that enables the dissection of biological processes and disease-related phenotypes. The commercial availability of genome-wide, short hairpin RNA (shRNA) libraries has fueled interest in this area but the generation and analysis of these complex data remain a challenge. Here, we describe complete experimental protocols and novel open source computational methodologies, shALIGN and shRNAseq, that allow RNAi screens to be rapidly deconvoluted using next generation sequencing. Our computational pipeline offers efficient screen analysis and the flexibility and scalability to quickly incorporate future developments in shRNA library technology. PMID:22018332

  9. Studying RNA Homology and Conservation with Infernal: From Single Sequences to RNA Families.

    PubMed

    Barquist, Lars; Burge, Sarah W; Gardner, Paul P

    2016-01-01

    Emerging high-throughput technologies have led to a deluge of putative non-coding RNA (ncRNA) sequences identified in a wide variety of organisms. Systematic characterization of these transcripts will be a tremendous challenge. Homology detection is critical to making maximal use of functional information gathered about ncRNAs: identifying homologous sequence allows us to transfer information gathered in one organism to another quickly and with a high degree of confidence. ncRNA presents a challenge for homology detection, as the primary sequence is often poorly conserved and de novo secondary structure prediction and search remain difficult. This unit introduces methods developed by the Rfam database for identifying "families" of homologous ncRNAs starting from single "seed" sequences, using manually curated sequence alignments to build powerful statistical models of sequence and structure conservation known as covariance models (CMs), implemented in the Infernal software package. We provide a step-by-step iterative protocol for identifying ncRNA homologs and then constructing an alignment and corresponding CM. We also work through an example for the bacterial small RNA MicA, discovering a previously unreported family of divergent MicA homologs in genus Xenorhabdus in the process. © 2016 by John Wiley & Sons, Inc. PMID:27322404

  10. Transcriptome Profiling of Developing Murine Lens Through RNA Sequencing

    PubMed Central

    Khan, Shahid Y.; Hackett, Sean F.; Lee, Mei-Chong W.; Pourmand, Nader; Talbot, C. Conover; Riazuddin, S. Amer

    2015-01-01

    Purpose Transcriptome is the entire repertoire of transcripts present in a cell at any particular time. We undertook a next-generation whole transcriptome sequencing approach to gain insight into the transcriptional landscape of the developing mouse lens. Methods We ascertained mouse lenses at six developmental time points including two embryonic (E15 and E18) and four postnatal stages (P0, P3, P6, and P9). The ocular tissue at each time point was maintained as two distinct pools serving as biological replicates for each developmental stage. The mRNA and small RNA libraries were paired-end sequenced on Illumina HiSeq 2000 and subsequently analyzed using bioinformatics tools. Results Mapping of mRNA and small RNA libraries generated 187.56 and 154.22 million paired-end reads, respectively. We detected a total of 14,465 genes in the mouse ocular lens at the above-mentioned six developmental stages. Of these, 46 genes exhibited a 40-fold differential (higher or lower) expression at one the five developmental stages (E18, P0, P3, P6, and P9) compared with their expression level at E15. Likewise, small RNA profiling identified 379 microRNAs (miRNAs) expressed in mouse lens at six developmental time points. Of these, 49 miRNAs manifested an 8-fold differential (higher or lower) expression at one the five developmental stages, as mentioned above compared with their expression level at E15. Conclusions We report a comprehensive profile of developing murine lens transcriptome including both mRNA and miRNA through next-generation RNA sequencing. A complete repository of the lens transcriptome of six developmental time points will be monumental in elucidating processes essential for the development of the ocular lens and maintenance of its transparency. PMID:26225632

  11. MicroRNA Expression Profile in Penile Cancer Revealed by Next-Generation Small RNA Sequencing

    PubMed Central

    Zhang, Yuanwei; Xu, Bo; Zhou, Jun; Fan, Song; Hao, Zongyao; Shi, Haoqiang; Zhang, Xiansheng; Kong, Rui; Xu, Lingfan; Gao, Jingjing; Zou, Duohong; Liang, Chaozhao

    2015-01-01

    Penile cancer (PeCa) is a relatively rare tumor entity but possesses higher morbidity and mortality rates especially in developing countries. To date, the concrete pathogenic signaling pathways and core machineries involved in tumorigenesis and progression of PeCa remain to be elucidated. Several studies suggested miRNAs, which modulate gene expression at posttranscriptional level, were frequently mis-regulated and aberrantly expressed in human cancers. However, the miRNA profile in human PeCa has not been reported before. In this present study, the miRNA profile was obtained from 10 fresh penile cancerous tissues and matched adjacent non-cancerous tissues via next-generation sequencing. As a result, a total of 751 and 806 annotated miRNAs were identified in normal and cancerous penile tissues, respectively. Among which, 56 miRNAs with significantly different expression levels between paired tissues were identified. Subsequently, several annotated miRNAs were selected randomly and validated using quantitative real-time PCR. Compared with the previous publications regarding to the altered miRNAs expression in various cancers and especially genitourinary (prostate, bladder, kidney, testis) cancers, the most majority of deregulated miRNAs showed the similar expression pattern in penile cancer. Moreover, the bioinformatics analyses suggested that the putative target genes of differentially expressed miRNAs between cancerous and matched normal penile tissues were tightly associated with cell junction, proliferation, growth as well as genomic instability and so on, by modulating Wnt, MAPK, p53, PI3K-Akt, Notch and TGF-β signaling pathways, which were all well-established to participate in cancer initiation and progression. Our work presents a global view of the differentially expressed miRNAs and potentially regulatory networks of their target genes for clarifying the pathogenic transformation of normal penis to PeCa, which research resource also provides new insights

  12. Using RNA Sequencing to Classify Organisms into Three Primary Kingdoms.

    ERIC Educational Resources Information Center

    Evans, Robert H.

    1983-01-01

    Using the biochemical record to class archaebacteria, eukaryotes, and eubacteria involves abstractions difficult for the concrete learner. Therefore, a method is provided in which students discover some basic tenets of biochemical classification and apply them in a "hands-on" classification problem. The method involves use of RNA sequencing. (JN)

  13. RNA sequencing of the nephron transcriptome: a technical note

    PubMed Central

    Lee, Jae Wook

    2015-01-01

    To understand the functions of the kidney, the transcriptome of each part of the nephron needs to be profiled using a highly sensitive and unbiased tool. RNA sequencing (RNA-seq) has revolutionized transcriptomic research, enabling researchers to define transcription activity and functions of genomic elements with unprecedented sensitivity and precision. Recently, RNA-seq for polyadenylated messenger RNAs [poly(A)′-mRNAs] and classical microdissection were successfully combined to investigate the transcriptome of glomeruli and 14 different renal tubule segments. A rat kidney is perfused with and incubated in collagenase solution, and the digested kidney was manually dissected under a stereomicroscope. Individual glomeruli and renal tubule segments are identified by their anatomical and morphological characteristics and collected in phosphate-buffered saline. Poly(A)′-tailed mRNAs are released from cell lysate, captured by oligo-dT primers, and made into complementary DNAs (cDNAs) using a highly sensitive reverse transcription method. These cDNAs are sheared by sonication and prepared into adapter-ligated cDNA libraries for Illumina sequencing. Nucleotide sequences reported from the sequencing reaction are mapped to the rat reference genome for gene expression analysis. These RNA-seq transcriptomic data were highly consistent with prior knowledge of gene expression along the nephron. The gene expression data obtained in this work are available as a public Web page (https://helixweb.nih.gov/ESBL/Database/NephronRNAseq/) and can be used to explore the transcriptomic landscape of the nephron. PMID:26779425

  14. Sequence specificity of mRNA N6-adenosine methyltransferase.

    PubMed

    Csepany, T; Lin, A; Baldick, C J; Beemon, K

    1990-11-25

    The sequence specificity of chicken mRNA N6-adenosine methyltransferase has been investigated in vivo. Localization of six new N6-methyladenosine sites on Rous sarcoma virus (RSV) virion RNA has confirmed our extended consensus sequence for methylation: RGACU, where R is usually a G (7/12). We have also observed A (2/12) and U (3/12) at the -2 position (relative to m6A at +1) but never a C. At the +3 position, the U was observed 10/12 times; an A and a C were observed once each in weakly methylated sequences. The extent of methylation varied between the different sites up to a maximum of about 90%. To test the significance of this consensus sequence, it was altered by site-specific mutagenesis, and methylation was assayed after transfection of mutated RSV DNA into chicken embryo fibroblasts. We found that changing the G at -1 or the U at +3 to any other residue inhibited methylation. However, inhibition of methylation at all four of the major sites in the RSV src gene did not detectably alter the steady-state levels of the three viral RNA species or viral infectivity. Additional mutants that inactivated the src protein kinase activity produced less virus and exhibited relatively less src mRNA in infected cells. PMID:2173695

  15. SRP-RNA sequence alignment and secondary structure.

    PubMed Central

    Larsen, N; Zwieb, C

    1991-01-01

    The secondary structures of the RNAs from the signal recognition particle, termed SRP-RNA, were derived buy comparative analyses of an alignment of 39 sequences. The models are minimal in that only base pairs are included for which there is comparative evidence. The structures represent refinements of earlier versions and include a new short helix. PMID:1707519

  16. siRNA release from pri-miRNA scaffolds is controlled by the sequence and structure of RNA.

    PubMed

    Galka-Marciniak, Paulina; Olejniczak, Marta; Starega-Roslan, Julia; Szczesniak, Michal W; Makalowska, Izabela; Krzyzosiak, Wlodzimierz J

    2016-04-01

    shmiRs are pri-miRNA-based RNA interference triggers from which exogenous siRNAs are expressed in cells to silence target genes. These reagents are very promising tools in RNAi in vivo applications due to their good activity profile and lower toxicity than observed for other vector-based reagents such as shRNAs. In this study, using high-resolution northern blotting and small RNA sequencing, we investigated the precision with which RNases Drosha and Dicer process shmiRs. The fidelity of siRNA release from the commonly used pri-miRNA shuttles was found to depend on both the siRNA insert and the pri-miR scaffold. Then, we searched for specific factors that may affect the precision of siRNA release and found that both the structural features of shmiR hairpins and the nucleotide sequence at Drosha and Dicer processing sites contribute to cleavage site selection and cleavage precision. An analysis of multiple shRNA intermediates generated from several reagents revealed the complexity of shmiR processing by Drosha and demonstrated that Dicer selects substrates for further processing. Aside from providing new basic knowledge regarding the specificity of nucleases involved in miRNA biogenesis, our results facilitate the rational design of more efficient genetic reagents for RNAi technology. PMID:26921501

  17. Toward Rare Blood Cell Preservation for RNA Sequencing.

    PubMed

    Vickovic, Sanja; Ahmadian, Afshin; Lewensohn, Rolf; Lundeberg, Joakim

    2015-07-01

    Cancer is driven by various events leading to cell differentiation and disease progression. Molecular tools are powerful approaches for describing how and why these events occur. With the growing field of next-generation DNA sequencing, there is an increasing need for high-quality nucleic acids derived from human cells and tissues-a prerequisite for successful cell profiling. Although advances in RNA preservation have been made, some of the largest biobanks still do not employ RNA blood preservation as standard because of limitations in low blood-input volume and RNA stability over the whole gene body. Therefore, we have developed a robust protocol for blood preservation and long-term storage while maintaining RNA integrity. Furthermore, we explored the possibility of using the protocol for preserving rare cell samples, such as circulating tumor cells. The results of our study confirmed that gene expression was not impacted by the preservation procedure (r(2) > 0.88) or by long-term storage (r(2) = 0.95), with RNA integrity number values averaging over 8. Similarly, cell surface antigens were still available for antibody selection (r(2) = 0.95). Lastly, data mining for fusion events showed that it was possible to detect rare tumor cells among a background of other cells present in blood irrespective of fixation. Thus, the developed protocol would be suitable for rare blood cell preservation followed by RNA sequencing analysis. PMID:25989392

  18. Learning to Predict miRNA-mRNA Interactions from AGO CLIP Sequencing and CLASH Data.

    PubMed

    Lu, Yuheng; Leslie, Christina S

    2016-07-01

    Recent technologies like AGO CLIP sequencing and CLASH enable direct transcriptome-wide identification of AGO binding and miRNA target sites, but the most widely used miRNA target prediction algorithms do not exploit these data. Here we use discriminative learning on AGO CLIP and CLASH interactions to train a novel miRNA target prediction model. Our method combines two SVM classifiers, one to predict miRNA-mRNA duplexes and a second to learn a binding model of AGO's local UTR sequence preferences and positional bias in 3'UTR isoforms. The duplex SVM model enables the prediction of non-canonical target sites and more accurately resolves miRNA interactions from AGO CLIP data than previous methods. The binding model is trained using a multi-task strategy to learn context-specific and common AGO sequence preferences. The duplex and common AGO binding models together outperform existing miRNA target prediction algorithms on held-out binding data. Open source code is available at https://bitbucket.org/leslielab/chimiric. PMID:27438777

  19. Learning to Predict miRNA-mRNA Interactions from AGO CLIP Sequencing and CLASH Data

    PubMed Central

    Lu, Yuheng; Leslie, Christina S.

    2016-01-01

    Recent technologies like AGO CLIP sequencing and CLASH enable direct transcriptome-wide identification of AGO binding and miRNA target sites, but the most widely used miRNA target prediction algorithms do not exploit these data. Here we use discriminative learning on AGO CLIP and CLASH interactions to train a novel miRNA target prediction model. Our method combines two SVM classifiers, one to predict miRNA-mRNA duplexes and a second to learn a binding model of AGO’s local UTR sequence preferences and positional bias in 3’UTR isoforms. The duplex SVM model enables the prediction of non-canonical target sites and more accurately resolves miRNA interactions from AGO CLIP data than previous methods. The binding model is trained using a multi-task strategy to learn context-specific and common AGO sequence preferences. The duplex and common AGO binding models together outperform existing miRNA target prediction algorithms on held-out binding data. Open source code is available at https://bitbucket.org/leslielab/chimiric. PMID:27438777

  20. Structurally complex and highly active RNA ligases derived from random RNA sequences

    NASA Technical Reports Server (NTRS)

    Ekland, E. H.; Szostak, J. W.; Bartel, D. P.

    1995-01-01

    Seven families of RNA ligases, previously isolated from random RNA sequences, fall into three classes on the basis of secondary structure and regiospecificity of ligation. Two of the three classes of ribozymes have been engineered to act as true enzymes, catalyzing the multiple-turnover transformation of substrates into products. The most complex of these ribozymes has a minimal catalytic domain of 93 nucleotides. An optimized version of this ribozyme has a kcat exceeding one per second, a value far greater than that of most natural RNA catalysts and approaching that of comparable protein enzymes. The fact that such a large and complex ligase emerged from a very limited sampling of sequence space implies the existence of a large number of distinct RNA structures of equivalent complexity and activity.

  1. Integrated microRNA-mRNA analyses reveal OPLL specific microRNA regulatory network using high-throughput sequencing.

    PubMed

    Xu, Chen; Chen, Yu; Zhang, Hao; Chen, Yuanyuan; Shen, Xiaolong; Shi, Changgui; Liu, Yang; Yuan, Wen

    2016-01-01

    Ossification of the posterior longitudinal ligament (OPLL) is a genetic disorder which involves pathological heterotopic ossification of the spinal ligaments. Although studies have identified several genes that correlated with OPLL, the underlying regulation network is far from clear. Through small RNA sequencing, we compared the microRNA expressions of primary posterior longitudinal ligament cells form OPLL patients with normal patients (PLL) and identified 218 dysregulated miRNAs (FDR < 0.01). Furthermore, assessing the miRNA profiling data of multiple cell types, we found these dysregulated miRNAs were mostly OPLL specific. In order to decipher the regulation network of these OPLL specific miRNAs, we integrated mRNA expression profiling data with miRNA sequencing data. Through computational approaches, we showed the pivotal roles of these OPLL specific miRNAs in heterotopic ossification of longitudinal ligament by discovering highly correlated miRNA/mRNA pairs that associated with skeletal system development, collagen fibril organization, and extracellular matrix organization. The results of which provide strong evidence that the miRNA regulatory networks we established may indeed play vital roles in OPLL onset and progression. To date, this is the first systematic analysis of the micronome in OPLL, and thus may provide valuable resources in finding novel treatment and diagnostic targets of OPLL. PMID:26868491

  2. Using Small RNA Deep Sequencing Data to Detect Human Viruses

    PubMed Central

    Wang, Fang; Sun, Yu; Ruan, Jishou; Chen, Rui; Chen, Xin; Chen, Chengjie; Kreuze, Jan F.; Fei, ZhangJun; Zhu, Xiao

    2016-01-01

    Small RNA sequencing (sRNA-seq) can be used to detect viruses in infected hosts without the necessity to have any prior knowledge or specialized sample preparation. The sRNA-seq method was initially used for viral detection and identification in plants and then in invertebrates and fungi. However, it is still controversial to use sRNA-seq in the detection of mammalian or human viruses. In this study, we used 931 sRNA-seq runs of data from the NCBI SRA database to detect and identify viruses in human cells or tissues, particularly from some clinical samples. Six viruses including HPV-18, HBV, HCV, HIV-1, SMRV, and EBV were detected from 36 runs of data. Four viruses were consistent with the annotations from the previous studies. HIV-1 was found in clinical samples without the HIV-positive reports, and SMRV was found in Diffuse Large B-Cell Lymphoma cells for the first time. In conclusion, these results suggest the sRNA-seq can be used to detect viruses in mammals and humans. PMID:27066498

  3. Using Small RNA Deep Sequencing Data to Detect Human Viruses.

    PubMed

    Wang, Fang; Sun, Yu; Ruan, Jishou; Chen, Rui; Chen, Xin; Chen, Chengjie; Kreuze, Jan F; Fei, ZhangJun; Zhu, Xiao; Gao, Shan

    2016-01-01

    Small RNA sequencing (sRNA-seq) can be used to detect viruses in infected hosts without the necessity to have any prior knowledge or specialized sample preparation. The sRNA-seq method was initially used for viral detection and identification in plants and then in invertebrates and fungi. However, it is still controversial to use sRNA-seq in the detection of mammalian or human viruses. In this study, we used 931 sRNA-seq runs of data from the NCBI SRA database to detect and identify viruses in human cells or tissues, particularly from some clinical samples. Six viruses including HPV-18, HBV, HCV, HIV-1, SMRV, and EBV were detected from 36 runs of data. Four viruses were consistent with the annotations from the previous studies. HIV-1 was found in clinical samples without the HIV-positive reports, and SMRV was found in Diffuse Large B-Cell Lymphoma cells for the first time. In conclusion, these results suggest the sRNA-seq can be used to detect viruses in mammals and humans. PMID:27066498

  4. Experimental design, preprocessing, normalization and differential expression analysis of small RNA sequencing experiments

    PubMed Central

    2011-01-01

    Prior to the advent of new, deep sequencing methods, small RNA (sRNA) discovery was dependent on Sanger sequencing, which was time-consuming and limited knowledge to only the most abundant sRNA. The innovation of large-scale, next-generation sequencing has exponentially increased knowledge of the biology, diversity and abundance of sRNA populations. In this review, we discuss issues involved in the design of sRNA sequencing experiments, including choosing a sequencing platform, inherent biases that affect sRNA measurements and replication. We outline the steps involved in preprocessing sRNA sequencing data and review both the principles behind and the current options for normalization. Finally, we discuss differential expression analysis in the absence and presence of biological replicates. While our focus is on sRNA sequencing experiments, many of the principles discussed are applicable to the sequencing of other RNA populations. PMID:21356093

  5. HLA typing from RNA-Seq sequence reads.

    PubMed

    Boegel, Sebastian; Löwer, Martin; Schäfer, Michael; Bukur, Thomas; de Graaf, Jos; Boisguérin, Valesca; Türeci, Ozlem; Diken, Mustafa; Castle, John C; Sahin, Ugur

    2012-01-01

    We present a method, seq2HLA, for obtaining an individual's human leukocyte antigen (HLA) class I and II type and expression using standard next generation sequencing RNA-Seq data. RNA-Seq reads are mapped against a reference database of HLA alleles, and HLA type, confidence score and locus-specific expression level are determined. We successfully applied seq2HLA to 50 individuals included in the HapMap project, yielding 100% specificity and 94% sensitivity at a P-value of 0.1 for two-digit HLA types. We determined HLA type and expression for previously un-typed Illumina Body Map tissues and a cohort of Korean patients with lung cancer. Because the algorithm uses standard RNA-Seq reads and requires no change to laboratory protocols, it can be used for both existing datasets and future studies, thus adding a new dimension for HLA typing and biomarker studies. PMID:23259685

  6. Using small RNA (sRNA) deep sequencing to understand global virus distribution in plants

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Small RNAs (sRNAs), a class of regulatory RNAs, have been used to serve as the specificity determinants of suppressing gene expression in plants and animals. Next generation sequencing (NGS) uncovered the sRNA landscape in most organisms including their associated microbes. In the current study, w...

  7. Dis3- and exosome subunit-responsive 3' mRNA instability elements.

    PubMed

    Kiss, Daniel L; Hou, Dezhi; Gross, Robert H; Andrulis, Erik D

    2012-07-01

    Eukaryotic RNA turnover is regulated in part by the exosome, a nuclear and cytoplasmic complex of ribonucleases (RNases) and RNA-binding proteins. The major RNase of the complex is thought to be Dis3, a multi-functional 3'-5' exoribonuclease and endoribonuclease. Although it is known that Dis3 and core exosome subunits are recruited to transcriptionally active genes and to messenger RNA (mRNA) substrates, this recruitment is thought to occur indirectly. We sought to discover cis-acting elements that recruit Dis3 or other exosome subunits. Using a bioinformatic tool called RNA SCOPE to screen the 3' untranslated regions of up-regulated transcripts from our published Dis3 depletion-derived transcriptomic data set, we identified several motifs as candidate instability elements. Secondary screening using a luciferase reporter system revealed that one cassette-harboring four elements-destabilized the reporter transcript. RNAi-based depletion of Dis3, Rrp6, Rrp4, Rrp40, or Rrp46 diminished the efficacy of cassette-mediated destabilization. Truncation analysis of the cassette showed that two exosome subunit-sensitive elements (ESSEs) destabilized the reporter. Point-directed mutagenesis of ESSE abrogated the destabilization effect. An examination of the transcriptomic data from exosome subunit depletion-based microarrays revealed that mRNAs with ESSEs are found in every up-regulated mRNA data set but are underrepresented or missing from the down-regulated data sets. Taken together, our findings imply a potentially novel mechanism of mRNA turnover that involves direct Dis3 and other exosome subunit recruitment to and/or regulation on mRNA substrates. PMID:22668878

  8. tRNA-Related Sequences Trigger Systemic mRNA Transport in Plants[OPEN

    PubMed Central

    Zhang, Wenna; Kollwig, Gregor; Apelt, Federico; Walther, Dirk

    2016-01-01

    In plants, protein-coding mRNAs can move via the phloem vasculature to distant tissues, where they may act as non-cell-autonomous signals. Emerging work has identified many phloem-mobile mRNAs, but little is known regarding RNA motifs triggering mobility, the extent of mRNA transport, and the potential of transported mRNAs to be translated into functional proteins after transport. To address these aspects, we produced reporter transcripts harboring tRNA-like structures (TLSs) that were found to be enriched in the phloem stream and in mRNAs moving over chimeric graft junctions. Phenotypic and enzymatic assays on grafted plants indicated that mRNAs harboring a distinctive TLS can move from transgenic roots into wild-type leaves and from transgenic leaves into wild-type flowers or roots; these mRNAs can also be translated into proteins after transport. In addition, we provide evidence that dicistronic mRNA:tRNA transcripts are frequently produced in Arabidopsis thaliana and are enriched in the population of graft-mobile mRNAs. Our results suggest that tRNA-derived sequences with predicted stem-bulge-stem-loop structures are sufficient to mediate mRNA transport and seem to be necessary for the mobility of a large number of endogenous transcripts that can move through graft junctions. PMID:27268430

  9. tRNA-Related Sequences Trigger Systemic mRNA Transport in Plants.

    PubMed

    Zhang, Wenna; Thieme, Christoph J; Kollwig, Gregor; Apelt, Federico; Yang, Lei; Winter, Nikola; Andresen, Nadine; Walther, Dirk; Kragler, Friedrich

    2016-06-01

    In plants, protein-coding mRNAs can move via the phloem vasculature to distant tissues, where they may act as non-cell-autonomous signals. Emerging work has identified many phloem-mobile mRNAs, but little is known regarding RNA motifs triggering mobility, the extent of mRNA transport, and the potential of transported mRNAs to be translated into functional proteins after transport. To address these aspects, we produced reporter transcripts harboring tRNA-like structures (TLSs) that were found to be enriched in the phloem stream and in mRNAs moving over chimeric graft junctions. Phenotypic and enzymatic assays on grafted plants indicated that mRNAs harboring a distinctive TLS can move from transgenic roots into wild-type leaves and from transgenic leaves into wild-type flowers or roots; these mRNAs can also be translated into proteins after transport. In addition, we provide evidence that dicistronic mRNA:tRNA transcripts are frequently produced in Arabidopsis thaliana and are enriched in the population of graft-mobile mRNAs. Our results suggest that tRNA-derived sequences with predicted stem-bulge-stem-loop structures are sufficient to mediate mRNA transport and seem to be necessary for the mobility of a large number of endogenous transcripts that can move through graft junctions. PMID:27268430

  10. Divergent RNA editing frequencies in hornwort mitochondrial nad5 sequences.

    PubMed

    Duff, R Joel

    2006-02-01

    Hornwort mitochondrial genomes have some of the highest rates of RNA editing among plants. Comparison of eleven partial mitochondrial nad5 genomic and cDNA sequences from diverse taxa of hornworts reveal 125 edited sites in only 1107 nt. No single sample exhibits more than half of these sites. Ten of the 11 hornwort taxa have between 35 and 54 edited sties each; whereas, the eleventh taxon, Leiosporoceros, which represents a potential sister taxa to all other hornworts, has only eight sites. Comparison of multiple cDNA sequences from several individuals reveals the presence of many immature transcripts showing the heterogonous nature of the progression of editing. Phylogenetic analyses of hornwort genomic and cDNAs sequences reveal that 65 of the 94 phylogenetically informative sites within the hornwort clade are edited positions. PMID:16376027

  11. SimFuse: A Novel Fusion Simulator for RNA Sequencing (RNA-Seq) Data.

    PubMed

    Tan, Yuxiang; Tambouret, Yann; Monti, Stefano

    2015-01-01

    The performance evaluation of fusion detection algorithms from high-throughput sequencing data crucially relies on the availability of data with known positive and negative cases of gene rearrangements. The use of simulated data circumvents some shortcomings of real data by generation of an unlimited number of true and false positive events, and the consequent robust estimation of accuracy measures, such as precision and recall. Although a few simulated fusion datasets from RNA Sequencing (RNA-Seq) are available, they are of limited sample size. This makes it difficult to systematically evaluate the performance of RNA-Seq based fusion-detection algorithms. Here, we present SimFuse to address this problem. SimFuse utilizes real sequencing data as the fusions' background to closely approximate the distribution of reads from a real sequencing library and uses a reference genome as the template from which to simulate fusions' supporting reads. To assess the supporting read-specific performance, SimFuse generates multiple datasets with various numbers of fusion supporting reads. Compared to an extant simulated dataset, SimFuse gives users control over the supporting read features and the sample size of the simulated library, based on which the performance metrics needed for the validation and comparison of alternative fusion-detection algorithms can be rigorously estimated. PMID:26839886

  12. Sequence and expression of ferredoxin mRNA in barley

    SciTech Connect

    Zielinski, R.; Funder, P.M.; Ling, V. )

    1990-05-01

    We have isolated and structurally characterized a full-length cDNA clone encoding ferredoxin from a {lambda}gt10 cDNA library prepared from barley leaf mRNA. The ferredoxin clone (pBFD-1) was fused head-to-head with a partial-length cDNA clone encoding calmodulin, and was fortuitously isolated by screening the library with a calmodulin-specific oligonucleotide probe. The mRNA sequence from which pBFD-1 was derived is expressed exclusively in the leaf tissues of 7-d old barley seedlings. Barley pre-ferredoxin has a predicted size of 15.3 kDal, of which 4.6 kDal are accounted for by the transit peptide. The polypeptide encoded by pBFD-1 is identical to wheat ferredoxin, and shares slightly more amino acid sequence similarity with spinach ferredoxin I than with ferredoxin II. Ferredoxin mRNA levels are rapidly increased 10-fold by white light in etiolated barley leaves.

  13. Chaining sequence/structure seeds for computing RNA similarity.

    PubMed

    Bourgeade, Laetitia; Chauve, Cédric; Allali, Julien

    2015-03-01

    We describe a new method to compare a query RNA with a static set of target RNAs. Our method is based on (i) a static indexing of the sequence/structure seeds of the target RNAs; (ii) searching the target RNAs by detecting seeds of the query present in the target, chaining these seeds in promising candidate homologs; and then (iii) completing the alignment using an anchor-based exact alignment algorithm. We apply our method on the benchmark Bralibase2.1 and compare its accuracy and efficiency with the exact method LocARNA and its recent seeds-based speed-up ExpLoc-P. Our pipeline RNA-unchained greatly improves computation time of LocARNA and is comparable to the one of ExpLoc-P, while improving the overall accuracy of the final alignments. PMID:25768236

  14. [Nucleotide sequence determination of yeast mitochondrial phenylalanine-tRNA].

    PubMed

    Martin, R; Sibler, A P; Schneller, J M; Keith, G; Stahl, A J; Dirheimer, G

    1978-10-01

    The primary structure of mitochondrial tRNAPhe from Saccharomyces cerevisiae, purified by two-dimensional polyacrylamide gel electrophoresis, was determined using, standard procedures on in vivo 32P-labeled tRNA, as well as the new 5'-end postlabeling techniques. We propose a cloverleaf model which allows for tertiary interaction between cytosine in position 46 and guanine in position 15 and maximizes base pairing in the psi C stem, thus excluding the uracile in position 50 from base pairing in the psi C stem. Comparison of the primary structure of this tRNA with all other known procaryotic, chloroplastic or cytoplasmic tRNAsPhe sequences does not lead to any conclusion about the endosymbiotic theory of mitochondria evolution. PMID:103657

  15. DNA slip-outs cause RNA polymerase II arrest in vitro: potential implications for genetic instability

    PubMed Central

    Salinas-Rios, Viviana; Belotserkovskii, Boris P.; Hanawalt, Philip C.

    2011-01-01

    The abnormal number of repeats found in triplet repeat diseases arises from ‘repeat instability’, in which the repetitive section of DNA is subject to a change in copy number. Recent studies implicate transcription in a mechanism for repeat instability proposed to involve RNA polymerase II (RNAPII) arrest caused by a CTG slip-out, triggering transcription-coupled repair (TCR), futile cycles of which may lead to repeat expansion or contraction. In the present study, we use defined DNA constructs to directly test whether the structures formed by CAG and CTG repeat slip-outs can cause transcription arrest in vitro. We found that a slip-out of (CAG)20 or (CTG)20 repeats on either strand causes RNAPII arrest in HeLa cell nuclear extracts. Perfect hairpins and loops on either strand also cause RNAPII arrest. These findings are consistent with a transcription-induced repeat instability model in which transcription arrest in mammalian cells may initiate a ‘gratuitous’ TCR event leading to a change in repeat copy number. An understanding of the underlying mechanism of repeat instability could lead to intervention to slow down expansion and delay the onset of many neurodegenerative diseases in which triplet repeat expansion is implicated. PMID:21666257

  16. Assessing long-distance RNA sequence connectivity via RNA-templated DNA–DNA ligation

    PubMed Central

    Roy, Christian K; Olson, Sara; Graveley, Brenton R; Zamore, Phillip D; Moore, Melissa J

    2015-01-01

    Many RNAs, including pre-mRNAs and long non-coding RNAs, can be thousands of nucleotides long and undergo complex post-transcriptional processing. Multiple sites of alternative splicing within a single gene exponentially increase the number of possible spliced isoforms, with most human genes currently estimated to express at least ten. To understand the mechanisms underlying these complex isoform expression patterns, methods are needed that faithfully maintain long-range exon connectivity information in individual RNA molecules. In this study, we describe SeqZip, a methodology that uses RNA-templated DNA–DNA ligation to retain and compress connectivity between distant sequences within single RNA molecules. Using this assay, we test proposed coordination between distant sites of alternative exon utilization in mouse Fn1, and we characterize the extraordinary exon diversity of Drosophila melanogaster Dscam1. DOI: http://dx.doi.org/10.7554/eLife.03700.001 PMID:25866926

  17. Legume genomics: understanding biology through DNA and RNA sequencing

    PubMed Central

    O'Rourke, Jamie A.; Bolon, Yung-Tsi; Bucciarelli, Bruna; Vance, Carroll P.

    2014-01-01

    Background The legume family (Leguminosae) consists of approx. 17 000 species. A few of these species, including, but not limited to, Phaseolus vulgaris, Cicer arietinum and Cajanus cajan, are important dietary components, providing protein for approx. 300 million people worldwide. Additional species, including soybean (Glycine max) and alfalfa (Medicago sativa), are important crops utilized mainly in animal feed. In addition, legumes are important contributors to biological nitrogen, forming symbiotic relationships with rhizobia to fix atmospheric N2 and providing up to 30 % of available nitrogen for the next season of crops. The application of high-throughput genomic technologies including genome sequencing projects, genome re-sequencing (DNA-seq) and transcriptome sequencing (RNA-seq) by the legume research community has provided major insights into genome evolution, genomic architecture and domestication. Scope and Conclusions This review presents an overview of the current state of legume genomics and explores the role that next-generation sequencing technologies play in advancing legume genomics. The adoption of next-generation sequencing and implementation of associated bioinformatic tools has allowed researchers to turn each species of interest into their own model organism. To illustrate the power of next-generation sequencing, an in-depth overview of the transcriptomes of both soybean and white lupin (Lupinus albus) is provided. The soybean transcriptome focuses on analysing seed development in two near-isogenic lines, examining the role of transporters, oil biosynthesis and nitrogen utilization. The white lupin transcriptome analysis examines how phosphate deficiency alters gene expression patterns, inducing the formation of cluster roots. Such studies illustrate the power of next-generation sequencing and bioinformatic analyses in elucidating the gene networks underlying biological processes. PMID:24769535

  18. Long Non-Coding RNA and Alternative Splicing Modulations in Parkinson's Leukocytes Identified by RNA Sequencing

    PubMed Central

    Soreq, Lilach; Guffanti, Alessandro; Salomonis, Nathan; Simchovitz, Alon; Israel, Zvi; Bergman, Hagai; Soreq, Hermona

    2014-01-01

    The continuously prolonged human lifespan is accompanied by increase in neurodegenerative diseases incidence, calling for the development of inexpensive blood-based diagnostics. Analyzing blood cell transcripts by RNA-Seq is a robust means to identify novel biomarkers that rapidly becomes a commonplace. However, there is lack of tools to discover novel exons, junctions and splicing events and to precisely and sensitively assess differential splicing through RNA-Seq data analysis and across RNA-Seq platforms. Here, we present a new and comprehensive computational workflow for whole-transcriptome RNA-Seq analysis, using an updated version of the software AltAnalyze, to identify both known and novel high-confidence alternative splicing events, and to integrate them with both protein-domains and microRNA binding annotations. We applied the novel workflow on RNA-Seq data from Parkinson's disease (PD) patients' leukocytes pre- and post- Deep Brain Stimulation (DBS) treatment and compared to healthy controls. Disease-mediated changes included decreased usage of alternative promoters and N-termini, 5′-end variations and mutually-exclusive exons. The PD regulated FUS and HNRNP A/B included prion-like domains regulated regions. We also present here a workflow to identify and analyze long non-coding RNAs (lncRNAs) via RNA-Seq data. We identified reduced lncRNA expression and selective PD-induced changes in 13 of over 6,000 detected leukocyte lncRNAs, four of which were inversely altered post-DBS. These included the U1 spliceosomal lncRNA and RP11-462G22.1, each entailing sequence complementarity to numerous microRNAs. Analysis of RNA-Seq from PD and unaffected controls brains revealed over 7,000 brain-expressed lncRNAs, of which 3,495 were co-expressed in the leukocytes including U1, which showed both leukocyte and brain increases. Furthermore, qRT-PCR validations confirmed these co-increases in PD leukocytes and two brain regions, the amygdala and substantia

  19. Use of Unamplified RNA/cDNA-Hybrid Nanopore Sequencing for Rapid Detection and Characterization of RNA Viruses.

    PubMed

    Kilianski, Andy; Roth, Pierce A; Liem, Alvin T; Hill, Jessica M; Willis, Kristen L; Rossmaier, Rebecca D; Marinich, Andrew V; Maughan, Michele N; Karavis, Mark A; Kuhn, Jens H; Honko, Anna N; Rosenzweig, C Nicole

    2016-08-01

    Nanopore sequencing, a novel genomics technology, has potential applications for routine biosurveillance, clinical diagnosis, and outbreak investigation of virus infections. Using rapid sequencing of unamplified RNA/cDNA hybrids, we identified Venezuelan equine encephalitis virus and Ebola virus in 3 hours from sample receipt to data acquisition, demonstrating a fieldable technique for RNA virus characterization. PMID:27191483

  20. Use of Unamplified RNA/cDNA–Hybrid Nanopore Sequencing for Rapid Detection and Characterization of RNA Viruses

    PubMed Central

    Kilianski, Andy; Roth, Pierce A.; Liem, Alvin T.; Hill, Jessica M.; Willis, Kristen L.; Rossmaier, Rebecca D.; Marinich, Andrew V.; Maughan, Michele N.; Karavis, Mark A.; Kuhn, Jens H.; Honko, Anna N.

    2016-01-01

    Nanopore sequencing, a novel genomics technology, has potential applications for routine biosurveillance, clinical diagnosis, and outbreak investigation of virus infections. Using rapid sequencing of unamplified RNA/cDNA hybrids, we identified Venezuelan equine encephalitis virus and Ebola virus in 3 hours from sample receipt to data acquisition, demonstrating a fieldable technique for RNA virus characterization. PMID:27191483

  1. PlantMirnaT: miRNA and mRNA integrated analysis fully utilizing characteristics of plant sequencing data.

    PubMed

    Rhee, S; Chae, H; Kim, S

    2015-07-15

    miRNA is known to regulate up to several hundreds coding genes, thus the integrated analysis of miRNA and mRNA expression data is an important problem. Unfortunately, the integrated analysis is challenging since it needs to consider expression data of two different types, miRNA and mRNA, and target relationship between miRNA and mRNA is not clear, especially when microarray data is used. Fortunately, due to the low sequencing cost, small RNA and RNA sequencing are routinely processed and we may be able to infer regulation relationships between miRNAs and mRNAs more accurately by using sequencing data. However, no method is developed specifically for sequencing data. Thus we developed PlantMirnaT, a new miRNA-mRNA integrated analysis system. To fully leverage the power of sequencing data, three major features are developed and implemented in PlantMirnaT. First, we implemented a plant-specific short read mapping tool based on recent discoveries on miRNA target relationship in plant. Second, we designed and implemented an algorithm considering miRNA targets in the full intragenic region, not just 3' UTR. Lastly but most importantly, our algorithm is designed to consider quantity of miRNA expression and its distribution on target mRNAs. The new algorithm was used to characterize rice under drought condition using our proprietary data. Our algorithm successfully discovered that two miRNAs, miRNA1425-5p, miRNA 398b, that are involved in suppression of glucose pathway in a naturally drought resistant rice, Vandana. The system can be downloaded at https://sites.google.com/site/biohealthinformaticslab/resources. PMID:25863133

  2. Globin mRNA contains a sequence complementary to double-stranded region of nuclear pre-mRNA.

    PubMed Central

    Ryskov, A P; Tokarskaya, O V; Georgiev, G P; Coutelle, C; Thiele, B

    1976-01-01

    Melted ds RNA isolated from rabbit bone marrow pre-mRNA was hybridized with excess of globin mRNA which was prepared from rabbit reticulocytes. 7-9% of ds sequences became RNAase-stable and about 30% of the sequences could be bound to poly(U)-Sepharose through poly (A) of mRNA. The size of RNAase-stable hybrid is about 30 nucleotides, that is one fourth of the length of one strand of the ds RNA. PMID:986644

  3. A method for clustering of miRNA sequences using fragmented programming.

    PubMed

    Ivashchenko, Anatoly; Pyrkova, Anna; Niyazova, Raigul

    2016-01-01

    Clustering of miRNA sequences is an important problem in molecular genetics associated cellular biology. Thousands of such sequences are known today through advancement in sophisticated molecular tools, sequencing techniques, computational resources and rule based mathematical models. Analysis of such large-scale miRNA sequences for inferring patterns towards deducing cellular function is a great challenge in modern molecular biology. Therefore, it is of interest to develop mathematical models specific for miRNA sequences. The process is to group (cluster) such miRNA sequences using well-defined known features. We describe a method for clustering of miRNA sequences using fragmented programming. Subsequently, we illustrated the utility of the model using a dendrogram (a tree diagram) for publically known A.thaliana miRNA nucleotide sequences towards the inference of observed conserved patterns. PMID:27212839

  4. Improved definition of the mouse transcriptome via targeted RNA sequencing

    PubMed Central

    Clark, Michael B.; Mercer, Tim R.; Crawford, Joanna; Malquori, Lorenzo; Notredame, Cedric; Dinger, Marcel E.; Mattick, John S.

    2016-01-01

    Targeted RNA sequencing (CaptureSeq) uses oligonucleotide probes to capture RNAs for sequencing, providing enriched read coverage, accurate measurement of gene expression, and quantitative expression data. We applied CaptureSeq to refine transcript annotations in the current murine GRCm38 assembly. More than 23,000 regions corresponding to putative or annotated long noncoding RNAs (lncRNAs) and 154,281 known splicing junction sites were selected for targeted sequencing across five mouse tissues and three brain subregions. The results illustrate that the mouse transcriptome is considerably more complex than previously thought. We assemble more complete transcript isoforms than GENCODE, expand transcript boundaries, and connect interspersed islands of mapped reads. We describe a novel filtering pipeline that identifies previously unannotated but high-quality transcript isoforms. In this set, 911 GENCODE neighboring genes are condensed into 400 expanded gene models. Additionally, 594 GENCODE lncRNAs acquire an open reading frame (ORF) when their structure is extended with CaptureSeq. Finally, we validate our observations using current FANTOM and Mouse ENCODE resources. PMID:27197243

  5. Sequence of the 16S ribosomal RNA from Halobacterium volcanii, an archaebacterium

    NASA Technical Reports Server (NTRS)

    Gupta, R.; Lanter, J. M.; Woese, C. R.

    1983-01-01

    The sequence of the 16S ribosomal RNA (rRNA) from the archaebacterium Halobacterium volcanii has been determined by DNA sequencing methods. The archaebacterial rRNA is similar to its eubacterial counterpart in secondary structure. Although it is closer in sequence to the eubacterial 16S rRNA than to the eukaryotic 16S-like rRNA, the H. volcanii sequence also shows certain points of specific similarity to its eukaryotic counterpart. Since the H. volcanii sequence is closer to both the eubacterial and the eukaryotic sequences than these two are to one another, it follows that the archaebacterial sequence resembles their common ancestral sequence more closely than does either of the other two versions.

  6. Comparative RNA sequencing reveals substantial genetic variation in endangered primates

    PubMed Central

    Perry, George H.; Melsted, Páll; Marioni, John C.; Wang, Ying; Bainer, Russell; Pickrell, Joseph K.; Michelini, Katelyn; Zehr, Sarah; Yoder, Anne D.; Stephens, Matthew; Pritchard, Jonathan K.; Gilad, Yoav

    2012-01-01

    Comparative genomic studies in primates have yielded important insights into the evolutionary forces that shape genetic diversity and revealed the likely genetic basis for certain species-specific adaptations. To date, however, these studies have focused on only a small number of species. For the majority of nonhuman primates, including some of the most critically endangered, genome-level data are not yet available. In this study, we have taken the first steps toward addressing this gap by sequencing RNA from the livers of multiple individuals from each of 16 mammalian species, including humans and 11 nonhuman primates. Of the nonhuman primate species, five are lemurs and two are lorisoids, for which little or no genomic data were previously available. To analyze these data, we developed a method for de novo assembly and alignment of orthologous gene sequences across species. We assembled an average of 5721 gene sequences per species and characterized diversity and divergence of both gene sequences and gene expression levels. We identified patterns of variation that are consistent with the action of positive or directional selection, including an 18-fold enrichment of peroxisomal genes among genes whose regulation likely evolved under directional selection in the ancestral primate lineage. Importantly, we found no relationship between genetic diversity and endangered status, with the two most endangered species in our study, the black and white ruffed lemur and the Coquerel's sifaka, having the highest genetic diversity among all primates. Our observations imply that many endangered lemur populations still harbor considerable genetic variation. Timely efforts to conserve these species alongside their habitats have, therefore, strong potential to achieve long-term success. PMID:22207615

  7. High-throughput illumina strand-specific RNA sequencing library preparation

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Conventional Illumina RNA-Seq does not have the resolution to decode the complex eukaryote transcriptome due to the lack of RNA polarity information. Strand-specific RNA sequencing (ssRNA-Seq) can overcome these limitations and as such is better suited for genome annotation, de novo transcriptome as...

  8. Research Techniques Made Simple: Methodology and Clinical Applications of RNA Sequencing.

    PubMed

    Whitley, Sarah K; Horne, William T; Kolls, Jay K

    2016-08-01

    RNA sequencing is a method of transcriptome profiling that utilizes next-generation sequencing technology. It offers several distinct advantages over hybridization-based approaches, most notably superior sensitivity and the capacity for de novo transcript discovery. This article describes RNA sequencing methodology, summarizes important technological advances and challenges, and discusses applications for this technique in the field of dermatology. PMID:27450500

  9. Modulations of RNA sequences by cytokinin in pumpkin cotyledons

    SciTech Connect

    Chang, C.; Ertl, J.; Chen, C.

    1987-04-01

    Polyadenylated mRNAs from excised pumpkin cotyledons treated with or without 10/sup -4/ M benzyladenine (BA) for various time periods in suspension culture were assayed by in vitro translation in the presence of (/sup 35/S) methionine. The radioactive polypeptides were analyzed by one- and two-dimensional polyacrylamide gel electrophoresis. Specific sequences of mRNAs were enhanced, reduced, induced, or suppressed by the hormone within 60 min of the application of BA to the cotyledons. Four independent cDNA clones of cytokinin-modulated mRNAs have been selected and characterized. RNA blot hybridization using the four cDNA probes also indicates that the levels of specific mRNAs are modulated upward or downward by the hormone.

  10. The Microglial Sensome Revealed by Direct RNA Sequencing

    PubMed Central

    Hickman, Suzanne E.; Kingery, Nathan D.; Ohsumi, Toshiro; Borowsky, Mark; Wang, Li-chong; Means, Terry K.; Khoury, Joseph El

    2013-01-01

    Microglia, the principal neuroimmune sentinels of the brain, continuously sense changes in their environment and respond to invading pathogens, toxins and cellular debris. Microglia exhibit plasticity and can assume neurotoxic or neuroprotective priming states that determine their responses to danger. We used direct RNA sequencing, without amplification or cDNA synthesis, to determine the quantitative transcriptomes of microglia of healthy adult and aged mice. We validated our findings by fluorescent dual in-situ hybridization, unbiased proteomic analysis and quantitative PCR. We report here that microglia have a distinct transcriptomic signature and express a unique cluster of transcripts encoding proteins for sensing endogenous ligands and microbes that we term the “sensome”. With aging, sensome transcripts for endogenous ligand recognition are downregulated, whereas those involved in microbe recognition and host defense are upregulated. In addition, aging is associated with an overall increase in expression of microglial genes involved in neuroprotection. PMID:24162652

  11. Copying of RNA Sequences without Pre-Activation

    PubMed Central

    Jauker, Mario; Griesser, Helmut; Richert, Clemens

    2015-01-01

    Template-directed incorporation of nucleotides at the terminus of a growing complementary strand is the basis of replication. For RNA, this process can occur in the absence of enzymes, if the ribonucleotides are first converted to an active species with a leaving group. Thus far, the activation required a separate chemical step, complicating prebiotically plausible scenarios. Here we show that a combination of a carbodiimide and an organocatalyst induces near-quantitative incorporation of any of the four ribonucleotides. Upon in situ activation, adenosine monophosphate was found to also form oligomers in aqueous solution. So, both de novo strand formation and sequence-specific copying can occur without an artificial synthetic step. PMID:26435291

  12. FASTR: A novel data format for concomitant representation of RNA sequence and secondary structure information.

    PubMed

    Bose, Tungadri; Dutta, Anirban; Mh, Mohammed; Gandhi, Hemang; Mande, Sharmila S

    2015-09-01

    Given the importance of RNA secondary structures in defining their biological role, it would be convenient for researchers seeking RNA data if both sequence and structural information pertaining to RNA molecules are made available together. Current nucleotide data repositories archive only RNA sequence data. Furthermore, storage formats which can frugally represent RNA sequence as well as structure data in a single file, are currently unavailable. This article proposes a novel storage format, 'FASTR', for concomitant representation of RNA sequence and structure. The storage efficiency of the proposed FASTR format has been evaluated using RNA data from various microorganisms. Results indicate that the size of FASTR formatted files (containing both RNA sequence as well as structure information) are equivalent to that of FASTA-format files, which contain only RNA sequence information. RNA secondary structure is typically represented using a combination of a string of nucleotide characters along with the corresponding dot-bracket notation indicating structural attributes. 'FASTR' - the novel storage format proposed in the present study enables a frugal representation of both RNA sequence and structural information in the form of a single string. In spite of having a relatively smaller storage footprint, the resultant 'fastr' string(s) retain all sequence as well as secondary structural information that could be stored using a dot-bracket notation. An implementation of the 'FASTR' methodology is available for download at http://metagenomics.atc.tcs.com/compression/fastr. PMID:26333403

  13. New wheat microRNA using whole-genome sequence.

    PubMed

    Kurtoglu, Kuaybe Yucebilgili; Kantar, Melda; Budak, Hikmet

    2014-06-01

    MicroRNAs are post-transcriptional regulators of gene expression, taking roles in a variety of fundamental biological processes. Hence, their identification, annotation and characterization are of great significance, especially in bread wheat, one of the main food sources for humans. The recent availability of 5× coverage Triticum aestivum L. whole-genome sequence provided us with the opportunity to perform a systematic prediction of a complete catalogue of wheat microRNAs. Using an in silico homology-based approach, stem-loop coding regions were derived from two assemblies, constructed from wheat 454 reads. To avoid the presence of pseudo-microRNAs in the final data set, transposable element related stem-loops were eliminated by repeat analysis. Overall, 52 putative wheat microRNAs were predicted, including seven, which have not been previously published. Moreover, with distinct analysis of the two different assemblies, both variety and representation of putative microRNA-coding stem-loops were found to be predominant in the intergenic regions. By searching available expressed sequences and small RNA library databases, expression evidence for 39 (out of 52) putative wheat microRNAs was provided. Expression of three of the predicted microRNAs (miR166, miR396 and miR528) was also comparatively quantified with real-time quantitative reverse transcription PCR. This is the first report on in silico prediction of a whole repertoire of bread wheat microRNAs, supported by the wet-lab validation. PMID:24395439

  14. RNA Sequencing Identifies Novel Translational Biomarkers of Kidney Fibrosis.

    PubMed

    Craciun, Florin L; Bijol, Vanesa; Ajay, Amrendra K; Rao, Poornima; Kumar, Ramya K; Hutchinson, John; Hofmann, Oliver; Joshi, Nikita; Luyendyk, James P; Kusebauch, Ulrike; Moss, Christopher L; Srivastava, Anand; Himmelfarb, Jonathan; Waikar, Sushrut S; Moritz, Robert L; Vaidya, Vishal S

    2016-06-01

    CKD is the gradual, asymptomatic loss of kidney function, but current tests only identify CKD when significant loss has already happened. Several potential biomarkers of CKD have been reported, but none have been approved for preclinical or clinical use. Using RNA sequencing in a mouse model of folic acid-induced nephropathy, we identified ten genes that track kidney fibrosis development, the common pathologic finding in patients with CKD. The gene expression of all ten candidates was confirmed to be significantly higher (approximately ten- to 150-fold) in three well established, mechanistically distinct mouse models of kidney fibrosis than in models of nonfibrotic AKI. Protein expression of these genes was also high in the folic acid model and in patients with biopsy-proven kidney fibrosis. mRNA expression of the ten genes increased with increasing severity of kidney fibrosis, decreased in response to therapeutic intervention, and increased only modestly (approximately two- to five-fold) with liver fibrosis in mice and humans, demonstrating specificity for kidney fibrosis. Using targeted selected reaction monitoring mass spectrometry, we detected three of the ten candidates in human urine: cadherin 11 (CDH11), macrophage mannose receptor C1 (MRC1), and phospholipid transfer protein (PLTP). Furthermore, urinary levels of each of these three proteins distinguished patients with CKD (n=53) from healthy individuals (n=53; P<0.05). In summary, we report the identification of urinary CDH11, MRC1, and PLTP as novel noninvasive biomarkers of CKD. PMID:26449608

  15. Targeted RNA Sequencing Assay to Characterize Gene Expression and Genomic Alterations.

    PubMed

    Martin, Dorrelyn P; Miya, Jharna; Reeser, Julie W; Roychowdhury, Sameek

    2016-01-01

    RNA sequencing (RNAseq) is a versatile method that can be utilized to detect and characterize gene expression, mutations, gene fusions, and noncoding RNAs. Standard RNAseq requires 30 - 100 million sequencing reads and can include multiple RNA products such as mRNA and noncoding RNAs. We demonstrate how targeted RNAseq (capture) permits a focused study on selected RNA products using a desktop sequencer. RNAseq capture can characterize unannotated, low, or transiently expressed transcripts that may otherwise be missed using traditional RNAseq methods. Here we describe the extraction of RNA from cell lines, ribosomal RNA depletion, cDNA synthesis, preparation of barcoded libraries, hybridization and capture of targeted transcripts and multiplex sequencing on a desktop sequencer. We also outline the computational analysis pipeline, which includes quality control assessment, alignment, fusion detection, gene expression quantification and identification of single nucleotide variants. This assay allows for targeted transcript sequencing to characterize gene expression, gene fusions, and mutations. PMID:27585245

  16. Complete sequence and gene organization of the Nosema spodopterae rRNA gene.

    PubMed

    Tsai, Shu-Jen; Huang, Wei-Fone; Wang, Chung-Hsiung

    2005-01-01

    By sequencing the entire ribosomal RNA (rRNA) gene of Nosema spodopterae, we show here that its gene organization follows a pattern similar to the Nosema type species, Nosema bombycis, i.e. 5'-large subunit rRNA (2,497 bp)-internal transcribed spacer (185 bp)-small subunit rRNA (1,232 bp)-intergenic spacer (277 bp)-5S rRNA (114 bp)-3'. Gene sequences and the secondary structures of large subunit rRNA, small subunit rRNA, and 5S rRNA are compared with the known corresponding sequences and structures of closely related microsporidia. The results suggest that the Nosema genus may be heterogeneous and that the rRNA gene organization may be a useful characteristic for determining which species are closely related to the type species. PMID:15702980

  17. International interlaboratory study comparing single organism 16S rRNA gene sequencing data: Beyond consensus sequence comparisons.

    PubMed

    Olson, Nathan D; Lund, Steven P; Zook, Justin M; Rojas-Cornejo, Fabiola; Beck, Brian; Foy, Carole; Huggett, Jim; Whale, Alexandra S; Sui, Zhiwei; Baoutina, Anna; Dobeson, Michael; Partis, Lina; Morrow, Jayne B

    2015-03-01

    This study presents the results from an interlaboratory sequencing study for which we developed a novel high-resolution method for comparing data from different sequencing platforms for a multi-copy, paralogous gene. The combination of PCR amplification and 16S ribosomal RNA gene (16S rRNA) sequencing has revolutionized bacteriology by enabling rapid identification, frequently without the need for culture. To assess variability between laboratories in sequencing 16S rRNA, six laboratories sequenced the gene encoding the 16S rRNA from Escherichia coli O157:H7 strain EDL933 and Listeria monocytogenes serovar 4b strain NCTC11994. Participants performed sequencing methods and protocols available in their laboratories: Sanger sequencing, Roche 454 pyrosequencing(®), or Ion Torrent PGM(®). The sequencing data were evaluated on three levels: (1) identity of biologically conserved position, (2) ratio of 16S rRNA gene copies featuring identified variants, and (3) the collection of variant combinations in a set of 16S rRNA gene copies. The same set of biologically conserved positions was identified for each sequencing method. Analytical methods using Bayesian and maximum likelihood statistics were developed to estimate variant copy ratios, which describe the ratio of nucleotides at each identified biologically variable position, as well as the likely set of variant combinations present in 16S rRNA gene copies. Our results indicate that estimated variant copy ratios at biologically variable positions were only reproducible for high throughput sequencing methods. Furthermore, the likely variant combination set was only reproducible with increased sequencing depth and longer read lengths. We also demonstrate novel methods for evaluating variable positions when comparing multi-copy gene sequence data from multiple laboratories generated using multiple sequencing technologies. PMID:27077030

  18. Sequence analysis of a cluster of twenty-one tRNA genes in Bacillus subtilis.

    PubMed Central

    Green, C J; Vold, B S

    1983-01-01

    The DNA sequence of a cluster of twenty-one tRNA genes distal to a rRNA gene set in B. subtilis was determined. None of the tRNA genes are repeated in the sequence. The only classes of tRNAs that are not represented are those for cysteine, glutamine, tryptophan, and tyrosine. Three of the tRNA genes in this cluster do not have the 3'-CCA sequence encoded in the gene. There is no RNA polymerase terminator sequence in the region between the 5S gene and the first tRNA gene or within the tRNA gene cluster. A terminator sequence was found directly after the last tRNA gene. This rRNA and tRNA gene cluster probably represents one transcriptional unit. However, there may be an RNA polymerase promoter site within this sequence, which raises some interesting questions concerning the regulation of transcription for these tRNA genes. PMID:6310512

  19. Sequences far downstream from the classical tRNA promoter elements bind RNA polymerase III transcription factors.

    PubMed Central

    Young, L S; Rivier, D H; Sprague, K U

    1991-01-01

    We have examined the interaction of transcription factors TFIIIC and TFIIID with a silkworm alanine tRNA gene. Previous functional analysis showed that the promoter for this gene is unusually large compared with the classical tRNA promoter elements (the A and B boxes) and includes sequences downstream from the transcription termination site. The goal of the experiments reported here was to determine which sequences within the full promoter make stable contacts with transcription factors. We show that when TFIIIC and TFIIID are combined, a complex is formed with the tRNA(Ala)C gene. Neither factor alone can form this complex. DNase I digestion of gene-factor complexes reveals that most of the tRNA(Ala)C promoter is in contact with factors. The protected region extends from -1 to at least +136 and includes both the A and B boxes and the previously identified downstream promoter sequences. Analysis of mutant promoters shows that sequence-specific contacts throughout the protected region are required for binding. The role of 3'-flanking sequences in transcription factor binding explains the contribution of these sequences to the tRNA(Ala)C promoter. We discuss the possibility that such sequences affect promoter strength in other tRNA genes. Images PMID:1996100

  20. The contribution of co-transcriptional RNA:DNA hybrid structures to DNA damage and genome instability

    PubMed Central

    Hamperl, Stephan; Cimprich, Karlene A.

    2014-01-01

    Accurate DNA replication and DNA repair are crucial for the maintenance of genome stability, and it is generally accepted that failure of these processes is a major source of DNA damage in cells. Intriguingly, recent evidence suggests that DNA damage is more likely to occur at genomic loci with high transcriptional activity. Furthermore, loss of certain RNA processing factors in eukaryotic cells is associated with increased formation of co-transcriptional RNA:DNA hybrid structures known as R-loops, resulting in double-strand breaks (DSBs) and DNA damage. However, the molecular mechanisms by which R-loop structures ultimately lead to DNA breaks and genome instability is not well understood. In this review, we summarize the current knowledge about the formation, recognition and processing of RNA:DNA hybrids, and discuss possible mechanisms by which these structures contribute to DNA damage and genome instability in the cell. PMID:24746923

  1. Optimal terminal sequences for continuous or serial isothermal amplification of dsRNA with norovirus RNA replicase.

    PubMed

    Arai, Hidenao; Nishigaki, Koichi; Nemoto, Naoto; Suzuki, Miho; Husimi, Yuzuru

    2014-01-01

    The norovirus RNA replicase (NV3D(pol), 56 kDa, single chain monomeric protein) can amplify double-stranded (ds) RNA isothermally. It will play an alternative role in the in vitro evolution against traditional Qβ RNA replicase, which cannot amplify dsRNA and consists of four subunits, three of which are borrowed from host E.coli. In order to identify the optimal 3'-terminal sequence of the RNA template for NV3D(pol), an in vitro selection using the serial transfer was performed for a random library having the 3'-terminal sequence of ---UUUUUUNNNN-3'. The population landscape on the 4-dimensional sequence space of the 17(th) round of transfer gave a main peak around ---CAAC-3'. In the preceding studies on the batch amplification reaction starting from a single-stranded RNA, a template with 3'-terminal C-stretch was amplified effectively. It was confirmed that in the batch amplification the ---CCC-3' was much more effective than the ---CAAC-3', but in the serial transfer condition in which the ----CAAC-3' was sustained stably, the ---CCC-3' was washed out. Based on these results we proposed the existence of the "shuttle mode" replication of dsRNA. We also proposed the optimal terminal sequences of RNA for in vitro evolution with NV3D(pol). PMID:27493494

  2. Optimal terminal sequences for continuous or serial isothermal amplification of dsRNA with norovirus RNA replicase

    PubMed Central

    Arai, Hidenao; Nishigaki, Koichi; Nemoto, Naoto; Suzuki, Miho; Husimi, Yuzuru

    2014-01-01

    The norovirus RNA replicase (NV3Dpol, 56 kDa, single chain monomeric protein) can amplify double-stranded (ds) RNA isothermally. It will play an alternative role in the in vitro evolution against traditional Qβ RNA replicase, which cannot amplify dsRNA and consists of four subunits, three of which are borrowed from host E.coli. In order to identify the optimal 3′-terminal sequence of the RNA template for NV3Dpol, an in vitro selection using the serial transfer was performed for a random library having the 3′-terminal sequence of ---UUUUUUNNNN-3′. The population landscape on the 4-dimensional sequence space of the 17th round of transfer gave a main peak around ---CAAC-3′. In the preceding studies on the batch amplification reaction starting from a single-stranded RNA, a template with 3′-terminal C-stretch was amplified effectively. It was confirmed that in the batch amplification the ---CCC-3′ was much more effective than the ---CAAC-3′, but in the serial transfer condition in which the ----CAAC-3′ was sustained stably, the ---CCC-3′ was washed out. Based on these results we proposed the existence of the “shuttle mode” replication of dsRNA. We also proposed the optimal terminal sequences of RNA for in vitro evolution with NV3Dpol. PMID:27493494

  3. Equally parsimonious pathways through an RNA sequence space are not equally likely

    NASA Technical Reports Server (NTRS)

    Lee, Y. H.; DSouza, L. M.; Fox, G. E.

    1997-01-01

    An experimental system for determining the potential ability of sequences resembling 5S ribosomal RNA (rRNA) to perform as functional 5S rRNAs in vivo in the Escherichia coli cellular environment was devised previously. Presumably, the only 5S rRNA sequences that would have been fixed by ancestral populations are ones that were functionally valid, and hence the actual historical paths taken through RNA sequence space during 5S rRNA evolution would have most likely utilized valid sequences. Herein, we examine the potential validity of all sequence intermediates along alternative equally parsimonious trajectories through RNA sequence space which connect two pairs of sequences that had previously been shown to behave as valid 5S rRNAs in E. coli. The first trajectory requires a total of four changes. The 14 sequence intermediates provide 24 apparently equally parsimonious paths by which the transition could occur. The second trajectory involves three changes, six intermediate sequences, and six potentially equally parsimonious paths. In total, only eight of the 20 sequence intermediates were found to be clearly invalid. As a consequence of the position of these invalid intermediates in the sequence space, seven of the 30 possible paths consisted of exclusively valid sequences. In several cases, the apparent validity/invalidity of the intermediate sequences could not be anticipated on the basis of current knowledge of the 5S rRNA structure. This suggests that the interdependencies in RNA sequence space may be more complex than currently appreciated. If ancestral sequences predicted by parsimony are to be regarded as actual historical sequences, then the present results would suggest that they should also satisfy a validity requirement and that, in at least limited cases, this conjecture can be tested experimentally.

  4. High-throughput-sequencing-based identification of a grapevine fanleaf virus satellite RNA in Vitis vinifera.

    PubMed

    Chiumenti, Michela; Mohorianu, Irina; Roseti, Vincenzo; Saldarelli, Pasquale; Dalmay, Tamas; Minafra, Angelantonio

    2016-05-01

    A new satellite RNA (satRNA) of grapevine fanleaf virus (GFLV) was identified by high-throughput sequencing of high-definition (HD) adapter libraries from grapevine plants of the cultivar Panse precoce (PPE) affected by enation disease. The complete nucleotide sequence was obtained by automatic sequencing using primers designed based on next-generation sequencing (NGS) data. The full-length sequence, named satGFLV-PPE, consisted of 1119 nucleotides with a single open reading frame from position 15 to 1034. This satRNA showed maximum nucleotide sequence identity of 87 % to satArMV-86 and satGFLV-R6. Symptomatic grapevines were surveyed for the presence of the satRNA, and no correlation was found between detection of the satRNA and enation symptom expression. PMID:26873812

  5. Nucleotide sequence of 5S ribosomal RNA from Aspergillus nidulans and Neurospora crassa.

    PubMed Central

    Piechulla, B; Hahn, U; McLaughlin, L W; Küntzel, H

    1981-01-01

    The nucleotide sequences of 5S rRNA molecules isolated from the cytosol and the mitochondria of the ascomycetes A. nidulans and N. crassa were determined by partial chemical cleavage of 3'-terminally labelled RNA. The sequence identity of the cytosolic and mitochondrial RNA preparations confirms the absence of mitochondrion-specific 5S rRNA in these fungi. The sequences of the two organisms differ in 35 positions, and each sequence differs from yeast 5S rRNA in 44 positions. Both molecules contain the sequence GCUC in place of GAAC or GAUY found in all other 5S rRNAs, indicating that this region is not universally involved in base-pairing to the invariant GTpsiC sequence of tRNAs. Images PMID:6453331

  6. Detection of mRNA sequences in nuclear 30S ribonucleoprotein subcomplexes.

    PubMed Central

    Kinniburgh, A J; Martin, T E

    1976-01-01

    RNA from nuclear 30S ribonucleoprotein (RNP) complexes of mouse ascites cells has been shows to contain sequences homologous to poly(A) + mRNA by its ability to hybridize with complementary DNA prepared from poly(A) + mRNA template. Analysis of the hybridization kinetics of poly(A) + mRNA with its own complementary DNA revealed several abundancy classes. The total complexity of poly(A) + mRNA from ascites cells was estimated to be approximately 30,000 sequences of average molecular weight (6 X 10(5)). When the hybridization reaction of 30S RNP-RNA with mRNA-specific cDNA was compared to the homologous reaction the majority, and most probably all, of the poly(A) + mRNA sequences were found to be present in the RNA. The kinetics of hybridization suggest that 10-15% of the RNA in this RNP complex is homologous to poly(A) + mRNA. The 30S RNP subcomplexes therefore contain nuclear poly(A) + mRNA sequences as well as the bulk of heterogeneous RNA. PMID:1066686

  7. 5S rRNA sequences from four marine invertebrates and implications for base pairing models of metazoan sequences.

    PubMed

    Walker, W F; Doolittle, W F

    1983-08-11

    The nucleotide sequences of 5S rRNAs from the starfish Asterias vulgaris, the squid Illex illecebrosus, the sipunculid Phascolopsis gouldii and the jellyfish Aurelia aurita were determined. The sequence from Asterias lends support for one of two previous base pairing models for helix E in metazoan sequences. The Aurelia sequence differs by five nucleotides from that previously reported and does not violate the consensus secondary structure model for eukaryotic 5S rRNA. PMID:6136024

  8. Maternal Plasma DNA and RNA Sequencing for Prenatal Testing.

    PubMed

    Tamminga, Saskia; van Maarle, Merel; Henneman, Lidewij; Oudejans, Cees B M; Cornel, Martina C; Sistermans, Erik A

    2016-01-01

    Cell-free DNA (cfDNA) testing has recently become indispensable in diagnostic testing and screening. In the prenatal setting, this type of testing is often called noninvasive prenatal testing (NIPT). With a number of techniques, using either next-generation sequencing or single nucleotide polymorphism-based approaches, fetal cfDNA in maternal plasma can be analyzed to screen for rhesus D genotype, common chromosomal aneuploidies, and increasingly for testing other conditions, including monogenic disorders. With regard to screening for common aneuploidies, challenges arise when implementing NIPT in current prenatal settings. Depending on the method used (targeted or nontargeted), chromosomal anomalies other than trisomy 21, 18, or 13 can be detected, either of fetal or maternal origin, also referred to as unsolicited or incidental findings. For various biological reasons, there is a small chance of having either a false-positive or false-negative NIPT result, or no result, also referred to as a "no-call." Both pre- and posttest counseling for NIPT should include discussing potential discrepancies. Since NIPT remains a screening test, a positive NIPT result should be confirmed by invasive diagnostic testing (either by chorionic villus biopsy or by amniocentesis). As the scope of NIPT is widening, professional guidelines need to discuss the ethics of what to offer and how to offer. In this review, we discuss the current biochemical, clinical, and ethical challenges of cfDNA testing in the prenatal setting and its future perspectives including novel applications that target RNA instead of DNA. PMID:27117661

  9. Distinct tmRNA sequence elements facilitate RNase R engagement on rescued ribosomes for selective nonstop mRNA decay

    PubMed Central

    Venkataraman, Krithika; Zafar, Hina; Karzai, A. Wali

    2014-01-01

    trans-Translation, orchestrated by SmpB and tmRNA, is the principal eubacterial pathway for resolving stalled translation complexes. RNase R, the leading nonstop mRNA surveillance factor, is recruited to stalled ribosomes in a trans-translation dependent process. To elucidate the contributions of SmpB and tmRNA to RNase R recruitment, we evaluated Escherichia coli–Francisella tularensis chimeric variants of tmRNA and SmpB. This evaluation showed that while the hybrid tmRNA supported nascent polypeptide tagging and ribosome rescue, it suffered defects in facilitating RNase R recruitment to stalled ribosomes. To gain further insights, we used established tmRNA and SmpB variants that impact distinct stages of the trans-translation process. Analysis of select tmRNA variants revealed that the sequence composition and positioning of the ultimate and penultimate codons of the tmRNA ORF play a crucial role in recruiting RNase R to rescued ribosomes. Evaluation of defined SmpB C-terminal tail variants highlighted the importance of establishing the tmRNA reading frame, and provided valuable clues into the timing of RNase R recruitment to rescued ribosomes. Taken together, these studies demonstrate that productive RNase R-ribosomes engagement requires active trans-translation, and suggest that RNase R captures the emerging nonstop mRNA at an early stage after establishment of the tmRNA ORF as the surrogate mRNA template. PMID:25200086

  10. Characterising the Canine Oral Microbiome by Direct Sequencing of Reverse-Transcribed rRNA Molecules.

    PubMed

    McDonald, James E; Larsen, Niels; Pennington, Andrea; Connolly, John; Wallis, Corrin; Rooks, David J; Hall, Neil; McCarthy, Alan J; Allison, Heather E

    2016-01-01

    PCR amplification and sequencing of phylogenetic markers, primarily Small Sub-Unit ribosomal RNA (SSU rRNA) genes, has been the paradigm for defining the taxonomic composition of microbiomes. However, 'universal' SSU rRNA gene PCR primer sets are likely to miss much of the diversity therein. We sequenced a library comprising purified and reverse-transcribed SSU rRNA (RT-SSU rRNA) molecules from the canine oral microbiome and compared it to a general bacterial 16S rRNA gene PCR amplicon library generated from the same biological sample. In addition, we have developed BIONmeta, a novel, open-source, computer package for the processing and taxonomic classification of the randomly fragmented RT-SSU rRNA reads produced. Direct RT-SSU rRNA sequencing revealed that 16S rRNA molecules belonging to the bacterial phyla Actinobacteria, Bacteroidetes, Firmicutes, Proteobacteria and Spirochaetes, were most abundant in the canine oral microbiome (92.5% of total bacterial SSU rRNA). The direct rRNA sequencing approach detected greater taxonomic diversity (1 additional phylum, 2 classes, 1 order, 10 families and 61 genera) when compared with general bacterial 16S rRNA amplicons from the same sample, simultaneously provided SSU rRNA gene inventories of Bacteria, Archaea and Eukarya, and detected significant numbers of sequences not recognised by 'universal' primer sets. Proteobacteria and Spirochaetes were found to be under-represented by PCR-based analysis of the microbiome, and this was due to primer mismatches and taxon-specific variations in amplification efficiency, validated by qPCR analysis of 16S rRNA amplicons from a mock community. This demonstrated the veracity of direct RT-SSU rRNA sequencing for molecular microbial ecology. PMID:27276347

  11. Characterising the Canine Oral Microbiome by Direct Sequencing of Reverse-Transcribed rRNA Molecules

    PubMed Central

    McDonald, James E.; Larsen, Niels; Pennington, Andrea; Connolly, John; Wallis, Corrin; Rooks, David J.; Hall, Neil; McCarthy, Alan J.; Allison, Heather E.

    2016-01-01

    PCR amplification and sequencing of phylogenetic markers, primarily Small Sub-Unit ribosomal RNA (SSU rRNA) genes, has been the paradigm for defining the taxonomic composition of microbiomes. However, ‘universal’ SSU rRNA gene PCR primer sets are likely to miss much of the diversity therein. We sequenced a library comprising purified and reverse-transcribed SSU rRNA (RT-SSU rRNA) molecules from the canine oral microbiome and compared it to a general bacterial 16S rRNA gene PCR amplicon library generated from the same biological sample. In addition, we have developed BIONmeta, a novel, open-source, computer package for the processing and taxonomic classification of the randomly fragmented RT-SSU rRNA reads produced. Direct RT-SSU rRNA sequencing revealed that 16S rRNA molecules belonging to the bacterial phyla Actinobacteria, Bacteroidetes, Firmicutes, Proteobacteria and Spirochaetes, were most abundant in the canine oral microbiome (92.5% of total bacterial SSU rRNA). The direct rRNA sequencing approach detected greater taxonomic diversity (1 additional phylum, 2 classes, 1 order, 10 families and 61 genera) when compared with general bacterial 16S rRNA amplicons from the same sample, simultaneously provided SSU rRNA gene inventories of Bacteria, Archaea and Eukarya, and detected significant numbers of sequences not recognised by ‘universal’ primer sets. Proteobacteria and Spirochaetes were found to be under-represented by PCR-based analysis of the microbiome, and this was due to primer mismatches and taxon-specific variations in amplification efficiency, validated by qPCR analysis of 16S rRNA amplicons from a mock community. This demonstrated the veracity of direct RT-SSU rRNA sequencing for molecular microbial ecology. PMID:27276347

  12. Tetrathiobacter kashmirensis Strain CA-1 16S rRNA gene complete sequence.

    Technology Transfer Automated Retrieval System (TEKTRAN)

    This study used 1326 base pair 16S rRNA gene sequence methods to confirm the identification of a bacterium as Tetrathiobacter kashmirensis. Morphological, biochemical characteristics, and fatty acid profiles are consistent with the 16S rRNA gene sequence identification of the bacterium. The isolate...

  13. In silico detection of tRNA sequence features characteristic to aminoacyl-tRNA synthetase class membership

    PubMed Central

    Jakó, Éena; Ittzés, Péter; Szenes, Áron; Kun, Ádám; Szathmáry, Eörs; Pál, Gábor

    2007-01-01

    Aminoacyl tRNA synthetases (aaRS) are grouped into Class I and II based on primary and tertiary structure and enzyme properties suggesting two independent phylogenetic lineages. Analogously, tRNA molecules can also form two respective classes, based on the class membership of their corresponding aaRS. Although some aaRS–tRNA interactions are not extremely specific and require editing mechanisms to avoid misaminoacylation, most aaRS–tRNA interactions are rather stereospecific. Thus, class-specific aaRS features could be mirrored by class-specific tRNA features. However, previous investigations failed to detect conserved class-specific nucleotides. Here we introduce a discrete mathematical approach that evaluates not only class-specific ‘strictly present’, but also ‘strictly absent’ nucleotides. The disjoint subsets of these elements compose a unique partition, named extended consensus partition (ECP). By analyzing the ECP for both Class I and II tDNA sets from 50 (13 archaeal, 30 bacterial and 7 eukaryotic) species, we could demonstrate that class-specific tRNA sequence features do exist, although not in terms of strictly conserved nucleotides as it had previously been anticipated. This finding demonstrates that important information was hidden in tRNA sequences inaccessible for traditional statistical methods. The ECP analysis might contribute to the understanding of tRNA evolution and could enrich the sequence analysis tool repertoire. PMID:17704131

  14. HIGH SEQUENCE DIVERSITY IN THE RNA SYNTHESIZED AT THE LAMPBRUSH STAGE OF OÖGENESIS*

    PubMed Central

    Davidson, Eric H.; Hough, Barbara R.

    1969-01-01

    Many diverse RNA's are synthesized in the lampbrush stage oöcyte of Xenopus, as shown by the presence of different nucleotide sequences in the RNA population. This fact has been established by hybridizing lampbrush stage oöcyte RNA with an isolated nonrepetitive fraction of Xenopus DNA. Images PMID:5257126

  15. High sequence diversity in the RNA synthesized at the lampbrush stage of oögenesis.

    PubMed

    Davidson, E H; Hough, B R

    1969-06-01

    Many diverse RNA's are synthesized in the lampbrush stage oöcyte of Xenopus, as shown by the presence of different nucleotide sequences in the RNA population. This fact has been established by hybridizing lampbrush stage oöcyte RNA with an isolated nonrepetitive fraction of Xenopus DNA. PMID:5257126

  16. Profiling miRNA Expression in Bovine Tissues by Deep Sequencing

    Technology Transfer Automated Retrieval System (TEKTRAN)

    miRNA are short RNA sequences ( ~ 21 nt long) that have been recently identified and were found to play an important role in gene regulation and controlling major cellular processes. Several miRNA are found to be evolutionarily conserved among the mammalian species. Some miRNAs are even conserved be...

  17. Transcription profile of boar spermatozoa as revealed by RNA-sequencing

    Technology Transfer Automated Retrieval System (TEKTRAN)

    High-throughput RNA sequencing (RNA-Seq) overcomes the limitations of the current hybridization-based techniques to detect the actual pool of RNA transcripts in spermatozoa. The application of this technology in livestock can speed the discovery of potential predictors of male fertility. As a first ...

  18. Strand-specific libraries for high throughput RNA sequencing (RNA-Seq) prepared without poly(A) selection

    PubMed Central

    2012-01-01

    Background High throughput DNA sequencing technology has enabled quantification of all the RNAs in a cell or tissue, a method widely known as RNA sequencing (RNA-Seq). However, non-coding RNAs such as rRNA are highly abundant and can consume >70% of sequencing reads. A common approach is to extract only polyadenylated mRNA; however, such approaches are blind to RNAs with short or no poly(A) tails, leading to an incomplete view of the transcriptome. Another challenge of preparing RNA-Seq libraries is to preserve the strand information of the RNAs. Design Here, we describe a procedure for preparing RNA-Seq libraries from 1 to 4 μg total RNA without poly(A) selection. Our method combines the deoxyuridine triphosphate (dUTP)/uracil-DNA glycosylase (UDG) strategy to achieve strand specificity with AMPure XP magnetic beads to perform size selection. Together, these steps eliminate gel purification, allowing a library to be made in less than two days. We barcode each library during the final PCR amplification step, allowing several samples to be sequenced in a single lane without sacrificing read length. Libraries prepared using this protocol are compatible with Illumina GAII, GAIIx and HiSeq 2000 platforms. Discussion The RNA-Seq protocol described here yields strand-specific transcriptome libraries without poly(A) selection, which provide approximately 90% mappable sequences. Typically, more than 85% of mapped reads correspond to protein-coding genes and only 6% derive from non-coding RNAs. The protocol has been used to measure RNA transcript identity and abundance in tissues from flies, mice, rats, chickens, and frogs, demonstrating its general applicability. PMID:23273270

  19. Changes in nuclear and polysomal polyadenylated RNA sequences during rat-liver regeneration.

    PubMed Central

    Wilkes, P R; Birnie, G D; Paul, J

    1979-01-01

    Nuclear and polysomal polyadenylated RNA populations of normal and 16 hour regenerating rat liver have been compared by mRNA-cDNA hybridisations and by unique DNA saturation experiments. It was found that nuclear polyadenylated RNA hybridises to 6.8% of unique DNA in both normal and 16 hour regenerating rat liver. However, cross-hybridisation experiments using cDNA have shown that 10-15% by weight of nuclear polyadenylated RNA sequences are specific to 16 hour regenerating rat-liver. Since both unique DNA and cDNA hybridisation have shown that normal and 16 hour regenerating rat-liver polysomal polyadenylated RNA populations are qualitatively very similar sequences specific to 16 hour regenerating rat-liver nuclear polyadenylated RNA are nucleus confined. Polysomal RNA sequences which were abundant in normal rat-liver have become less abundant in regenerating rat liver. PMID:461186

  20. JAR3D Webserver: Scoring and aligning RNA loop sequences to known 3D motifs.

    PubMed

    Roll, James; Zirbel, Craig L; Sweeney, Blake; Petrov, Anton I; Leontis, Neocles

    2016-07-01

    Many non-coding RNAs have been identified and may function by forming 2D and 3D structures. RNA hairpin and internal loops are often represented as unstructured on secondary structure diagrams, but RNA 3D structures show that most such loops are structured by non-Watson-Crick basepairs and base stacking. Moreover, different RNA sequences can form the same RNA 3D motif. JAR3D finds possible 3D geometries for hairpin and internal loops by matching loop sequences to motif groups from the RNA 3D Motif Atlas, by exact sequence match when possible, and by probabilistic scoring and edit distance for novel sequences. The scoring gauges the ability of the sequences to form the same pattern of interactions observed in 3D structures of the motif. The JAR3D webserver at http://rna.bgsu.edu/jar3d/ takes one or many sequences of a single loop as input, or else one or many sequences of longer RNAs with multiple loops. Each sequence is scored against all current motif groups. The output shows the ten best-matching motif groups. Users can align input sequences to each of the motif groups found by JAR3D. JAR3D will be updated with every release of the RNA 3D Motif Atlas, and so its performance is expected to improve over time. PMID:27235417

  1. Replicating satellite RNA induces sequence-specific DNA methylation and truncated transcripts in plants.

    PubMed Central

    Wang, M B; Wesley, S V; Finnegan, E J; Smith, N A; Waterhouse, P M

    2001-01-01

    Tobacco plants were transformed with a chimeric transgene comprising sequences encoding beta-glucuronidase (GUS) and the satellite RNA (satRNA) of cereal yellow dwarf luteovirus. When transgenic plants were infected with potato leafroll luteovirus (PLRV), which replicated the transgene-derived satRNA to a high level, the satellite sequence of the GUS:Sat transgene became densely methylated. Within the satellite region, all 86 cytosines in the upper strand and 73 of the 75 cytosines in the lower strand were either partially or fully methylated. In contrast, very low levels of DNA methylation were detected in the satellite sequence of the transgene in uninfected plants and in the flanking nonsatellite sequences in both infected and uninfected plants. Substantial amounts of truncated GUS:Sat RNA accumulated in the satRNA-replicating plants, and most of the molecules terminated at nucleotides within the first 60 bp of the satellite sequence. Whereas this RNA truncation was associated with high levels of satRNA replication, it appeared to be independent of the levels of DNA methylation in the satellite sequence, suggesting that it is not caused by methylation. All the sequenced GUS:Sat DNA molecules were hypermethylated in plants with replicating satRNA despite the phloem restriction of the helper PLRV. Also, small, sense and antisense approximately 22 nt RNAs, derived from the satRNA, were associated with the replicating satellite. These results suggest that the sequence-specific DNA methylation spread into cells in which no satRNA replication occurred and that this was mediated by the spread of unamplified satRNA and/or its associated 22 nt RNA molecules. PMID:11214177

  2. Integration of Expressed Sequence Tag Data Flanking Predicted RNA Secondary Structures Facilitates Novel Non-Coding RNA Discovery

    PubMed Central

    Krzyzanowski, Paul M.; Price, Feodor D.; Muro, Enrique M.; Rudnicki, Michael A.; Andrade-Navarro, Miguel A.

    2011-01-01

    Many computational methods have been used to predict novel non-coding RNAs (ncRNAs), but none, to our knowledge, have explicitly investigated the impact of integrating existing cDNA-based Expressed Sequence Tag (EST) data that flank structural RNA predictions. To determine whether flanking EST data can assist in microRNA (miRNA) prediction, we identified genomic sites encoding putative miRNAs by combining functional RNA predictions with flanking ESTs data in a model consistent with miRNAs undergoing cleavage during maturation. In both human and mouse genomes, we observed that the inclusion of flanking ESTs adjacent to and not overlapping predicted miRNAs significantly improved the performance of various methods of miRNA prediction, including direct high-throughput sequencing of small RNA libraries. We analyzed the expression of hundreds of miRNAs predicted to be expressed during myogenic differentiation using a customized microarray and identified several known and predicted myogenic miRNA hairpins. Our results indicate that integrating ESTs flanking structural RNA predictions improves the quality of cleaved miRNA predictions and suggest that this strategy can be used to predict other non-coding RNAs undergoing cleavage during maturation. PMID:21698286

  3. Integration of expressed sequence tag data flanking predicted RNA secondary structures facilitates novel non-coding RNA discovery.

    PubMed

    Krzyzanowski, Paul M; Price, Feodor D; Muro, Enrique M; Rudnicki, Michael A; Andrade-Navarro, Miguel A

    2011-01-01

    Many computational methods have been used to predict novel non-coding RNAs (ncRNAs), but none, to our knowledge, have explicitly investigated the impact of integrating existing cDNA-based Expressed Sequence Tag (EST) data that flank structural RNA predictions. To determine whether flanking EST data can assist in microRNA (miRNA) prediction, we identified genomic sites encoding putative miRNAs by combining functional RNA predictions with flanking ESTs data in a model consistent with miRNAs undergoing cleavage during maturation. In both human and mouse genomes, we observed that the inclusion of flanking ESTs adjacent to and not overlapping predicted miRNAs significantly improved the performance of various methods of miRNA prediction, including direct high-throughput sequencing of small RNA libraries. We analyzed the expression of hundreds of miRNAs predicted to be expressed during myogenic differentiation using a customized microarray and identified several known and predicted myogenic miRNA hairpins. Our results indicate that integrating ESTs flanking structural RNA predictions improves the quality of cleaved miRNA predictions and suggest that this strategy can be used to predict other non-coding RNAs undergoing cleavage during maturation. PMID:21698286

  4. The use of exome capture RNA-seq for highly degraded RNA with application to clinical cancer sequencing.

    PubMed

    Cieslik, Marcin; Chugh, Rashmi; Wu, Yi-Mi; Wu, Ming; Brennan, Christine; Lonigro, Robert; Su, Fengyun; Wang, Rui; Siddiqui, Javed; Mehra, Rohit; Cao, Xuhong; Lucas, David; Chinnaiyan, Arul M; Robinson, Dan

    2015-09-01

    RNA-seq by poly(A) selection is currently the most common protocol for whole transcriptome sequencing as it provides a broad, detailed, and accurate view of the RNA landscape. Unfortunately, the utility of poly(A) libraries is greatly limited when the input RNA is degraded, which is the norm for research tissues and clinical samples, especially when specimens are formalin-fixed. To facilitate the use of RNA sequencing beyond cell lines and in the clinical setting, we developed an exome-capture transcriptome protocol with greatly improved performance on degraded RNA. Capture transcriptome libraries enable measuring absolute and differential gene expression, calling genetic variants, and detecting gene fusions. Through validation against gold-standard poly(A) and Ribo-Zero libraries from intact RNA, we show that capture RNA-seq provides accurate and unbiased estimates of RNA abundance, uniform transcript coverage, and broad dynamic range. Unlike poly(A) selection and Ribo-Zero depletion, capture libraries retain these qualities regardless of RNA quality and provide excellent data from clinical specimens including formalin-fixed paraffin-embedded (FFPE) blocks. Systematic improvements across key applications of RNA-seq are shown on a cohort of prostate cancer patients and a set of clinical FFPE samples. Further, we demonstrate the utility of capture RNA-seq libraries in a patient with a highly malignant solitary fibrous tumor (SFT) enrolled in our clinical sequencing program called MI-ONCOSEQ. Capture transcriptome profiling from FFPE revealed two oncogenic fusions: the pathognomonic NAB2-STAT6 inversion and a therapeutically actionable BRAF fusion, which may drive this specific cancer's aggressive phenotype. PMID:26253700

  5. AB053. MicroRNA expression profile in penile cancer revealed by next-generation small RNA sequencing

    PubMed Central

    Zhang, Li; Wei, Pengfei

    2016-01-01

    Objective Penile cancer (PeCa) is a relatively rare tumor entity but possesses higher morbidity and mortality rates especially in developing countries. To date, the concrete pathogenic signaling pathways and core machineries involved in tumorigenesis and progression of PeCa remain to be elucidated. Several studies suggested miRNAs, which modulate gene expression at posttranscriptional level, were frequently mis-regulated and aberrantly expressed in human cancers. However, the miRNA profile in human PeCa has not been reported before. Methods In this present study, the miRNA profile was obtained from 10 fresh penile cancerous tissues and matched adjacent non-cancerous tissues via next-generation sequencing. Results As a result, a total of 751 and 806 annotated miRNAs were identified in normal and cancerous penile tissues, respectively. Among which, 56 miRNAs with significantly different expression levels between paired tissues were identified. Subsequently, several annotated miRNAs were randomly and validated using quantitative real-time PCR. Compared with the previous publications regarding to the altered miRNAs expression in various cancers and especially genitourinary (prostate, bladder, kidney, testis) cancers, the most majority of deregulated miRNAs showed the similar expression pattern in penile cancer. Moreover, the bioinformatics analyses suggested that the putative target genes of differentially expressed miRNAs between cancerous and matched normal penile tissues were tightly associated with cell junction, proliferation, growth as well as genomic instability and so on, by modulating Wnt, MAPK, p53, PI3K-Akt, Notch and TGF-β signaling pathways, which were all well-established to participate in cancer initiation and progression. Conclusions Our work presents a global view of the differentially expressed miRNAs and potentially regulatory networks of their target genes for clarifying the pathogenic transformation of normal penis to PeCa, which research resource

  6. Globin mRNA reduction for whole-blood transcriptome sequencing.

    PubMed

    Krjutškov, Kaarel; Koel, Mariann; Roost, Anne Mari; Katayama, Shintaro; Einarsdottir, Elisabet; Jouhilahti, Eeva-Mari; Söderhäll, Cilla; Jaakma, Ülle; Plaas, Mario; Vesterlund, Liselotte; Lohi, Hannes; Salumets, Andres; Kere, Juha

    2016-01-01

    The transcriptome analysis of whole-blood RNA by sequencing holds promise for the identification and tracking of biomarkers; however, the high globin mRNA (gmRNA) content of erythrocytes hampers whole-blood and buffy coat analyses. We introduce a novel gmRNA locking assay (GlobinLock, GL) as a robust and simple gmRNA reduction tool to preserve RNA quality, save time and cost. GL consists of a pair of gmRNA-specific oligonucleotides in RNA initial denaturation buffer that is effective immediately after RNA denaturation and adds only ten minutes of incubation to the whole cDNA synthesis procedure when compared to non-blood RNA analysis. We show that GL is fully effective not only for human samples but also for mouse and rat, and so far incompletely studied cow, dog and zebrafish. PMID:27515369

  7. Globin mRNA reduction for whole-blood transcriptome sequencing

    PubMed Central

    Krjutškov, Kaarel; Koel, Mariann; Roost, Anne Mari; Katayama, Shintaro; Einarsdottir, Elisabet; Jouhilahti, Eeva-Mari; Söderhäll, Cilla; Jaakma, Ülle; Plaas, Mario; Vesterlund, Liselotte; Lohi, Hannes; Salumets, Andres; Kere, Juha

    2016-01-01

    The transcriptome analysis of whole-blood RNA by sequencing holds promise for the identification and tracking of biomarkers; however, the high globin mRNA (gmRNA) content of erythrocytes hampers whole-blood and buffy coat analyses. We introduce a novel gmRNA locking assay (GlobinLock, GL) as a robust and simple gmRNA reduction tool to preserve RNA quality, save time and cost. GL consists of a pair of gmRNA-specific oligonucleotides in RNA initial denaturation buffer that is effective immediately after RNA denaturation and adds only ten minutes of incubation to the whole cDNA synthesis procedure when compared to non-blood RNA analysis. We show that GL is fully effective not only for human samples but also for mouse and rat, and so far incompletely studied cow, dog and zebrafish. PMID:27515369

  8. Sequence-specific cleavage of dsRNA by Mini-III RNase

    PubMed Central

    Głów, Dawid; Pianka, Dariusz; Sulej, Agata A.; Kozłowski, Łukasz P.; Czarnecka, Justyna; Chojnowski, Grzegorz; Skowronek, Krzysztof J.; Bujnicki, Janusz M.

    2015-01-01

    Ribonucleases (RNases) play a critical role in RNA processing and degradation by hydrolyzing phosphodiester bonds (exo- or endonucleolytically). Many RNases that cut RNA internally exhibit substrate specificity, but their target sites are usually limited to one or a few specific nucleotides in single-stranded RNA and often in a context of a particular three-dimensional structure of the substrate. Thus far, no RNase counterparts of restriction enzymes have been identified which could cleave double-stranded RNA (dsRNA) in a sequence-specific manner. Here, we present evidence for a sequence-dependent cleavage of long dsRNA by RNase Mini-III from Bacillus subtilis (BsMiniIII). Analysis of the sites cleaved by this enzyme in limited digest of bacteriophage Φ6 dsRNA led to the identification of a consensus target sequence. We defined nucleotide residues within the preferred cleavage site that affected the efficiency of the cleavage and were essential for the discrimination of cleavable versus non-cleavable dsRNA sequences. We have also determined that the loop α5b-α6, a distinctive structural element in Mini-III RNases, is crucial for the specific cleavage, but not for dsRNA binding. Our results suggest that BsMiniIII may serve as a prototype of a sequence-specific dsRNase that could possibly be used for targeted cleavage of dsRNA. PMID:25634891

  9. Sequence of instability processes triggered by heavy rainfall in the northern Italy

    NASA Astrophysics Data System (ADS)

    Luino, Fabio

    2005-03-01

    Northern Italy is a geomorphologically heterogeneous region: high mountains, wide valleys, gentle hills and a large plain form a very varied landscape and influence the temperate climate of the area. The Alps region has harsh winters and moderately warm summers with abundant rainfall. The Po Plain has harsh winters with long periods of subfreezing temperatures and warm sultry summers, with rainfall more common in winter. Geomorphic instability processes are very common. Almost every year, landslides, mud flows and debris flows in the Alpine areas and flooding in the Po flood plain cause severe damage to structures and infrastructure and often claim human lives. Analyses of major events that have struck northern Italy over the last 35 years have provided numerous useful data for the recognition of various rainfall-triggering processes and their sequence of development in relation to the intensity and duration of rainfall. Findings acquired during and after these events emphasise that the quantity and typology of instability processes triggered by rainfall are related not only to an area's morphological and geological characteristics but also to intense rainfall distribution during meteorological disturbances. Moreover, critical rainfall thresholds can vary from place to place in relation to the climatic and geomorphological conditions of the area. Once the threshold has been exceeded, which is about 10% of the local mean annual rainfall (MAR), the instability processes on the slopes and along the hydrographic networks follow a sequence that can be reconstructed in three different phases. In the first phase, the initial instability processes that can usually be observed are soil slips on steep slopes, mud-debris flows in small basins of less than 20 km 2 in area, while discharge increases substantially in larger stream basins of up to 500 km 2. In continuous precipitation, in the second phase, first mud-debris flows can be triggered also in basins larger than 20 km 2

  10. In vitro selection of an RNA sequence that interacts with high affinity with thymidylate synthase

    PubMed Central

    Lin, Xiukun; Mizunuma, Nobuyuki; Chen, Tian-men; Copur, Sitki M.; Maley, Gladys F.; Liu, Jun; Maley, Frank; Chu, Edward

    2000-01-01

    Previous studies have shown that the repressive effect of thymidylate synthase (TS) mRNA translation is mediated by direct binding of TS itself to two cis-acting elements on its cognate mRNA. To identify the optimal RNA nucleotides that interact with TS, we in vitro synthesized a completely degenerate, linear RNA pool of 25 nt and employed in vitro selection to isolate high affinity RNA ligands that bind human TS protein. After 10 rounds of selection and amplification, a single RNA molecule was selected that bound TS protein with nearly 20-fold greater affinity than native, wild-type TS RNA sequences. Secondary structure analysis of this RNA sequence predicted it to possess a stem–loop structure. Deletion and/or modification of the UGU loop element within the RNA sequence decreased binding to TS by up to 1000-fold. In vivo transfection experiments revealed that the presence of the selected RNA sequence resulted in a significant increase in the expression of a heterologous luciferase reporter construct in human colon cancer H630 and TS-overexpressing HCT-C:His-TS+ cells, but not in HCT-C18 cells expressing a functionally inactive TS. In addition, the presence of this element in H630 cells leads to induced expression of TS protein. An immunoprecipitation method using RT–PCR confirmed a direct interaction between human TS protein and the selected RNA sequence in transfected human cancer H630 cells. This study identified a novel RNA sequence from a degenerate RNA library that specifically interacts with TS. PMID:11058126

  11. In vitro selection of an RNA sequence that interacts with high affinity with thymidylate synthase.

    PubMed

    Lin, X; Mizunuma, N; Chen, T; Copur, S M; Maley, G F; Liu, J; Maley, F; Chu, E

    2000-11-01

    Previous studies have shown that the repressive effect of thymidylate synthase (TS) mRNA translation is mediated by direct binding of TS itself to two cis-acting elements on its cognate mRNA. To identify the optimal RNA nucleotides that interact with TS, we in vitro synthesized a completely degenerate, linear RNA pool of 25 nt and employed in vitro selection to isolate high affinity RNA ligands that bind human TS protein. After 10 rounds of selection and amplification, a single RNA molecule was selected that bound TS protein with nearly 20-fold greater affinity than native, wild-type TS RNA sequences. Secondary structure analysis of this RNA sequence predicted it to possess a stem-loop structure. Deletion and/or modification of the UGU loop element within the RNA sequence decreased binding to TS by up to 1000-fold. In vivo transfection experiments revealed that the presence of the selected RNA sequence resulted in a significant increase in the expression of a heterologous luciferase reporter construct in human colon cancer H630 and TS-overexpressing HCT-C:His-TS+ cells, but not in HCT-C18 cells expressing a functionally inactive TS. In addition, the presence of this element in H630 cells leads to induced expression of TS protein. An immunoprecipitation method using RT-PCR confirmed a direct interaction between human TS protein and the selected RNA sequence in transfected human cancer H630 cells. This study identified a novel RNA sequence from a degenerate RNA library that specifically interacts with TS. PMID:11058126

  12. Nucleotide sequence of 3' untranslated portion of human alpha globin mRNA.

    PubMed Central

    Wilson, J T; deRiel, J K; Forget, B G; Marotta, C A; Weissman, S M

    1977-01-01

    We have determined the nucleotide sequence of 75 nucleotides of the 3'-untranslated portion of normal human alpha globin mRNA which corresponds to the elongated amino acid sequence of the chain termination mutant Hb Constant Spring. This was accomplished by sequence analysis of cDNA fragments obtained by restriction endonuclease or T4 endonuclease IV cleavage of human globin cDNA synthesized from globin mRNA by use of viral reverse transcriptase. Analysis of cRNA synthesized from cDNA by use of RNA polymerase provided additional confirmatory sequence information. Possible polymorphism has been identified at one site of the sequence. Our sequence overlaps with, and extends the sequence of 43 nucleotides determined by Proudfood and coworkers for the very 3'-terminal portion of human alpha globin mRNA. The complete 3'-untranslated sequence of human alpha globin mRNA (112 nucleotides including termination codon) shows little homology to that of the human or rabbit beta globin mRNAs except for the presence of the hexanucleotide sequence AAUAAA which is found in most eukaryotic mRNAs near the 3'-terminal poly (A). Images PMID:909779

  13. Interspersion of sequences in avian myeloblastosis virus rna that rapidly hybridize with leukemic chicken cell DNA.

    PubMed Central

    Drohan, W N; Shoyab, M; Wall, R; Baluda, M A

    1975-01-01

    Liquid hybridization of progressively smaller fragments (35S, 27S, 15.5S, 12.5S, and 8S) of poly(A)-selected avian myeloblastosis virus RNA with excess DNA from leukemic chicken myeloblasts revealed that all sizes of RNA contained sequences complementary to both slowly and rapidly hybridizing cellular DNA sequences. Apparently, the RNA sequences which hybridize rapidly with excesses of cellular DNA are not restricted to any one region of the avian myeloblastosis virus 35S RNA. Instead, they appear to be randomly distributed over the entire 35S avian myeloblastosis virus RNA molecule with some positioned within 200 nucleotides of the poly(A) tract at the 3' end of the RNA. PMID:163372

  14. Molecular basis of sequence-specific recognition of pre-ribosomal RNA by nucleolin

    PubMed Central

    Allain, Frédéric H.-T.; Bouvet, Philippe; Dieckmann, Thorsten; Feigon, Juli

    2000-01-01

    The structure of the 28 kDa complex of the first two RNA binding domains (RBDs) of nucleolin (RBD12) with an RNA stem–loop that includes the nucleolin recognition element UCCCGA in the loop was determined by NMR spectroscopy. The structure of nucleolin RBD12 with the nucleolin recognition element (NRE) reveals that the two RBDs bind on opposite sides of the RNA loop, forming a molecular clamp that brings the 5′ and 3′ ends of the recognition sequence close together and stabilizing the stem–loop. The specific interactions observed in the structure explain the sequence specificity for the NRE sequence. Binding studies of mutant proteins and analysis of conserved residues support the proposed interactions. The mode of interaction of the protein with the RNA and the location of the putative NRE sites suggest that nucleolin may function as an RNA chaperone to prevent improper folding of the nascent pre-rRNA. PMID:11118222

  15. Sequence-specific RNA Photocleavage by Single-stranded DNA in Presence of Riboflavin.

    PubMed

    Zhao, Yongyun; Chen, Gangyi; Yuan, Yi; Li, Na; Dong, Juan; Huang, Xin; Cui, Xin; Tang, Zhuo

    2015-01-01

    Constant efforts have been made to develop new method to realize sequence-specific RNA degradation, which could cause inhibition of the expression of targeted gene. Herein, by using an unmodified short DNA oligonucleotide for sequence recognition and endogenic small molecule, vitamin B2 (riboflavin) as photosensitizer, we report a simple strategy to realize the sequence-specific photocleavage of targeted RNA. The DNA strand is complimentary to the target sequence to form DNA/RNA duplex containing a G • U wobble in the middle. The cleavage reaction goes through oxidative elimination mechanism at the nucleoside downstream of U of the G • U wobble in duplex to obtain unnatural RNA terminal, and the whole process is under tight control by using light as switch, which means the cleavage could be carried out according to specific spatial and temporal requirements. The biocompatibility of this method makes the DNA strand in combination with riboflavin a promising molecular tool for RNA manipulation. PMID:26461456

  16. Sequence-specific RNA Photocleavage by Single-stranded DNA in Presence of Riboflavin

    NASA Astrophysics Data System (ADS)

    Zhao, Yongyun; Chen, Gangyi; Yuan, Yi; Li, Na; Dong, Juan; Huang, Xin; Cui, Xin; Tang, Zhuo

    2015-10-01

    Constant efforts have been made to develop new method to realize sequence-specific RNA degradation, which could cause inhibition of the expression of targeted gene. Herein, by using an unmodified short DNA oligonucleotide for sequence recognition and endogenic small molecue, vitamin B2 (riboflavin) as photosensitizer, we report a simple strategy to realize the sequence-specific photocleavage of targeted RNA. The DNA strand is complimentary to the target sequence to form DNA/RNA duplex containing a G•U wobble in the middle. The cleavage reaction goes through oxidative elimination mechanism at the nucleoside downstream of U of the G•U wobble in duplex to obtain unnatural RNA terminal, and the whole process is under tight control by using light as switch, which means the cleavage could be carried out according to specific spatial and temporal requirements. The biocompatibility of this method makes the DNA strand in combination with riboflavin a promising molecular tool for RNA manipulation.

  17. Plant RNA virus sequences identified in kimchi by microbial metatranscriptome analysis.

    PubMed

    Kim, Dong Seon; Jung, Ji Young; Wang, Yao; Oh, Hye Ji; Choi, Dongjin; Jeon, Che Ok; Hahn, Yoonsoo

    2014-07-01

    Plant pathogenic RNA viruses are present in a variety of plant-based foods. When ingested by humans, these viruses can survive the passage through the digestive tract, and are frequently detected in human feces. Kimchi is a traditional fermented Korean food made from cabbage or vegetables, with a variety of other plant-based ingredients, including ground red pepper and garlic paste. We analyzed microbial metatranscriptome data from kimchi at five fermentation stages to identify plant RNA virus-derived sequences. We successfully identified a substantial amount of plant RNA virus sequences, especially during the early stages of fermentation: 23.47% and 16.45% of total clean reads on days 7 and 13, respectively. The most abundant plant RNA virus sequences were from pepper mild mottle virus, a major pathogen of red peppers; this constituted 95% of the total RNA virus sequences identified throughout the fermentation period. We observed distinct sequencing read-depth distributions for plant RNA virus genomes, possibly implying intrinsic and/or technical biases during the metatranscriptome generation procedure. We also identified RNA virus sequences in publicly available microbial metatranscriptome data sets. We propose that metatranscriptome data may serve as a valuable resource for RNA virus detection, and a systematic screening of the ingredients may help prevent the use of virus-infected low-quality materials for food production. PMID:24836186

  18. Common 5S rRNA variants are likely to be accepted in many sequence contexts

    NASA Technical Reports Server (NTRS)

    Zhang, Zhengdong; D'Souza, Lisa M.; Lee, Youn-Hyung; Fox, George E.

    2003-01-01

    Over evolutionary time RNA sequences which are successfully fixed in a population are selected from among those that satisfy the structural and chemical requirements imposed by the function of the RNA. These sequences together comprise the structure space of the RNA. In principle, a comprehensive understanding of RNA structure and function would make it possible to enumerate which specific RNA sequences belong to a particular structure space and which do not. We are using bacterial 5S rRNA as a model system to attempt to identify principles that can be used to predict which sequences do or do not belong to the 5S rRNA structure space. One promising idea is the very intuitive notion that frequently seen sequence changes in an aligned data set of naturally occurring 5S rRNAs would be widely accepted in many other 5S rRNA sequence contexts. To test this hypothesis, we first developed well-defined operational definitions for a Vibrio region of the 5S rRNA structure space and what is meant by a highly variable position. Fourteen sequence variants (10 point changes and 4 base-pair changes) were identified in this way, which, by the hypothesis, would be expected to incorporate successfully in any of the known sequences in the Vibrio region. All 14 of these changes were constructed and separately introduced into the Vibrio proteolyticus 5S rRNA sequence where they are not normally found. Each variant was evaluated for its ability to function as a valid 5S rRNA in an E. coli cellular context. It was found that 93% (13/14) of the variants tested are likely valid 5S rRNAs in this context. In addition, seven variants were constructed that, although present in the Vibrio region, did not meet the stringent criteria for a highly variable position. In this case, 86% (6/7) are likely valid. As a control we also examined seven variants that are seldom or never seen in the Vibrio region of 5S rRNA sequence space. In this case only two of seven were found to be potentially valid. The

  19. Preparation of cDNA libraries for high-throughput RNA sequencing analysis of RNA 5′ ends

    PubMed Central

    Vvedenskaya, Irina O.; Goldman, Seth R.; Nickels, Bryce E.

    2015-01-01

    Summary We provide a detailed protocol for preparing cDNA libraries suitable for high throughput sequencing that are derived specifically from the 5′ ends of RNA (5′ specific RNA-seq). The protocol describes how cDNA libraries for 5′ specific RNA-seq can be tailored to analyze specific classes of RNAs based upon the phosphorylation status of the 5′ end. Thus, the analysis of cDNA libraries generated by these methods provides information regarding both the sequence and phosphorylation status of the 5′ ends of RNAs. 5′ specific RNA-seq can be used to analyze transcription initiation and post-transcriptional processing of RNAs with single base pair resolution on a genome-wide level. PMID:25665566

  20. 5S RNA sequence from the Philosamia silkworm: evidence for variable evolutionary rates in insect 5S RNA.

    PubMed Central

    Xian-Rong, G; Nicoghosian, K; Cedergren, R J

    1982-01-01

    The primary structure of 5S RNA isolated from the posterior silkgland of Philosamia cynthia ricini was determined using three in vitro labelling techniques. The derived sequence consists of 119 nucleotides and can be folded into the secondary structure model proposed for eukaryotic 5S RNAs. This 5S RNA differs from the Bombyx mori molecule in 9 positions and from the Drosophila melanogaster sequence in 14 positions. The comparison of evolutionary rates in insect 5S RNA with inferred rates in other eukaryotic phyla leads to the conclusion that 5S RNA evolution is not constant in different eukaryotic branches, a condition which must be taken into account in phylogenetic tree constructions. Images PMID:7145713

  1. Sequencing 16S rRNA gene fragments using the PacBio SMRT DNA sequencing system

    PubMed Central

    Jenior, Matthew L.; Koumpouras, Charles C.; Westcott, Sarah L.; Highlander, Sarah K.

    2016-01-01

    Over the past 10 years, microbial ecologists have largely abandoned sequencing 16S rRNA genes by the Sanger sequencing method and have instead adopted highly parallelized sequencing platforms. These new platforms, such as 454 and Illumina’s MiSeq, have allowed researchers to obtain millions of high quality but short sequences. The result of the added sequencing depth has been significant improvements in experimental design. The tradeoff has been the decline in the number of full-length reference sequences that are deposited into databases. To overcome this problem, we tested the ability of the PacBio Single Molecule, Real-Time (SMRT) DNA sequencing platform to generate sequence reads from the 16S rRNA gene. We generated sequencing data from the V4, V3–V5, V1–V3, V1–V5, V1–V6, and V1–V9 variable regions from within the 16S rRNA gene using DNA from a synthetic mock community and natural samples collected from human feces, mouse feces, and soil. The mock community allowed us to assess the actual sequencing error rate and how that error rate changed when different curation methods were applied. We developed a simple method based on sequence characteristics and quality scores to reduce the observed error rate for the V1–V9 region from 0.69 to 0.027%. This error rate is comparable to what has been observed for the shorter reads generated by 454 and Illumina’s MiSeq sequencing platforms. Although the per base sequencing cost is still significantly more than that of MiSeq, the prospect of supplementing reference databases with full-length sequences from organisms below the limit of detection from the Sanger approach is exciting. PMID:27069806

  2. Sequencing 16S rRNA gene fragments using the PacBio SMRT DNA sequencing system.

    PubMed

    Schloss, Patrick D; Jenior, Matthew L; Koumpouras, Charles C; Westcott, Sarah L; Highlander, Sarah K

    2016-01-01

    Over the past 10 years, microbial ecologists have largely abandoned sequencing 16S rRNA genes by the Sanger sequencing method and have instead adopted highly parallelized sequencing platforms. These new platforms, such as 454 and Illumina's MiSeq, have allowed researchers to obtain millions of high quality but short sequences. The result of the added sequencing depth has been significant improvements in experimental design. The tradeoff has been the decline in the number of full-length reference sequences that are deposited into databases. To overcome this problem, we tested the ability of the PacBio Single Molecule, Real-Time (SMRT) DNA sequencing platform to generate sequence reads from the 16S rRNA gene. We generated sequencing data from the V4, V3-V5, V1-V3, V1-V5, V1-V6, and V1-V9 variable regions from within the 16S rRNA gene using DNA from a synthetic mock community and natural samples collected from human feces, mouse feces, and soil. The mock community allowed us to assess the actual sequencing error rate and how that error rate changed when different curation methods were applied. We developed a simple method based on sequence characteristics and quality scores to reduce the observed error rate for the V1-V9 region from 0.69 to 0.027%. This error rate is comparable to what has been observed for the shorter reads generated by 454 and Illumina's MiSeq sequencing platforms. Although the per base sequencing cost is still significantly more than that of MiSeq, the prospect of supplementing reference databases with full-length sequences from organisms below the limit of detection from the Sanger approach is exciting. PMID:27069806

  3. a Simple Symmetric Algorithm Using a Likeness with Introns Behavior in RNA Sequences

    NASA Astrophysics Data System (ADS)

    Regoli, Massimo

    2009-02-01

    The RNA-Crypto System (shortly RCS) is a symmetric key algorithm to cipher data. The idea for this new algorithm starts from the observation of nature. In particular from the observation of RNA behavior and some of its properties. The RNA sequences has some sections called Introns. Introns, derived from the term "intragenic regions", are non-coding sections of precursor mRNA (pre-mRNA) or other RNAs, that are removed (spliced out of the RNA) before the mature RNA is formed. Once the introns have been spliced out of a pre-mRNA, the resulting mRNA sequence is ready to be translated into a protein. The corresponding parts of a gene are known as introns as well. The nature and the role of Introns in the pre-mRNA is not clear and it is under ponderous researches by Biologists but, in our case, we will use the presence of Introns in the RNA-Crypto System output as a strong method to add chaotic non coding information and an unnecessary behaviour in the access to the secret key to code the messages. In the RNA-Crypto System algoritnm the introns are sections of the ciphered message with non-coding information as well as in the precursor mRNA.

  4. Deep Sequencing of RNA from Ancient Maize Kernels

    PubMed Central

    Rasmussen, Morten; Cappellini, Enrico; Romero-Navarro, J. Alberto; Wales, Nathan; Alquezar-Planas, David E.; Penfield, Steven; Brown, Terence A.; Vielle-Calzada, Jean-Philippe; Montiel, Rafael; Jørgensen, Tina; Odegaard, Nancy; Jacobs, Michael; Arriaza, Bernardo; Higham, Thomas F. G.; Ramsey, Christopher Bronk; Willerslev, Eske; Gilbert, M. Thomas P.

    2013-01-01

    The characterization of biomolecules from ancient samples can shed otherwise unobtainable insights into the past. Despite the fundamental role of transcriptomal change in evolution, the potential of ancient RNA remains unexploited – perhaps due to dogma associated with the fragility of RNA. We hypothesize that seeds offer a plausible refuge for long-term RNA survival, due to the fundamental role of RNA during seed germination. Using RNA-Seq on cDNA synthesized from nucleic acid extracts, we validate this hypothesis through demonstration of partial transcriptomal recovery from two sources of ancient maize kernels. The results suggest that ancient seed transcriptomics may offer a powerful new tool with which to study plant domestication. PMID:23326310

  5. Enzymatic aminoacylation of sequence-specific RNA minihelices and hybrid duplexes with methionine.

    PubMed Central

    Martinis, S A; Schimmel, P

    1992-01-01

    RNA hairpin helices whose sequences are based on the acceptor stems of alanine and histidine tRNAs are specifically aminoacylated with their cognate amino acids. In these examples, major determinants for the identities of the respective tRNAs reside in the acceptor stem; the anticodon and other parts of the tRNA are dispensable for aminoacylation. In contrast, the anticodon is a major determinant for the identity of a methionine tRNA. RNA hairpin helices and hybrid duplexes that reconstruct the acceptor-T psi C stem and the acceptor stem, respectively, of methionine tRNA were investigated here for aminoacylation with methionine. Direct visualization of the aminoacylated RNA product on an acidic polyacrylamide gel by phosphor imaging demonstrated specific aminoacylation with substrates that contained as few as 7 base pairs. No aminoacylation with methionine was detected with several analogous RNA substrates whose sequences were based on noncognate tRNAs. While the efficiency of aminoacylation is reduced by orders of magnitude relative to methionine tRNA, the results establish that specific aminoacylation with methionine of small duplex substrates can be achieved without the anticodon or other domains of the tRNA. The results, combined with earlier studies, suggest a highly specific adaptation of the structures of aminoacyl-tRNA synthetases to the acceptor stems of their cognate tRNAs, resulting in a relationship between the nucleotide sequences/structures of small RNA duplexes and specific amino acids. Images PMID:1729719

  6. Combined DECS Analysis and Next-Generation Sequencing Enable Efficient Detection of Novel Plant RNA Viruses.

    PubMed

    Yanagisawa, Hironobu; Tomita, Reiko; Katsu, Koji; Uehara, Takuya; Atsumi, Go; Tateda, Chika; Kobayashi, Kappei; Sekine, Ken-Taro

    2016-03-01

    The presence of high molecular weight double-stranded RNA (dsRNA) within plant cells is an indicator of infection with RNA viruses as these possess genomic or replicative dsRNA. DECS (dsRNA isolation, exhaustive amplification, cloning, and sequencing) analysis has been shown to be capable of detecting unknown viruses. We postulated that a combination of DECS analysis and next-generation sequencing (NGS) would improve detection efficiency and usability of the technique. Here, we describe a model case in which we efficiently detected the presumed genome sequence of Blueberry shoestring virus (BSSV), a member of the genus Sobemovirus, which has not so far been reported. dsRNAs were isolated from BSSV-infected blueberry plants using the dsRNA-binding protein, reverse-transcribed, amplified, and sequenced using NGS. A contig of 4,020 nucleotides (nt) that shared similarities with sequences from other Sobemovirus species was obtained as a candidate of the BSSV genomic sequence. Reverse transcription (RT)-PCR primer sets based on sequences from this contig enabled the detection of BSSV in all BSSV-infected plants tested but not in healthy controls. A recombinant protein encoded by the putative coat protein gene was bound by the BSSV-antibody, indicating that the candidate sequence was that of BSSV itself. Our results suggest that a combination of DECS analysis and NGS, designated here as "DECS-C," is a powerful method for detecting novel plant viruses. PMID:27072419

  7. Combined DECS Analysis and Next-Generation Sequencing Enable Efficient Detection of Novel Plant RNA Viruses

    PubMed Central

    Yanagisawa, Hironobu; Tomita, Reiko; Katsu, Koji; Uehara, Takuya; Atsumi, Go; Tateda, Chika; Kobayashi, Kappei; Sekine, Ken-Taro

    2016-01-01

    The presence of high molecular weight double-stranded RNA (dsRNA) within plant cells is an indicator of infection with RNA viruses as these possess genomic or replicative dsRNA. DECS (dsRNA isolation, exhaustive amplification, cloning, and sequencing) analysis has been shown to be capable of detecting unknown viruses. We postulated that a combination of DECS analysis and next-generation sequencing (NGS) would improve detection efficiency and usability of the technique. Here, we describe a model case in which we efficiently detected the presumed genome sequence of Blueberry shoestring virus (BSSV), a member of the genus Sobemovirus, which has not so far been reported. dsRNAs were isolated from BSSV-infected blueberry plants using the dsRNA-binding protein, reverse-transcribed, amplified, and sequenced using NGS. A contig of 4,020 nucleotides (nt) that shared similarities with sequences from other Sobemovirus species was obtained as a candidate of the BSSV genomic sequence. Reverse transcription (RT)-PCR primer sets based on sequences from this contig enabled the detection of BSSV in all BSSV-infected plants tested but not in healthy controls. A recombinant protein encoded by the putative coat protein gene was bound by the BSSV-antibody, indicating that the candidate sequence was that of BSSV itself. Our results suggest that a combination of DECS analysis and NGS, designated here as “DECS-C,” is a powerful method for detecting novel plant viruses. PMID:27072419

  8. Sequence and functional characterization of RNase P RNA from the chl alb containing cyanobacterium Prochlorothrix hollandica.

    PubMed

    Fingerhut, C; Schön, A

    1998-05-29

    Only a few complete sequences and very limited functional data are available for the catalytic RNA component of cyanobacterial RNase P. The RNase P RNA from the chl alb containing cyanobacterium Prochlorothrix hollandica belongs to a rarely found structural subtype with an extended P15/16 domain. We have established conditions for optimal in vitro ribozyme activity, and determined the kinetic parameters for cleavage of pre-tRNA(Tyr). Analysis of pre-tRNA mutants revealed that the T-stem sequence only plays a modulating role, whereas the CCA end is essential for efficient product formation. PMID:9654127

  9. Predicting RNA secondary structures from sequence and probing data.

    PubMed

    Lorenz, Ronny; Wolfinger, Michael T; Tanzer, Andrea; Hofacker, Ivo L

    2016-07-01

    RNA secondary structures have proven essential for understanding the regulatory functions performed by RNA such as microRNAs, bacterial small RNAs, or riboswitches. This success is in part due to the availability of efficient computational methods for predicting RNA secondary structures. Recent advances focus on dealing with the inherent uncertainty of prediction by considering the ensemble of possible structures rather than the single most stable one. Moreover, the advent of high-throughput structural probing has spurred the development of computational methods that incorporate such experimental data as auxiliary information. PMID:27064083

  10. RNA sequencing of Sleeping Beauty transposon-induced tumors detects transposon-RNA fusions in forward genetic cancer screens.

    PubMed

    Temiz, Nuri A; Moriarity, Branden S; Wolf, Natalie K; Riordan, Jesse D; Dupuy, Adam J; Largaespada, David A; Sarver, Aaron L

    2016-01-01

    Forward genetic screens using Sleeping Beauty (SB)-mobilized T2/Onc transposons have been used to identify common insertion sites (CISs) associated with tumor formation. Recurrent sites of transposon insertion are commonly identified using ligation-mediated PCR (LM-PCR). Here, we use RNA sequencing (RNA-seq) data to directly identify transcriptional events mediated by T2/Onc. Surprisingly, the majority (∼80%) of LM-PCR identified junction fragments do not lead to observable changes in RNA transcripts. However, in CIS regions, direct transcriptional effects of transposon insertions are observed. We developed an automated method to systematically identify T2/Onc-genome RNA fusion sequences in RNA-seq data. RNA fusion-based CISs were identified corresponding to both DNA-based CISs (Cdkn2a, Mycl1, Nf2, Pten, Sema6d, and Rere) and additional regions strongly associated with cancer that were not observed by LM-PCR (Myc, Akt1, Pth, Csf1r, Fgfr2, Wisp1, Map3k5, and Map4k3). In addition to calculating recurrent CISs, we also present complementary methods to identify potential driver events via determination of strongly supported fusions and fusions with large transcript level changes in the absence of multitumor recurrence. These methods independently identify CIS regions and also point to cancer-associated genes like Braf. We anticipate RNA-seq analyses of tumors from forward genetic screens will become an efficient tool to identify causal events. PMID:26553456

  11. RNA sequencing of Sleeping Beauty transposon-induced tumors detects transposon-RNA fusions in forward genetic cancer screens

    PubMed Central

    Temiz, Nuri A.; Moriarity, Branden S.; Wolf, Natalie K.; Riordan, Jesse D.; Dupuy, Adam J.; Largaespada, David A.; Sarver, Aaron L.

    2016-01-01

    Forward genetic screens using Sleeping Beauty (SB)-mobilized T2/Onc transposons have been used to identify common insertion sites (CISs) associated with tumor formation. Recurrent sites of transposon insertion are commonly identified using ligation-mediated PCR (LM-PCR). Here, we use RNA sequencing (RNA-seq) data to directly identify transcriptional events mediated by T2/Onc. Surprisingly, the majority (∼80%) of LM-PCR identified junction fragments do not lead to observable changes in RNA transcripts. However, in CIS regions, direct transcriptional effects of transposon insertions are observed. We developed an automated method to systematically identify T2/Onc-genome RNA fusion sequences in RNA-seq data. RNA fusion-based CISs were identified corresponding to both DNA-based CISs (Cdkn2a, Mycl1, Nf2, Pten, Sema6d, and Rere) and additional regions strongly associated with cancer that were not observed by LM-PCR (Myc, Akt1, Pth, Csf1r, Fgfr2, Wisp1, Map3k5, and Map4k3). In addition to calculating recurrent CISs, we also present complementary methods to identify potential driver events via determination of strongly supported fusions and fusions with large transcript level changes in the absence of multitumor recurrence. These methods independently identify CIS regions and also point to cancer-associated genes like Braf. We anticipate RNA-seq analyses of tumors from forward genetic screens will become an efficient tool to identify causal events. PMID:26553456

  12. The nucleotide sequence of the large ribosomal RNA gene and the adjacent tRNA genes from rat mitochondria.

    PubMed Central

    Saccone, C; Cantatore, P; Gadaleta, G; Gallerani, R; Lanave, C; Pepe, G; Kroon, A M

    1981-01-01

    We have sequenced the Eco R(1) fragment D from rat mitochondrial DNA. It contains one third of the tRNA (Val) gene (the remaining part has been sequenced from the 3' end of the Eco R(1) fragment A) the complete gene for the large mt 16S rRNA, the tRNA (Leu) gene and the 5' end of an unidentified reading frame. The mt gene for the large rRNA from rat has been aligned with the homologous genes from mouse and human using graphic computer programs. Hypervariable regions at the center of the molecule and highly conserved regions toward the 3' end have been detected. The mt gene for tRNA Leu is of the conventional type and its primary structure is highly conserved among mammals. The mt gene for tRNA(Val) shows characteristics similar to those of other mt tRNA genes but the degree of homology is lower. Comparative studies confirm that AGA and AGG are read as stop codons in mammalian mitochondria. PMID:6913863

  13. miRBase: integrating microRNA annotation and deep-sequencing data.

    PubMed

    Kozomara, Ana; Griffiths-Jones, Sam

    2011-01-01

    miRBase is the primary online repository for all microRNA sequences and annotation. The current release (miRBase 16) contains over 15,000 microRNA gene loci in over 140 species, and over 17,000 distinct mature microRNA sequences. Deep-sequencing technologies have delivered a sharp rise in the rate of novel microRNA discovery. We have mapped reads from short RNA deep-sequencing experiments to microRNAs in miRBase and developed web interfaces to view these mappings. The user can view all read data associated with a given microRNA annotation, filter reads by experiment and count, and search for microRNAs by tissue- and stage-specific expression. These data can be used as a proxy for relative expression levels of microRNA sequences, provide detailed evidence for microRNA annotations and alternative isoforms of mature microRNAs, and allow us to revisit previous annotations. miRBase is available online at: http://www.mirbase.org/. PMID:21037258

  14. Identification of two proteins that bind to a pyrimidine-rich sequence in the 3'-untranslated region of GAP-43 mRNA.

    PubMed Central

    Irwin, N; Baekelandt, V; Goritchenko, L; Benowitz, L I

    1997-01-01

    GAP-43 is a membrane phosphoprotein that is important for the development and plasticity of neural connections. In undifferentiated PC12 pheochromocytoma cells, GAP-43 mRNA degrades rapidly ( t = 5 h), but becomes stable when cells are treated with nerve growth factor. To identify trans- acting factors that may influence mRNA stability, we combined column chromatography and gel mobility shift assays to isolate GAP-43 mRNA binding proteins from neonatal bovine brain tissue. This resulted in the isolation of two proteins that bind specifically and competitively to a pyrimidine-rich sequence in the 3'-untranslated region of GAP-43 mRNA. Partial amino acid sequencing revealed that one of the RNA binding proteins coincides with FBP (far upstream element binding protein), previously characterized as a protein that resembles hnRNP K and which binds to a single-stranded, pyrimidine-rich DNA sequence upstream of the c -myc gene to activate its expression. The other binding protein shares sequence homology with PTB, a polypyrimidine tract binding protein implicated in RNA splicing and regulation of translation initiation. The two proteins bind to a 26 nt pyrimidine-rich sequence lying 300 nt downstream of the end of the coding region, in an area shown by others to confer instability on a reporter mRNA in transient transfection assays. We therefore propose that FBP and the PTB-like protein may compete for binding at the same site to influence the stability of GAP-43 mRNA. PMID:9092640

  15. Use of S1 nuclease in deep sequencing for detection of double-stranded RNA viruses.

    PubMed

    Shimada, Saya; Nagai, Makoto; Moriyama, Hiromitsu; Fukuhara, Toshiyuki; Koyama, Satoshi; Omatsu, Tsutomu; Furuya, Tetsuya; Shirai, Junsuke; Mizutani, Tetsuya

    2015-09-01

    Metagenomic approach using next-generation DNA sequencing has facilitated the detection of many pathogenic viruses from fecal samples. However, in many cases, majority of the detected sequences originate from the host genome and bacterial flora in the gut. Here, to improve efficiency of the detection of double-stranded (ds) RNA viruses from samples, we evaluated the applicability of S1 nuclease on deep sequencing. Treating total RNA with S1 nuclease resulted in 1.5-28.4- and 10.1-208.9-fold increases in sequence reads of group A rotavirus in fecal and viral culture samples, respectively. Moreover, increasing coverage of mapping to reference sequences allowed for sufficient genotyping using analytical software. These results suggest that library construction using S1 nuclease is useful for deep sequencing in the detection of dsRNA viruses. PMID:25843154

  16. Deletion analysis of the 5' untranslated leader sequence of tobacco mosaic virus RNA.

    PubMed

    Takamatsu, N; Watanabe, Y; Iwasaki, T; Shiba, T; Meshi, T; Okada, Y

    1991-03-01

    To determine the sequences essential for viral multiplication in the 5' untranslated leader sequence of tobacco mosaic virus RNA, mutant TMV-L (a tomato strain) RNAs which carry several deletions in this 71-nucleotide sequence were constructed by an in vitro transcription system and their multiplication was analyzed by introducing mutant RNA into tobacco protoplasts by electroporation. Large deletions of the sequence from nucleotides 9 to 47 or 25 to 71 abolished viral multiplication; when about 10-nucleotide deletions were introduced throughout this 5' leader sequence, only deletion of the sequence from nucleotides 2 to 8 abolished detectable viral multiplication. This mutant RNA, however, directed the synthesis of the 130,000-molecular-weight protein in a rabbit reticulocyte lysate in vitro translation system, and consequently this 5'-proximal portion appears likely to be essential for replication. PMID:1995954

  17. Sequence variation between 462 human individuals fine-tunes functional sites of RNA processing.

    PubMed

    Ferreira, Pedro G; Oti, Martin; Barann, Matthias; Wieland, Thomas; Ezquina, Suzana; Friedländer, Marc R; Rivas, Manuel A; Esteve-Codina, Anna; Rosenstiel, Philip; Strom, Tim M; Lappalainen, Tuuli; Guigó, Roderic; Sammeth, Michael

    2016-01-01

    Recent advances in the cost-efficiency of sequencing technologies enabled the combined DNA- and RNA-sequencing of human individuals at the population-scale, making genome-wide investigations of the inter-individual genetic impact on gene expression viable. Employing mRNA-sequencing data from the Geuvadis Project and genome sequencing data from the 1000 Genomes Project we show that the computational analysis of DNA sequences around splice sites and poly-A signals is able to explain several observations in the phenotype data. In contrast to widespread assessments of statistically significant associations between DNA polymorphisms and quantitative traits, we developed a computational tool to pinpoint the molecular mechanisms by which genetic markers drive variation in RNA-processing, cataloguing and classifying alleles that change the affinity of core RNA elements to their recognizing factors. The in silico models we employ further suggest RNA editing can moonlight as a splicing-modulator, albeit less frequently than genomic sequence diversity. Beyond existing annotations, we demonstrate that the ultra-high resolution of RNA-Seq combined from 462 individuals also provides evidence for thousands of bona fide novel elements of RNA processing-alternative splice sites, introns, and cleavage sites-which are often rare and lowly expressed but in other characteristics similar to their annotated counterparts. PMID:27617755

  18. RNA editing generates cellular subsets with diverse sequence within populations.

    PubMed

    Harjanto, Dewi; Papamarkou, Theodore; Oates, Chris J; Rayon-Estrada, Violeta; Papavasiliou, F Nina; Papavasiliou, Anastasia

    2016-01-01

    RNA editing is a mutational mechanism that specifically alters the nucleotide content in transcribed RNA. However, editing rates vary widely, and could result from equivalent editing amongst individual cells, or represent an average of variable editing within a population. Here we present a hierarchical Bayesian model that quantifies the variance of editing rates at specific sites using RNA-seq data from both single cells, and a cognate bulk sample to distinguish between these two possibilities. The model predicts high variance for specific edited sites in murine macrophages and dendritic cells, findings that we validated experimentally by using targeted amplification of specific editable transcripts from single cells. The model also predicts changes in variance in editing rates for specific sites in dendritic cells during the course of LPS stimulation. Our data demonstrate substantial variance in editing signatures amongst single cells, supporting the notion that RNA editing generates diversity within cellular populations. PMID:27418407

  19. RNA editing generates cellular subsets with diverse sequence within populations

    PubMed Central

    Harjanto, Dewi; Papamarkou, Theodore; Oates, Chris J.; Rayon-Estrada, Violeta; Papavasiliou, F. Nina; Papavasiliou, Anastasia

    2016-01-01

    RNA editing is a mutational mechanism that specifically alters the nucleotide content in transcribed RNA. However, editing rates vary widely, and could result from equivalent editing amongst individual cells, or represent an average of variable editing within a population. Here we present a hierarchical Bayesian model that quantifies the variance of editing rates at specific sites using RNA-seq data from both single cells, and a cognate bulk sample to distinguish between these two possibilities. The model predicts high variance for specific edited sites in murine macrophages and dendritic cells, findings that we validated experimentally by using targeted amplification of specific editable transcripts from single cells. The model also predicts changes in variance in editing rates for specific sites in dendritic cells during the course of LPS stimulation. Our data demonstrate substantial variance in editing signatures amongst single cells, supporting the notion that RNA editing generates diversity within cellular populations. PMID:27418407

  20. Analysis of a marine picoplankton community by 16S rRNA gene cloning and sequencing.

    PubMed Central

    Schmidt, T M; DeLong, E F; Pace, N R

    1991-01-01

    The phylogenetic diversity of an oligotrophic marine picoplankton community was examined by analyzing the sequences of cloned ribosomal genes. This strategy does not rely on cultivation of the resident microorganisms. Bulk genomic DNA was isolated from picoplankton collected in the north central Pacific Ocean by tangential flow filtration. The mixed-population DNA was fragmented, size fractionated, and cloned into bacteriophage lambda. Thirty-eight clones containing 16S rRNA genes were identified in a screen of 3.2 x 10(4) recombinant phage, and portions of the rRNA gene were amplified by polymerase chain reaction and sequenced. The resulting sequences were used to establish the identities of the picoplankton by comparison with an established data base of rRNA sequences. Fifteen unique eubacterial sequences were obtained, including four from cyanobacteria and eleven from proteobacteria. A single eucaryote related to dinoflagellates was identified; no archaebacterial sequences were detected. The cyanobacterial sequences are all closely related to sequences from cultivated marine Synechococcus strains and with cyanobacterial sequences obtained from the Atlantic Ocean (Sargasso Sea). Several sequences were related to common marine isolates of the gamma subdivision of proteobacteria. In addition to sequences closely related to those of described bacteria, sequences were obtained from two phylogenetic groups of organisms that are not closely related to any known rRNA sequences from cultivated organisms. Both of these novel phylogenetic clusters are proteobacteria, one group within the alpha subdivision and the other distinct from known proteobacterial subdivisions. The rRNA sequences of the alpha-related group are nearly identical to those of some Sargasso Sea picoplankton, suggesting a global distribution of these organisms. Images PMID:2066334

  1. Informatics for RNA Sequencing: A Web Resource for Analysis on the Cloud.

    PubMed

    Griffith, Malachi; Walker, Jason R; Spies, Nicholas C; Ainscough, Benjamin J; Griffith, Obi L

    2015-08-01

    Massively parallel RNA sequencing (RNA-seq) has rapidly become the assay of choice for interrogating RNA transcript abundance and diversity. This article provides a detailed introduction to fundamental RNA-seq molecular biology and informatics concepts. We make available open-access RNA-seq tutorials that cover cloud computing, tool installation, relevant file formats, reference genomes, transcriptome annotations, quality-control strategies, expression, differential expression, and alternative splicing analysis methods. These tutorials and additional training resources are accompanied by complete analysis pipelines and test datasets made available without encumbrance at www.rnaseq.wiki. PMID:26248053

  2. Informatics for RNA Sequencing: A Web Resource for Analysis on the Cloud

    PubMed Central

    Griffith, Malachi; Walker, Jason R.; Spies, Nicholas C.; Ainscough, Benjamin J.; Griffith, Obi L.

    2015-01-01

    Massively parallel RNA sequencing (RNA-seq) has rapidly become the assay of choice for interrogating RNA transcript abundance and diversity. This article provides a detailed introduction to fundamental RNA-seq molecular biology and informatics concepts. We make available open-access RNA-seq tutorials that cover cloud computing, tool installation, relevant file formats, reference genomes, transcriptome annotations, quality-control strategies, expression, differential expression, and alternative splicing analysis methods. These tutorials and additional training resources are accompanied by complete analysis pipelines and test datasets made available without encumbrance at www.rnaseq.wiki. PMID:26248053

  3. Phylogenetic analysis of oryx species using partial sequences of mitochondrial rRNA genes.

    PubMed

    Khan, H A; Arif, I A; Al Farhan, A H; Al Homaidan, A A

    2008-01-01

    We conducted a comparative evaluation of 12S rRNA and 16S rRNA genes of the mitochondrial genome for molecular differentiation among three oryx species (Oryx leucoryx, Oryx dammah and Oryx gazella) with respect to two closely related outgroups, addax and roan. Our findings showed the failure of 12S rRNA gene to differentiate between the genus Oryx and addax, whereas a 342-bp partial sequence of 16S rRNA accurately grouped all five taxa studied, suggesting the utility of 16S rRNA segment for molecular phylogeny of oryx at the genus and possibly species levels. PMID:19048493

  4. Chromosomal localization and sequence variation of 5S rRNA gene in five Capsicum species.

    PubMed

    Park, Y K; Park, K C; Park, C H; Kim, N S

    2000-02-29

    Chromosomal localization and sequence analysis of the 5S rRNA gene were carried out in five Capsicum species. Fluorescence in situ hybridization revealed that chromosomal location of the 5S rRNA gene was conserved in a single locus at a chromosome which was assigned to chromosome 1 by the synteny relationship with tomato. In sequence analysis, the repeating units of the 5S rRNA genes in the Capsicum species were variable in size from 278 bp to 300 bp. In sequence comparison of our results to the results with other Solanaceae plants as published by others, the coding region was highly conserved, but the spacer regions varied in size and sequence. T stretch regions, just after the end of the coding sequences, were more prominant in the Capsicum species than in two other plants. High G x C rich regions, which might have similar functions as that of the GC islands in the genes transcribed by RNA PolII, were observed after the T stretch region. Although we could not observe the TATA like sequences, an AT rich segment at -27 to -18 was detected in the 5S rRNA genes of the Capsicum species. Species relationship among the Capsicum species was also studied by the sequence comparison of the 5S rRNA genes. While C. chinense, C. frutescens, and C. annuum formed one lineage, C. baccatum was revealed to be an intermediate species between the former three species and C. pubescens. PMID:10774742

  5. New perspectives on the diversification of the RNA interference system: insights from comparative genomics and small RNA sequencing

    PubMed Central

    Burroughs, Alexander Maxwell; Ando, Yoshinari; Aravind, L

    2014-01-01

    Our understanding of the pervasive involvement of small RNAs in regulating diverse biological processes has been greatly augmented by recent application of deep-sequencing technologies to small RNA across diverse eukaryotes. We review the currently-known small RNA classes and place them in context of the reconstructed evolutionary history of the RNAi protein machinery. This synthesis indicates the earliest versions of eukaryotic RNAi systems likely utilized small RNA processed from three types of precursors: 1) sense-antisense transcriptional products, 2) genome-encoded, imperfectly-complementary hairpin sequences, and 3) larger non-coding RNA precursor sequences. Structural dissection of PIWI proteins along with recent discovery of novel families (including Med13 of the Mediator complex) suggest that emergence of a distinct architecture with the N-terminal domains (also occurring separately fused to endoDNases in prokaryotes) formed via duplication of an ancestral unit was key to their recruitment as primary RNAi effectors and use of small RNAs of certain preferred lengths. Prokaryotic PIWI proteins are typically components of several RNA-directed DNA restriction or CRISPR/Cas systems. However, eukaryotic versions appear to have emerged from a subset that evolved RNA-directed RNA interference. They were recruited alongside RNaseIII domains and RdRP domains, also from prokaryotic systems, to form the core eukaryotic RNAi system. Like certain regulatory systems, RNAi diversified into two distinct but linked arms concomitant with eukaryotic nucleo-cytoplasmic compartmentalization. Subsequent elaboration of RNAi proceeded via diversification of the core protein machinery through lineage-specific expansions and recruitment of new components from prokaryotes (nucleases and small RNA-modifying enzymes), allowing for diversification of associating small RNAs. PMID:24311560

  6. Sequence analysis of the 3' non-coding region of mouse immunoglobulin light chain messenger RNA.

    PubMed Central

    Hamlyn, P H; Gillam, S; Smith, M; Milstein, C

    1977-01-01

    Using an oligonucleotide d(pT10-C-A) as primer, cDNA has been transcribed from the 3' non-coding region of mouse immunoglobulin light chain mRNA and sequenced by a modification1 of the 'plus-minus' gel method2. The sequence obtained has partially corrected and extended a previously obtained sequence3. The new data contains an unusual sequence in which a trinucleotide is repeated seven times. Images PMID:405661

  7. Fluorescent in situ sequencing (FISSEQ) of RNA for gene expression profiling in intact cells and tissues

    PubMed Central

    Lee, Je Hyuk; Daugharthy, Evan R.; Scheiman, Jonathan; Kalhor, Reza; Ferrante, Thomas C.; Terry, Richard; Turczyk, Brian M.; Yang, Joyce L.; Lee, Ho Suk; Aach, John; Zhang, Kun; Church, George M.

    2014-01-01

    RNA sequencing measures the quantitative change in gene expression over the whole transcriptome, but it lacks spatial context. On the other hand, in situ hybridization provides the location of gene expression, but only for a small number of genes. Here we detail a protocol for genome-wide profiling of gene expression in situ in fixed cells and tissues, in which RNA is converted into cross-linked cDNA amplicons and sequenced manually on a confocal microscope. Unlike traditional RNA-seq our method enriches for context-specific transcripts over house-keeping and/or structural RNA, and it preserves the tissue architecture for RNA localization studies. Our protocol is written for researchers experienced in cell microscopy with minimal computing skills. Library construction and sequencing can be completed within 14 d, with image analysis requiring an additional 2 d. PMID:25675209

  8. Empirical analysis of RNA robustness and evolution using high-throughput sequencing of ribozyme reactions.

    PubMed

    Hayden, Eric J

    2016-08-15

    RNA molecules provide a realistic but tractable model of a genotype to phenotype relationship. This relationship has been extensively investigated computationally using secondary structure prediction algorithms. Enzymatic RNA molecules, or ribozymes, offer access to genotypic and phenotypic information in the laboratory. Advancements in high-throughput sequencing technologies have enabled the analysis of sequences in the lab that now rivals what can be accomplished computationally. This has motivated a resurgence of in vitro selection experiments and opened new doors for the analysis of the distribution of RNA functions in genotype space. A body of computational experiments has investigated the persistence of specific RNA structures despite changes in the primary sequence, and how this mutational robustness can promote adaptations. This article summarizes recent approaches that were designed to investigate the role of mutational robustness during the evolution of RNA molecules in the laboratory, and presents theoretical motivations, experimental methods and approaches to data analysis. PMID:27215494

  9. RNA sequencing using fluorescent-labeled dideoxynucleotides and automated fluorescence detection.

    PubMed Central

    Bauer, G J

    1990-01-01

    Although dideoxy terminated sequencing of RNA, using reverse transcriptase and oligodeoxynucleotide primers, is now a well established method, the accuracy is limited by sequence ambiguities due to unspecific chain termination events. A protocol is described which circumvents these ambiguities by using fluorescence labels tagged to dideoxynucleotides. Only chain terminations caused by dideoxynucleotides were detected while premature terminated cDNA's remain undetectable. In addition, the remaining multiple signals at nucleotide positions can be assigned to sequence heterogeneities within the RNA sequence to be determined. Images PMID:1690393

  10. RNA sequence and transcriptional properties of the 3' end of the Newcastle disease virus genome

    SciTech Connect

    Kurilla, M.G.; Stone, H.O.; Keene, J.D.

    1985-09-01

    The 3' end of the genomic RNA of Newcastle disease virus (NDV) has been sequenced and the leader RNA defined. Using hybridization to a 3'-end-labeled genome, leader RNA species from in vitro transcription reactions and from infected cell extracts were found to be 47 and 53 nucleotides long. In addition, the start site of the 3'-proximal mRNA was determined by sequence analysis of in vitro (beta-32P)GTP-labeled transcription products. The genomic sequence extending beyond the leader region demonstrated an open reading frame for at least 42 amino acids and probably represents the amino terminus of the nucleocapsid protein (NP). The terminal 8 nucleotides of the NDV genome were identical to those of measles virus and Sendai virus while the sequence of the distal half of the leader region was more similar to that of vesicular stomatitis virus. These data argue for strong evolutionary relatedness between the paramyxovirus and rhabdovirus groups.

  11. Large-scale sequencing and the natural history of model human RNA viruses

    PubMed Central

    Dugan, Vivien G; Saira, Kazima; Ghedin, Elodie

    2012-01-01

    RNA virus exploration within the field of medical virology has greatly benefited from technological developments in genomics, deepening our understanding of viral dynamics and emergence. Large-scale first-generation technology sequencing projects have expedited molecular epidemiology studies at an unprecedented scale for two pathogenic RNA viruses chosen as models: influenza A virus and dengue. Next-generation sequencing approaches are now leading to a more in-depth analysis of virus genetic diversity, which is greater for RNA than DNA viruses because of high replication rates and the absence of proofreading activity of the RNA-dependent RNA polymerase. In the field of virus discovery, technological advancements and metagenomic approaches are expanding the catalogs of novel viruses by facilitating our probing into the RNA virus world. PMID:23682295

  12. Small RNA and RNA-IP Sequencing Identifies and Validates Novel MicroRNAs in Human Mesenchymal Stem Cells.

    PubMed

    Tsai, Chin-Han; Liao, Ko-Hsun; Shih, Chuan-Chi; Chan, Chia-Hao; Hsieh, Jui-Yu; Tsai, Cheng-Fong; Wang, Hsei-Wei; Chang, Shing-Jyh

    2016-03-01

    Organ regeneration therapies using multipotent mesenchymal stem cells (MSCs) are currently being investigated for a variety of common complex diseases. Understanding the molecular regulation of MSC biology will benefit regenerative medicine. MicroRNAs (miRNAs) act as regulators in MSC stemness. There are approximately 2500 currently known human miRNAs that have been recorded in the miRBase v21 database. In the present study, we identified novel microRNAs involved in MSC stemness and differentiation by obtaining the global microRNA expression profiles (miRNomes) of MSCs from two anatomical locations bone marrow (BM-MSCs) and umbilical cord Wharton's jelly (WJ-MSCs) and from osteogenically and adipogenically differentiated progenies of BM-MSCs. Small RNA sequencing (smRNA-seq) and bioinformatics analyses predicted that 49 uncharacterized miRNA candidates had high cellular expression values in MSCs. Another independent batch of Ago1/2-based RNA immunoprecipitation (RNA-IP) sequencing datasets validated the existence of 40 unreported miRNAs in cells and their associations with the RNA-induced silencing complex (RISC). Nine of these 40 new miRNAs were universally overexpressed in both MSC types; nine others were overexpressed in differentiated cells. A novel miRNA (UNI-118-3p) was specifically expressed in BM-MSCs, as verified using RT-qPCR. Taken together, this report offers comprehensive miRNome profiles for two MSC types, as well as cells differentiated from BM-MSCs. MSC transplantation has the potential to ameliorate degenerative disorders and repair damaged tissues. Interventions involving the above 40 new microRNA members in transplanted MSCs may potentially guide future clinical applications. PMID:26910904

  13. The nucleotide sequence at the 5' end of foot and mouth disease virus RNA.

    PubMed Central

    Harris, T J

    1979-01-01

    Foot and mouth disease virus RNA has been treated with RNase H in the presence of oligo (dG) specifically to digest the poly(C) tract which lies near the 5' end of the molecule (10). The short (S) fragment containing the 5' end of the RNA was separated from the remainder of the RNA (L fragment) by gel electrophoresis. RNA ligase mediated labelling of the 3' end of S fragment showed that the RNase H digestion gave rise to molecules that differed only in the number of cytidylic acid residues remaining at their 3' ends and did not leave the unique 3' end necessary for fast sequence analysis. As the 5' end of S fragment prepared form virus RNA is blocked by VPg, S fragment was prepared from virus specific messenger RNA which does not contain this protein. This RNA was labelled at the 5' end using polynucleotide kinase and the sequence of 70 nucleotides at the 5' end determined by partial enzyme digestion sequencing on polyacrylamide gels. Some of this sequence was confirmed from an analysis of the oligonucleotides derived by RNase T1 digestion of S fragment. The sequence obtained indicates that there is a stable hairpin loop at the 5' terminus of the RNA before an initiation codon 33 nucleotides from the 5' end. In addition, the RNase T1 analysis suggests that there are short repeated sequences in S fragment and that an eleven nucleotide inverted complementary repeat of a sequence near the 3' end of the RNA is present at the junction of S fragment and the poly(C) tract. Images PMID:231762

  14. Comprehensive analysis of human small RNA sequencing data provides insights into expression profiles and miRNA editing

    PubMed Central

    Gong, Jing; Wu, Yuliang; Zhang, Xiantong; Liao, Yifang; Sibanda, Vusumuzi Leroy; Liu, Wei; Guo, An-Yuan

    2014-01-01

    MicroRNAs (miRNAs) play key regulatory roles in various biological processes and diseases. A comprehensive analysis of large scale small RNA sequencing data (smRNA-seq) will be very helpful to explore tissue or disease specific miRNA markers and uncover miRNA variants. Here, we systematically analyzed 410 human smRNA-seq datasets, which samples are from 24 tissue/disease/cell lines. We tested the mapping strategies and found that it was necessary to make multiple-round mappings with different mismatch parameters. miRNA expression profiles revealed that on average ∼70% of known miRNAs were expressed at low level or not expressed (RPM < 1) in a sample and only ∼9% of known miRNAs were relatively highly expressed (RPM > 100). About 30% known miRNAs were not expressed in all of our used samples. The miRNA expression profiles were compiled into an online database (HMED, http://bioinfo.life.hust.edu.cn/smallRNA/). Dozens of tissue/disease specific miRNAs, disease/control dysregulated miRNAs and miRNAs with arm switching events were discovered. Further, we identified some highly confident editing sites including 24 A-to-I sites and 23 C-to-U sites. About half of them were widespread miRNA editing sites in different tissues. We characterized that the 2 types of editing sites have different features with regard to location, editing level and frequency. Our analyses for expression profiles, specific miRNA markers, arm switching, and editing sites, may provide valuable information for further studies of miRNA function and biomarker finding. PMID:25692236

  15. Complete Sequence Construction of the Highly Repetitive Ribosomal RNA Gene Repeats in Eukaryotes Using Whole Genome Sequence Data.

    PubMed

    Agrawal, Saumya; Ganley, Austen R D

    2016-01-01

    The ribosomal RNA genes (rDNA) encode the major rRNA species of the ribosome, and thus are essential across life. These genes are highly repetitive in most eukaryotes, forming blocks of tandem repeats that form the core of nucleoli. The primary role of the rDNA in encoding rRNA has been long understood, but more recently the rDNA has been implicated in a number of other important biological phenomena, including genome stability, cell cycle, and epigenetic silencing. Noncoding elements, primarily located in the intergenic spacer region, appear to mediate many of these phenomena. Although sequence information is available for the genomes of many organisms, in almost all cases rDNA repeat sequences are lacking, primarily due to problems in assembling these intriguing regions during whole genome assemblies. Here, we present a method to obtain complete rDNA repeat unit sequences from whole genome assemblies. Limitations of next generation sequencing (NGS) data make them unsuitable for assembling complete rDNA unit sequences; therefore, the method we present relies on the use of Sanger whole genome sequence data. Our method makes use of the Arachne assembler, which can assemble highly repetitive regions such as the rDNA in a memory-efficient way. We provide a detailed step-by-step protocol for generating rDNA sequences from whole genome Sanger sequence data using Arachne, for refining complete rDNA unit sequences, and for validating the sequences obtained. In principle, our method will work for any species where the rDNA is organized into tandem repeats. This will help researchers working on species without a complete rDNA sequence, those working on evolutionary aspects of the rDNA, and those interested in conducting phylogenetic footprinting studies with the rDNA. PMID:27576718

  16. Quantitative Assessment of RNA-Protein Interactions with High Throughput Sequencing - RNA Affinity Profiling (HiTS-RAP)

    PubMed Central

    Ozer, Abdullah; Tome, Jacob M.; Friedman, Robin C.; Gheba, Dan; Schroth, Gary P.; Lis, John T.

    2016-01-01

    Because RNA-protein interactions play a central role in a wide-array of biological processes, methods that enable a quantitative assessment of these interactions in a high-throughput manner are in great demand. Recently, we developed the High Throughput Sequencing-RNA Affinity Profiling (HiTS-RAP) assay, which couples sequencing on an Illumina GAIIx with the quantitative assessment of one or several proteins’ interactions with millions of different RNAs in a single experiment. We have successfully used HiTS-RAP to analyze interactions of EGFP and NELF-E proteins with their corresponding canonical and mutant RNA aptamers. Here, we provide a detailed protocol for HiTS-RAP, which can be completed in about a month (8 days hands-on time) including the preparation and testing of recombinant proteins and DNA templates, clustering DNA templates on a flowcell, high-throughput sequencing and protein binding with GAIIx, and finally data analysis. We also highlight aspects of HiTS-RAP that can be further improved and points of comparison between HiTS-RAP and two other recently developed methods, RNA-MaP and RBNS. A successful HiTS-RAP experiment provides the sequence and binding curves for approximately 200 million RNAs in a single experiment. PMID:26182240

  17. Method for rapid base sequencing in DNA and RNA with two base labeling

    DOEpatents

    Jett, J.H.; Keller, R.A.; Martin, J.C.; Posner, R.G.; Marrone, B.L.; Hammond, M.L.; Simpson, D.J.

    1995-04-11

    A method is described for rapid-base sequencing in DNA and RNA with two-base labeling and employing fluorescent detection of single molecules at two wavelengths. Bases modified to accept fluorescent labels are used to replicate a single DNA or RNA strand to be sequenced. The bases are then sequentially cleaved from the replicated strand, excited with a chosen spectrum of electromagnetic radiation, and the fluorescence from individual, tagged bases detected in the order of cleavage from the strand. 4 figures.

  18. Method for rapid base sequencing in DNA and RNA with two base labeling

    DOEpatents

    Jett, James H.; Keller, Richard A.; Martin, John C.; Posner, Richard G.; Marrone, Babetta L.; Hammond, Mark L.; Simpson, Daniel J.

    1995-01-01

    Method for rapid-base sequencing in DNA and RNA with two-base labeling and employing fluorescent detection of single molecules at two wavelengths. Bases modified to accept fluorescent labels are used to replicate a single DNA or RNA strand to be sequenced. The bases are then sequentially cleaved from the replicated strand, excited with a chosen spectrum of electromagnetic radiation, and the fluorescence from individual, tagged bases detected in the order of cleavage from the strand.

  19. ARM-Seq: AlkB-facilitated RNA methylation sequencing reveals a complex landscape of modified tRNA fragments

    PubMed Central

    Cozen, Aaron E.; Quartley, Erin; Holmes, Andrew D.; Robinson, Eva H.; Phizicky, Eric M.; Lowe, Todd M.

    2015-01-01

    High throughput RNA sequencing has accelerated discovery of the complex regulatory roles of small RNAs, but RNAs containing modified nucleosides may escape detection when those modifications interfere with reverse transcription during RNA-seq library preparation. Here we describe AlkB-facilitated RNA Methylation sequencing (ARM-Seq) which uses pre-treatment with Escherichia coli AlkB to demethylate 1-methyladenosine, 3-methylcytidine, and 1-methylguanosine, all commonly found in transfer RNAs. Comparative methylation analysis using ARM-Seq provides the first detailed, transcriptome-scale map of these modifications, and reveals an abundance of previously undetected, methylated small RNAs derived from tRNAs. ARM-Seq demonstrates that tRNA-derived small RNAs accurately recapitulate the m1A modification state for well-characterized yeast tRNAs, and generates new predictions for a large number of human tRNAs, including tRNA precursors and mitochondrial tRNAs. Thus, ARM-Seq provides broad utility for identifying previously overlooked methyl-modified RNAs, can efficiently monitor methylation state, and may reveal new roles for tRNA-derived RNAs as biomarkers or signaling molecules. PMID:26237225

  20. tRNAfeature: An algorithm for tRNA features to identify tRNA genes in DNA sequences.

    PubMed

    Yang, Cheng-Hong; Lin, Yu-Da; Chuang, Li-Yeh

    2016-09-01

    The identification of transfer RNAs (tRNAs) is critical for a detailed understanding of the evolution of biological organisms and viruses. However, some tRNAs are difficult to recognize due to their unusual sub-structures and may result in the detection of the wrong anticodon. Therefore, the detection of unusual sub-structures of tRNA genes remains an important challenge. In this study, we propose a method to identify tRNA genes based on tRNA features. tRNAfeature attempts to refold the sequence with single-stranded regions longer than those found in the canonical and conventional structural models for tRNA. We predicted a set of 53926 archaeal, eubacterial and eukaryotic tRNA genes annotated in tRNADB-CE and scanned the tRNA genes in whole genome sequencing. The results indicate that tRNAfeature is more powerful than other existing methods for identifying tRNAs. PMID:27291467

  1. Accurate taxonomy assignments from 16S rRNA sequences produced by highly parallel pyrosequencers

    PubMed Central

    Liu, Zongzhi; DeSantis, Todd Z.; Andersen, Gary L.; Knight, Rob

    2008-01-01

    The recent introduction of massively parallel pyrosequencers allows rapid, inexpensive analysis of microbial community composition using 16S ribosomal RNA (rRNA) sequences. However, a major challenge is to design a workflow so that taxonomic information can be accurately and rapidly assigned to each read, so that the composition of each community can be linked back to likely ecological roles played by members of each species, genus, family or phylum. Here, we use three large 16S rRNA datasets to test whether taxonomic information based on the full-length sequences can be recaptured by short reads that simulate the pyrosequencer outputs. We find that different taxonomic assignment methods vary radically in their ability to recapture the taxonomic information in full-length 16S rRNA sequences: most methods are sensitive to the region of the 16S rRNA gene that is targeted for sequencing, but many combinations of methods and rRNA regions produce consistent and accurate results. To process large datasets of partial 16S rRNA sequences obtained from surveys of various microbial communities, including those from human body habitats, we recommend the use of Greengenes or RDP classifier with fragments of at least 250 bases, starting from one of the primers R357, R534, R798, F343 or F517. PMID:18723574

  2. Species Identification and Profiling of Complex Microbial Communities Using Shotgun Illumina Sequencing of 16S rRNA Amplicon Sequences

    PubMed Central

    Lay, Christophe; Ho, Eliza Xin Pei; Low, Louie; Hibberd, Martin Lloyd; Nagarajan, Niranjan

    2013-01-01

    The high throughput and cost-effectiveness afforded by short-read sequencing technologies, in principle, enable researchers to perform 16S rRNA profiling of complex microbial communities at unprecedented depth and resolution. Existing Illumina sequencing protocols are, however, limited by the fraction of the 16S rRNA gene that is interrogated and therefore limit the resolution and quality of the profiling. To address this, we present the design of a novel protocol for shotgun Illumina sequencing of the bacterial 16S rRNA gene, optimized to amplify more than 90% of sequences in the Greengenes database and with the ability to distinguish nearly twice as many species-level OTUs compared to existing protocols. Using several in silico and experimental datasets, we demonstrate that despite the presence of multiple variable and conserved regions, the resulting shotgun sequences can be used to accurately quantify the constituents of complex microbial communities. The reconstruction of a significant fraction of the 16S rRNA gene also enabled high precision (>90%) in species-level identification thereby opening up potential application of this approach for clinical microbial characterization. PMID:23579286

  3. High-Throughput Mapping of Single-Neuron Projections by Sequencing of Barcoded RNA.

    PubMed

    Kebschull, Justus M; Garcia da Silva, Pedro; Reid, Ashlan P; Peikon, Ian D; Albeanu, Dinu F; Zador, Anthony M

    2016-09-01

    Neurons transmit information to distant brain regions via long-range axonal projections. In the mouse, area-to-area connections have only been systematically mapped using bulk labeling techniques, which obscure the diverse projections of intermingled single neurons. Here we describe MAPseq (Multiplexed Analysis of Projections by Sequencing), a technique that can map the projections of thousands or even millions of single neurons by labeling large sets of neurons with random RNA sequences ("barcodes"). Axons are filled with barcode mRNA, each putative projection area is dissected, and the barcode mRNA is extracted and sequenced. Applying MAPseq to the locus coeruleus (LC), we find that individual LC neurons have preferred cortical targets. By recasting neuroanatomy, which is traditionally viewed as a problem of microscopy, as a problem of sequencing, MAPseq harnesses advances in sequencing technology to permit high-throughput interrogation of brain circuits. PMID:27545715

  4. Research Techniques Made Simple: Bacterial 16S Ribosomal RNA Gene Sequencing in Cutaneous Research.

    PubMed

    Jo, Jay-Hyun; Kennedy, Elizabeth A; Kong, Heidi H

    2016-03-01

    Skin serves as a protective barrier and also harbors numerous microorganisms collectively comprising the skin microbiome. As a result of recent advances in sequencing (next-generation sequencing), our understanding of microbial communities on skin has advanced substantially. In particular, the 16S ribosomal RNA gene sequencing technique has played an important role in efforts to identify the global communities of bacteria in healthy individuals and patients with various disorders in multiple topographical regions over the skin surface. Here, we describe basic principles, study design, and a workflow of 16S ribosomal RNA gene sequencing methodology, primarily for investigators who are not familiar with this approach. This article will also discuss some applications and challenges of 16S ribosomal RNA sequencing as well as directions for future development. PMID:26902128

  5. Sequence heterogeneity in the two 16S rRNA genes of Phormium yellow leaf phytoplasma.

    PubMed Central

    Liefting, L W; Andersen, M T; Beever, R E; Gardner, R C; Forster, R L

    1996-01-01

    Phormium yellow leaf (PYL) phytoplasma causes a lethal disease of the monocotyledon, New Zealand flax (Phormium tenax). The 16S rRNA genes of PYL phytoplasma were amplified from infected flax by PCR and cloned, and the nucleotide sequences were determined. DNA sequencing and Southern hybridization analysis of genomic DNA indicated the presence of two copies of the 16S rRNA gene. The two 16S rRNA genes exhibited sequence heterogeneity in 4 nucleotide positions and could be distinguished by the restriction enzymes BpmI and BsrI. This is the first record in which sequence heterogeneity in the 16S rRNA genes of a phytoplasma has been determined by sequence analysis. A phylogenetic tree based on 16S rRNA gene sequences showed that PYL phytoplasma is most closely related to the stolbur and German grapevine yellows phytoplasmas, which form the stolbur subgroup of the aster yellows group. This phylogenetic position of PYL phytoplasma was supported by 16S/23S spacer region sequence data. PMID:8795200

  6. Efficient Nucleic Acid Extraction and 16S rRNA Gene Sequencing for Bacterial Community Characterization.

    PubMed

    Anahtar, Melis N; Bowman, Brittany A; Kwon, Douglas S

    2016-01-01

    There is a growing appreciation for the role of microbial communities as critical modulators of human health and disease. High throughput sequencing technologies have allowed for the rapid and efficient characterization of bacterial communities using 16S rRNA gene sequencing from a variety of sources. Although readily available tools for 16S rRNA sequence analysis have standardized computational workflows, sample processing for DNA extraction remains a continued source of variability across studies. Here we describe an efficient, robust, and cost effective method for extracting nucleic acid from swabs. We also delineate downstream methods for 16S rRNA gene sequencing, including generation of sequencing libraries, data quality control, and sequence analysis. The workflow can accommodate multiple samples types, including stool and swabs collected from a variety of anatomical locations and host species. Additionally, recovered DNA and RNA can be separated and used for other applications, including whole genome sequencing or RNA-seq. The method described allows for a common processing approach for multiple sample types and accommodates downstream analysis of genomic, metagenomic and transcriptional information. PMID:27168460

  7. A method for accurate determination of terminal sequences of viral genomic RNA.

    PubMed

    Weng, Z; Xiong, Z

    1995-09-01

    A combination of ligation-anchored PCR and anchored cDNA cloning techniques were used to clone the termini of the saguaro cactus virus (SCV) RNA genome. The terminal sequences of the viral genome were subsequently determined from the clones. The 5' terminus was cloned by ligation-anchored PCR, whereas the 3' terminus was obtained by a technique we term anchored cDNA cloning. In anchored cDNA cloning, an anchor oligonucleotide was prepared by phosphorylation at the 5' end, followed by addition of a dideoxynucleotide at the 3' end to block the free hydroxyl group. The 5' end of the anchor was subsequently ligated to the 3' end of SCV RNA. The anchor-ligated, chimerical viral RNA was then reverse-transcribed into cDNA using a primer complementary to the anchor. The cDNA containing the complete 3'-terminal sequence was converted into ds-cDNA, cloned, and sequenced. Two restriction sites, one within the viral sequence and one within the primer sequence, were used to facilitate cloning. The combination of these techniques proved to be an easy and accurate way to determine the terminal sequences of SCV RNA genome and should be applicable to any other RNA molecules with unknown terminal sequences. PMID:9132274

  8. Efficient Nucleic Acid Extraction and 16S rRNA Gene Sequencing for Bacterial Community Characterization

    PubMed Central

    Anahtar, Melis N.; Bowman, Brittany A.; Kwon, Douglas S.

    2016-01-01

    There is a growing appreciation for the role of microbial communities as critical modulators of human health and disease. High throughput sequencing technologies have allowed for the rapid and efficient characterization of bacterial communities using 16S rRNA gene sequencing from a variety of sources. Although readily available tools for 16S rRNA sequence analysis have standardized computational workflows, sample processing for DNA extraction remains a continued source of variability across studies. Here we describe an efficient, robust, and cost effective method for extracting nucleic acid from swabs. We also delineate downstream methods for 16S rRNA gene sequencing, including generation of sequencing libraries, data quality control, and sequence analysis. The workflow can accommodate multiple samples types, including stool and swabs collected from a variety of anatomical locations and host species. Additionally, recovered DNA and RNA can be separated and used for other applications, including whole genome sequencing or RNA-seq. The method described allows for a common processing approach for multiple sample types and accommodates downstream analysis of genomic, metagenomic and transcriptional information. PMID:27168460

  9. A 5'-proximal RNA sequence of murine coronavirus as a potential initiation site for genomic-length mRNA transcription.

    PubMed Central

    Zhang, X; Lai, M M

    1996-01-01

    Coronavirus transcription is a discontinuous process, involving interactions between a trans-acting leader and the intergenic transcription initiation sequences. A 9-nucleotide (nt) sequence (UUUAUAAAC), which is located immediately downstream of the leader at the 5' terminus of the mouse hepatitis virus (MHV) genomic RNA, contains a sequence resembling the consensus intergenic sequence (UCUAAAC). It has been shown previously that the presence of the 9-nt sequence facilitates leader RNA switching and may enhance subgenomic mRNA transcription. It is unclear how the 9-nt sequence exerts these functions. In this study, we inserted the 9-nt sequence into a defective interfering (DI) RNA reporter system and demonstrated that mRNA transcription could be initiated from the 9-nt sequence almost as efficiently as from the intergenic sequence between genes 6 and 7. Sequence analysis of the mRNAs showed that the 9-nt sequence served as a site of fusion between the leaders and mRNA. The transcription initiation function of the 9-nt sequence could not be substituted by other 5'-terminal sequences. When the entire 5'-terminal sequence, including four copies of the UCUAA sequence plus the 9-nt sequence, was present, transcription could be initiated from any of the UCUAA copies or the 9-nt sequence, resulting in different copy numbers of the UCUAA sequence and the deletion of the 9-nt sequence in some mRNAs. All of these heterogeneous RNA species were also detected from the 5'-terminal region of the viral genomic-length RNA in MHV-infected cells. These results thus suggest tha the heterogeneity of the copy number of UCUAA sequences at the 5' end, the deletion of the 9-nt sequence in viral and DI RNAs, and the leader RNA switching are the results of transcriptional initiation from the 9-nt site. They also show that an mRNA species (mRNA 1) that lacks the 9-nt sequence can be synthesized during MHV infection. Therefore, MHV genomic RNA replication and mRNA 1 transcription may be

  10. Adenovirus type 12-specific RNA sequences during productive infection of KB cells.

    PubMed Central

    Smiley, J R; Mak, S

    1976-01-01

    The complementary strands of adenovirus type 12 DNA were separated, and virus-specific RNA was analyzed by saturation hybridization in solution. Late during infection whole cell RNA hybridized to 75% of the light (1) strand and 15% of the heavy (H) strand, whereas cytoplasmic RNA hybridized to 65% of the 1 strand and 15% of the h strand. Late nuclear RNA hybridized to about 90% of the 1 strand and at least 36% of the h strand. Double-stranded RNA was isolated from infected cells late after infection, which annealed to greater than 30% of each of the two complementary DNA strands. Early whole cell RNA hybridized to 45 to 50% of the 1 strand and 15% of the h strand, whereas early cytoplasmic RNA hybridized to about 15% of each of the complementary strands. All early cytoplasmic sequences were present in the cytoplasm at late times. PMID:950688

  11. Complete sequence and gene organization of the Nosema heliothidis ribosomal RNA gene region.

    PubMed

    Dong, Shinan; Shen, Zhongyuan; Zhu, Feng; Tang, Xudong; Xu, Li

    2011-01-01

    By sequencing the entire ribosomal RNA (rRNA) gene region of Nosema heliothidis isolated from cotton bollworm (Helicoverpa armigera), we showed that its gene organization is similar to the type species, Nosema bombycis: the 5'-large subunit rRNA (2,490 bp)-internal transcribed spacer (192 bp)-small subunit rRNA (1,232 bp)-intergenic spacer (274 bp)-5S rRNA (115 bp)-3'. We constructed two phylogenetic trees, analyzed phylogenetic relationships, examined rRNA organization of microsporidia, and compared the secondary structure of small subunit rRNA with closely related microsporidia. The latter two features may provide important information for the classification and phylogenetic analysis of microsporidia. PMID:21895841

  12. Excess of Yra1 RNA-Binding Factor Causes Transcription-Dependent Genome Instability, Replication Impairment and Telomere Shortening

    PubMed Central

    Gavaldá, Sandra; Santos-Pereira, José M.; García-Rubio, María L.; Luna, Rosa; Aguilera, Andrés

    2016-01-01

    Yra1 is an essential nuclear factor of the evolutionarily conserved family of hnRNP-like export factors that when overexpressed impairs mRNA export and cell growth. To investigate further the relevance of proper Yra1 stoichiometry in the cell, we overexpressed Yra1 by transforming yeast cells with YRA1 intron-less constructs and analyzed its effect on gene expression and genome integrity. We found that YRA1 overexpression induces DNA damage and leads to a transcription-associated hyperrecombination phenotype that is mediated by RNA:DNA hybrids. In addition, it confers a genome-wide replication retardation as seen by reduced BrdU incorporation and accumulation of the Rrm3 helicase. In addition, YRA1 overexpression causes a cell senescence-like phenotype and telomere shortening. ChIP-chip analysis shows that overexpressed Yra1 is loaded to transcribed chromatin along the genome and to Y’ telomeric regions, where Rrm3 is also accumulated, suggesting an impairment of telomere replication. Our work not only demonstrates that a proper stoichiometry of the Yra1 mRNA binding and export factor is required to maintain genome integrity and telomere homeostasis, but suggests that the cellular imbalance between transcribed RNA and specific RNA-binding factors may become a major cause of genome instability mediated by co-transcriptional replication impairment. PMID:27035147

  13. Characterization and phylogenetic relationships among microsporidia infecting silkworm, Bombyx mori, using inter simple sequence repeat (ISSR) and small subunit rRNA (SSU-rRNA) sequence analysis.

    PubMed

    Rao, S Nageswara; Nath, B Surendra; Saratchandra, B

    2005-06-01

    This study is the first report on the genetic characterization and relationships among different microsporidia infecting the silkworm, Bombyx mori, using inter simple sequence repeat PCR (ISSR-PCR) analysis. Six different microsporidians were distinguished through molecular DNA typing using ISSR-PCR. Thus, ISSR-PCR analysis can be a powerful tool to detect polymorphisms and identify microsporidians, which are difficult to study with microscopy because of their extremely small size. Of the 100 ISSR primers tested, only 28 primers had reproducibility and high polymorphism (93%). A total of 24 ISSR primers produced 55 unique genetic markers, which could be used to differentiate the microsporidians from each other. Among the 28 SSRs tested, the most abundant were (CA)n, (GA)n, and (GT)n repeats. The degree of band sharing was used to evaluate genetic similarity between different microsporidian isolates and to construct a phylogenetic tree using Jaccard's similarity coefficient. The results indicate that the DNA profiles based on ISSR markers can be used as diagnostic tools to identify different microsporidia with considerable accuracy. In addition, the small subunit ribosomal RNA (SSU-rRNA) sequence gene was amplified, cloned, and sequenced from each of the 6 microsporidian isolates. These sequences were compared with 20 other microsporidian SSU-rRNA sequences to develop a phylogenetic tree for the microsporidia isolated from the silkworms. This method was found to be useful in establishing the phylogenetic relationships among the different microsporidians isolated from silkworms. Of the 6 microsporidian isolates, NIK-1s revealed an SSU-rRNA gene sequence similar to Nosema bombycis, indicating that NIK-1s is similar to N. bombycis; the remaining 5 isolates, which differed from each other and from N. bombycis, were considered to be different variants belonging to the species N. bombycis. PMID:16121233

  14. Molecular Diagnosis of Actinomadura madurae Infection by 16S rRNA Deep Sequencing

    PubMed Central

    SenGupta, Dhruba J.; Hoogestraat, Daniel R.; Cummings, Lisa A.; Bryant, Bronwyn H.; Natividad, Catherine; Thielges, Stephanie; Monsaas, Peter W.; Chau, Mimosa; Barbee, Lindley A.; Rosenthal, Christopher; Cookson, Brad T.; Hoffman, Noah G.

    2013-01-01

    Next-generation DNA sequencing can be used to catalog individual organisms within complex, polymicrobial specimens. Here, we utilized deep sequencing of 16S rRNA to implicate Actinomadura madurae as the cause of mycetoma in a diabetic patient when culture and conventional molecular methods were overwhelmed by overgrowth of other organisms. PMID:24108607

  15. Genome Sequence of Saccharomyces cerevisiae Double-Stranded RNA Virus L-A-28

    PubMed Central

    Konovalovas, Aleksandras

    2016-01-01

    We cloned and sequenced the complete genome of the L-A-28 virus from the Saccharomyces cerevisiae K28 killer strain. This sequence completes the set of currently identified L-A helper viruses required for expression of double-stranded RNA-originated killer phenotypes in baking yeast. PMID:27313294

  16. Genome Sequence of Saccharomyces cerevisiae Double-Stranded RNA Virus L-A-28.

    PubMed

    Konovalovas, Aleksandras; Serviené, Elena; Serva, Saulius

    2016-01-01

    We cloned and sequenced the complete genome of the L-A-28 virus from the Saccharomyces cerevisiae K28 killer strain. This sequence completes the set of currently identified L-A helper viruses required for expression of double-stranded RNA-originated killer phenotypes in baking yeast. PMID:27313294

  17. Differential DNA and RNA sequence discrimination by PNA having charged side chains.

    PubMed

    De Costa, N Tilani S; Heemstra, Jennifer M

    2014-05-15

    PNA sequences modified with charged side chains were evaluated for base-pairing sequence selectivity under physiological conditions. PNA having negatively charged aspartic acid side chains shows higher selectivity with RNA, while PNA having positively charged lysine side chains shows higher selectivity with DNA. These observations provide insight into the binding selectivity of modified PNA in antisense and antigene applications. PMID:24731279

  18. Draft Genome Sequences of Leviviridae RNA Phages EC and MB Recovered from San Francisco Wastewater

    PubMed Central

    DeRisi, Joseph L.

    2015-01-01

    We report here the draft genome sequences of marine RNA phages EC and MB assembled from metagenomic sequencing of organisms in San Francisco wastewater. These phages showed moderate translated amino acid identity to other enterobacteria phages and appear to constitute novel members of the Leviviridae family. PMID:26112785

  19. Draft Genome Sequences of Leviviridae RNA Phages EC and MB Recovered from San Francisco Wastewater.

    PubMed

    Greninger, Alexander L; DeRisi, Joseph L

    2015-01-01

    We report here the draft genome sequences of marine RNA phages EC and MB assembled from metagenomic sequencing of organisms in San Francisco wastewater. These phages showed moderate translated amino acid identity to other enterobacteria phages and appear to constitute novel members of the Leviviridae family. PMID:26112785

  20. Taxonomic Assessment of Rumen Microbiota Using Total RNA and Targeted Amplicon Sequencing Approaches.

    PubMed

    Li, Fuyong; Henderson, Gemma; Sun, Xu; Cox, Faith; Janssen, Peter H; Guan, Le Luo

    2016-01-01

    Taxonomic characterization of active gastrointestinal microbiota is essential to detect shifts in microbial communities and functions under various conditions. This study aimed to identify and quantify potentially active rumen microbiota using total RNA sequencing and to compare the outcomes of this approach with the widely used targeted RNA/DNA amplicon sequencing technique. Total RNA isolated from rumen digesta samples from five beef steers was subjected to Illumina paired-end sequencing (RNA-seq), and bacterial and archaeal amplicons of partial 16S rRNA/rDNA were subjected to 454 pyrosequencing (RNA/DNA Amplicon-seq). Taxonomic assessments of the RNA-seq, RNA Amplicon-seq, and DNA Amplicon-seq datasets were performed using a pipeline developed in house. The detected major microbial phylotypes were common among the three datasets, with seven bacterial phyla, fifteen bacterial families, and five archaeal taxa commonly identified across all datasets. There were also unique microbial taxa detected in each dataset. Elusimicrobia and Verrucomicrobia phyla; Desulfovibrionaceae, Elusimicrobiaceae, and Sphaerochaetaceae families; and Methanobrevibacter woesei were only detected in the RNA-Seq and RNA Amplicon-seq datasets, whereas Streptococcaceae was only detected in the DNA Amplicon-seq dataset. In addition, the relative abundances of four bacterial phyla, eight bacterial families and one archaeal taxon were different among the three datasets. This is the first study to compare the outcomes of rumen microbiota profiling between RNA-seq and RNA/DNA Amplicon-seq datasets. Our results illustrate the differences between these methods in characterizing microbiota both qualitatively and quantitatively for the same sample, and so caution must be exercised when comparing data. PMID:27446027

  1. Taxonomic Assessment of Rumen Microbiota Using Total RNA and Targeted Amplicon Sequencing Approaches

    PubMed Central

    Li, Fuyong; Henderson, Gemma; Sun, Xu; Cox, Faith; Janssen, Peter H.; Guan, Le Luo

    2016-01-01

    Taxonomic characterization of active gastrointestinal microbiota is essential to detect shifts in microbial communities and functions under various conditions. This study aimed to identify and quantify potentially active rumen microbiota using total RNA sequencing and to compare the outcomes of this approach with the widely used targeted RNA/DNA amplicon sequencing technique. Total RNA isolated from rumen digesta samples from five beef steers was subjected to Illumina paired-end sequencing (RNA-seq), and bacterial and archaeal amplicons of partial 16S rRNA/rDNA were subjected to 454 pyrosequencing (RNA/DNA Amplicon-seq). Taxonomic assessments of the RNA-seq, RNA Amplicon-seq, and DNA Amplicon-seq datasets were performed using a pipeline developed in house. The detected major microbial phylotypes were common among the three datasets, with seven bacterial phyla, fifteen bacterial families, and five archaeal taxa commonly identified across all datasets. There were also unique microbial taxa detected in each dataset. Elusimicrobia and Verrucomicrobia phyla; Desulfovibrionaceae, Elusimicrobiaceae, and Sphaerochaetaceae families; and Methanobrevibacter woesei were only detected in the RNA-Seq and RNA Amplicon-seq datasets, whereas Streptococcaceae was only detected in the DNA Amplicon-seq dataset. In addition, the relative abundances of four bacterial phyla, eight bacterial families and one archaeal taxon were different among the three datasets. This is the first study to compare the outcomes of rumen microbiota profiling between RNA-seq and RNA/DNA Amplicon-seq datasets. Our results illustrate the differences between these methods in characterizing microbiota both qualitatively and quantitatively for the same sample, and so caution must be exercised when comparing data. PMID:27446027

  2. Nucleotide sequence of an exceptionally long 5.8S ribosomal RNA from Crithidia fasciculata.

    PubMed

    Schnare, M N; Gray, M W

    1982-03-25

    In Crithidia fasciculata, a trypanosomatid protozoan, the large ribosomal subunit contains five small RNA species (e, f, g, i, j) in addition to 5S rRNA [Gray, M.W. (1981) Mol. Cell. Biol. 1, 347-357]. The complete primary sequence of species i is shown here to be pAACGUGUmCGCGAUGGAUGACUUGGCUUCCUAUCUCGUUGA ... AGAmACGCAGUAAAGUGCGAUAAGUGGUApsiCAAUUGmCAGAAUCAUUCAAUUACCGAAUCUUUGAACGAAACGG ... CGCAUGGGAGAAGCUCUUUUGAGUCAUCCCCGUGCAUGCCAUAUUCUCCAmGUGUCGAA(C)OH. This sequence establishes that species i is a 5.8S rRNA, despite its exceptional length (171-172 nucleotides). The extra nucleotides in C. fasciculata 5.8S rRNA are located in a region whose primary sequence and length are highly variable among 5.8S rRNAs, but which is capable of forming a stable hairpin loop structure (the "G+C-rich hairpin"). The sequence of C. fasciculata 5.8S rRNA is no more closely related to that of another protozoan, Acanthamoeba castellanii, than it is to representative 5.8S rRNA sequences from the other eukaryotic kingdoms, emphasizing the deep phylogenetic divisions that seem to exist within the Kingdom Protista. PMID:7079176

  3. Sequence characterization of 5S ribosomal RNA from eight gram positive procaryotes

    NASA Technical Reports Server (NTRS)

    Woese, C. R.; Luehrsen, K. R.; Pribula, C. D.; Fox, G. E.

    1976-01-01

    Complete nucleotide sequences are presented for 5S rRNA from Bacillus subtilis, B. firmus, B. pasteurii, B. brevis, Lactobacillus brevis, and Streptococcus faecalis, and 5S rRNA oligonucleotide catalogs and partial sequence data are given for B. cereus and Sporosarcina ureae. These data demonstrate a striking consistency of 5S rRNA primary and secondary structure within a given bacterial grouping. An exception is B. brevis, in which the 5S rRNA sequence varies significantly from that of other bacilli in the tuned helix and the procaryotic loop. The localization of these variations suggests that B. brevis occupies an ecological niche that selects such changes. It is noted that this organism produces antibiotics which affect ribosome function.

  4. Self-Assembly of Measles Virus Nucleocapsid-like Particles: Kinetics and RNA Sequence Dependence.

    PubMed

    Milles, Sigrid; Jensen, Malene Ringkjøbing; Communie, Guillaume; Maurin, Damien; Schoehn, Guy; Ruigrok, Rob W H; Blackledge, Martin

    2016-08-01

    Measles virus RNA genomes are packaged into helical nucleocapsids (NCs), comprising thousands of nucleo-proteins (N) that bind the entire genome. N-RNA provides the template for replication and transcription by the viral polymerase and is a promising target for viral inhibition. Elucidation of mechanisms regulating this process has been severely hampered by the inability to controllably assemble NCs. Here, we demonstrate self-organization of N into NC-like particles in vitro upon addition of RNA, providing a simple and versatile tool for investigating assembly. Real-time NMR and fluorescence spectroscopy reveals biphasic assembly kinetics. Remarkably, assembly depends strongly on the RNA-sequence, with the genomic 5' end and poly-Adenine sequences assembling efficiently, while sequences such as poly-Uracil are incompetent for NC formation. This observation has important consequences for understanding the assembly process. PMID:27270664

  5. A known expressed sequence tag, BM742401, is a potent lincRNA inhibiting cancer metastasis.

    PubMed

    Park, Seong-Min; Park, Sung-Joon; Kim, Hee-Jin; Kwon, Oh-Hyung; Kang, Tae-Wook; Sohn, Hyun-Ahm; Kim, Seon-Kyu; Moo Noh, Seung; Song, Kyu-Sang; Jang, Se-Jin; Sung Kim, Yong; Kim, Seon-Young

    2013-01-01

    Long intergenic non-coding RNAs (lincRNAs) have historically been ignored in cancer biology. However, thousands of lincRNAs have been identified in mammals using recently developed genomic tools, including microarray and high-throughput RNA sequencing (RNA-seq). Several of the lincRNAs identified have been well characterized for their functions in carcinogenesis. Here we performed RNA-seq experiments comparing gastric cancer with normal tissues to find differentially expressed transcripts in intergenic regions. By analyzing our own RNA-seq and public microarray data, we identified 31 transcripts, including a known expressed sequence tag, BM742401. BM742401 was downregulated in cancer, and its downregulation was associated with poor survival in gastric cancer patients. Ectopic overexpression of BM742401 inhibited metastasis-related phenotypes and decreased the concentration of extracellular MMP9. These results suggest that BM742401 is a potential lincRNA marker and therapeutic target. PMID:23846333

  6. Complete nucleotide sequence of the genomic RNA of tobacco mosaic virus strain Cg.

    PubMed

    Yamanaka, T; Komatani, H; Meshi, T; Naito, S; Ishikawa, M; Ohno, T

    1998-01-01

    Tobacco mosaic virus (TMV)-Cg is a crucifer-infecting tobamovirus that was isolated from field-grown garlic. We determined the complete nucleotide sequence of the genomic RNA of TMV-Cg. The genomic RNA of TMV-Cg consists of 6303 nucleotides and encodes four large open reading frames, organized basically in the same way as that of other tobamoviruses. The nucleotide and deduced amino acid sequences are very similar to those of the other crucifer-infecting tobamoviruses that have been sequenced so far. PMID:9608662

  7. The RNA sequence context defines the mechanistic routes by which yeast arginyl-tRNA synthetase charges tRNA.

    PubMed

    Sissler, M; Giegé, R; Florentz, C

    1998-06-01

    Arginylation of tRNA transcripts by yeast arginyl-tRNA synthetase can be triggered by two alternate recognition sets in anticodon loops: C35 and U36 or G36 in tRNA(Arg) and C36 and G37 in tRNA(Asp) (Sissler M, Giegé R, Florentz C, 1996, EMBO J 15:5069-5076). Kinetic studies on tRNA variants were done to explore the mechanisms by which these sets are expressed. Although the synthetase interacts in a similar manner with tRNA(Arg) and tRNA(Asp), the details of the interaction patterns are idiosyncratic, especially in anticodon loops (Sissler M, Eriani G, Martin F, Giegé R, Florentz C, 1997, Nucleic Acids Res 25:4899-4906). Exchange of individual recognition elements between arginine and aspartate tRNA frameworks strongly blocks arginylation of the mutated tRNAs, whereas full exchange of the recognition sets leads to efficient arginine acceptance of the transplanted tRNAs. Unpredictably, the similar catalytic efficiencies of native and transplanted tRNAs originate from different k(cat) and Km combinations. A closer analysis reveals that efficient arginylation results from strong anticooperative effects between individual recognition elements. Nonrecognition nucleotides as well as the tRNA architecture are additional factors that tune efficiency. Altogether, arginyl-tRNA synthetase is able to utilize different context-dependent mechanistic routes to be activated. This confers biological advantages to the arginine aminoacylation system and sheds light on its evolutionary relationship with the aspartate system. PMID:9622124

  8. Identification of characteristic oligonucleotides in the bacterial 16S ribosomal RNA sequence dataset

    NASA Technical Reports Server (NTRS)

    Zhang, Zhengdong; Willson, Richard C.; Fox, George E.

    2002-01-01

    MOTIVATION: The phylogenetic structure of the bacterial world has been intensively studied by comparing sequences of 16S ribosomal RNA (16S rRNA). This database of sequences is now widely used to design probes for the detection of specific bacteria or groups of bacteria one at a time. The success of such methods reflects the fact that there are local sequence segments that are highly characteristic of particular organisms or groups of organisms. It is not clear, however, the extent to which such signature sequences exist in the 16S rRNA dataset. A better understanding of the numbers and distribution of highly informative oligonucleotide sequences may facilitate the design of hybridization arrays that can characterize the phylogenetic position of an unknown organism or serve as the basis for the development of novel approaches for use in bacterial identification. RESULTS: A computer-based algorithm that characterizes the extent to which any individual oligonucleotide sequence in 16S rRNA is characteristic of any particular bacterial grouping was developed. A measure of signature quality, Q(s), was formulated and subsequently calculated for every individual oligonucleotide sequence in the size range of 5-11 nucleotides and for 15mers with reference to each cluster and subcluster in a 929 organism representative phylogenetic tree. Subsequently, the perfect signature sequences were compared to the full set of 7322 sequences to see how common false positives were. The work completed here establishes beyond any doubt that highly characteristic oligonucleotides exist in the bacterial 16S rRNA sequence dataset in large numbers. Over 16,000 15mers were identified that might be useful as signatures. Signature oligonucleotides are available for over 80% of the nodes in the representative tree.

  9. Diversity of host species and strains of Pneumocystis carinii is based on rRNA sequences.

    PubMed Central

    Shah, J S; Pieciak, W; Liu, J; Buharin, A; Lane, D J

    1996-01-01

    We have amplified by PCR Pneumocystis carinii cytoplasmic small-subunit rRNA (variously referred to as 16S-like or 18S-like rRNA) genes from DNA extracted from bronchoalveolar lavage and induced sputum specimens from patients positive for P. carinii and from infected ferret lung tissue. The amplification products were cloned into pUC18, and individual clones were sequenced. Comparison of the determined sequences with each other and with published rat and partial human P.carinii small-subunit rRNA gene sequences reveals that, although all P. carinii small-subunit rRNAs are closely related (approximately 96% identity), small-subunit rRNA genes isolated from different host species (human, rat, and ferret) exhibit distinctive patterns of sequence variation. Two types of sequences were isolated from the infected ferret lung tissue, one as a predominant species and the other as a minor species. There was 96% identity between the two types. In situ hybridization of the infected ferret lung tissue with oligonucleotide probes specific for each type revealed that there were two distinct strains of P. carinii present in the ferret lung tissue. Unlike the ferret P. carinii isolates, the small-subunit rRNA gene sequences from different human P. carinii isolates have greater than 99% identity and are distinct from all rat and ferret sequences so far inspected or reported in the literature. Southern blot hybridization analysis of PCR amplification products from several additional bronchoalveolar lavage or induced sputum specimens from P. carinii-infected patients, using a 32P-labeled oligonucleotide probe specific for human P. carinii, also suggests that all of the human P. carinii isolates are identical. These findings indicate that human P. carinii isolates may represent a distinct species of P. carinii distinguishable from rat and ferret P. carinii on the basis of characterization of small-subunit rRNA gene sequences. PMID:8770515

  10. The phylogenetic utility and functional constraint of microRNA flanking sequences

    PubMed Central

    Kenny, Nathan J.; Sin, Yung Wa; Hayward, Alexander; Paps, Jordi; Chu, Ka Hou; Hui, Jerome H. L.

    2015-01-01

    MicroRNAs (miRNAs) have recently risen to prominence as novel factors responsible for post-transcriptional regulation of gene expression. miRNA genes have been posited as highly conserved in the clades in which they exist. Consequently, miRNAs have been used as rare genome change characters to estimate phylogeny by tracking their gain and loss. However, their short length (21–23 bp) has limited their perceived utility in sequenced-based phylogenetic inference. Here, using reference taxa with established phylogenetic relationships, we demonstrate that miRNA sequences are of high utility in quantitative, rather than in qualitative, phylogenetic analysis. The clear orthology among miRNA genes from different species makes it straightforward to identify and align these sequences from even fragmentary datasets. We also identify significant sequence conservation in the regions directly flanking miRNA genes, and show that this too is of utility in phylogenetic analysis, as well as highlighting conserved regions that will be of interest to other fields. Employing miRNA sequences from 12 sequenced drosophilid genomes, together with a Tribolium castaneum outgroup, we demonstrate that this approach is robust using Bayesian and maximum-likelihood methods. The utility of these characters is further demonstrated in the rhabditid nematodes and primates. As next-generation sequencing makes it more cost-effective to sequence genomes and small RNA libraries, this methodology provides an alternative data source for phylogenetic analysis. The approach allows rapid resolution of relationships between both closely related and rapidly evolving species, and provides an additional tool for investigation of relationships within the tree of life. PMID:25694624

  11. Deep Sequencing Analysis of Nucleolar Small RNAs: RNA Isolation and Library Preparation.

    PubMed

    Bai, Baoyan; Laiho, Marikki

    2016-01-01

    The nucleolus is a subcellular compartment with a key essential function in ribosome biogenesis. The nucleolus is rich in noncoding RNAs, mostly the ribosomal RNAs and small nucleolar RNAs. Surprisingly, also several miRNAs have been detected in the nucleolus, raising the question as to whether other small RNA species are present and functional in the nucleolus. We have developed a strategy for stepwise enrichment of nucleolar small RNAs from the total nucleolar RNA extracts and subsequent construction of nucleolar small RNA libraries which are suitable for deep sequencing. Our method successfully isolates the small RNA population from total RNAs and monitors the RNA quality in each step to ensure that small RNAs recovered represent the actual small RNA population in the nucleolus and not degradation products from larger RNAs. We have further applied this approach to characterize the distribution of small RNAs in different cellular compartments. PMID:27576723

  12. Next-generation sequencing of the porcine skeletal muscle transcriptome for computational prediction of microRNA gene targets

    Technology Transfer Automated Retrieval System (TEKTRAN)

    MicroRNA are a class of small RNAs that regulate gene expression by inhibiting translation of protein encoding transcripts. Inhibition is exerted through targeting of a microRNA-protein complex by base-pairing of the microRNA sequence to cognate recognition sequences in the 3’ untranslated region (...

  13. Analyzing the microRNA Transcriptome in Plants Using Deep Sequencing Data

    PubMed Central

    Yang, Xiaozeng; Li, Lei

    2012-01-01

    MicroRNAs (miRNAs) are 20- to 24-nucleotide endogenous small RNA molecules emerging as an important class of sequence-specific, trans-acting regulators for modulating gene expression at the post-transcription level. There has been a surge of interest in the past decade in identifying miRNAs and profiling their expression pattern using various experimental approaches. In particular, ultra-deep sampling of specifically prepared low-molecular-weight RNA libraries based on next-generation sequencing technologies has been used successfully in diverse species. The challenge now is to effectively deconvolute the complex sequencing data to provide comprehensive and reliable information on the miRNAs, miRNA precursors, and expression profile of miRNA genes. Here we review the recently developed computational tools and their applications in profiling the miRNA transcriptomes, with an emphasis on the model plant Arabidopsis thaliana. Highlighted is also progress and insight into miRNA biology derived from analyzing available deep sequencing data. PMID:24832228

  14. The novel organization and complete sequence of the ribosomal RNA gene of Nosema bombycis.

    PubMed

    Huang, Wei-Fone; Tsai, Shu-Jen; Lo, Chu-Fang; Soichi, Yamane; Wang, Chung-Hsiung

    2004-05-01

    We present here for the first time the complete DNA sequence data (4301bp) of the ribosomal RNA (rRNA) gene of the microsporidian type species, Nosema bombycis. Sequences for the large subunit gene (LSUrRNA: 2497bp, GenBank Accession No. ), the internal transcribed spacer (ITS: 179bp, GenBank Accession No. ), the small subunit gene (SSUrRNA: 1232bp), intergenic spacer (IGS: 279bp), and 5S region (114bp) are also given, and the secondary structure of the large subunit is discussed. The organization of the N. bombycis rRNA gene is LSUrRNA-ITS-SSUrRNA-IGS-5S. This novel arrangement, in which the LSU is 5' of the SSU, is the reverse of the organizational sequence (i.e., SSU-ITS-LSU) found in all previously reported microsporidian rRNAs, including Nosema apis. This unique character in the type species may have taxonomic implications for the members of the genus Nosema. PMID:15050536

  15. StarScan: a web server for scanning small RNA targets from degradome sequencing data

    PubMed Central

    Liu, Shun; Li, Jun-Hao; Wu, Jie; Zhou, Ke-Ren; Zhou, Hui; Yang, Jian-Hua; Qu, Liang-Hu

    2015-01-01

    Endogenous small non-coding RNAs (sRNAs), including microRNAs, PIWI-interacting RNAs and small interfering RNAs, play important gene regulatory roles in animals and plants by pairing to the protein-coding and non-coding transcripts. However, computationally assigning these various sRNAs to their regulatory target genes remains technically challenging. Recently, a high-throughput degradome sequencing method was applied to identify biologically relevant sRNA cleavage sites. In this study, an integrated web-based tool, StarScan (sRNA target Scan), was developed for scanning sRNA targets using degradome sequencing data from 20 species. Given a sRNA sequence from plants or animals, our web server performs an ultrafast and exhaustive search for potential sRNA–target interactions in annotated and unannotated genomic regions. The interactions between small RNAs and target transcripts were further evaluated using a novel tool, alignScore. A novel tool, degradomeBinomTest, was developed to quantify the abundance of degradome fragments located at the 9–11th nucleotide from the sRNA 5′ end. This is the first web server for discovering potential sRNA-mediated RNA cleavage events in plants and animals, which affords mechanistic insights into the regulatory roles of sRNAs. The StarScan web server is available at http://mirlab.sysu.edu.cn/starscan/. PMID:25990732

  16. MicroRNA transcriptome in the newborn mouse ovaries determined by massive parallel sequencing.

    PubMed

    Ahn, Hyo Won; Morin, Ryan D; Zhao, Han; Harris, Ronald A; Coarfa, Cristian; Chen, Zi-Jiang; Milosavljevic, Aleksandar; Marra, Marco A; Rajkovic, Aleksandar

    2010-07-01

    Small non-coding RNAs, such as microRNAs (miRNAs), are involved in diverse biological processes including organ development and tissue differentiation. Global disruption of miRNA biogenesis in Dicer knockout mice disrupts early embryogenesis and primordial germ cell formation. However, the role of miRNAs in early folliculogenesis is poorly understood. In order to identify a full transcriptome set of small RNAs expressed in the newborn (NB) ovary, we extracted small RNA fraction from mouse NB ovary tissues and subjected it to massive parallel sequencing using the Genome Analyzer from Illumina. Massive sequencing produced 4 655 992 reads of 33 bp each representing a total of 154 Mbp of sequence data. The Pash alignment algorithm mapped 50.13% of the reads to the mouse genome. Sequence reads were clustered based on overlapping mapping coordinates and intersected with known miRNAs, small nucleolar RNAs (snoRNAs), piwi-interacting RNA (piRNA) clusters and repetitive genomic regions; 25.2% of the reads mapped to known miRNAs, 25.5% to genomic repeats, 3.5% to piRNAs and 0.18% to snoRNAs. Three hundred and ninety-eight known miRNA species were among the sequenced small RNAs, and 118 isomiR sequences that are not in the miRBase database. Let-7 family was the most abundantly expressed miRNA, and mmu-mir-672, mmu-mir-322, mmu-mir-503 and mmu-mir-465 families are the most abundant X-linked miRNA detected. X-linked mmu-mir-503, mmu-mir-672 and mmu-mir-465 family showed preferential expression in testes and ovaries. We also identified four novel miRNAs that are preferentially expressed in gonads. Gonadal selective miRNAs may play important roles in ovarian development, folliculogenesis and female fertility. PMID:20215419

  17. Sequence requirements for localization and packaging of Ty3 retroelement RNA

    PubMed Central

    Clemens, Kristina; Bilanchone, Virginia; Beliakova-Bethell, Nadejda; Larsen, Liza S.Z.; Nguyen, Kim; Sandmeyer, Suzanne

    2012-01-01

    Retroviruses and retrotransposons package genomic RNA into virus-like particles (VLPs) in a poorly understood process. Expression of the budding yeast retrotransposon Ty3 results in the formation of cytoplasmic Ty3 VLP assembly foci comprised of Ty3 RNA and proteins, and cellular factors associated with RNA processing body (PB) components, which modulate translation and effect nonsense-mediated decay (NMD). A series of Ty3 RNA variants were tested to understand the effects of read-through translation via programmed frameshifting on RNA localization and packaging into VLPs, and to identify the roles of coding and non-coding sequences in those processes. These experiments showed that a low level of read-through translation of the downstream open reading frame (as opposed to no translation or translation without frameshifting) is important for localization of full-length Ty3 RNA to foci. Ty3 RNA variants associated with PB components via independent determinants in the native Ty3 untranslated regions (UTRs) and in GAG3-POL3 sequences flanked by UTRs adapted from non-Ty3 transcripts. However, despite localization, RNAs containing GAG3-POL3 but lacking Ty3 UTRs were not packaged efficiently. Surprisingly, sequences within Ty3 UTRs, which bind the initiator tRNAMet proposed to provide the dimerization interface, were not required for packaging of full-length Ty3 RNA into VLPs. In summary, our results demonstrate that Gag3 is sufficient and required for localization and packaging of RNAs containing Ty3 UTRs and support a role for POL3 sequences, translation of which is attenuated by programmed frameshifting, in both localization and packaging of the Ty3 full-length gRNA. PMID:23073180

  18. R3D-2-MSA: the RNA 3D structure-to-multiple sequence alignment server.

    PubMed

    Cannone, Jamie J; Sweeney, Blake A; Petrov, Anton I; Gutell, Robin R; Zirbel, Craig L; Leontis, Neocles

    2015-07-01

    The RNA 3D Structure-to-Multiple Sequence Alignment Server (R3D-2-MSA) is a new web service that seamlessly links RNA three-dimensional (3D) structures to high-quality RNA multiple sequence alignments (MSAs) from diverse biological sources. In this first release, R3D-2-MSA provides manual and programmatic access to curated, representative ribosomal RNA sequence alignments from bacterial, archaeal, eukaryal and organellar ribosomes, using nucleotide numbers from representative atomic-resolution 3D structures. A web-based front end is available for manual entry and an Application Program Interface for programmatic access. Users can specify up to five ranges of nucleotides and 50 nucleotide positions per range. The R3D-2-MSA server maps these ranges to the appropriate columns of the corresponding MSA and returns the contents of the columns, either for display in a web browser or in JSON format for subsequent programmatic use. The browser output page provides a 3D interactive display of the query, a full list of sequence variants with taxonomic information and a statistical summary of distinct sequence variants found. The output can be filtered and sorted in the browser. Previous user queries can be viewed at any time by resubmitting the output URL, which encodes the search and re-generates the results. The service is freely available with no login requirement at http://rna.bgsu.edu/r3d-2-msa. PMID:26048960

  19. Sequence and phylogenetic analysis of SSU rRNA gene of five microsporidia.

    PubMed

    Dong, ShiNan; Shen, ZhongYuan; Xu, Li; Zhu, Feng

    2010-01-01

    The complete small subunit rRNA (SSU rRNA) gene sequences of five microsporidia including Nosema heliothidis, and four novel microsporidia isolated from Pieris rapae, Phyllobrotica armta, Hemerophila atrilineata, and Bombyx mori, respectively, were obtained by PCR amplification, cloning, and sequencing. Two phylogenetic trees based on SSU rRNA sequences had been constructed by using Neighbor-Joining of Phylip software and UPGMA of MEGA4.0 software. The taxonomic status of four novel microsporidia was determined by analysis of phylogenetic relationship, length, G+C content, identity, and divergence of the SSU rRNA sequences. The results showed that the microsporidia isolated from Pieris rapae, Phyllobrotica armta, and Hemerophila atrilineata have close phylogenetic relationship with the Nosema, while another microsporidium isolated from Bombyx mori is closely related to the Endoreticulatus. So, we temporarily classify three novel species of microsporidia to genus Nosema, as Nosema sp. PR, Nosema sp. PA, Nosema sp. HA. Another is temporarily classified into genus Endoreticulatus, as Endoreticulatus sp. Zhenjiang. The result indicated as well that it is feasible and valuable to elucidate phylogenetic relationships and taxonomic status of microsporidian species by analyzing information from SSU rRNA sequences of microsporidia. PMID:19768503

  20. R3D-2-MSA: the RNA 3D structure-to-multiple sequence alignment server

    PubMed Central

    Cannone, Jamie J.; Sweeney, Blake A.; Petrov, Anton I.; Gutell, Robin R.; Zirbel, Craig L.; Leontis, Neocles

    2015-01-01

    The RNA 3D Structure-to-Multiple Sequence Alignment Server (R3D-2-MSA) is a new web service that seamlessly links RNA three-dimensional (3D) structures to high-quality RNA multiple sequence alignments (MSAs) from diverse biological sources. In this first release, R3D-2-MSA provides manual and programmatic access to curated, representative ribosomal RNA sequence alignments from bacterial, archaeal, eukaryal and organellar ribosomes, using nucleotide numbers from representative atomic-resolution 3D structures. A web-based front end is available for manual entry and an Application Program Interface for programmatic access. Users can specify up to five ranges of nucleotides and 50 nucleotide positions per range. The R3D-2-MSA server maps these ranges to the appropriate columns of the corresponding MSA and returns the contents of the columns, either for display in a web browser or in JSON format for subsequent programmatic use. The browser output page provides a 3D interactive display of the query, a full list of sequence variants with taxonomic information and a statistical summary of distinct sequence variants found. The output can be filtered and sorted in the browser. Previous user queries can be viewed at any time by resubmitting the output URL, which encodes the search and re-generates the results. The service is freely available with no login requirement at http://rna.bgsu.edu/r3d-2-msa. PMID:26048960

  1. Splice site consensus sequences are preferentially accessible to nucleases in isolated adenovirus RNA.

    PubMed Central

    Munroe, S H; Duthie, R S

    1986-01-01

    The conformation of RNA sequences spanning five 3' splice sites and two 5' splice sites in adenovirus mRNA was probed by partial digestion with single-strand specific nucleases. Although cleavage of nucleotides near both 3' and 5' splice sites was observed, most striking was the preferential digestion of sequences near the 3' splice site. At each 3' splice site a region of very strong cleavage is observed at low concentrations of enzyme near the splice site consensus sequence or the upstream branch point consensus sequence. Additional sites of moderately strong cutting near the branch point consensus sequence were observed in those sequences where the splice site was the preferred target. Since recognition of the 3' splice site and branch site appear to be early events in mRNA splicing these observations may indicate that the local conformation of the splice site sequences may play a direct or indirect role in enhancing the accessibility of sequences important for splicing. Images PMID:3024107

  2. High-quality RNA extraction from copepods for Next Generation Sequencing: A comparative study.

    PubMed

    Asai, Sneha; Ianora, Adrianna; Lauritano, Chiara; Lindeque, Penelope K; Carotenuto, Ylenia

    2015-12-01

    Despite the ecological importance of copepods, few Next Generation Sequencing studies (NGS) have been performed on small crustaceans, and a standard method for RNA extraction is lacking. In this study, we compared three commonly-used methods: TRIzol®, Aurum Total RNA Mini Kit and Qiagen RNeasy Micro Kit, in combination with preservation reagents TRIzol® or RNAlater®, to obtain high-quality and quantity of RNA from copepods for NGS. Total RNA was extracted from the copepods Calanus helgolandicus, Centropages typicus and Temora stylifera and its quantity and quality were evaluated using NanoDrop, agarose gel electrophoresis and Agilent Bioanalyzer. Our results demonstrate that preservation of copepods in RNAlater® and extraction with Qiagen RNeasy Micro Kit were the optimal isolation method for high-quality and quantity of RNA for NGS studies of C. helgolandicus. Intriguingly, C. helgolandicus 28S rRNA is formed by two subunits that separate after heat-denaturation and migrate along with 18S rRNA. This unique property of protostome RNA has never been reported in copepods. Overall, our comparative study on RNA extraction protocols will help increase gene expression studies on copepods using high-throughput applications, such as RNA-Seq and microarrays. PMID:25546577

  3. The reverse transcription signature of N-1-methyladenosine in RNA-Seq is sequence dependent

    PubMed Central

    Hauenschild, Ralf; Tserovski, Lyudmil; Schmid, Katharina; Thüring, Kathrin; Winz, Marie-Luise; Sharma, Sunny; Entian, Karl-Dieter; Wacheul, Ludivine; Lafontaine, Denis L. J.; Anderson, James; Alfonzo, Juan; Hildebrandt, Andreas; Jäschke, Andres; Motorin, Yuri; Helm, Mark

    2015-01-01

    The combination of Reverse Transcription (RT) and high-throughput sequencing has emerged as a powerful combination to detect modified nucleotides in RNA via analysis of either abortive RT-products or of the incorporation of mismatched dNTPs into cDNA. Here we simultaneously analyze both parameters in detail with respect to the occurrence of N-1-methyladenosine (m1A) in the template RNA. This naturally occurring modification is associated with structural effects, but it is also known as a mediator of antibiotic resistance in ribosomal RNA. In structural probing experiments with dimethylsulfate, m1A is routinely detected by RT-arrest. A specifically developed RNA-Seq protocol was tailored to the simultaneous analysis of RT-arrest and misincorporation patterns. By application to a variety of native and synthetic RNA preparations, we found a characteristic signature of m1A, which, in addition to an arrest rate, features misincorporation as a significant component. Detailed analysis suggests that the signature depends on RNA structure and on the nature of the nucleotide 3′ of m1A in the template RNA, meaning it is sequence dependent. The RT-signature of m1A was used for inspection and confirmation of suspected modification sites and resulted in the identification of hitherto unknown m1A residues in trypanosomal tRNA. PMID:26365242

  4. Study design requirements for RNA sequencing-based breast cancer diagnostics

    PubMed Central

    Mer, Arvind Singh; Klevebring, Daniel; Grönberg, Henrik; Rantalainen, Mattias

    2016-01-01

    Sequencing-based molecular characterization of tumors provides information required for individualized cancer treatment. There are well-defined molecular subtypes of breast cancer that provide improved prognostication compared to routine biomarkers. However, molecular subtyping is not yet implemented in routine breast cancer care. Clinical translation is dependent on subtype prediction models providing high sensitivity and specificity. In this study we evaluate sample size and RNA-sequencing read requirements for breast cancer subtyping to facilitate rational design of translational studies. We applied subsampling to ascertain the effect of training sample size and the number of RNA sequencing reads on classification accuracy of molecular subtype and routine biomarker prediction models (unsupervised and supervised). Subtype classification accuracy improved with increasing sample size up to N = 750 (accuracy = 0.93), although with a modest improvement beyond N = 350 (accuracy = 0.92). Prediction of routine biomarkers achieved accuracy of 0.94 (ER) and 0.92 (Her2) at N = 200. Subtype classification improved with RNA-sequencing library size up to 5 million reads. Development of molecular subtyping models for cancer diagnostics requires well-designed studies. Sample size and the number of RNA sequencing reads directly influence accuracy of molecular subtyping. Results in this study provide key information for rational design of translational studies aiming to bring sequencing-based diagnostics to the clinic. PMID:26830453

  5. Study design requirements for RNA sequencing-based breast cancer diagnostics.

    PubMed

    Mer, Arvind Singh; Klevebring, Daniel; Grönberg, Henrik; Rantalainen, Mattias

    2016-01-01

    Sequencing-based molecular characterization of tumors provides information required for individualized cancer treatment. There are well-defined molecular subtypes of breast cancer that provide improved prognostication compared to routine biomarkers. However, molecular subtyping is not yet implemented in routine breast cancer care. Clinical translation is dependent on subtype prediction models providing high sensitivity and specificity. In this study we evaluate sample size and RNA-sequencing read requirements for breast cancer subtyping to facilitate rational design of translational studies. We applied subsampling to ascertain the effect of training sample size and the number of RNA sequencing reads on classification accuracy of molecular subtype and routine biomarker prediction models (unsupervised and supervised). Subtype classification accuracy improved with increasing sample size up to N = 750 (accuracy = 0.93), although with a modest improvement beyond N = 350 (accuracy = 0.92). Prediction of routine biomarkers achieved accuracy of 0.94 (ER) and 0.92 (Her2) at N = 200. Subtype classification improved with RNA-sequencing library size up to 5 million reads. Development of molecular subtyping models for cancer diagnostics requires well-designed studies. Sample size and the number of RNA sequencing reads directly influence accuracy of molecular subtyping. Results in this study provide key information for rational design of translational studies aiming to bring sequencing-based diagnostics to the clinic. PMID:26830453

  6. Two methods for full-length RNA sequencing for low quantities of cells and single cells

    PubMed Central

    Pan, Xinghua; Durrett, Russell E.; Zhu, Haiying; Tanaka, Yoshiaki; Li, Yumei; Zi, Xiaoyuan; Marjani, Sadie L.; Euskirchen, Ghia; Ma, Chao; LaMotte, Robert H.; Park, In-Hyun; Snyder, Michael P.; Mason, Christopher E.; Weissman, Sherman M.

    2013-01-01

    The ability to determine the gene expression pattern in low quantities of cells or single cells is important for resolving a variety of problems in many biological disciplines. A robust description of the expression signature of a single cell requires determination of the full-length sequence of the expressed mRNAs in the cell, yet existing methods have either 3′ biased or variable transcript representation. Here, we report our protocols for the amplification and high-throughput sequencing of very small amounts of RNA for sequencing using procedures of either semirandom primed PCR or phi29 DNA polymerase-based DNA amplification, for the cDNA generated with oligo-dT and/or random oligonucleotide primers. Unlike existing methods, these protocols produce relatively uniformly distributed sequences covering the full length of almost all transcripts independent of their sizes, from 1,000 to 10 cells, and even with single cells. Both protocols produced satisfactory detection/coverage of the abundant mRNAs from a single K562 erythroleukemic cell or a single dorsal root ganglion neuron. The phi29-based method produces long products with less noise, uses an isothermal reaction, and is simple to practice. The semirandom primed PCR procedure is more sensitive and reproducible at low transcript levels or with low quantities of cells. These methods provide tools for mRNA sequencing or RNA sequencing when only low quantities of cells, a single cell, or even degraded RNA are available for profiling. PMID:23267071

  7. The landscape of fusion transcripts in spitzoid melanoma and biologically indeterminate spitzoid tumors by RNA sequencing

    PubMed Central

    Wu, Gang; Barnhill, Raymond L.; Lee, Seungjae; Li, Yongjin; Shao, Ying; Easton, John; Dalton, James; Zhang, Jinghui; Pappo, Alberto; Bahrami, Armita

    2016-01-01

    Kinase activation by chromosomal translocations is a common mechanism that drives tumorigenesis in spitzoid neoplasms. To explore the landscape of fusion transcripts in these tumors, we performed whole-transcriptome sequencing using formalin-fixed paraffin-embedded tissues in malignant or biologically indeterminate spitzoid tumors from 7 patients (age 2–14 years). RNA sequence libraries enriched for coding regions were prepared and the sequencing was analyzed by a novel assembly-based algorithm designed for detecting complex fusions. In addition, tumor samples were screened for hotspot TERT promoter mutations, and telomerase expression was assessed by TERT mRNA in situ hybridization (ISH). Two patients had widespread metastasis and subsequently died of disease, and 5 patients had a benign clinical course on limited follow-up (mean: 30 months). RNA sequencing and TERT mRNA ISH were successful in 6 tumors and unsuccessful in 1 disseminating tumor due to low RNA quality. RNA sequencing identified a kinase fusion in 5 of the 6 sequenced tumors: TPM3–NTRK1 (2 tumors), complex rearrangements involving TPM3, ALK, and IL6R (1 tumor), BAIAP2L1–BRAF (1 tumor), and EML4–BRAF (1 disseminating tumor). All predicted chimeric transcripts were expressed at high levels and contained the intact kinase domain. In addition, 2 tumors each contained a second fusion gene, ARID1B-SNX9 or PTPRZ1-NFAM1. The detected chimeric genes were validated by home-brew break-apart or fusion fluorescence in situ hybridization. The 2 disseminating tumors each harbored the TERT promoter −124C>T (Chr 5:1,295,228 hg19 coordinate) mutation whereas the remaining 5 tumors retained the wild-type gene. The presence of the −124C>T mutation correlated with telomerase expression by TERT mRNA ISH. In summary, we demonstrated complex fusion transcripts and novel partner genes for BRAF by RNA sequencing of FFPE samples. The diversity of gene fusions demonstrated by RNA sequencing defines the molecular

  8. Prediction of Immunomodulatory potential of an RNA sequence for designing non-toxic siRNAs and RNA-based vaccine adjuvants

    PubMed Central

    Chaudhary, Kumardeep; Nagpal, Gandharva; Dhanda, Sandeep Kumar; Raghava, Gajendra P. S.

    2016-01-01

    Our innate immune system recognizes a foreign RNA sequence of a pathogen and activates the immune system to eliminate the pathogen from our body. This immunomodulatory potential of RNA can be used to design RNA-based immunotherapy and vaccine adjuvants. In case of siRNA-based therapy, the immunomodulatory effect of an RNA sequence is unwanted as it may cause immunotoxicity. Thus, we developed a method for designing a single-stranded RNA (ssRNA) sequence with desired immunomodulatory potentials, for designing RNA-based therapeutics, immunotherapy and vaccine adjuvants. The dataset used for training and testing our models consists of 602 experimentally verified immunomodulatory oligoribonucleotides (IMORNs) that are ssRNA sequences of length 17 to 27 nucleotides and 520 circulating miRNAs as non-immunomodulatory sequences. We developed prediction models using various features that include composition-based features, binary profile, selected features, and hybrid features. All models were evaluated using five-fold cross-validation and external validation techniques; achieving a maximum mean Matthews Correlation Coefficient (MCC) of 0.86 with 93% accuracy. We identified motifs using MERCI software and observed the abundance of adenine (A) in motifs. Based on the above study, we developed a web server, imRNA, comprising of various modules important for designing RNA-based therapeutics (http://crdd.osdd.net/raghava/imrna/). PMID:26861761

  9. Use of yeast nuclear DNA sequences to define the mitochondrial RNA polymerase promoter in vitro.

    PubMed Central

    Marczynski, G T; Schultz, P W; Jaehning, J A

    1989-01-01

    We have extended an earlier observation that the TATA box for the nuclear GAL10 gene serves as a promoter for the mitochondrial RNA polymerase in in vitro transcription reactions (C. S. Winkley, M. J. Keller, and J. A. Jaehning, J. Biol. Chem. 260:14214-14223, 1985). In this work, we demonstrate that other nuclear genes also have upstream sequences that function in vitro as mitochondrial RNA polymerase promoters. These genes include the GAL7 and MEL1 genes, which are regulated in concert with the GAL10 gene, the sigma repetitive element, and the 2 microns plasmid origin of replication. We used in vitro transcription reactions to test a large number of nuclear DNA sequences that contain critical mitochondrial promoter sequences as defined by Biswas et al. (T. K. Biswas, J. C. Edwards, M. Rabinowitz, and G. S. Getz, J. Biol. Chem. 262:13690-13696, 1987). The results of these experiments allowed us to extend the definition of essential promoter elements. This extended sequence, -ACTATAAACGatcATAG-, was frequently found in the upstream regulatory regions of nuclear genes. On the basis of these observations, we hypothesized that either (i) a catalytic RNA polymerase related to the mitochondrial enzyme functions in the nucleus of the yeast cell or (ii) a DNA sequence recognition factor is shared by the two genetic compartments. By using cells deficient in the catalytic core of the mitochondrial RNA polymerase (rpo41-) and sensitive assays for transcripts initiating from the nuclear promoter sequences, we have conclusively ruled out a role for the catalytic RNA polymerase in synthesizing transcripts from all of the nuclear sequences analyzed. The possibility that a DNA sequence recognition factor functions in both the nucleus and the mitochondria remains to be tested. Images PMID:2677667

  10. Bioinformatics of Cancer ncRNA in High Throughput Sequencing: Present State and Challenges

    PubMed Central

    Jorge, Natasha Andressa Nogueira; Ferreira, Carlos Gil; Passetti, Fabio

    2012-01-01

    The numerous genome sequencing projects produced unprecedented amount of data providing significant information to the discovery of novel non-coding RNA (ncRNA). Several ncRNAs have been described to control gene expression and display important role during cell differentiation and homeostasis. In the last decade, high throughput methods in conjunction with approaches in bioinformatics have been used to identify, classify, and evaluate the expression of hundreds of ncRNA in normal and pathological states, such as cancer. Patient outcomes have been already associated with differential expression of ncRNAs in normal and tumoral tissues, providing new insights in the development of innovative therapeutic strategies in oncology. In this review, we present and discuss bioinformatics advances in the development of computational approaches to analyze and discover ncRNA data in oncology using high throughput sequencing technologies. PMID:23251139

  11. Preferential use of RNA leader sequences during influenza A transcription initiation in vivo.

    PubMed

    Geerts-Dimitriadou, Christina; Goldbach, Rob; Kormelink, Richard

    2011-01-01

    In vitro transcription initiation studies revealed a preference of influenza A virus for capped RNA leader sequences with base complementarity to the viral RNA template. Here, these results were verified during an influenza infection in MDCK cells. Alfalfa mosaic virus RNA3 leader sequences mutated in their base complementarity to the viral template, or the nucleotides 5' of potential base-pairing residues, were tested for their use either singly or in competition. These analyses revealed that influenza transcriptase is able to use leaders from an exogenous mRNA source with a preference for leaders harboring base complementarity to the 3'-ultimate residues of the viral template, as previously observed during in vitro studies. Internal priming at the 3'-penultimate residue, as well as "prime-and-realign" was observed. The finding that multiple base-pairing promotes cap donor selection in vivo, and the earlier observed competitiveness of such molecules in vitro, offers new possibilities for antiviral drug design. PMID:21030059

  12. Structator: fast index-based search for RNA sequence-structure patterns

    PubMed Central

    2011-01-01

    Background The secondary structure of RNA molecules is intimately related to their function and often more conserved than the sequence. Hence, the important task of searching databases for RNAs requires to match sequence-structure patterns. Unfortunately, current tools for this task have, in the best case, a running time that is only linear in the size of sequence databases. Furthermore, established index data structures for fast sequence matching, like suffix trees or arrays, cannot benefit from the complementarity constraints introduced by the secondary structure of RNAs. Results We present a novel method and readily applicable software for time efficient matching of RNA sequence-structure patterns in sequence databases. Our approach is based on affix arrays, a recently introduced index data structure, preprocessed from the target database. Affix arrays support bidirectional pattern search, which is required for efficiently handling the structural constraints of the pattern. Structural patterns like stem-loops can be matched inside out, such that the loop region is matched first and then the pairing bases on the boundaries are matched consecutively. This allows to exploit base pairing information for search space reduction and leads to an expected running time that is sublinear in the size of the sequence database. The incorporation of a new chaining approach in the search of RNA sequence-structure patterns enables the description of molecules folding into complex secondary structures with multiple ordered patterns. The chaining approach removes spurious matches from the set of intermediate results, in particular of patterns with little specificity. In benchmark experiments on the Rfam database, our method runs up to two orders of magnitude faster than previous methods. Conclusions The presented method's sublinear expected running time makes it well suited for RNA sequence-structure pattern matching in large sequence databases. RNA molecules containing several

  13. YM500v2: a small RNA sequencing (smRNA-seq) database for human cancer miRNome research.

    PubMed

    Cheng, Wei-Chung; Chung, I-Fang; Tsai, Cheng-Fong; Huang, Tse-Shun; Chen, Chen-Yang; Wang, Shao-Chuan; Chang, Ting-Yu; Sun, Hsing-Jen; Chao, Jeffrey Yung-Chuan; Cheng, Cheng-Chung; Wu, Cheng-Wen; Wang, Hsei-Wei

    2015-01-01

    We previously presented YM500, which is an integrated database for miRNA quantification, isomiR identification, arm switching discovery and novel miRNA prediction from 468 human smRNA-seq datasets. Here in this updated YM500v2 database (http://ngs.ym.edu.tw/ym500/), we focus on the cancer miRNome to make the database more disease-orientated. New miRNA-related algorithms developed after YM500 were included in YM500v2, and, more significantly, more than 8000 cancer-related smRNA-seq datasets (including those of primary tumors, paired normal tissues, PBMC, recurrent tumors, and metastatic tumors) were incorporated into YM500v2. Novel miRNAs (miRNAs not included in the miRBase R21) were not only predicted by three independent algorithms but also cleaned by a new in silico filtration strategy and validated by wetlab data such as Cross-Linked ImmunoPrecipitation sequencing (CLIP-seq) to reduce the false-positive rate. A new function 'Meta-analysis' is additionally provided for allowing users to identify real-time differentially expressed miRNAs and arm-switching events according to customer-defined sample groups and dozens of clinical criteria tidying up by proficient clinicians. Cancer miRNAs identified hold the potential for both basic research and biotech applications. PMID:25398902

  14. Sequence and organization of 5S ribosomal RNA-encoding genes of Arabidopsis thaliana.

    PubMed

    Campell, B R; Song, Y; Posch, T E; Cullis, C A; Town, C D

    1992-03-15

    We have isolated a genomic clone containing Arabidopsis thaliana 5S ribosomal RNA (rRNA)-encoding genes (rDNA) by screening an A. thaliana library with a 5S rDNA probe from flax. The clone isolated contains seven repeat units of 497 bp, plus 11 kb of flanking genomic sequence at one border. Sequencing of individual subcloned repeat units shows that the sequence of the 5S rRNA coding region is very similar to that reported for other flowering plants. Four A. thaliana ecotypes were found to contain approx. 1000 copies of 5S rDNA per haploid genome. Southern-blot analysis of genomic DNA indicates that 5S rDNA occurs in long tandem arrays, and shows the presence of numerous restriction-site polymorphisms among the six ecotypes studied. PMID:1348233

  15. Telomerase RNA stem terminus element affects template boundary element function, telomere sequence, and shelterin binding

    PubMed Central

    Webb, Christopher J.; Zakian, Virginia A.

    2015-01-01

    The stem terminus element (STE), which was discovered 13 y ago in human telomerase RNA, is required for telomerase activity, yet its mode of action is unknown. We report that the Schizosaccharomyces pombe telomerase RNA, TER1 (telomerase RNA 1), also contains a STE, which is essential for telomere maintenance. Cells expressing a partial loss-of-function TER1 STE allele maintained short stable telomeres by a recombination-independent mechanism. Remarkably, the mutant telomere sequence was different from that of wild-type cells. Generation of the altered sequence is explained by reverse transcription into the template boundary element, demonstrating that the STE helps maintain template boundary element function. The altered telomeres bound less Pot1 (protection of telomeres 1) and Taz1 (telomere-associated in Schizosaccharomyces pombe 1) in vivo. Thus, the S. pombe STE, although distant from the template, ensures proper telomere sequence, which in turn promotes proper assembly of the shelterin complex. PMID:26305931

  16. A user-friendly computational workflow for the analysis of microRNA deep sequencing data.

    PubMed

    Majer, Anna; Caligiuri, Kyle A; Booth, Stephanie A

    2013-01-01

    Second-generation high-throughput sequencing is a robust and inexpensive methodology that is becoming an increasingly common technique for the study of microRNA (miRNA) expression levels in the central nervous system. This method allows for the identification of both known and novel miRNAs, reporting on the qualitative and quantitative levels these RNA species represent in any given sample. Numerous bioinformatic programs are currently available to analyze deep sequencing data but many require at least a partial understanding of the command line interface. In this chapter, we describe a user-friendly computational workflow guiding the user through the process from the initial FASTQ deep sequencing file to the identification of known and potentially novel miRNAs in a given experiment, as well as the assessment of the differential expression of these miRNAs between experimental samples. Furthermore, programs that can predict potential targets for these miRNAs are also highlighted. PMID:23007497

  17. PARalyzer: definition of RNA binding sites from PAR-CLIP short-read sequence data

    PubMed Central

    2011-01-01

    Crosslinking and immunoprecipitation (CLIP) protocols have made it possible to identify transcriptome-wide RNA-protein interaction sites. In particular, PAR-CLIP utilizes a photoactivatable nucleoside for more efficient crosslinking. We present an approach, centered on the novel PARalyzer tool, for mapping high-confidence sites from PAR-CLIP deep-sequencing data. We show that PARalyzer delineates sites with a high signal-to-noise ratio. Motif finding identifies the sequence preferences of RNA-binding proteins, as well as seed-matches for highly expressed microRNAs when profiling Argonaute proteins. Our study describes tailored analytical methods and provides guidelines for future efforts to utilize high-throughput sequencing in RNA biology. PARalyzer is available at http://www.genome.duke.edu/labs/ohler/research/PARalyzer/. PMID:21851591

  18. Determining mutant spectra of three RNA viral samples using ultra-deep sequencing

    SciTech Connect

    Chen, H

    2012-06-06

    RNA viruses have extremely high mutation rates that enable the virus to adapt to new host environments and even jump from one species to another. As part of a viral transmission study, three viral samples collected from naturally infected animals were sequenced using Illumina paired-end technology at ultra-deep coverage. In order to determine the mutant spectra within the viral quasispecies, it is critical to understand the sequencing error rates and control for false positive calls of viral variants (point mutantations). I will estimate the sequencing error rate from two control sequences and characterize the mutant spectra in the natural samples with this error rate.

  19. Yersinia spp. Identification Using Copy Diversity in the Chromosomal 16S rRNA Gene Sequence

    PubMed Central

    Chen, Yuhuang; Liu, Chang; Xiao, Yuchun; Li, Xu; Su, Mingming; Jing, Huaiqi; Wang, Xin

    2016-01-01

    API 20E strip test, the standard for Enterobacteriaceae identification, is not sufficient to discriminate some Yersinia species for some unstable biochemical reactions and the same biochemical profile presented in some species, e.g. Yersinia ferderiksenii and Yersinia intermedia, which need a variety of molecular biology methods as auxiliaries for identification. The 16S rRNA gene is considered a valuable tool for assigning bacterial strains to species. However, the resolution of the 16S rRNA gene may be insufficient for discrimination because of the high similarity of sequences between some species and heterogeneity within copies at the intra-genomic level. In this study, for each strain we randomly selected five 16S rRNA gene clones from 768 Yersinia strains, and collected 3,840 sequences of the 16S rRNA gene from 10 species, which were divided into 439 patterns. The similarity among the five clones of 16S rRNA gene is over 99% for most strains. Identical sequences were found in strains of different species. A phylogenetic tree was constructed using the five 16S rRNA gene sequences for each strain where the phylogenetic classifications are consistent with biochemical tests; and species that are difficult to identify by biochemical phenotype can be differentiated. Most Yersinia strains form distinct groups within each species. However Yersinia kristensenii, a heterogeneous species, clusters with some Yersinia enterocolitica and Yersinia ferderiksenii/intermedia strains, while not affecting the overall efficiency of this species classification. In conclusion, through analysis derived from integrated information from multiple 16S rRNA gene sequences, the discrimination ability of Yersinia species is improved using our method. PMID:26808495

  20. Yersinia spp. Identification Using Copy Diversity in the Chromosomal 16S rRNA Gene Sequence.

    PubMed

    Hao, Huijing; Liang, Junrong; Duan, Ran; Chen, Yuhuang; Liu, Chang; Xiao, Yuchun; Li, Xu; Su, Mingming; Jing, Huaiqi; Wang, Xin

    2016-01-01

    API 20E strip test, the standard for Enterobacteriaceae identification, is not sufficient to discriminate some Yersinia species for some unstable biochemical reactions and the same biochemical profile presented in some species, e.g. Yersinia ferderiksenii and Yersinia intermedia, which need a variety of molecular biology methods as auxiliaries for identification. The 16S rRNA gene is considered a valuable tool for assigning bacterial strains to species. However, the resolution of the 16S rRNA gene may be insufficient for discrimination because of the high similarity of sequences between some species and heterogeneity within copies at the intra-genomic level. In this study, for each strain we randomly selected five 16S rRNA gene clones from 768 Yersinia strains, and collected 3,840 sequences of the 16S rRNA gene from 10 species, which were divided into 439 patterns. The similarity among the five clones of 16S rRNA gene is over 99% for most strains. Identical sequences were found in strains of different species. A phylogenetic tree was constructed using the five 16S rRNA gene sequences for each strain where the phylogenetic classifications are consistent with biochemical tests; and species that are difficult to identify by biochemical phenotype can be differentiated. Most Yersinia strains form distinct groups within each species. However Yersinia kristensenii, a heterogeneous species, clusters with some Yersinia enterocolitica and Yersinia ferderiksenii/intermedia strains, while not affecting the overall efficiency of this species classification. In conclusion, through analysis derived from integrated information from multiple 16S rRNA gene sequences, the discrimination ability of Yersinia species is improved using our method. PMID:26808495

  1. Analysis options for high-throughput sequencing in miRNA expression profiling

    PubMed Central

    2014-01-01

    Background Recently high-throughput sequencing (HTS) using next generation sequencing techniques became useful in digital gene expression profiling. Our study introduces analysis options for HTS data based on mapping to miRBase or counting and grouping of identical sequence reads. Those approaches allow a hypothesis free detection of miRNA differential expression. Methods We compare our results to microarray and qPCR data from one set of RNA samples. We use Illumina platforms for microarray analysis and miRNA sequencing of 20 samples from benign follicular thyroid adenoma and malignant follicular thyroid carcinoma. Furthermore, we use three strategies for HTS data analysis to evaluate miRNA biomarkers for malignant versus benign follicular thyroid tumors. Results High correlation of qPCR and HTS data was observed for the proposed analysis methods. However, qPCR is limited in the differential detection of miRNA isoforms. Moreover, we illustrate a much broader dynamic range of HTS compared to microarrays for small RNA studies. Finally, our data confirm hsa-miR-197-3p, hsa-miR-221-3p, hsa-miR-222-3p and both hsa-miR-144-3p and hsa-miR-144-5p as potential follicular thyroid cancer biomarkers. Conclusions Compared to microarrays HTS provides a global profile of miRNA expression with higher specificity and in more detail. Summarizing of HTS reads as isoform groups (analysis pipeline B) or according to functional criteria (seed analysis pipeline C), which better correlates to results of qPCR are promising new options for HTS analysis. Finally, data opens future miRNA research perspectives for HTS and indicates that qPCR might be limited in validating HTS data in detail. PMID:24625073

  2. Nucleotide sequence of the mRNA encoding the pre-alpha-subunit of mouse thyrotropin.

    PubMed Central

    Chin, W W; Kronenberg, H M; Dee, P C; Maloof, F; Habener, J F

    1981-01-01

    We have constructed and cloned in bacteria recombinant DNA molecules containing DNA sequences coding for the precursor of the alpha subunit of thyrotropin (pre-TSH-alpha). Double-stranded DNA complementary to total poly(A)+RNA derived from a mouse pituitary thyrotropic tumor was prepared enzymatically, inserted into the Pst I site of the plasmid pBR322 by using poly(dC).poly(dG) homopolymeric extensions, and cloned in Escherichia coli chi 1776. Cloned cDNAs encoding pre-TSH-alpha were identified by their hybridization to pre-TSH-alpha mRNA as determined by cell-free translations of hybrid-selected and hybrid-arrested RNA. The nucleotide sequences of two cDNAs (510 and 480 base pairs) were determined with chemical methods and corresponded to much of the region coding for the alpha subunit and the 3' untranslated region of pre-TSH-alpha mRNA. The sequence of the 5' end of the mRNA was determined from cDNA synthesized by using total mRNA as template and a restriction enzyme DNA fragment as primer. Together these sequences represented greater than 90% of the coding and noncoding regions of full-length pre-TSH-alpha mRNA, which was determined to be 800 bases long. The amino acid sequence of the pre-TSH-alpha deduced from the nucleotide sequence showed a NH2-terminal leader sequence of 24 amino acids followed by the 96-amino-acid sequence of the apoprotein of TSH-alpha. There is greater than 90% homology in the amino acid sequences among the murine, ruminant, and porcine alpha subunits and 75-80% homology among the murine, equine, and human alpha subunits. Several regions of the sequence remain absolutely conserved among all species, suggesting that these particular regions are essential for the biological function of the subunit. The successful cloning of the alpha subunit of TSH will permit further studies of the organization of the genes coding for the glycoprotein hormone subunits and the regulation of their expression. Images PMID:6272299

  3. Naive Bayesian classifier for rapid assignment of rRNA sequences into the new bacterial taxonomy.

    PubMed

    Wang, Qiong; Garrity, George M; Tiedje, James M; Cole, James R

    2007-08-01

    The Ribosomal Database Project (RDP) Classifier, a naïve Bayesian classifier, can rapidly and accurately classify bacterial 16S rRNA sequences into the new higher-order taxonomy proposed in Bergey's Taxonomic Outline of the Prokaryotes (2nd ed., release 5.0, Springer-Verlag, New York, NY, 2004). It provides taxonomic assignments from domain to genus, with confidence estimates for each assignment. The majority of classifications (98%) were of high estimated confidence (> or = 95%) and high accuracy (98%). In addition to being tested with the corpus of 5,014 type strain sequences from Bergey's outline, the RDP Classifier was tested with a corpus of 23,095 rRNA sequences as assigned by the NCBI into their alternative higher-order taxonomy. The results from leave-one-out testing on both corpora show that the overall accuracies at all levels of confidence for near-full-length and 400-base segments were 89% or above down to the genus level, and the majority of the classification errors appear to be due to anomalies in the current taxonomies. For shorter rRNA segments, such as those that might be generated by pyrosequencing, the error rate varied greatly over the length of the 16S rRNA gene, with segments around the V2 and V4 variable regions giving the lowest error rates. The RDP Classifier is suitable both for the analysis of single rRNA sequences and for the analysis of libraries of thousands of sequences. Another related tool, RDP Library Compare, was developed to facilitate microbial-community comparison based on 16S rRNA gene sequence libraries. It combines the RDP Classifier with a statistical test to flag taxa differentially represented between samples. The RDP Classifier and RDP Library Compare are available online at http://rdp.cme.msu.edu/. PMID:17586664

  4. Ebola virus RNA editing depends on the primary editing site sequence and an upstream secondary structure.

    PubMed

    Mehedi, Masfique; Hoenen, Thomas; Robertson, Shelly; Ricklefs, Stacy; Dolan, Michael A; Taylor, Travis; Falzarano, Darryl; Ebihara, Hideki; Porcella, Stephen F; Feldmann, Heinz

    2013-01-01

    Ebolavirus (EBOV), the causative agent of a severe hemorrhagic fever and a biosafety level 4 pathogen, increases its genome coding capacity by producing multiple transcripts encoding for structural and nonstructural glycoproteins from a single gene. This is achieved through RNA editing, during which non-template adenosine residues are incorporated into the EBOV mRNAs at an editing site encoding for 7 adenosine residues. However, the mechanism of EBOV RNA editing is currently not understood. In this study, we report for the first time that minigenomes containing the glycoprotein gene editing site can undergo RNA editing, thereby eliminating the requirement for a biosafety level 4 laboratory to study EBOV RNA editing. Using a newly developed dual-reporter minigenome, we have characterized the mechanism of EBOV RNA editing, and have identified cis-acting sequences that are required for editing, located between 9 nt upstream and 9 nt downstream of the editing site. Moreover, we show that a secondary structure in the upstream cis-acting sequence plays an important role in RNA editing. EBOV RNA editing is glycoprotein gene-specific, as a stretch encoding for 7 adenosine residues located in the viral polymerase gene did not serve as an editing site, most likely due to an absence of the necessary cis-acting sequences. Finally, the EBOV protein VP30 was identified as a trans-acting factor for RNA editing, constituting a novel function for this protein. Overall, our results provide novel insights into the RNA editing mechanism of EBOV, further understanding of which might result in novel intervention strategies against this viral pathogen. PMID:24146620

  5. Complexity of murine cardiomyocyte miRNA biogenesis, sequence variant expression and function.

    PubMed

    Humphreys, David T; Hynes, Carly J; Patel, Hardip R; Wei, Grace H; Cannon, Leah; Fatkin, Diane; Suter, Catherine M; Clancy, Jennifer L; Preiss, Thomas

    2012-01-01

    microRNAs (miRNAs) are critical to heart development and disease. Emerging research indicates that regulated precursor processing can give rise to an unexpected diversity of miRNA variants. We subjected small RNA from murine HL-1 cardiomyocyte cells to next generation sequencing to investigate the relevance of such diversity to cardiac biology. ∼40 million tags were mapped to known miRNA hairpin sequences as deposited in miRBase version 16, calling 403 generic miRNAs as appreciably expressed. Hairpin arm bias broadly agreed with miRBase annotation, although 44 miR* were unexpectedly abundant (>20% of tags); conversely, 33 -5p/-3p annotated hairpins were asymmetrically expressed. Overall, variability was infrequent at the 5' start but common at the 3' end of miRNAs (5.2% and 52.3% of tags, respectively). Nevertheless, 105 miRNAs showed marked 5' isomiR expression (>20% of tags). Among these was miR-133a, a miRNA with important cardiac functions, and we demonstrated differential mRNA targeting by two of its prevalent 5' isomiRs. Analyses of miRNA termini and base-pairing patterns around Drosha and Dicer cleavage regions confirmed the known bias towards uridine at the 5' most position of miRNAs, as well as supporting the thermodynamic asymmetry rule for miRNA strand selection and a role for local structural distortions in fine tuning miRNA processing. We further recorded appreciable expression of 5 novel miR*, 38 extreme variants and 8 antisense miRNAs. Analysis of genome-mapped tags revealed 147 novel candidate miRNAs. In summary, we revealed pronounced sequence diversity among cardiomyocyte miRNAs, knowledge of which will underpin future research into the mechanisms involved in miRNA biogenesis and, importantly, cardiac function, disease and therapy. PMID:22319597

  6. Deep sequencing of pigeonpea sterility mosaic virus discloses five RNA segments related to emaraviruses.

    PubMed

    Elbeaino, Toufic; Digiaro, Michele; Uppala, Mangala; Sudini, Harikishan

    2014-08-01

    The sequences of five viral RNA segments of pigeonpea sterility mosaic virus (PPSMV), the agent of sterility mosaic disease (SMD) of pigeonpea (Cajanus cajan, Fabaceae), were determined using the deep sequencing technology. Each of the five RNAs encodes a single protein on the negative-sense strand with an open reading frame (ORF) of 6885, 1947, 927, 1086, and 1,422 nts, respectively. In order, from RNA1 to RNA5, these ORFs encode the RNA-dependent RNA polymerase (p1, 267.9 kDa), a putative glycoprotein precursor (p2, 74.3 kDa), a putative nucleocapsid protein (p3, 34.6 kDa), a putative movement protein (p4, 40.8 kDa), while p5 (55 kDa) has an unknown function. All RNA segments of PPSMV showed the highest identity with orthologs of fig mosaic virus (FMV) and Rose rosette virus (RRV). In phylogenetic trees constructed with the amino acid sequences of p1, p2 and p3, PPSMV clustered consistently with other emaraviruses, close to clades comprising members of other genera of the family Bunyaviridae. Based on the molecular characteristics unveiled in this study and the morphological and epidemiological features similar to other emaraviruses, PPSMV seems to be the seventh species to join the list of emaraviruses known to date and accordingly, its classification in the genus Emaravirus seems now legitimate. PMID:24685674

  7. incaRNAfbinv: a web server for the fragment-based design of RNA sequences.

    PubMed

    Drory Retwitzer, Matan; Reinharz, Vladimir; Ponty, Yann; Waldispühl, Jérôme; Barash, Danny

    2016-07-01

    In recent years, new methods for computational RNA design have been developed and applied to various problems in synthetic biology and nanotechnology. Lately, there is considerable interest in incorporating essential biological information when solving the inverse RNA folding problem. Correspondingly, RNAfbinv aims at including biologically meaningful constraints and is the only program to-date that performs a fragment-based design of RNA sequences. In doing so it allows the design of sequences that do not necessarily exactly fold into the target, as long as the overall coarse-grained tree graph shape is preserved. Augmented by the weighted sampling algorithm of incaRNAtion, our web server called incaRNAfbinv implements the method devised in RNAfbinv and offers an interactive environment for the inverse folding of RNA using a fragment-based design approach. It takes as input: a target RNA secondary structure; optional sequence and motif constraints; optional target minimum free energy, neutrality and GC content. In addition to the design of synthetic regulatory sequences, it can be used as a pre-processing step for the detection of novel natural occurring RNAs. The two complementary methodologies RNAfbinv and incaRNAtion are merged together and fully implemented in our web server incaRNAfbinv, available at http://www.cs.bgu.ac.il/incaRNAfbinv. PMID:27185893

  8. Phenotype classification of single cells using SRS microscopy, RNA sequencing, and microfluidics (Conference Presentation)

    NASA Astrophysics Data System (ADS)

    Streets, Aaron M.; Cao, Chen; Zhang, Xiannian; Huang, Yanyi

    2016-03-01

    Phenotype classification of single cells reveals biological variation that is masked in ensemble measurement. This heterogeneity is found in gene and protein expression as well as in cell morphology. Many techniques are available to probe phenotypic heterogeneity at the single cell level, for example quantitative imaging and single-cell RNA sequencing, but it is difficult to perform multiple assays on the same single cell. In order to directly track correlation between morphology and gene expression at the single cell level, we developed a microfluidic platform for quantitative coherent Raman imaging and immediate RNA sequencing (RNA-Seq) of single cells. With this device we actively sort and trap cells for analysis with stimulated Raman scattering microscopy (SRS). The cells are then processed in parallel pipelines for lysis, and preparation of cDNA for high-throughput transcriptome sequencing. SRS microscopy offers three-dimensional imaging with chemical specificity for quantitative analysis of protein and lipid distribution in single cells. Meanwhile, the microfluidic platform facilitates single-cell manipulation, minimizes contamination, and furthermore, provides improved RNA-Seq detection sensitivity and measurement precision, which is necessary for differentiating biological variability from technical noise. By combining coherent Raman microscopy with RNA sequencing, we can better understand the relationship between cellular morphology and gene expression at the single-cell level.

  9. Cloning and nucleotide sequence of the leucyl-tRNA synthetase gene of Bacillus subtilis.

    PubMed Central

    Vander Horn, P B; Zahler, S A

    1992-01-01

    The leucyl-tRNA synthetase gene (leuS) of Bacillus subtilis was cloned and sequenced. A mutation in the gene, leuS1, increases the transcription and expression of the ilv-leu operion, permitting monitoring of leuS alleles. The leuS1 mutation was mapped to 270 degrees on the chromosome. Sequence analysis showed that the mutation is a single-base substitution, possibly in a monocistronic operon. The leader mRNA predicted by the sequence would contain a number of possible secondary structures and a T box, a sequence observed upstream of leader mRNA terminators of Bacillus tRNA synthetases and the B. subtilis ilv-leu operon. The DNA of the B. subtilis leuS open reading frame is 48% identical to the leuS gene of Escherichia coli and is predicted to encode a polypeptide with 46% identity to the leucyl-tRNA synthetase of E. coli. PMID:1317842

  10. Evaluation of 16S rRNA amplicon sequencing using two next-generation sequencing technologies for phylogenetic analysis of the rumen bacterial community in steers

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Next generation sequencing technologies have vastly changed the approach of sequencing of the 16S rRNA gene for studies in microbial ecology. Three distinct technologies are available for large-scale 16S sequencing. All three are subject to biases introduced by sequencing error rates, amplificatio...

  11. Sequence and structure analysis of a mirror tRNA located upstream of the cytochrome oxidase I mRNA in mouse mitochondria.

    PubMed

    Okui, Saya; Ushida, Chisato; Kiyosawa, Hidenori; Kawai, Gota

    2016-03-01

    RNA fragments corresponding to the mirror tRNA that is located upstream of the cytochrome oxidase I (COXI) gene in the mouse mitochondrial genome were found in the sequences obtained from the mouse brain by the next generation sequencing. RNA fragments corresponding to the 5' terminal of COXI mRNA were also found and it was suggested that the precursor of the COXI mRNA is processed at three residues upstream of the first AUG codon. The mirror tRNA fragment has poly(A) in its 3' terminal and variable 5' terminal, suggesting that this RNA is produced during the 5' processing of COXI mRNA. Secondary structure prediction and NMR analysis indicated that the mirror tRNA is folded into a tRNA-like secondary structure, suggesting that the tRNA-like conformation of the 5' adjacent sequence of COXI mRNA is involved in the COXI mRNA maturation in the mouse mitochondria. PMID:26519737

  12. Phylogeny of protostome worms derived from 18S rRNA sequences.

    PubMed

    Winnepenninckx, B; Backeljau, T; De Wachter, R

    1995-07-01

    The phylogenetic relationships of protostome worms were studied by comparing new complete 18S rRNA sequences of Vestimentifera, Pogonophora, Sipuncula, Echiura, Nemertea, and Annelida with existing 18S rRNA sequences of Mollusca, Arthropoda, Chordata, and Platyhelminthes. Phylogenetic trees were inferred via neighbor-joining and maximum parsimony analyses. These suggest that (1) Sipuncula and Echiura are not sister groups; (2) Nemertea are protostomes; (3) Vestimentifera and Pogonophora are protostomes that have a common ancestor with Echiura; and (4) Vestimentifera and Pogonophora are a monophyletic clade. PMID:7659019

  13. CCAT2, a novel noncoding RNA mapping to 8q24, underlies metastatic progression and chromosomal instability in colon cancer

    PubMed Central

    Ling, Hui; Spizzo, Riccardo; Atlasi, Yaser; Nicoloso, Milena; Shimizu, Masayoshi; Redis, Roxana S.; Nishida, Naohiro; Gafà, Roberta; Song, Jian; Guo, Zhiyi; Ivan, Cristina; Barbarotto, Elisa; De Vries, Ingrid; Zhang, Xinna; Ferracin, Manuela; Churchman, Mike; van Galen, Janneke F.; Beverloo, Berna H.; Shariati, Maryam; Haderk, Franziska; Estecio, Marcos R.; Garcia-Manero, Guillermo; Patijn, Gijs A.; Gotley, David C.; Bhardwaj, Vikas; Shureiqi, Imad; Sen, Subrata; Multani, Asha S.; Welsh, James; Yamamoto, Ken; Taniguchi, Itsuki; Song, Min-Ae; Gallinger, Steven; Casey, Graham; Thibodeau, Stephen N.; Le Marchand, Loïc; Tiirikainen, Maarit; Mani, Sendurai A.; Zhang, Wei; Davuluri, Ramana V.; Mimori, Koshi; Mori, Masaki; Sieuwerts, Anieta M.; Martens, John W.M.; Tomlinson, Ian; Negrini, Massimo; Berindan-Neagoe, Ioana; Foekens, John A.; Hamilton, Stanley R.; Lanza, Giovanni; Kopetz, Scott; Fodde, Riccardo; Calin, George A.

    2013-01-01

    The functional roles of SNPs within the 8q24 gene desert in the cancer phenotype are not yet well understood. Here, we report that CCAT2, a novel long noncoding RNA transcript (lncRNA) encompassing the rs6983267 SNP, is highly overexpressed in microsatellite-stable colorectal cancer and promotes tumor growth, metastasis, and chromosomal instability. We demonstrate that MYC, miR–17–5p, and miR–20a are up-regulated by CCAT2 through TCF7L2-mediated transcriptional regulation. We further identify the physical interaction between CCAT2 and TCF7L2 resulting in an enhancement of WNT signaling activity. We show that CCAT2 is itself a WNT downstream target, which suggests the existence of a feedback loop. Finally, we demonstrate that the SNP status affects CCAT2 expression and the risk allele G produces more CCAT2 transcript. Our results support a new mechanism of MYC and WNT regulation by the novel lncRNA CCAT2 in colorectal cancer pathogenesis, and provide an alternative explanation of the SNP-conferred cancer risk. PMID:23796952

  14. Overlapping open reading frames revealed by complete nucleotide sequencing of turnip yellow mosaic virus genomic RNA.

    PubMed Central

    Morch, M D; Boyer, J C; Haenni, A L

    1988-01-01

    The complete nucleotide sequence of turnip yellow mosaic virus (TYMV) genomic RNA has been determined on a set of overlapping cDNA clones using a sequential sequencing strategy. The RNA is 6318 nucleotides long, excluding the cap structure. The genome organization deduced from the sequence confirms previous results of in vitro translation. A novel open reading frame (ORF) putatively encoding a Pro-rich and very basic 69K (K = kilodalton) protein is detected at the 5' end of the genome. It is initiated at the first AUG codon on the RNA and overlaps the major ORF that encodes the non structural 206K (previously referred to as 195K) protein of TYMV; its function is unknown. Several amino acid consensus sequences already described among plant and animal viruses are also found in the TYMV-encoded polypeptides. A comparison with other viruses whose RNA sequence is known leads to the conclusion that TYMV belongs to the "Sindbis-like" supergroup of viruses and could be related to Semliki forest virus. PMID:3399388

  15. 5'-Terminal sequences of eucaryotic mRNA can be cloned with high efficiency.

    PubMed Central

    Land, H; Grez, M; Hauser, H; Lindenmaier, W; Schütz, G

    1981-01-01

    A method for cloning mRNAs has been used which results in a high yield of recombinants containing complete 5'-terminal mRNA sequences. It is not dependent on self-priming to generate double-stranded DNA and therefore the S1 nuclease digestion step is not required. Instead, the cDNA is dCMP-tailed at its 3'-end with terminal deoxynucleotidyl transferase (TdT). The synthesis of the second strand is primed by oligo(dG) hybridized to the 3'-tail. Double-stranded cDNA is subsequently tailed with dCTP and annealed to dGMP-tailed vector DNA. This approach overcomes the loss of the 5'-terminal mRNA sequences and the problem of artifacts which may be introduced into cloned cDNA sequences. Chicken lysozyme cDNA was cloned into pBR322 by this procedure with a transformation efficiency of 5 x 10(3) recombinant clones per ng of ds-cDNA. Sequence analysis revealed that at least nine out of nineteen randomly isolated plasmids contained the entire 5'-untranslated mRNA sequence. The data strongly support the conclusion that the 5'-untranslated region of the lysozyme mRNA is heterogeneous in length. Images PMID:6166921

  16. Research progress on mechanisms of male sterility in plants based on high-throughput RNA sequencing.

    PubMed

    Yongming, Liu; Ling, Zhang; Tao, Qiu; Zhuofan, Zhao; Moju, Cao

    2016-08-01

    Male sterility is defined as failing to produce functional pollen during stamen development in plants, and it plays a crucial role in plant reproductive research and hybrid seed production in utilization of crop heterosis. High throughput RNA sequencing (RNA-seq) has been used widely in the study of different fields of life science, as it readily detects all the mRNA and non-coding RNA in cells. Recently, RNA-seq has been reported to be applied in different species and kinds of pollen abortion types in plants, which has contributed to the understanding of the molecular mechanism and metabolic networks of male sterility at the transcription level. In this review, we summarize research progress on the mechanisms of male sterility in plants, focusing on RNA-seq analysis encompassing strategies of RNA library construction, differentially expressed genes and functional characteristics of noncoding RNAs involved in stamen abortion. Furthermore, we also discuss application of transcriptome sequencing technology to elucidate pollen abortion mechanisms and map fertility-related genes. We hope to provide references to the study of male sterility in plants. PMID:27531606

  17. Identification of in vivo target RNA sequences bound by thymidylate synthase.

    PubMed Central

    Chu, E; Cogliati, T; Copur, S M; Borre, A; Voeller, D M; Allegra, C J; Segal, S

    1996-01-01

    We developed an immunoprecipitation-RNA-random PCR (rPCR) method to isolate cellular RNA sequences that bind to the folate-dependent enzyme thymidylate synthase (TS). Using this approach, nine different cellular RNAs that formed a ribonucleoprotein (RNP) complex with thymidylate synthase (TS) in human colon cancer cells were identified. RNA binding experiments revealed that seven of these RNAs bound TS with relatively high affinity (IC50 values ranging from 1.5 to 6 nM). One of the RNAs was shown to encode the interferon (IFN)-induced 15 kDa protein. Western immunoblot analyses demonstrated that the level of IFN-induced 15 kDa protein was significantly decreased in human colon cancer H630-R10 cells compared with parent H630 cells. While the level of IFN-induced 15 kDa mRNA expression was the same in parent and TS-overexpressing cell lines, the level of IFN-induced 15 kDa RNA bound to TS in the form of a RNP complex was markedly higher in H630-R10 cells relative to parent H630 cells. These studies begin to define a number of cellular target RNA sequences with which TS interacts and suggest that these TS protein-cellular RNA interactions may have a biological role. PMID:8774904

  18. Identification of in vivo target RNA sequences bound by thymidylate synthase.

    PubMed

    Chu, E; Cogliati, T; Copur, S M; Borre, A; Voeller, D M; Allegra, C J; Segal, S

    1996-08-15

    We developed an immunoprecipitation-RNA-random PCR (rPCR) method to isolate cellular RNA sequences that bind to the folate-dependent enzyme thymidylate synthase (TS). Using this approach, nine different cellular RNAs that formed a ribonucleoprotein (RNP) complex with thymidylate synthase (TS) in human colon cancer cells were identified. RNA binding experiments revealed that seven of these RNAs bound TS with relatively high affinity (IC50 values ranging from 1.5 to 6 nM). One of the RNAs was shown to encode the interferon (IFN)-induced 15 kDa protein. Western immunoblot analyses demonstrated that the level of IFN-induced 15 kDa protein was significantly decreased in human colon cancer H630-R10 cells compared with parent H630 cells. While the level of IFN-induced 15 kDa mRNA expression was the same in parent and TS-overexpressing cell lines, the level of IFN-induced 15 kDa RNA bound to TS in the form of a RNP complex was markedly higher in H630-R10 cells relative to parent H630 cells. These studies begin to define a number of cellular target RNA sequences with which TS interacts and suggest that these TS protein-cellular RNA interactions may have a biological role. PMID:8774904

  19. Targeted Mutagenesis in Plant Cells through Transformation of Sequence-Specific Nuclease mRNA

    PubMed Central

    Stoddard, Thomas J.; Clasen, Benjamin M.; Baltes, Nicholas J.; Demorest, Zachary L.; Voytas, Daniel F.; Zhang, Feng; Luo, Song

    2016-01-01

    Plant genome engineering using sequence-specific nucleases (SSNs) promises to advance basic and applied plant research by enabling precise modification of endogenous genes. Whereas DNA is an effective means for delivering SSNs, DNA can integrate randomly into the plant genome, leading to unintentional gene inactivation. Further, prolonged expression of SSNs from DNA constructs can lead to the accumulation of off-target mutations. Here, we tested a new approach for SSN delivery to plant cells, namely transformation of messenger RNA (mRNA) encoding TAL effector nucleases (TALENs). mRNA delivery of a TALEN pair targeting the Nicotiana benthamiana ALS gene resulted in mutation frequencies of approximately 6% in comparison to DNA delivery, which resulted in mutation frequencies of 70.5%. mRNA delivery resulted in three-fold fewer insertions, and 76% were <10bp; in contrast, 88% of insertions generated through DNA delivery were >10bp. In an effort to increase mutation frequencies using mRNA, we fused several different 5’ and 3’ untranslated regions (UTRs) from Arabidopsis thaliana genes to the TALEN coding sequence. UTRs from an A. thaliana adenine nucleotide α hydrolases-like gene (At1G09740) enhanced mutation frequencies approximately two-fold, relative to a no-UTR control. These results indicate that mRNA can be used as a delivery vehicle for SSNs, and that manipulation of mRNA UTRs can influence efficiencies of genome editing. PMID:27176769

  20. Computational and analytical framework for small RNA profiling by high-throughput sequencing.

    PubMed

    Fahlgren, Noah; Sullivan, Christopher M; Kasschau, Kristin D; Chapman, Elisabeth J; Cumbie, Jason S; Montgomery, Taiowa A; Gilbert, Sunny D; Dasenko, Mark; Backman, Tyler W H; Givan, Scott A; Carrington, James C

    2009-05-01

    The advent of high-throughput sequencing (HTS) methods has enabled direct approaches to quantitatively profile small RNA populations. However, these methods have been limited by several factors, including representational artifacts and lack of established statistical methods of analysis. Furthermore, massive HTS data sets present new problems related to data processing and mapping to a reference genome. Here, we show that cluster-based sequencing-by-synthesis technology is highly reproducible as a quantitative profiling tool for several classes of small RNA from Arabidopsis thaliana. We introduce the use of synthetic RNA oligoribonucleotide standards to facilitate objective normalization between HTS data sets, and adapt microarray-type methods for statistical analysis of multiple samples. These methods were tested successfully using mutants with small RNA biogenesis (miRNA-defective dcl1 mutant and siRNA-defective dcl2 dcl3 dcl4 triple mutant) or effector protein (ago1 mutant) deficiencies. Computational methods were also developed to rapidly and accurately parse, quantify, and map small RNA data. PMID:19307293

  1. Sequence-based identification of 3D structural modules in RNA with RMDetect.

    PubMed

    Cruz, José Almeida; Westhof, Eric

    2011-06-01

    Structural RNA modules, sets of ordered non-Watson-Crick base pairs embedded between Watson-Crick pairs, have central roles as architectural organizers and sites of ligand binding in RNA molecules, and are recurrently observed in RNA families throughout the phylogeny. Here we describe a computational tool, RNA three-dimensional (3D) modules detection, or RMDetect, for identifying known 3D structural modules in single and multiple RNA sequences in the absence of any other information. Currently, four modules can be searched for: G-bulge loop, kink-turn, C-loop and tandem-GA loop. In control test sequences we found all of the known modules with a false discovery rate of 0.23. Scanning through 1,444 publicly available alignments, we identified 21 yet unreported modules and 141 known modules. RMDetect can be used to refine RNA 2D structure, assemble RNA 3D models, and search and annotate structured RNAs in genomic data. PMID:21552257

  2. GRSDB: a database of quadruplex forming G-rich sequences in alternatively processed mammalian pre-mRNA sequences.

    PubMed

    Kostadinov, Rumen; Malhotra, Nishtha; Viotti, Manuel; Shine, Robert; D'Antonio, Lawrence; Bagga, Paramjeet

    2006-01-01

    Guanine-rich nucleic acids are known to form highly stable G-quadruplex structures, also known as G-quartets. Recently, there has been a tremendous amount of interest in studying G-quadruplexes owing to the realization of their biological importance. G-rich sequences (GRSs) capable of forming G-quadruplexes are found in the vicinity of polyadenylation regions and are involved in regulating 3' end processing of mammalian pre-mRNAs. G-rich motifs are also known to play an important role in alternative, tissue-specific splicing by interacting with hnRNP H protein subfamily. Whether quadruplex structure directly plays a role in regulating RNA processing events requires further investigation. To date there has not been a comprehensive effort to study G-quadruplexes near RNA processing sites. We have applied a computational approach to map putative Quadruplex forming GRSs within the transcribed regions of a large number of alternatively processed human and mouse gene sequences that were obtained as fully annotated entries from GenBank and RefSeq. We have used the computed data to build the GRSDB database that provides a unique avenue for studying G-quadruplexes in the context of RNA processing sites. GRSDB website offers visual comparison of G-quadruplex distribution patterns among all the alternative RNA products of a gene with the help of dynamic graphics. At present, GRSDB contains data from 1310 human and mouse genes, of which 1188 are alternatively processed. It has a total of 379,223 predicted G-quadruplexes, of which 54,252 are near RNA processing sites. GRSDB is a good resource for researchers interested in investigating the functional relevance of G-quadruplexes, especially in the context of alternative RNA processing. It can be accessed at http://bioinformatics.ramapo.edu/grsdb/. PMID:16381828

  3. GRSDB: a database of quadruplex forming G-rich sequences in alternatively processed mammalian pre-mRNA sequences

    PubMed Central

    Kostadinov, Rumen; Malhotra, Nishtha; Viotti, Manuel; Shine, Robert; D'Antonio, Lawrence; Bagga, Paramjeet

    2006-01-01

    Guanine-rich nucleic acids are known to form highly stable G-quadruplex structures, also known as G-quartets. Recently, there has been a tremendous amount of interest in studying G-quadruplexes owing to the realization of their biological importance. G-rich sequences (GRSs) capable of forming G-quadruplexes are found in the vicinity of polyadenylation regions and are involved in regulating 3′ end processing of mammalian pre-mRNAs. G-rich motifs are also known to play an important role in alternative, tissue-specific splicing by interacting with hnRNP H protein subfamily. Whether quadruplex structure directly plays a role in regulating RNA processing events requires further investigation. To date there has not been a comprehensive effort to study G-quadruplexes near RNA processing sites. We have applied a computational approach to map putative Quadruplex forming GRSs within the transcribed regions of a large number of alternatively processed human and mouse gene sequences that were obtained as fully annotated entries from GenBank and RefSeq. We have used the computed data to build the GRSDB database that provides a unique avenue for studying G-quadruplexes in the context of RNA processing sites. GRSDB website offers visual comparison of G-quadruplex distribution patterns among all the alternative RNA products of a gene with the help of dynamic graphics. At present, GRSDB contains data from 1310 human and mouse genes, of which 1188 are alternatively processed. It has a total of 379 223 predicted G-quadruplexes, of which 54 252 are near RNA processing sites. GRSDB is a good resource for researchers interested in investigating the functional relevance of G-quadruplexes, especially in the context of alternative RNA processing. It can be accessed at . PMID:16381828

  4. Combined sequencing of mRNA and DNA from human embryonic stem cells.

    PubMed

    Mertes, Florian; Kuhl, Heiner; Wruck, Wasco; Lehrach, Hans; Adjaye, James

    2016-06-01

    Combined transcriptome and whole genome sequencing of the same ultra-low input sample down to single cells is a rapidly evolving approach for the analysis of rare cells. Besides stem cells, rare cells originating from tissues like tumor or biopsies, circulating tumor cells and cells from early embryonic development are under investigation. Herein we describe a universal method applicable for the analysis of minute amounts of sample material (150 to 200 cells) derived from sub-colony structures from human embryonic stem cells. The protocol comprises the combined isolation and separate amplification of poly(A) mRNA and whole genome DNA followed by next generation sequencing. Here we present a detailed description of the method developed and an overview of the results obtained for RNA and whole genome sequencing of human embryonic stem cells, sequencing data is available in the Gene Expression Omnibus (GEO) database under accession number GSE69471. PMID:27275414

  5. Alterations in SiRNA and MiRNA Expression Profiles Detected by Deep Sequencing of Transgenic Rice with SiRNA-Mediated Viral Resistance

    PubMed Central

    Wang, Xifeng; Liang, Chun

    2015-01-01

    RNA-mediated gene silencing has been demonstrated to serve as a defensive mechanism against viral pathogens by plants. It is known that specifically expressed endogenous siRNAs and miRNAs are involved in the self-defense process during viral infection. However, research has been rarely devoted to the endogenous siRNA and miRNA expression changes under viral infection if the resistance has already been genetically engineered in plants. Aiming to gain a deeper understanding of the RNA-mediated gene silencing defense process in plants, the expression profiles of siRNAs and miRNAs before and after viral infection in both wild type and transgenic anti-Rice stripe virus (RSV) rice plants were examined by small RNA high-throughput sequencing. Our research confirms that the newly generated siRNAs, which are derived from the engineered inverted repeat construct, is the major contributor of the viral resistance in rice. Further analysis suggests the accuracy of siRNA biogenesis might be affected when siRNAs machinery is excessively used in the transgenic plants. In addition, the expression levels of many known miRNAs are dramatically changed due to RSV infection on both wild type and transgenic rice plants, indicating potential function of those miRNAs involved in plant-virus interacting process. PMID:25559820

  6. Complete Genome Sequence of a Reference Stock of Simian Immunodeficiency Virus RNA (SIVmac251/32H/L28) Determined by Deep Sequencing

    PubMed Central

    Jenkins, Adrian; Ham, Claire; Almond, Neil

    2016-01-01

    A reference preparation for simian immunodeficiency virus (SIV) RNA nucleic acid assays was characterized by complete genome deep sequencing. The entire coding sequence and flanking long terminal repeats, including minority species, were determined. This information will inform SIV research investigations and aid evaluation and development of amplification assays for SIV RNA quantification. PMID:27231355

  7. Complete Genome Sequence of a Reference Stock of Simian Immunodeficiency Virus RNA (SIVmac251/32H/L28) Determined by Deep Sequencing.

    PubMed

    Jenkins, Adrian; Ham, Claire; Almond, Neil; Berry, Neil

    2016-01-01

    A reference preparation for simian immunodeficiency virus (SIV) RNA nucleic acid assays was characterized by complete genome deep sequencing. The entire coding sequence and flanking long terminal repeats, including minority species, were determined. This information will inform SIV research investigations and aid evaluation and development of amplification assays for SIV RNA quantification. PMID:27231355

  8. Evaluating the impact of sequencing error correction for RNA-seq data with ERCC RNA spike-in controls

    PubMed Central

    Tong, Li; Yang, Cheng; Wu, Po-Yen; Wang, May D.

    2016-01-01

    Sequencing errors are a major issue for several next-generation sequencing-based applications such as de novo assembly and single nucleotide polymorphism detection. Several error-correction methods have been developed to improve raw data quality. However, error-correction performance is hard to evaluate because of the lack of a ground truth. In this study, we propose a novel approach which using ERCC RNA spike-in controls as the ground truth to facilitate error-correction performance evaluation. After aligning raw and corrected RNA-seq data, we characterized the quality of reads by three metrics: mismatch patterns (i.e., the substitution rate of A to C) of reads aligned with one mismatch, mismatch patterns of reads aligned with two mismatches and the percentage increase of reads aligned to reference. We observed that the mismatch patterns for reads aligned with one mismatch are significantly correlated between ERCC spike-ins and real RNA samples. Based on such observations, we conclude that ERCC spike-ins can serve as ground truths for error correction beyond their previous applications for validation of dynamic range and fold-change response. Also, the mismatch patterns for ERCC reads aligned with one mismatch can serve as a novel and reliable metric to evaluate the performance of error-correction tools. PMID:27532064

  9. Sequence, structural and evolutionary relationships between class 2 aminoacyl-tRNA synthetases.

    PubMed Central

    Cusack, S; Härtlein, M; Leberman, R

    1991-01-01

    Class 2 aminoacyl-tRNA synthetases, which include the enzymes for alanine, aspartic acid, asparagine, glycine, histidine, lysine, phenylalanine, proline, serine and threonine, are characterised by three distinct sequence motifs 1,2 and 3 (reference 1). The structural and evolutionary relatedness of these ten enzymes are examined using alignments of primary sequences from prokaryotic and eukaryotic sources and the known three dimensional structure of seryl-tRNA synthetase from E. coli. It is shown that motif 1 forms part of the dimer interface of seryl-tRNA synthetase and motifs 2 and 3 part of the putative active site. It is further shown that the seven alpha 2 dimeric synthetases can be subdivided into class 2a (proline, threonine, histidine and serine) and class 2b (aspartic acid, asparagine and lysine), each subclass sharing several important characteristic sequence motifs in addition to those characteristic of class 2 enzymes in general. The alpha 2 beta 2 tetrameric enzymes (for glycine and phenylalanine) show certain special features in common as well as some of the class 2b motifs. In the alanyl-tRNA synthetase only motif 3 and possibly motif 2 can be identified. The sequence alignments suggest that the catalytic domain of other class 2 synthetases should resemble the antiparallel domain found in seryl-tRNA synthetase. Predictions are made about the sequence location of certain important helices and beta-strands in this domain as well as suggestions concerning which residues are important in ATP and amino acid binding. Strong homologies are found in the N-terminal extensions of class 2b synthetases and in the C-terminal extensions of class 2a synthetases suggesting that these putative tRNA binding domains have been added at a later stage in evolution to the catalytic domain. Images PMID:1852601

  10. Identification and sequence determination of a novel double-stranded RNA mycovirus from the entomopathogenic fungus Beauveria bassiana.

    PubMed

    Kotta-Loizou, Ioly; Sipkova, Jana; Coutts, Robert H A

    2015-03-01

    An isolate of the entomopathogenic fungus Beauveria bassiana was found to contain five double-stranded (ds) RNA elements ranging from 1.5 to more than 3 kbp. The complete sequence of the largest dsRNA element is described here. Analysis of the RdRp nucleotide sequence reveals its similarity to unclassified dsRNA elements, such as Alternaria longipes dsRNA virus 1, and its distant relationship to the RNA-dependent RNA polymerases of members of the family Partitiviridae. PMID:25577168

  11. mirTools: microRNA profiling and discovery based on high-throughput sequencing

    PubMed Central

    Zhu, Erle; Zhao, Fangqing; Xu, Gang; Hou, Huabin; Zhou, LingLin; Li, Xiaokun; Sun, Zhongsheng; Wu, Jinyu

    2010-01-01

    miRNAs are small, non-coding RNA that negatively regulate gene expression at post-transcriptional level, which play crucial roles in various physiological and pathological processes, such as development and tumorigenesis. Although deep sequencing technologies have been applied to investigate various small RNA transcriptomes, their computational methods are far away from maturation as compared to microarray-based approaches. In this study, a comprehensive web server mirTools was developed to allow researchers to comprehensively characterize small RNA transcriptome. With the aid of mirTools, users can: (i) filter low-quality reads and 3/5′ adapters from raw sequenced data; (ii) align large-scale short reads to the reference genome and explore their length distribution; (iii) classify small RNA candidates into known categories, such as known miRNAs, non-coding RNA, genomic repeats and coding sequences; (iv) provide detailed annotation information for known miRNAs, such as miRNA/miRNA*, absolute/relative reads count and the most abundant tag; (v) predict novel miRNAs that have not been characterized before; and (vi) identify differentially expressed miRNAs between samples based on two different counting strategies: total read tag counts and the most abundant tag counts. We believe that the integration of multiple computational approaches in mirTools will greatly facilitate current microRNA researches in multiple ways. mirTools can be accessed at http://centre.bioinformatics.zj.cn/mirtools/ and http://59.79.168.90/mirtools. PMID:20478827

  12. Next-generation sequencing identifies the natural killer cell microRNA transcriptome

    PubMed Central

    Fehniger, Todd A.; Wylie, Todd; Germino, Elizabeth; Leong, Jeffrey W.; Magrini, Vincent J.; Koul, Sunita; Keppel, Catherine R.; Schneider, Stephanie E.; Koboldt, Daniel C.; Sullivan, Ryan P.; Heinz, Michael E.; Crosby, Seth D.; Nagarajan, Rakesh; Ramsingh, Giridharan; Link, Daniel C.; Ley, Timothy J.; Mardis, Elaine R.

    2010-01-01

    Natural killer (NK) cells are innate lymphocytes important for early host defense against infectious pathogens and surveillance against malignant transformation. Resting murine NK cells regulate the translation of effector molecule mRNAs (e.g., granzyme B, GzmB) through unclear molecular mechanisms. MicroRNAs (miRNAs) are small noncoding RNAs that post-transcriptionally regulate the translation of their mRNA targets, and are therefore candidates for mediating this control process. While the expression and importance of miRNAs in T and B lymphocytes have been established, little is known about miRNAs in NK cells. Here, we used two next-generation sequencing (NGS) platforms to define the miRNA transcriptomes of resting and cytokine-activated primary murine NK cells, with confirmation by quantitative real-time PCR (qRT-PCR) and microarrays. We delineate a bioinformatics analysis pipeline that identified 302 known and 21 novel mature miRNAs from sequences obtained from NK cell small RNA libraries. These miRNAs are expressed over a broad range and exhibit isomiR complexity, and a subset is differentially expressed following cytokine activation. Using these miRNA NGS data, miR-223 was identified as a mature miRNA present in resting NK cells with decreased expression following cytokine activation. Furthermore, we demonstrate that miR-223 specifically targets the 3′ untranslated region of murine GzmB in vitro, indicating that this miRNA may contribute to control of GzmB translation in resting NK cells. Thus, the sequenced NK cell miRNA transcriptome provides a valuable framework for further elucidation of miRNA expression and function in NK cell biology. PMID:20935160

  13. RNA-ID, a Powerful Tool for Identifying and Characterizing Regulatory Sequences.

    PubMed

    Brule, C E; Dean, K M; Grayhack, E J

    2016-01-01

    The identification and analysis of sequences that regulate gene expression is critical because regulated gene expression underlies biology. RNA-ID is an efficient and sensitive method to discover and investigate regulatory sequences in the yeast Saccharomyces cerevisiae, using fluorescence-based assays to detect green fluorescent protein (GFP) relative to a red fluorescent protein (RFP) control in individual cells. Putative regulatory sequences can be inserted either in-frame or upstream of a superfolder GFP fusion protein whose expression, like that of RFP, is driven by the bidirectional GAL1,10 promoter. In this chapter, we describe the methodology to identify and study cis-regulatory sequences in the RNA-ID system, explaining features and variations of the RNA-ID reporter, as well as some applications of this system. We describe in detail the methods to analyze a single regulatory sequence, from construction of a single GFP variant to assay of variants by flow cytometry, as well as modifications required to screen libraries of different strains simultaneously. We also describe subsequent analyses of regulatory sequences. PMID:27241757

  14. [Characterization of Black and Dichothrix Cyanobacteria Based on the 16S Ribosomal RNA Gene Sequence

    NASA Technical Reports Server (NTRS)

    Ortega, Maya

    2010-01-01

    My project focuses on characterizing different cyanobacteria in thrombolitic mats found on the island of Highborn Cay, Bahamas. Thrombolites are interesting ecosystems because of the ability of bacteria in these mats to remove carbon dioxide from the atmosphere and mineralize it as calcium carbonate. In the future they may be used as models to develop carbon sequestration technologies, which could be used as part of regenerative life systems in space. These thrombolitic communities are also significant because of their similarities to early communities of life on Earth. I targeted two cyanobacteria in my research, Dichothrix spp. and whatever black is, since they are believed to be important to carbon sequestration in these thrombolitic mats. The goal of my summer research project was to molecularly identify these two cyanobacteria. DNA was isolated from each organism through mat dissections and DNA extractions. I ran Polymerase Chain Reactions (PCR) to amplify the 16S ribosomal RNA (rRNA) gene in each cyanobacteria. This specific gene is found in almost all bacteria and is highly conserved, meaning any changes in the sequence are most likely due to evolution. As a result, the 16S rRNA gene can be used for bacterial identification of different species based on the sequence of their 16S rRNA gene. Since the exact sequence of the Dichothrix gene was unknown, I designed different primers that flanked the gene based on the known sequences from other taxonomically similar cyanobacteria. Once the 16S rRNA gene was amplified, I cloned the gene into specialized Escherichia coli cells and sent the gene products for sequencing. Once the sequence is obtained, it will be added to a genetic database for future reference to and classification of other Dichothrix sp.

  15. Identification of microRNAs by small RNA deep sequencing for synthetic microRNA mimics to control Spodoptera exigua.

    PubMed

    Zhang, Yu Liang; Huang, Qi Xing; Yin, Guo Hua; Lee, Samantha; Jia, Rui Zong; Liu, Zhi Xin; Yu, Nai Tong; Pennerman, Kayla K; Chen, Xin; Guo, An Ping

    2015-02-25

    Beet armyworm, Spodoptera exigua, is a major pest of cotton around the world. With the increase of resistance to Bacillus thuringiensis (Bt) toxin in transgenic cotton plants, there is a need to develop an alternative control approach that can be used in combination with Bt transgenic crops as part of resistance management strategies. MicroRNAs (miRNAs), a non-coding small RNA family (18-25 nt), play crucial roles in various biological processes and over-expression of miRNAs has been shown to interfere with the normal development of insects. In this study, we identified 127 conserved miRNAs in S. exigua by using small RNA deep sequencing technology. From this, we tested the effects of 11 miRNAs on larval development. We found three miRNAs, Sex-miR-10-1a, Sex-miR-4924, and Sex-miR-9, to be differentially expressed during larval stages of S. exigua. Oral feeding experiments using synthetic miRNA mimics of Sex-miR-10-1a, Sex-miR-4924, and Sex-miR-9 resulted in suppressed growth of S. exigua and mortality. Over-expression of Sex-miR-4924 caused a significant reduction in the expression level of chitinase 1 and caused abortive molting in the insects. Therefore, we demonstrated a novel approach of using miRNA mimics to control S. exigua development. PMID:25528266

  16. The impact of CRISPR repeat sequence on structures of a Cas6 protein-RNA complex

    SciTech Connect

    Wang, Ruiying; Zheng, Han; Preamplume, Gan; Shao, Yaming; Li, Hong

    2012-03-15

    The repeat-associated mysterious proteins (RAMPs) comprise the most abundant family of proteins involved in prokaryotic immunity against invading genetic elements conferred by the clustered regularly interspaced short palindromic repeat (CRISPR) system. Cas6 is one of the first characterized RAMP proteins and is a key enzyme required for CRISPR RNA maturation. Despite a strong structural homology with other RAMP proteins that bind hairpin RNA, Cas6 distinctly recognizes single-stranded RNA. Previous structural and biochemical studies show that Cas6 captures the 5' end while cleaving the 3' end of the CRISPR RNA. Here, we describe three structures and complementary biochemical analysis of a noncatalytic Cas6 homolog from Pyrococcus horikoshii bound to CRISPR repeat RNA of different sequences. Our study confirms the specificity of the Cas6 protein for single-stranded RNA and further reveals the importance of the bases at Positions 5-7 in Cas6-RNA interactions. Substitutions of these bases result in structural changes in the protein-RNA complex including its oligomerization state.

  17. Structure and sequence of the gene for the largest subunit of trypanosomal RNA polymerase III.

    PubMed Central

    Köck, J; Evers, R; Cornelissen, A W

    1988-01-01

    As the first step in the analysis of the transcription process in the African trypanosome, Trypanosoma brucei, we have started to characterise the trypanosomal RNA polymerases. We have previously described the gene encoding the largest subunit of RNA polymerase II and found that two almost identical RNA polymerase II genes are encoded within the genome of T. brucei. Here we present the identification, cloning and sequence analysis of the gene encoding the largest subunit of RNA polymerase III. This gene contains a single open reading frame encoding a polypeptide with a Mr of 170 kD. In total, eight encoding a polypeptide with a Mr of 170 kD. In total, eight highly conserved regions with significant homology to those previously reported in other eukaryotic RNA polymerase largest subunits were identified. Some of these domains contain functional sites, which are conserved among all eukaryotic largest subunit genes analysed thus far. Since these domains make up a large part of each polypeptide, independent of the RNA polymerase class, these data strongly support the hypothesis that these domains provide a major part of the transcription machinery of the RNA polymerase complex. The additional domains which are uniquely present in the largest subunit of RNA polymerase I and II, respectively, two large hydrophylic insertions and a C-terminal extension, might be a determining factor in specific transcription of the gene classes. Images PMID:3174432

  18. Small RNA Sequencing Based Identification of MiRNAs in Daphnia magna

    PubMed Central

    2015-01-01

    Small RNA molecules are short, non-coding RNAs identified for their crucial role in post-transcriptional regulation. A well-studied example includes miRNAs (microRNAs) which have been identified in several model organisms including the freshwater flea and planktonic crustacean Daphnia. A model for epigenetic-based studies with an available genome database, the identification of miRNAs and their potential role in regulating Daphnia gene expression has only recently garnered interest. Computational-based work using Daphnia pulex, has indicated the existence of 45 miRNAs, 14 of which have been experimentally verified. To extend this study, we took a sequencing approach towards identifying miRNAs present in a small RNA library isolated from Daphnia magna. Using Perl codes designed for comparative genomic analysis, 815,699 reads were obtained from 4 million raw reads and run against a database file of known miRNA sequences. Using this approach, we have identified 205 putative mature miRNA sequences belonging to 188 distinct miRNA families. Data from this study provides critical information necessary to begin an investigation into a role for these transcripts in the epigenetic regulation of Daphnia magna. PMID:26367422

  19. Messenger RNA sequence and the translation process --a particle transport perspective

    NASA Astrophysics Data System (ADS)

    Dong, Jiajia; Schmittmann, Beate; Zia, Royce K. P.

    2008-03-01

    The translation process in bacteria has been under intensive study. A key question concerns the quantitative effect of different elongation rates, associated with different codons, on the overall translation efficiency. Starting with a simple particle transport model, the totally asymmetric simple exclusion process (TASEP), we incorporate the essential components of the translation process: Ribosomes, cognate tRNA concentrations, and messenger RNA (mRNA) templates correspond to particles, hopping rates, and the underlying lattice, respectively. Using simulations and mean-field approximations to obtain the stationary currents (the protein production rates) associated with different mRNA sequences, we are especially interested in the effect of slow codons, i.e., codons which are associated with rare tRNAs and are therefore translated very slowly. As the first step, we look at a ``designed sequence'' with one and two slow codons and quantify the marked impact of their spatial distribution to the currents. Extending the results to several mRNA sequences taken from real genes, we argue that an effective translation rate including the information from the vicinity of each codon needs to be taken into consideration when seeking an efficient strategy to optimize the protein production.

  20. CoRAL: predicting non-coding RNAs from small RNA-sequencing data.

    PubMed

    Leung, Yuk Yee; Ryvkin, Paul; Ungar, Lyle H; Gregory, Brian D; Wang, Li-San

    2013-08-01

    The surprising observation that virtually the entire human genome is transcribed means we know little about the function of many emerging classes of RNAs, except their astounding diversities. Traditional RNA function prediction methods rely on sequence or alignment information, which are limited in their abilities to classify the various collections of non-coding RNAs (ncRNAs). To address this, we developed Classification of RNAs by Analysis of Length (CoRAL), a machine learning-based approach for classification of RNA molecules. CoRAL uses biologically interpretable features including fragment length and cleavage specificity to distinguish between different ncRNA populations. We evaluated CoRAL using genome-wide small RNA sequencing data sets from four human tissue types and were able to classify six different types of RNAs with ∼80% cross-validation accuracy. Analysis by CoRAL revealed that microRNAs, small nucleolar and transposon-derived RNAs are highly discernible and consistent across all human tissue types assessed, whereas long intergenic ncRNAs, small cytoplasmic RNAs and small nuclear RNAs show less consistent patterns. The ability to reliably annotate loci across tissue types demonstrates the potential of CoRAL to characterize ncRNAs using small RNA sequencing data in less well-characterized organisms. PMID:23700308

  1. High-resolution transcriptome analysis with long-read RNA sequencing.

    PubMed

    Cho, Hyunghoon; Davis, Joe; Li, Xin; Smith, Kevin S; Battle, Alexis; Montgomery, Stephen B

    2014-01-01

    RNA sequencing (RNA-seq) enables characterization and quantification of individual transcriptomes as well as detection of patterns of allelic expression and alternative splicing. Current RNA-seq protocols depend on high-throughput short-read sequencing of cDNA. However, as ongoing advances are rapidly yielding increasing read lengths, a technical hurdle remains in identifying the degree to which differences in read length influence various transcriptome analyses. In this study, we generated two paired-end RNA-seq datasets of differing read lengths (2×75 bp and 2×262 bp) for lymphoblastoid cell line GM12878 and compared the effect of read length on transcriptome analyses, including read-mapping performance, gene and transcript quantification, and detection of allele-specific expression (ASE) and allele-specific alternative splicing (ASAS) patterns. Our results indicate that, while the current long-read protocol is considerably more expensive than short-read sequencing, there are important benefits that can only be achieved with longer read length, including lower mapping bias and reduced ambiguity in assigning reads to genomic elements, such as mRNA transcript. We show that these benefits ultimately lead to improved detection of cis-acting regulatory and splicing variation effects within individuals. PMID:25251678

  2. Systematic Assessment of RNA-Seq Quantification Tools Using Simulated Sequence Data

    PubMed Central

    Chandramohan, Raghu; Wu, Po-Yen; Phan, John H.; Wang, May D.

    2016-01-01

    RNA-sequencing (RNA-seq) technology has emerged as the preferred method for quantification of gene and isoform expression. Numerous RNA-seq quantification tools have been proposed and developed, bringing us closer to developing expression-based diagnostic tests based on this technology. However, because of the rapidly evolving technologies and algorithms, it is essential to establish a systematic method for evaluating the quality of RNA-seq quantification. We investigate how different RNA-seq experimental designs (i.e., variations in sequencing depth and read length) affect various quantification algorithms (i.e., HTSeq, Cufflinks, and MISO). Using simulated data, we evaluate the quantification tools based on four metrics, namely: (1) total number of usable fragments for quantification, (2) detection of genes and isoforms, (3) correlation, and (4) accuracy of expression quantification with respect to the ground truth. Results show that Cufflinks is able to use the largest number of fragments for quantification, leading to better detection of genes and isoforms. However, HTSeq produces more accurate expression estimates. Moreover, each quantification algorithm is affected differently by varying sequencing depth and read length, suggesting that the selection of quantification algorithms should be application-dependent.

  3. Examining the Gm18 and m1G Modification Positions in tRNA Sequences

    PubMed Central

    Subramanian, Mayavan; Srinivasan, Thangavelu

    2014-01-01

    The tRNA structure contains conserved modifications that are responsible for its stability and are involved in the initiation and accuracy of the translation process. tRNA modification enzymes are prevalent in bacteria, archaea, and eukaryotes. tRNA Gm18 methyltransferase (TrmH) and tRNA m1G37 methyltransferase (TrmD) are prevalent and essential enzymes in bacterial populations. TrmH involves itself in methylation process at the 2'-OH group of ribose at the 18th position of guanosine (G) in tRNAs. TrmD methylates the G residue next to the anticodon in selected tRNA subsets. Initially, m1G37 modification was reported to take place on three conserved tRNA subsets (tRNAArg, tRNALeu, tRNAPro); later on, few archaea and eukaryotes organisms revealed that other tRNAs also have the m1G37 modification. The present study reveals Gm18, m1G37 modification, and positions of m1G that take place next to the anticodon in tRNA sequences. We selected extremophile organisms and attempted to retrieve the m1G and Gm18 modification bases in tRNA sequences. Results showed that the Gm18 modification G residue occurs in all tRNA subsets except three tRNAs (tRNAMet, tRNAPro, tRNAVal). Whereas the m1G37 modification base G is formed only on tRNAArg, tRNALeu, tRNAPro, and tRNAHis, the rest of the tRNAs contain adenine (A) next to the anticodon. Thus, we hypothesize that Gm18 modification and m1G modification occur irrespective of a G residue in tRNAs. PMID:25031570

  4. RMBase: a resource for decoding the landscape of RNA modifications from high-throughput sequencing data

    PubMed Central

    Sun, Wen-Ju; Li, Jun-Hao; Liu, Shun; Wu, Jie; Zhou, Hui; Qu, Liang-Hu; Yang, Jian-Hua

    2016-01-01

    Although more than 100 different types of RNA modifications have been characterized across all living organisms, surprisingly little is known about the modified positions and their functions. Recently, various high-throughput modification sequencing methods have been developed to identify diverse post-transcriptional modifications of RNA molecules. In this study, we developed a novel resource, RMBase (RNA Modification Base, http://mirlab.sysu.edu.cn/rmbase/), to decode the genome-wide landscape of RNA modifications identified from high-throughput modification data generated by 18 independent studies. The current release of RMBase includes ∼9500 pseudouridine (Ψ) modifications generated from Pseudo-seq and CeU-seq sequencing data, ∼1000 5-methylcytosines (m5C) predicted from Aza-IP data, ∼124 200 N6-Methyladenosine (m6A) modifications discovered from m6A-seq and ∼1210 2′-O-methylations (2′-O-Me) identified from RiboMeth-seq data and public resources. Moreover, RMBase provides a comprehensive listing of other experimentally supported types of RNA modifications by integrating various resources. It provides web interfaces to show thousands of relationships between RNA modification sites and microRNA target sites. It can also be used to illustrate the disease-related SNPs residing in the modification sites/regions. RMBase provides a genome browser and a web-based modTool to query, annotate and visualize various RNA modifications. This database will help expand our understanding of potential functions of RNA modifications. PMID:26464443

  5. LNCipedia: a database for annotated human lncRNA transcript sequences and structures.

    PubMed

    Volders, Pieter-Jan; Helsens, Kenny; Wang, Xiaowei; Menten, Björn; Martens, Lennart; Gevaert, Kris; Vandesompele, Jo; Mestdagh, Pieter

    2013-01-01

    Here, we present LNCipedia (http://www.lncipedia.org), a novel database for human long non-coding RNA (lncRNA) transcripts and genes. LncRNAs constitute a large and diverse class of non-coding RNA genes. Although several lncRNAs have been functionally annotated, the majority remains to be characterized. Different high-throughput methods to identify new lncRNAs (including RNA sequencing and annotation of chromatin-state maps) have been applied in various studies resulting in multiple unrelated lncRNA data sets. LNCipedia offers 21 488 annotated human lncRNA transcripts obtained from different sources. In addition to basic transcript information and gene structure, several statistics are determined for each entry in the database, such as secondary structure information, protein coding potential and microRNA binding sites. Our analyses suggest that, much like microRNAs, many lncRNAs have a significant secondary structure, in-line with their presumed association with proteins or protein complexes. Available literature on specific lncRNAs is linked, and users or authors can submit articles through a web interface. Protein coding potential is assessed by two different prediction algorithms: Coding Potential Calculator and HMMER. In addition, a novel strategy has been integrated for detecting potentially coding lncRNAs by automatically re-analysing the large body of publicly available mass spectrometry data in the PRIDE database. LNCipedia is publicly available and allows users to query and download lncRNA sequences and structures based on different search criteria. The database may serve as a resource to initiate small- and large-scale lncRNA studies. As an example, the LNCipedia content was used to develop a custom microarray for expression profiling of all available lncRNAs. PMID:23042674

  6. Sequences more than 500 base pairs upstream of the human U3 small nuclear RNA gene stimulate the synthesis of U3 RNA in frog oocytes

    SciTech Connect

    Suh, D.; Reddy, R. ); Wright, D. )

    1991-06-04

    Small nuclear RNA (snRNA) genes contain strong promoters capable of initiating transcription once every 4 s. Studies on the human U1 snRNA gene, carried out in other laboratories, showed that sequences within 400 bp of the 5' flanking region are sufficient for maximal levels of transcription both in vivo and in frog oocytes (reviewed in Dahlberg and Lund (1988)). The authors studied the expression of a human U3 snRNA gene by injecting 5' deletion mutants into frog oocytes. The results show that sequences more than 500 bp upstream of the U3 snRNA gene have a 2-3-fold stimulatory effect on the U3 snRNA synthesis. These results indicate that the human U3 snRNA gene is different from human U1 snRNA gene in containing regulatory elements more than 500 bp upstream. The U3 snRNA gene upstream sequences contain an AluI homologous sequence in the {minus}1,200 region; these AluI sequences were transcribed in vitro and in frog oocytes but were not detectable in Hela cells.

  7. Bacterial metabarcoding by 16S rRNA gene ion torrent amplicon sequencing.

    PubMed

    Fantini, Elio; Gianese, Giulio; Giuliano, Giovanni; Fiore, Alessia

    2015-01-01

    Ion Torrent is a next generation sequencing technology based on the detection of hydrogen ions produced during DNA chain elongation; this technology allows analyzing and characterizing genomes, genes, and species. Here, we describe an Ion Torrent procedure applied to the metagenomic analysis of 16S rRNA gene amplicons to study the bacterial diversity in food and environmental samples. PMID:25343859

  8. Predicting candidate genomic sequences that correspond to synthetic functional RNA motifs

    PubMed Central

    Laserson, Uri; Gan, Hin Hark; Schlick, Tamar

    2005-01-01

    Riboswitches and RNA interference are important emerging mechanisms found in many organisms to control gene expression. To enhance our understanding of such RNA roles, finding small regulatory motifs in genomes presents a challenge on a wide scale. Many simple functional RNA motifs have been found by in vitro selection experiments, which produce synthetic target-binding aptamers as well as catalytic RNAs, including the hammerhead ribozyme. Motivated by the prediction of Piganeau and Schroeder [(2003) Chem. Biol., 10, 103–104] that synthetic RNAs may have natural counterparts, we develop and apply an efficient computational protocol for identifying aptamer-like motifs in genomes. We define motifs from the sequence and structural information of synthetic aptamers, search for sequences in genomes that will produce motif matches, and then evaluate the structural stability and statistical significance of the potential hits. Our application to aptamers for streptomycin, chloramphenicol, neomycin B and ATP identifies 37 candidate sequences (in coding and non-coding regions) that fold to the target aptamer structures in bacterial and archaeal genomes. Further energetic screening reveals that several candidates exhibit energetic properties and sequence conservation patterns that are characteristic of functional motifs. Besides providing candidates for experimental testing, our computational protocol offers an avenue for expanding natural RNA's functional repertoire. PMID:16254081

  9. De novo transcript sequence reconstruction from RNA-Seq: reference generation and analysis with Trinity

    Technology Transfer Automated Retrieval System (TEKTRAN)

    De novo assembly of RNA-seq data enables researchers to study transcriptomes without the need for a genome sequence; this approach can be usefully applied, for instance, in research on 'non-model organisms' of ecological and evolutionary importance, cancer samples or the microbiome. In this protocol...

  10. Bioinformatics analysis of plant orthologous introns: identification of an intronic tRNA-like sequence.

    PubMed

    Akkuratov, Evgeny E; Walters, Lorraine; Saha-Mandal, Arnab; Khandekar, Sushant; Crawford, Erin; Zirbel, Craig L; Leisner, Scott; Prakash, Ashwin; Fedorova, Larisa; Fedorov, Alexei

    2014-09-10

    Orthologous introns have identical positions relative to the coding sequence in orthologous genes of different species. By analyzing the complete genomes of five plants we generated a database of 40,512 orthologous intron groups of dicotyledonous plants, 28,519 orthologous intron groups of angiosperms, and 15,726 of land plants (moss and angiosperms). Multiple sequence alignments of each orthologous intron group were obtained using the Mafft algorithm. The number of conserved regions in plant introns appeared to be hundreds of times fewer than that in mammals or vertebrates. Approximately three quarters of conserved intronic regions among angiosperms and dicots, in particular, correspond to alternatively-spliced exonic sequences. We registered only a handful of conserved intronic ncRNAs of flowering plants. However, the most evolutionarily conserved intronic region, which is ubiquitous for all plants examined in this study, including moss, possessed multiple structural features of tRNAs, which caused us to classify it as a putative tRNA-like ncRNA. Intronic sequences encoding tRNA-like structures are not unique to plants. Bioinformatics examination of the presence of tRNA inside introns revealed an unusually long-term association of four glycine tRNAs inside the Vac14 gene of fish, amniotes, and mammals. PMID:25014137

  11. Adenovirus type 2 VAI RNA transcription by polymerase III is blocked by sequence-specific methylation.

    PubMed Central

    Jüttermann, R; Hosokawa, K; Kochanek, S; Doerfler, W

    1991-01-01

    Sequence-specific methylation of the promoter and adjacent regions in mammalian genes transcribed by RNA polymerase II leads to the inhibition of these genes. So far, RNA polymerase III-transcribed genes have not been investigated in depth. We therefore studied methylation effects on the RNA polymerase III-transcribed VAI gene of adenovirus type 2 DNA. The VAI gene contains 20 5'-CG-3' dinucleotides, of which 4 (20%) can be methylated by HpaII (5'-CCGG-3') and HhaI (5'-GCGC-3'). Three of these 5'-CG-3' sequences are located close to the internal regulatory region of the VAI segment. An unmethylated, a 5'-CCGG-3'- and 5'-GCGC-3'-methylated, and a 5'-CG-3'-methylated pUC18 construct containing the VAI and VAII regions were transfected into mammalian cells. In many experiments, an inactivating effect of 5'-CCGG-3' and 5'-GCGC-3' DNA methylation on the VAI region was not observed. In contrast, methylation of all 20 5'-CG-3' sequences in the VAI region by a CpG-specific DNA methyltransferase from Spiroplasma species did interfere with VAI transcription. Transcription of the VAI- and VAII- and of the VAI-containing constructs was also shown to be inhibited in an in vitro cell-free transcription system after the constructs had been methylated at the 5'-CCGG-3' and 5'-GCGC-3' sequences or at all 5'-CG-3' sequences. When an oligodeoxyribonucleotide which carried the internal control block A of the VAI region was methylated at three 5'-CG-3' sequences, the formation of a complex with HeLa nuclear proteins was abrogated. The results presented support the notion that the VAI gene transcribed by the DNA-dependent RNA polymerase III is also inactivated by methylation of the decisive 5'-CG-3' sequences. Images PMID:2002541

  12. RNA-directed DNA methylation efficiency depends on trigger and target sequence identity.

    PubMed

    Dalakouras, Athanasios; Dadami, Elena; Wassenegger, Michèle; Krczal, Gabi; Wassenegger, Michael

    2016-07-01

    RNA-directed DNA methylation (RdDM) in plants has been extensively studied, but the RNA molecules guiding the RdDM machinery to their targets are still to be characterized. It is unclear whether these molecules require full complementarity with their target. In this study, we have generated Nicotiana tabacum (Nt) plants carrying an infectious tomato apical stunt viroid (TASVd) transgene (Nt-TASVd) and a non-infectious potato spindle tuber viroid (PSTVd) transgene (Nt-SB2). The two viroid sequences exhibit 81% sequence identity. Nt-TASVd and Nt-SB2 plants were genetically crossed. In the progeny plants (Nt-SB2/TASVd), deep sequencing of small RNAs (sRNAs) showed that TASVd infection was associated with the accumulation of abundant small interfering RNAs (siRNAs) that mapped along the entire TASVd but only partially matched the SB2 transgene. TASVd siRNAs efficiently targeted SB2 RNA for degradation, but no transitivity was detectable. Bisulfite sequencing in the Nt-SB2/TASVd plants revealed that the TASVd transgene was targeted for dense cis-RdDM along its entire sequence. In the same plants, the SB2 transgene was targeted for trans-RdDM. The SB2 methylation pattern, however, was weak and heterogeneous, pointing to a positive correlation between trigger-target sequence identity and RdDM efficiency. Importantly, trans-RdDM on SB2 was also detected at sites where no homologous siRNAs were detected. Our data indicate that RdDM efficiency depends on the trigger-target sequence identity, and is not restricted to siRNA occupancy. These findings support recent data suggesting that RNAs with sizes longer than 24 nt (>24-nt RNAs) trigger RdDM. PMID:27121647

  13. SINA: Accurate high-throughput multiple sequence alignment of ribosomal RNA genes

    PubMed Central

    Pruesse, Elmar; Peplies, Jörg; Glöckner, Frank Oliver

    2012-01-01

    Motivation: In the analysis of homologous sequences, computation of multiple sequence alignments (MSAs) has become a bottleneck. This is especially troublesome for marker genes like the ribosomal RNA (rRNA) where already millions of sequences are publicly available and individual studies can easily produce hundreds of thousands of new sequences. Methods have been developed to cope with such numbers, but further improvements are needed to meet accuracy requirements. Results: In this study, we present the SILVA Incremental Aligner (SINA) used to align the rRNA gene databases provided by the SILVA ribosomal RNA project. SINA uses a combination of k-mer searching and partial order alignment (POA) to maintain very high alignment accuracy while satisfying high throughput performance demands. SINA was evaluated in comparison with the commonly used high throughput MSA programs PyNAST and mothur. The three BRAliBase III benchmark MSAs could be reproduced with 99.3, 97.6 and 96.1 accuracy. A larger benchmark MSA comprising 38 772 sequences could be reproduced with 98.9 and 99.3% accuracy using reference MSAs comprising 1000 and 5000 sequences. SINA was able to achieve higher accuracy than PyNAST and mothur in all performed benchmarks. Availability: Alignment of up to 500 sequences using the latest SILVA SSU/LSU Ref datasets as reference MSA is offered at http://www.arb-silva.de/aligner. This page also links to Linux binaries, user manual and tutorial. SINA is made available under a personal use license. Contact: epruesse@mpi-bremen.de Supplementary information: Supplementary data are available at Bioinformatics online. PMID:22556368

  14. Widespread Endogenization of Genome Sequences of Non-Retroviral RNA Viruses into Plant Genomes

    PubMed Central

    Tani, Akio; Saisho, Daisuke; Sakamoto, Wataru; Kanematsu, Satoko; Suzuki, Nobuhiro

    2011-01-01

    Non-retroviral RNA virus sequences (NRVSs) have been found in the chromosomes of vertebrates and fungi, but not plants. Here we report similarly endogenized NRVSs derived from plus-, negative-, and double-stranded RNA viruses in plant chromosomes. These sequences were found by searching public genomic sequence databases, and, importantly, most NRVSs were subsequently detected by direct molecular analyses of plant DNAs. The most widespread NRVSs were related to the coat protein (CP) genes of the family Partitiviridae which have bisegmented dsRNA genomes, and included plant- and fungus-infecting members. The CP of a novel fungal virus (Rosellinia necatrix partitivirus 2, RnPV2) had the greatest sequence similarity to Arabidopsis thaliana ILR2, which is thought to regulate the activities of the phytohormone auxin, indole-3-acetic acid (IAA). Furthermore, partitivirus CP-like sequences much more closely related to plant partitiviruses than to RnPV2 were identified in a wide range of plant species. In addition, the nucleocapsid protein genes of cytorhabdoviruses and varicosaviruses were found in species of over 9 plant families, including Brassicaceae and Solanaceae. A replicase-like sequence of a betaflexivirus was identified in the cucumber genome. The pattern of occurrence of NRVSs and the phylogenetic analyses of NRVSs and related viruses indicate that multiple independent integrations into many plant lineages may have occurred. For example, one of the NRVSs was retained in Ar. thaliana but not in Ar. lyrata or other related Camelina species, whereas another NRVS displayed the reverse pattern. Our study has shown that single- and double-stranded RNA viral sequences are widespread in plant genomes, and shows the potential of genome integrated NRVSs to contribute to resolve unclear phylogenetic relationships of plant species. PMID:21779172

  15. RNA sequencing uncovers antisense RNAs and novel small RNAs in Streptococcus pyogenes.

    PubMed

    Le Rhun, Anaïs; Beer, Yan Yan; Reimegård, Johan; Chylinski, Krzysztof; Charpentier, Emmanuelle

    2016-02-01

    Streptococcus pyogenes is a human pathogen responsible for a wide spectrum of diseases ranging from mild to life-threatening infections. During the infectious process, the temporal and spatial expression of pathogenicity factors is tightly controlled by a complex network of protein and RNA regulators acting in response to various environmental signals. Here, we focus on the class of small RNA regulators (sRNAs) and present the first complete analysis of sRNA sequencing data in S. pyogenes. In the SF370 clinical isolate (M1 serotype), we identified 197 and 428 putative regulatory RNAs by visual inspection and bioinformatics screening of the sequencing data, respectively. Only 35 from the 197 candidates identified by visual screening were assigned a predicted function (T-boxes, ribosomal protein leaders, characterized riboswitches or sRNAs), indicating how little is known about sRNA regulation in S. pyogenes. By comparing our list of predicted sRNAs with previous S. pyogenes sRNA screens using bioinformatics or microarrays, 92 novel sRNAs were revealed, including antisense RNAs that are for the first time shown to be expressed in this pathogen. We experimentally validated the expression of 30 novel sRNAs and antisense RNAs. We show that the expression profile of 9 sRNAs including 2 predicted regulatory elements is affected by the endoribonucleases RNase III and/or RNase Y, highlighting the critical role of these enzymes in sRNA regulation. PMID:26580233

  16. Computational Approaches for the Analysis of ncRNA through Deep Sequencing Techniques

    PubMed Central

    Veneziano, Dario; Nigita, Giovanni; Ferro, Alfredo

    2015-01-01

    The majority of the human transcriptome is defined as non-coding RNA (ncRNA), since only a small fraction of human DNA encodes for proteins, as reported by the ENCODE project. Several distinct classes of ncRNAs, such as transfer RNA, microRNA, and long non-coding RNA, have been classified, each with its own three-dimensional folding and specific function. As ncRNAs are highly abundant in living organisms and have been discovered to play important roles in many biological processes, there has been an ever increasing need to investigate the entire ncRNAome in further unbiased detail. Recently, the advent of next-generation sequencing (NGS) technologies has substantially increased the throughput of transcriptome studies, allowing an unprecedented investigation of ncRNAs, as regulatory pathways and novel functions involving ncRNAs are now also emerging. The huge amount of transcript data produced by NGS has progressively required the development and implementation of suitable bioinformatics workflows, complemented by knowledge-based approaches, to identify, classify, and evaluate the expression of hundreds of ncRNAs in normal and pathological conditions, such as cancer. In this mini-review, we present and discuss current bioinformatics advances in the development of such computational approaches to analyze and classify the ncRNA component of human transcriptome sequence data obtained from NGS technologies. PMID:26090362

  17. RNA sequencing uncovers antisense RNAs and novel small RNAs in Streptococcus pyogenes

    PubMed Central

    Le Rhun, Anaïs; Beer, Yan Yan; Reimegård, Johan; Chylinski, Krzysztof; Charpentier, Emmanuelle

    2016-01-01

    ABSTRACT Streptococcus pyogenes is a human pathogen responsible for a wide spectrum of diseases ranging from mild to life-threatening infections. During the infectious process, the temporal and spatial expression of pathogenicity factors is tightly controlled by a complex network of protein and RNA regulators acting in response to various environmental signals. Here, we focus on the class of small RNA regulators (sRNAs) and present the first complete analysis of sRNA sequencing data in S. pyogenes. In the SF370 clinical isolate (M1 serotype), we identified 197 and 428 putative regulatory RNAs by visual inspection and bioinformatics screening of the sequencing data, respectively. Only 35 from the 197 candidates identified by visual screening were assigned a predicted function (T-boxes, ribosomal protein leaders, characterized riboswitches or sRNAs), indicating how little is known about sRNA regulation in S. pyogenes. By comparing our list of predicted sRNAs with previous S. pyogenes sRNA screens using bioinformatics or microarrays, 92 novel sRNAs were revealed, including antisense RNAs that are for the first time shown to be expressed in this pathogen. We experimentally validated the expression of 30 novel sRNAs and antisense RNAs. We show that the expression profile of 9 sRNAs including 2 predicted regulatory elements is affected by the endoribonucleases RNase III and/or RNase Y, highlighting the critical role of these enzymes in sRNA regulation. PMID:26580233

  18. Computational Approaches for the Analysis of ncRNA through Deep Sequencing Techniques.

    PubMed

    Veneziano, Dario; Nigita, Giovanni; Ferro, Alfredo

    2015-01-01

    The majority of the human transcriptome is defined as non-coding RNA (ncRNA), since only a small fraction of human DNA encodes for proteins, as reported by the ENCODE project. Several distinct classes of ncRNAs, such as transfer RNA, microRNA, and long non-coding RNA, have been classified, each with its own three-dimensional folding and specific function. As ncRNAs are highly abundant in living organisms and have been discovered to play important roles in many biological processes, there has been an ever increasing need to investigate the entire ncRNAome in further unbiased detail. Recently, the advent of next-generation sequencing (NGS) technologies has substantially increased the throughput of transcriptome studies, allowing an unprecedented investigation of ncRNAs, as regulatory pathways and novel functions involving ncRNAs are now also emerging. The huge amount of transcript data produced by NGS has progressively required the development and implementation of suitable bioinformatics workflows, complemented by knowledge-based approaches, to identify, classify, and evaluate the expression of hundreds of ncRNAs in normal and pathological conditions, such as cancer. In this mini-review, we present and discuss current bioinformatics advances in the development of such computational approaches to analyze and classify the ncRNA component of human transcriptome sequence data obtained from NGS technologies. PMID:26090362

  19. A max-margin model for efficient simultaneous alignment and folding of RNA sequences

    PubMed Central

    Do, Chuong B.; Foo, Chuan-Sheng; Batzoglou, Serafim

    2008-01-01

    Motivation: The need for accurate and efficient tools for computational RNA structure analysis has become increasingly apparent over the last several years: RNA folding algorithms underlie numerous applications in bioinformatics, ranging from microarray probe selection to de novo non-coding RNA gene prediction. In this work, we present RAF (RNA Alignment and Folding), an efficient algorithm for simultaneous alignment and consensus folding of unaligned RNA sequences. Algorithmically, RAF exploits sparsity in the set of likely pairing and alignment candidates for each nucleotide (as identified by the CONTRAfold or CONTRAlign programs) to achieve an effectively quadratic running time for simultaneous pairwise alignment and folding. RAF's fast sparse dynamic programming, in turn, serves as the inference engine within a discriminative machine learning algorithm for parameter estimation. Results: In cross-validated benchmark tests, RAF achieves accuracies equaling or surpassing the current best approaches for RNA multiple sequence secondary structure prediction. However, RAF requires nearly an order of magnitude less time than other simultaneous folding and alignment methods, thus making it especially appropriate for high-throughput studies. Availability: Source code for RAF is available at:http://contra.stanford.edu/contrafold/ Contact: chuongdo@cs.stanford.edu PMID:18586747

  20. hnRNP G: sequence and characterization of a glycosylated RNA-binding protein.

    PubMed Central

    Soulard, M; Della Valle, V; Siomi, M C; Piñol-Roma, S; Codogno, P; Bauvy, C; Bellini, M; Lacroix, J C; Monod, G; Dreyfuss, G

    1993-01-01

    The autoantigen p43 is a nuclear protein initially identified with autoantibodies from dogs with a lupus-like syndrome. Here we show that p43 is an RNA-binding protein, and identify it as hnRNP G, a previously described component of heterogeneous nuclear ribonucleoprotein complexes. We demonstrate that p43/hnRNP G is glycosylated, and identify the modification as O-linked N-acetylglucosamine. A full-length cDNA clone for hnRNP G has been isolated and sequenced, and the predicted amino acid sequence for hnRNP G shows that it contains one RNP-consensus RNA binding domain (RBD) at the amino terminus and a carboxyl domain rich in serines, arginines and glycines. The RBD of human hnRNP G shows striking similarities with the RBDs of several plant RNA-binding proteins. Images PMID:7692398

  1. Genome wide instability scanning in chewing-tobacco associated oral cancer using inter simple sequence repeat PCR.

    PubMed

    Rai, Rekha; Kulkarni, Viraj; Saranath, Dhananjaya

    2004-11-01

    Genomic instability plays a major role in cancer, facilitating tumour progression and tumour heterogeneity. Inter simple sequence repeat PCR (ISSR-PCR) is a sensitive tool for detection of whole genome scanning. In fifteen oral cancer patients, using tumor tissue and adjacent normal tissue DNA, we investigated genomic instability regions using ISSR-PCR assay. The genomic fragments were cloned, sequenced and identified. Two-anchored dinucleotide repeat primers, (CA)(8)A/GG and (CA)(8)A/GC/T, were used in the study. About 40-50 fragments were observed on polyacrylamide gel electrophoresis, with 25 distinct fragments of less than 2 kb. The electrophoretic pattern highlighted several distinct fragments in tumor adjacent normal tissues. The distinct fragments of 258, 325, 430, 440, 600 and 900 bp sizes using (CA)(8)A/GG primer, and 300, 475, 675 and 800 bp using (CA)(8)A/GC/T primers, in the normal tissues showed partial (>50%) or complete loss in multiple tumor tissues. These fragments were eluted from the gel, cloned in pMos Blue vector and subjected to nucleotide sequencing. Insilico analysis defined the specific genomic sequences, given as follows: RP11-399D2 () on chromosome (chr)4; RP1-39J2 (), NKp44RG () and RP11-518I13 () on chr6; NC-T-2 () on chr7; RP11-586K2 () and RP11-495O10 () on chr8; RP11-101K10 () on chr9; R-794A8 () on chr14; and RP11-679B19 () on chr16. The sequences of our clones have been submitted to NCBI gene bank, accession numbers to , and . The Genomic Instability Index was calculated and ranged from 6% to 28.5% (median 12%) in the oral cancer samples, excluding one case where genomic instability was not observed. Thus, our results indicate presence of widespread genomic alterations in chewing-tobacco associated oral cancers. PMID:15509495

  2. Assessment of translational importance of mammalian mRNA sequence features based on Ribo-Seq and mRNA-Seq data.

    PubMed

    Volkova, Oxana A; Kondrakhin, Yury V; Yevshin, Ivan S; Valeev, Tagir F; Sharipov, Ruslan N

    2016-04-01

    Ribosome profiling technology (Ribo-Seq) allowed to highlight more details of mRNA translation in cell and get additional information on importance of mRNA sequence features for this process. Application of translation inhibitors like harringtonine and cycloheximide along with mRNA-Seq technique helped to assess such important characteristic as translation efficiency. We assessed the translational importance of features of mRNA sequences with the help of statistical analysis of Ribo-Seq and mRNA-Seq data. Translationally important features known from literature as well as proposed by the authors were used in analysis. Such comparisons as protein coding versus non-coding RNAs and high- versus low-translated mRNAs were performed. We revealed a set of features that allowed to discriminate the compared categories of RNA. Significant relationships between mRNA features and efficiency of translation were also established. PMID:27122318

  3. SILVA: a comprehensive online resource for quality checked and aligned ribosomal RNA sequence data compatible with ARB

    PubMed Central

    Pruesse, Elmar; Quast, Christian; Knittel, Katrin; Fuchs, Bernhard M.; Ludwig, Wolfgang; Peplies, Jörg; Glöckner, Frank Oliver

    2007-01-01

    Sequencing ribosomal RNA (rRNA) genes is currently the method of choice for phylogenetic reconstruction, nucleic acid based detection and quantification of microbial diversity. The ARB software suite with its corresponding rRNA datasets has been accepted by researchers worldwide as a standard tool for large scale rRNA analysis. However, the rapid increase of publicly available rRNA sequence data has recently hampered the maintenance of comprehensive and curated rRNA knowledge databases. A new system, SILVA (from Latin silva, forest), was implemented to provide a central comprehensive web resource for up to date, quality controlled databases of aligned rRNA sequences from the Bacteria, Archaea and Eukarya domains. All sequences are checked for anomalies, carry a rich set of sequence associated contextual information, have multiple taxonomic classifications, and the latest validly described nomenclature. Furthermore, two precompiled sequence datasets compatible with ARB are offered for download on the SILVA website: (i) the reference (Ref) datasets, comprising only high quality, nearly full length sequences suitable for in-depth phylogenetic analysis and probe design and (ii) the comprehensive Parc datasets with all publicly available rRNA sequences longer than 300 nucleotides suitable for biodiversity analyses. The latest publicly available database release 91 (August 2007) hosts 547 521 sequences split into 461 823 small subunit and 85 689 large subunit rRNAs. PMID:17947321

  4. The sequence of 28S ribosomal RNA varies within and between human cell lines.

    PubMed Central

    Leffers, H; Andersen, A H

    1993-01-01

    The primary structure of 28S ribosomal RNA constitutes a conserved core which is similar among most 23S-like rRNAs and expansion segments which occur at specific positions in the sequence. The expansion segments account for most of the size difference between prokaryotic (archaeal and eubacterial) and eukaryotic rRNAs and they exhibit a sequence variation which is unique among rRNAs. We have investigated the sequence variation of one of the expansion segments, V8, by sequencing a total of 111 V8 segments from 9 different human cell lines and tissues and have found 35 different variants. The variation occur mainly at two 'hot spots' which are separated by 170 nucleotides in the primary sequence but are neighbours in the secondary structure. The sequence of V8 segments varies both within and between human cell lines and tissues. The implications for the evolution of the eukaryotic 28S rRNA are discussed together with possible functions of the expansion segments. We also present a secondary structure model for the V8 segment based on comparative sequence analysis and chemical and enzymatic foot printing. Images PMID:8464736

  5. Methods for small RNA preparation for digital gene expression profiling by next-generation sequencing.

    PubMed

    Linsen, Sam E V; Cuppen, Edwin

    2012-01-01

    Digital gene expression (DGE) profiling techniques are playing an eminent role in the detection, localization, and differential expression quantification of many small RNA species, including microRNAs (1-3). Procedures in small RNA library preparation techniques typically include adapter ligation by RNA ligase, followed by reverse transcription and amplification by PCR. This chapter describes three protocols that were successfully applied to generate small RNA sequencing SOLiD(TM) libraries. The Ambion SREK(TM)-adopted protocol can be readily used for multiplexing samples; the modban-based protocol is cost-efficient, but biased toward certain microRNAs; the poly(A)-based protocol is less biased, but less precise because of the A-tail that is introduced. In summary, each of these protocols has its advantages and disadvantages with respect to the ease of including barcodes, costs, and outcome. PMID:22144201

  6. An RNA-based approach to sequence the mitogenome of Hypoptopoma incognitum (Siluriformes: Loricariidae).

    PubMed

    Moreira, Daniel Andrade; Magalhães, Maithê G P; de Andrade, Paula C C; Furtado, Carolina; Val, Adalberto L; Parente, Thiago Estevam

    2016-09-01

    Hypoptopoma incognitum is a fish of the fifth most species-rich family of vertebrates and abundant in rivers from the Brazilian Amazon. Only two species of Loricariidae fish have their complete mitogenomes sequence deposited in the Genbank. An innovative RNA-based approach was used to assemble the complete mitogenome of H. incognitum with an average coverage depth of 5292×. The typical vertebrate mitochondrial features were found; 22 tRNA genes, two rRNA genes, 13 protein-coding genes, and a non-coding control region. Moreover, the use of this approach allowed the measurement of mtRNA expression levels, the punctuation pattern of editing, and the detection of heteroplasmies. PMID:26370305

  7. Phylogenetic relationships among Vairimorpha and Nosema species (Microspora) based on ribosomal RNA sequence data.

    PubMed

    Baker, M D; Vossbrinck, C R; Maddox, J V; Undeen, A H

    1994-09-01

    A portion (approximately 350 nucleotides) of the large subunit ribosomal RNA (rRNA) 5' to the 580 region (Escherichia coli numbering) was sequenced using the reverse transcriptase dideoxy method and compared for several species of Nosema and Vairimorpha. Comparison among Nosema species suggests that this genus is composed of several unrelated groups. The group which includes the type species, Nosema bombycis, consists of closely related species found primarily in Lepidoptera. Other Nosema species sequenced (Nosema kingi, Nosema algerae, and Nosema locustae) do not appear to be closely related to each other or to the lepidopteran Nosema group. Comparison among the Vairimorpha species indicates that two distinct but very closely related groups exist. The Lymantria group consists of species isolated from the gypsy moth, Lymantria dispar, while the Vairimorpha necatrix group consists of species isolated from other Lepidoptera. Intergeneric comparison of the sequence data suggests that the lepidopteran Nosema species are closely related to the Vairimorpha species. PMID:7963643

  8. Practicability of detecting somatic point mutation from RNA high throughput sequencing data.

    PubMed

    Sheng, Quanhu; Zhao, Shilin; Li, Chung-I; Shyr, Yu; Guo, Yan

    2016-05-01

    Traditionally, somatic mutations are detected by examining DNA sequence. The maturity of sequencing technology has allowed researchers to screen for somatic mutations in the whole genome. Increasingly, researchers have become interested in identifying somatic mutations through RNAseq data. With this motivation, we evaluated the practicability of detecting somatic mutations from RNAseq data. Current somatic mutation calling tools were designed for DNA sequencing data. To increase performance on RNAseq data, we developed a somatic mutation caller GLMVC based on bias reduced generalized linear model for both DNA and RNA sequencing data. Through comparison with MuTect and Varscan we showed that GLMVC performed better for somatic mutation detection using exome sequencing or RNAseq data. GLMVC is freely available for download at the following website: https://github.com/shengqh/GLMVC/wiki. PMID:27046520

  9. Deep sequencing reveals global patterns of mRNA recruitment during translation initiation

    PubMed Central

    Gao, Rong; Yu, Kai; Nie, Jukui; Lian, Tengfei; Jin, Jianshi; Liljas, Anders; Su, Xiao-Dong

    2016-01-01

    In this work, we developed a method to systematically study the sequence preference of mRNAs during translation initiation. Traditionally, the dynamic process of translation initiation has been studied at the single molecule level with limited sequencing possibility. Using deep sequencing techniques, we identified the sequence preference at different stages of the initiation complexes. Our results provide a comprehensive and dynamic view of the initiation elements in the translation initiation region (TIR), including the S1 binding sequence, the Shine-Dalgarno (SD)/anti-SD interaction and the second codon, at the equilibrium of different initiation complexes. Moreover, our experiments reveal the conformational changes and regional dynamics throughout the dynamic process of mRNA recruitment. PMID:27460773

  10. Phylogenetic tree derived from bacterial, cytosol and organelle 5S rRNA sequences.

    PubMed Central

    Küntzel, H; Heidrich, M; Piechulla, B

    1981-01-01

    A phylogenetic tree was constructed by computer analysis of 47 completely determined 5S rRNA sequences. The wheat mitochondrial sequence is significantly more related to prokaryotic than to eukaryotic sequences, and its affinity to that of the thermophilic Gram-negative bacterium Thermus aquaticus is comparable to the affinity between Anacystis nidulans and chloroplastic sequences. This strongly supports the idea of an endosymbiotic origin of plant mitochondria. A comparison of the plant cytosol and chloroplast sub-trees suggests a similar rate of nucleotide substitution in nuclear genes and chloroplastic genes. Other features of the tree are a common precursor of protozoa and metazoa, which appears to be more related to the fungal than to the plant protosequence, and an early divergence of the archebacterial sequence (Halobacterium cutirubrum) from the prokaryotic branch. PMID:6785727

  11. RNA shotgun metagenomic sequencing of northern California (USA) mosquitoes uncovers viruses, bacteria, and fungi

    PubMed Central

    Chandler, James Angus; Liu, Rachel M.; Bennett, Shannon N.

    2015-01-01

    Mosquitoes, most often recognized for the microbial agents of disease they may carry, harbor diverse microbial communities that include viruses, bacteria, and fungi, collectively called the microbiota. The composition of the microbiota can directly and indirectly affect disease transmission through microbial interactions that could be revealed by its characterization in natural populations of mosquitoes. Furthermore, the use of shotgun metagenomic sequencing (SMS) approaches could allow the discovery of unknown members of the microbiota. In this study, we use RNA SMS to characterize the microbiota of seven individual mosquitoes (species include Culex pipiens, Culiseta incidens, and Ochlerotatus sierrensis) collected from a variety of habitats in California, USA. Sequencing was performed on the Illumina HiSeq platform and the resulting sequences were quality-checked and assembled into contigs using the A5 pipeline. Sequences related to single stranded RNA viruses of the Bunyaviridae and Rhabdoviridae were uncovered, along with an unclassified genus of double-stranded RNA viruses. Phylogenetic analysis finds that in all three cases, the closest relatives of the identified viral sequences are other mosquito-associated viruses, suggesting widespread host-group specificity among disparate viral taxa. Interestingly, we identified a Narnavirus of fungi, also reported elsewhere in mosquitoes, that potentially demonstrates a nested host-parasite association between virus, fungi, and mosquito. Sequences related to 8 bacterial families and 13 fungal families were found across the seven samples. Bacillus and Escherichia/Shigella were identified in all samples and Wolbachia was identified in all Cx. pipiens samples, while no single fungal genus was found in more than two samples. This study exemplifies the utility of RNA SMS in the characterization of the natural microbiota of mosquitoes and, in particular, the value of identifying all microbes associated with a specific host

  12. Profiling status epilepticus-induced changes in hippocampal RNA expression using high-throughput RNA sequencing.

    PubMed

    Hansen, Katelin F; Sakamoto, Kensuke; Pelz, Carl; Impey, Soren; Obrietan, Karl

    2014-01-01

    Status epilepticus (SE) is a life-threatening condition that can give rise to a number of neurological disorders, including learning deficits, depression, and epilepsy. Many of the effects of SE appear to be mediated by alterations in gene expression. To gain deeper insight into how SE affects the transcriptome, we employed the pilocarpine SE model in mice and Illumina-based high-throughput sequencing to characterize alterations in gene expression from the induction of SE, to the development of spontaneous seizure activity. While some genes were upregulated over the entire course of the pathological progression, each of the three sequenced time points (12-hour, 10-days and 6-weeks post-SE) had a largely unique transcriptional profile. Hence, genes that regulate synaptic physiology and transcription were most prominently altered at 12-hours post-SE; at 10-days post-SE, marked changes in metabolic and homeostatic gene expression were detected; at 6-weeks, substantial changes in the expression of cell excitability and morphogenesis genes were detected. At the level of cell signaling, KEGG analysis revealed dynamic changes within the MAPK pathways, as well as in CREB-associated gene expression. Notably, the inducible expression of several noncoding transcripts was also detected. These findings offer potential new insights into the cellular events that shape SE-evoked pathology. PMID:25373493

  13. Rhizobium sp. strain BN4 (a selenium oxyanion-reducing bacterium) 16S rRNA gene complete sequence

    Technology Transfer Automated Retrieval System (TEKTRAN)

    This study used 1482 base pair 16S rRNA gene sequence methods in conjunction with other biochemical and morphological studies to confirm the identification of a bacterium (refer to as the BN4 strain) as a Rhizobium sp. The 16S rRNA gene sequence places it with the Rhizobium clade that includes R. d...

  14. AIB1 gene amplification and the instability of polyQ encoding sequence in breast cancer cell lines

    PubMed Central

    Wong, Lee-Jun C; Dai, Pu; Lu, Jyh-Feng; Lou, Mary Ann; Clarke, Robert; Nazarov, Viktor

    2006-01-01

    Background The poly Q polymorphism in AIB1 (amplified in breast cancer) gene is usually assessed by fragment length analysis which does not reveal the actual sequence variation. The purpose of this study is to investigate the sequence variation of poly Q encoding region in breast cancer cell lines at single molecule level, and to determine if the sequence variation is related to AIB1 gene amplification. Methods The polymorphic poly Q encoding region of AIB1 gene was investigated at the single molecule level by PCR cloning/sequencing. The amplification of AIB1 gene in various breast cancer cell lines were studied by real-time quantitative PCR. Results Significant amplifications (5–23 folds) of AIB1 gene were found in 2 out of 9 (22%) ER positive cell lines (in BT-474 and MCF-7 but not in BT-20, ZR-75-1, T47D, BT483, MDA-MB-361, MDA-MB-468 and MDA-MB-330). The AIB1 gene was not amplified in any of the ER negative cell lines. Different passages of MCF-7 cell lines and their derivatives maintained the feature of AIB1 amplification. When the cells were selected for hormone independence (LCC1) and resistance to 4-hydroxy tamoxifen (4-OH TAM) (LCC2 and R27), ICI 182,780 (LCC9) or 4-OH TAM, KEO and LY 117018 (LY-2), AIB1 copy number decreased but still remained highly amplified. Sequencing analysis of poly Q encoding region of AIB1 gene did not reveal specific patterns that could be correlated with AIB1 gene amplification. However, about 72% of the breast cancer cell lines had at least one under represented (<20%) extra poly Q encoding sequence patterns that were derived from the original allele, presumably due to somatic instability. Although all MCF-7 cells and their variants had the same predominant poly Q encoding sequence pattern of (CAG)3CAA(CAG)9(CAACAG)3(CAACAGCAG)2CAA of the original cell line, a number of altered poly Q encoding sequences were found in the derivatives of MCF-7 cell lines. Conclusion These data suggest that poly Q encoding region of AIB1 gene is

  15. Rare variant phasing and haplotypic expression from RNA sequencing with phASER.

    PubMed

    Castel, Stephane E; Mohammadi, Pejman; Chung, Wendy K; Shen, Yufeng; Lappalainen, Tuuli

    2016-01-01

    Haplotype phasing of genetic variants is important for clinical interpretation of the genome, population genetic analysis and functional genomic analysis of allelic activity. Here we present phASER, an accurate approach for phasing variants that are overlapped by sequencing reads, including those from RNA sequencing (RNA-seq), which often span multiple exons due to splicing. Using diverse RNA-seq data we demonstrate that this provides more accurate phasing of rare variants compared with population-based phasing and allows phasing of variants in the same gene up to hundreds of kilobases away that cannot be obtained from DNA sequencing (DNA-seq) reads. We show that in the context of medical genetic studies this improves the resolution of compound heterozygotes. Additionally, phASER provides measures of haplotypic expression that increase power and accuracy in studies of allelic expression. In summary, phasing using RNA-seq and phASER is accurate and improves studies where rare variant haplotypes or allelic expression is needed. PMID:27605262

  16. MicroRNA of the fifth-instar posterior silk gland of silkworm identified by Solexa sequencing.

    PubMed

    Li, Jisheng; Ye, Lupeng; Wang, Shaohua; Che, Jiaqian; You, Zhengying; Zhong, Boxiong

    2014-12-01

    No special studies have been focused on the microRNA (miRNA) in the fifth-instar posterior silk gland of Bombyx mori. Here, using next-generation sequencing, we acquired 93.2 million processed reads from 10 small RNA libraries. In this paper, we tried to thoroughly describe how our dataset generated from deep sequencing which was recently published in BMC genomics. Results showed that our findings are largely enriched silkworm miRNA depository and may benefit us to reveal the miRNA functions in the process of silk production. PMID:26484119

  17. MicroRNA of the fifth-instar posterior silk gland of silkworm identified by Solexa sequencing

    PubMed Central

    Li, Jisheng; Ye, Lupeng; Wang, Shaohua; Che, Jiaqian; You, Zhengying; Zhong, Boxiong

    2014-01-01

    No special studies have been focused on the microRNA (miRNA) in the fifth-instar posterior silk gland of Bombyx mori. Here, using next-generation sequencing, we acquired 93.2 million processed reads from 10 small RNA libraries. In this paper, we tried to thoroughly describe how our dataset generated from deep sequencing which was recently published in BMC genomics. Results showed that our findings are largely enriched silkworm miRNA depository and may benefit us to reveal the miRNA functions in the process of silk production. PMID:26484119

  18. MiRGator v3.0: a microRNA portal for deep sequencing, expression profiling and mRNA targeting.

    PubMed

    Cho, Sooyoung; Jang, Insu; Jun, Yukyung; Yoon, Suhyeon; Ko, Minjeong; Kwon, Yeajee; Choi, Ikjung; Chang, Hyeshik; Ryu, Daeun; Lee, Byungwook; Kim, V Narry; Kim, Wankyu; Lee, Sanghyuk

    2013-01-01

    Biogenesis and molecular function are two key subjects in the field of microRNA (miRNA) research. Deep sequencing has become the principal technique in cataloging of miRNA repertoire and generating expression profiles in an unbiased manner. Here, we describe the miRGator v3.0 update (http://mirgator.kobic.re.kr) that compiled the deep sequencing miRNA data available in public and implemented several novel tools to facilitate exploration of massive data. The miR-seq browser supports users to examine short read alignment with the secondary structure and read count information available in concurrent windows. Features such as sequence editing, sorting, ordering, import and export of user data would be of great utility for studying iso-miRs, miRNA editing and modifications. miRNA-target relation is essential for understanding miRNA function. Coexpression analysis of miRNA and target mRNAs, based on miRNA-seq and RNA-seq data from the same sample, is visualized in the heat-map and network views where users can investigate the inverse correlation of gene expression and target relations, compiled from various databases of predicted and validated targets. By keeping datasets and analytic tools up-to-date, miRGator should continue to serve as an integrated resource for biogenesis and functional investigation of miRNAs. PMID:23193297

  19. Spatially Enhanced Differential RNA Methylation Analysis from Affinity-Based Sequencing Data with Hidden Markov Model

    PubMed Central

    Zhang, Yu-Chen; Zhang, Shao-Wu; Liu, Lian; Liu, Hui; Zhang, Lin; Cui, Xiaodong; Huang, Yufei; Meng, Jia

    2015-01-01

    With the development of new sequencing technology, the entire N6-methyl-adenosine (m6A) RNA methylome can now be unbiased profiled with methylated RNA immune-precipitation sequencing technique (MeRIP-Seq), making it possible to detect differential methylation states of RNA between two conditions, for example, between normal and cancerous tissue. However, as an affinity-based method, MeRIP-Seq has yet provided base-pair resolution; that is, a single methylation site determined from MeRIP-Seq data can in practice contain multiple RNA methylation residuals, some of which can be regulated by different enzymes and thus differentially methylated between two conditions. Since existing peak-based methods could not effectively differentiate multiple methylation residuals located within a single methylation site, we propose a hidden Markov model (HMM) based approach to address this issue. Specifically, the detected RNA methylation site is further divided into multiple adjacent small bins and then scanned with higher resolution using a hidden Markov model to model the dependency between spatially adjacent bins for improved accuracy. We tested the proposed algorithm on both simulated data and real data. Result suggests that the proposed algorithm clearly outperforms existing peak-based approach on simulated systems and detects differential methylation regions with higher statistical significance on real dataset. PMID:26301253

  20. Characterization of PA-N terminal domain of Influenza A polymerase reveals sequence specific RNA cleavage.

    PubMed

    Datta, Kausiki; Wolkerstorfer, Andrea; Szolar, Oliver H J; Cusack, Stephen; Klumpp, Klaus

    2013-09-01

    Influenza virus uses a unique cap-snatching mechanism characterized by hijacking and cleavage of host capped pre-mRNAs, resulting in short capped RNAs, which are used as primers for viral mRNA synthesis. The PA subunit of influenza polymerase carries the endonuclease activity that catalyzes the host mRNA cleavage reaction. Here, we show that PA is a sequence selective endonuclease with distinct preference to cleave at the 3' end of a guanine (G) base in RNA. The G specificity is exhibited by the native influenza polymerase complex associated with viral ribonucleoprotein particles and is conferred by an intrinsic G specificity of the isolated PA endonuclease domain PA-Nter. In addition, RNA cleavage site choice by the full polymerase is also guided by cap binding to the PB2 subunit, from which RNA cleavage preferentially occurs at the 12th nt downstream of the cap. However, if a G residue is present in the region of 10-13 nucleotides from the cap, cleavage preferentially occurs at G. This is the first biochemical evidence of influenza polymerase PA showing intrinsic sequence selective endonuclease activity. PMID:23847103

  1. Whole Transcriptome Sequencing Reveals Extensive Unspliced mRNA in Metastatic Castration-Resistant Prostate Cancer

    PubMed Central

    Sowalsky, Adam G.; Xia, Zheng; Wang, Liguo; Zhao, Hao; Chen, Shaoyong; Bubley, Glenn J.; Balk, Steven P.; Li, Wei

    2014-01-01

    Men with metastatic prostate cancer (PCa) who are treated with androgen deprivation therapies (ADT) usually relapse within 2–3 years with disease that is termed castration-resistant prostate cancer (CRPC). To identify the mechanism that drives these advanced tumors, paired-end RNA-sequencing (RNA-seq) was performed on a panel of CRPC bone marrow biopsy specimens. From this genome-wide approach, mutations were found in a series of genes with PCa relevance including: AR, NCOR1, KDM3A, KDM4A, CHD1, SETD5, SETD7, INPP4B, RASGRP3, RASA1, TP53BP1 and CDH1, and a novel SND1:BRAF gene fusion. Amongst the most highly-expressed transcripts were ten non-coding RNAs (ncRNAs), including MALAT1 and PABPC1, which are involved in RNA processing. Notably, a high percentage of sequence reads mapped to introns, which were determined to be the result of incomplete splicing at canonical splice junctions. Using quantitative PCR (qPCR) a series of genes (AR, KLK2, KLK3, STEAP2, CPSF6, and CDK19) were confirmed to have a greater proportion of unspliced RNA in CRPC specimens than in normal prostate epithelium, untreated primary PCa, and cultured PCa cells. This inefficient coupling of transcription and mRNA splicing suggests an overall increase in transcription or defect in splicing. PMID:25189356

  2. Structural Analysis of Single-Point Mutations Given an RNA Sequence: A Case Study with RNAMute

    NASA Astrophysics Data System (ADS)

    Churkin, Alexander; Barash, Danny

    2006-12-01

    We introduce here for the first time the RNAMute package, a pattern-recognition-based utility to perform mutational analysis and detect vulnerable spots within an RNA sequence that affect structure. Mutations in these spots may lead to a structural change that directly relates to a change in functionality. Previously, the concept was tried on RNA genetic control elements called "riboswitches" and other known RNA switches, without an organized utility that analyzes all single-point mutations and can be further expanded. The RNAMute package allows a comprehensive categorization, given an RNA sequence that has functional relevance, by exploring the patterns of all single-point mutants. For illustration, we apply the RNAMute package on an RNA transcript for which individual point mutations were shown experimentally to inactivate spectinomycin resistance in Escherichia coli. Functional analysis of mutations on this case study was performed experimentally by creating a library of point mutations using PCR and screening to locate those mutations. With the availability of RNAMute, preanalysis can be performed computationally before conducting an experiment.

  3. Annotation of primate miRNAs by high throughput sequencing of small RNA libraries

    PubMed Central

    2012-01-01

    Background In addition to genome sequencing, accurate functional annotation of genomes is required in order to carry out comparative and evolutionary analyses between species. Among primates, the human genome is the most extensively annotated. Human miRNA gene annotation is based on multiple lines of evidence including evidence for expression as well as prediction of the characteristic hairpin structure. In contrast, most miRNA genes in non-human primates are annotated based on homology without any expression evidence. We have sequenced small-RNA libraries from chimpanzee, gorilla, orangutan and rhesus macaque from multiple individuals and tissues. Using patterns of miRNA expression in conjunction with a model of miRNA biogenesis we used these high-throughput sequencing data to identify novel miRNAs in non-human primates. Results We predicted 47 new miRNAs in chimpanzee, 240 in gorilla, 55 in orangutan and 47 in rhesus macaque. The algorithm we used was able to predict 64% of the previously known miRNAs in chimpanzee, 94% in gorilla, 61% in orangutan and 71% in rhesus macaque. We therefore added evidence for expression in between one and five tissues to miRNAs that were previously annotated based only on homology to human miRNAs. We increased from 60 to 175 the number miRNAs that are located in orthologous regions in humans and the four non-human primate species studied here. Conclusions In this study we provide expression evidence for homology-based annotated miRNAs and predict de novo miRNAs in four non-human primate species. We increased the number of annotated miRNA genes and provided evidence for their expression in four non-human primates. Similar approaches using different individuals and tissues would improve annotation in non-human primates and allow for further comparative studies in the future. PMID:22453055

  4. Enhancing potency of siRNA targeting fusion genes by optimization outside of target sequence.

    PubMed

    Gavrilov, Kseniya; Seo, Young-Eun; Tietjen, Gregory T; Cui, Jiajia; Cheng, Christopher J; Saltzman, W Mark

    2015-12-01

    Canonical siRNA design algorithms have become remarkably effective at predicting favorable binding regions within a target mRNA, but in some cases (e.g., a fusion junction site) region choice is restricted. In these instances, alternative approaches are necessary to obtain a highly potent silencing molecule. Here we focus on strategies for rational optimization of two siRNAs that target the junction sites of fusion oncogenes BCR-ABL and TMPRSS2-ERG. We demonstrate that modifying the termini of these siRNAs with a terminal G-U wobble pair or a carefully selected pair of terminal asymmetry-enhancing mismatches can result in an increase in potency at low doses. Importantly, we observed that improvements in silencing at the mRNA level do not necessarily translate to reductions in protein level and/or cell death. Decline in protein level is also heavily influenced by targeted protein half-life, and delivery vehicle toxicity can confound measures of cell death due to silencing. Therefore, for BCR-ABL, which has a long protein half-life that is difficult to overcome using siRNA, we also developed a nontoxic transfection vector: poly(lactic-coglycolic acid) nanoparticles that release siRNA over many days. We show that this system can achieve effective killing of leukemic cells. These findings provide insights into the implications of siRNA sequence for potency and suggest strategies for the design of more effective therapeutic siRNA molecules. Furthermore, this work points to the importance of integrating studies of siRNA design and delivery, while heeding and addressing potential limitations such as restricted targetable mRNA regions, long protein half-lives, and nonspecific toxicities. PMID:26627251

  5. Enhancing potency of siRNA targeting fusion genes by optimization outside of target sequence

    PubMed Central

    Gavrilov, Kseniya; Seo, Young-Eun; Tietjen, Gregory T.; Cui, Jiajia; Cheng, Christopher J.; Saltzman, W. Mark

    2015-01-01

    Canonical siRNA design algorithms have become remarkably effective at predicting favorable binding regions within a target mRNA, but in some cases (e.g., a fusion junction site) region choice is restricted. In these instances, alternative approaches are necessary to obtain a highly potent silencing molecule. Here we focus on strategies for rational optimization of two siRNAs that target the junction sites of fusion oncogenes BCR-ABL and TMPRSS2-ERG. We demonstrate that modifying the termini of these siRNAs with a terminal G-U wobble pair or a carefully selected pair of terminal asymmetry-enhancing mismatches can result in an increase in potency at low doses. Importantly, we observed that improvements in silencing at the mRNA level do not necessarily translate to reductions in protein level and/or cell death. Decline in protein level is also heavily influenced by targeted protein half-life, and delivery vehicle toxicity can confound measures of cell death due to silencing. Therefore, for BCR-ABL, which has a long protein half-life that is difficult to overcome using siRNA, we also developed a nontoxic transfection vector: poly(lactic-coglycolic acid) nanoparticles that release siRNA over many days. We show that this system can achieve effective killing of leukemic cells. These findings provide insights into the implications of siRNA sequence for potency and suggest strategies for the design of more effective therapeutic siRNA molecules. Furthermore, this work points to the importance of integrating studies of siRNA design and delivery, while heeding and addressing potential limitations such as restricted targetable mRNA regions, long protein half-lives, and nonspecific toxicities. PMID:26627251

  6. New Hosts of Simplicimonas similis and Trichomitus batrachorum Identified by 18S Ribosomal RNA Gene Sequences

    PubMed Central

    Dimasuay, Kris Genelyn B.; Lavilla, Orlie John Y.; Rivera, Windell L.

    2013-01-01

    Trichomonads are obligate anaerobes generally found in the digestive and genitourinary tract of domestic animals. In this study, four trichomonad isolates were obtained from carabao, dog, and pig hosts using rectal swab. Genomic DNA was extracted using Chelex method and the 18S rRNA gene was successfully amplified through novel sets of primers and undergone DNA sequencing. Aligned isolate sequences together with retrieved 18S rRNA gene sequences of known trichomonads were utilized to generate phylogenetic trees using maximum likelihood and neighbor-joining analyses. Two isolates from carabao were identified as Simplicimonas similis while each isolate from dog and pig was identified as Pentatrichomonas hominis and Trichomitus batrachorum, respectively. This is the first report of S. similis in carabao and the identification of T. batrachorum in pig using 18S rRNA gene sequence analysis. The generated phylogenetic tree yielded three distinct groups mostly with relatively moderate to high bootstrap support and in agreement with the most recent classification. Pathogenic potential of the trichomonads in these hosts still needs further investigation. PMID:23936631

  7. Computational identification of riboswitches based on RNA conserved functional sequences and conformations.

    PubMed

    Chang, Tzu-Hao; Huang, Hsien-Da; Wu, Li-Ching; Yeh, Chi-Ta; Liu, Baw-Jhiune; Horng, Jorng-Tzong

    2009-07-01

    Riboswitches are cis-acting genetic regulatory elements within a specific mRNA that can regulate both transcription and translation by interacting with their corresponding metabolites. Recently, an increasing number of riboswitches have been identified in different species and investigated for their roles in regulatory functions. Both the sequence contexts and structural conformations are important characteristics of riboswitches. None of the previously developed tools, such as covariance models (CMs), Riboswitch finder, and RibEx, provide a web server for efficiently searching homologous instances of known riboswitches or considers two crucial characteristics of each riboswitch, such as the structural conformations and sequence contexts of functional regions. Therefore, we developed a systematic method for identifying 12 kinds of riboswitches. The method is implemented and provided as a web server, RiboSW, to efficiently and conveniently identify riboswitches within messenger RNA sequences. The predictive accuracy of the proposed method is comparable with other previous tools. The efficiency of the proposed method for identifying riboswitches was improved in order to achieve a reasonable computational time required for the prediction, which makes it possible to have an accurate and convenient web server for biologists to obtain the results of their analysis of a given mRNA sequence. RiboSW is now available on the web at http://RiboSW.mbc.nctu.edu.tw/. PMID:19460868

  8. Deep RNA sequencing improved the structural annotation of the Tuber melanosporum transcriptome.

    PubMed

    Tisserant, E; Da Silva, C; Kohler, A; Morin, E; Wincker, P; Martin, F

    2011-02-01

    • The functional complexity of the Tuber melanosporum transcriptome has not yet been fully elucidated. Here, we applied high-throughput Illumina RNA-sequencing (RNA-Seq) to the transcriptome of T. melanosporum at different major developmental stages, that is free-living mycelium, fruiting body and ectomycorrhiza. • Sequencing of cDNA libraries generated a total of c. 24 million sequence reads representing > 882 Mb of sequence data. To construct a coverage signal profile across the genome, all reads were then aligned to the reference genome assembly of T. melanosporum Mel28. • We were able to identify a substantial number of novel transcripts, antisense transcripts, new exons, untranslated regions (UTRs), alternative upstream initiation codons and upstream open reading frames. • This RNA-Seq analysis allowed us to improve the genome annotation. It also provided us with a genome-wide view of the transcriptional and post-transcriptional mechanisms generating an increased number of transcript isoforms during major developmental transitions in T. melanosporum. PMID:21223284

  9. tRNADB-CE: tRNA gene database well-timed in the era of big sequence data.

    PubMed

    Abe, Takashi; Inokuchi, Hachiro; Yamada, Yuko; Muto, Akira; Iwasaki, Yuki; Ikemura, Toshimichi

    2014-01-01

    The tRNA gene data base curated by experts "tRNADB-CE" (http://trna.ie.niigata-u.ac.jp) was constructed by analyzing 1,966 complete and 5,272 draft genomes of prokaryotes, 171 viruses', 121 chloroplasts', and 12 eukaryotes' genomes plus fragment sequences obtained by metagenome studies of environmental samples. 595,115 tRNA genes in total, and thus two times of genes compiled previously, have been registered, for which sequence, clover-leaf structure, and results of sequence-similarity and oligonucleotide-pattern searches can be browsed. To provide collective knowledge with help from experts in tRNA researches, we added a column for enregistering comments to each tRNA. By grouping bacterial tRNAs with an identical sequence, we have found high phylogenetic preservation of tRNA sequences, especially at the phylum level. Since many species-unknown tRNAs from metagenomic sequences have sequences identical to those found in species-known prokaryotes, the identical sequence group (ISG) can provide phylogenetic markers to investigate the microbial community in an environmental ecosystem. This strategy can be applied to a huge amount of short sequences obtained from next-generation sequencers, as showing that tRNADB-CE is a well-timed database in the era of big sequence data. It is also discussed that batch-learning self-organizing-map with oligonucleotide composition is useful for efficient knowledge discovery from big sequence data. PMID:24822057

  10. Mutation Detection in an Antibody-Producing Chinese Hamster Ovary Cell Line by Targeted RNA Sequencing

    PubMed Central

    Zhang, Siyan; Hughes, Jason D.; Murgolo, Nicholas; Levitan, Diane; Chen, Janice; Liu, Zhong

    2016-01-01

    Chinese hamster ovary (CHO) cells have been used widely in the pharmaceutical industry for production of biological therapeutics including monoclonal antibodies (mAb). The integrity of the gene of interest and the accuracy of the relay of genetic information impact product quality and patient safety. Here we employed next-generation sequencing, particularly RNA-seq, and developed a method to systematically analyze the mutation rate of the mRNA of CHO cell lines producing a mAb. The effect of an extended culturing period to mimic the scale of cell expansion in a manufacturing process and varying selection pressure in the cell culture were also closely examined. PMID:27088091

  11. Phylogenetic analysis of complete rRNA gene sequence of Nosema philosamiae isolated from the lepidopteran Philosamia cynthia ricini.

    PubMed

    Zhu, Feng; Shen, Zhongyuan; Xu, Xiaofang; Tao, Hengping; Dong, Shinan; Tang, Xudong; Xu, Li

    2010-01-01

    ABSTRACT. The microsporidian Nosema philosamiae is a pathogen that infects the eri-silkworm Philosamia cynthia ricini. The complete sequence of rRNA gene (4,314 bp) was obtained by polymerase chain reaction amplification with specific primers and sequencing. The sequence analysis showed that the organization of the rRNA of N. philosamiae was similar to the pattern of Nosema bombycis. Phylogenetic analysis of rRNA gene sequences revealed that N. philosamiae had a close relationship with other Nosema species, confirming that N. philosamiae is correctly assigned to the genus Nosema. PMID:20384905

  12. Evidence for multiple sequences and factors involved in c-myc RNA stability during amphibian oogenesis.

    PubMed

    Lefresne, J; Lemaitre, J M; Selo, M; Goussard, J; Mouton, C; Andeol, Y

    2001-04-01

    To investigate the molecular mechanisms regulating c-myc RNA stability during late amphibian oogenesis, a heterologous system was used in which synthetic Xenopus laevis c-myc transcripts, progressively deleted from their 3' end, were injected into the cytoplasm of two different host axolotl (Ambystoma mexicanum) cells: stage VI oocytes and progesterone-matured oocytes (unfertilized eggs; UFE). This in vivo strategy allowed the behavior of the exogenous c-myc transcripts to be followed and different regions involved in the stability of each intermediate deleted molecule to be identified. Interestingly, these specific regions differ in the two cellular contexts. In oocytes, two stabilizing regions are located in the 3' untranslated region (UTR) and two in the coding sequence (exons II and III) of the RNA. In UFE, the stabilizing regions correspond to the first part of the 3' UTR and to the first part of exon II. However, in UFE, the majority of synthetic transcripts are degraded. This degradation is a consequence of nuclear factors delivered after germinal vesicle breakdown and specifically acting on targeted regions of the RNA. To test the direct implication of these nuclear factors in c-myc RNA degradation, an in vitro system was set up using axolotl germinal vesicle extracts that mimic the in vivo results and confirm the existence of specific destabilizing factors. In vitro analysis revealed that two populations of nuclear molecules are implicated: one of 4.4-5S (50-65 kDa) and the second of 5.4-6S (90-110 kDa). These degrading nuclear factors act preferentially on the coding region of the c-myc RNA and appear to be conserved between axolotl and Xenopus. Thus, this experimental approach has allowed the identification of specific stabilizing sequences in c-myc RNA and the temporal identification of the different factors (cytoplasmic and/or nuclear) involved in post-transcriptional regulation of this RNA during oogenesis. PMID:11284969

  13. Insights into the phylogenetic positions of photosynthetic bacteria obtained from 5S rRNA and 16S rRNA sequence data

    NASA Technical Reports Server (NTRS)

    Fox, G. E.

    1985-01-01

    Comparisons of complete 16S ribosomal ribonucleic acid (rRNA) sequences established that the secondary structure of these molecules is highly conserved. Earlier work with 5S rRNA secondary structure revealed that when structural conservation exists the alignment of sequences is straightforward. The constancy of structure implies minimal functional change. Under these conditions a uniform evolutionary rate can be expected so that conditions are favorable for phylogenetic tree construction.

  14. Simultaneous alignment and folding of 28S rRNA sequences uncovers phylogenetic signal in structure variation.

    PubMed

    Letsch, Harald O; Greve, Carola; Kück, Patrick; Fleck, Günther; Stocsits, Roman R; Misof, Bernhard

    2009-12-01

    Secondary structure models of mitochondrial and nuclear (r)RNA sequences are frequently applied to aid the alignment of these molecules in phylogenetic analyses. Additionally, it is often speculated that structure variation of (r)RNA sequences might profitably be used as phylogenetic markers. The benefit of these approaches depends on the reliability of structure models. We used a recently developed approach to show that reliable inference of large (r)RNA secondary structures as a prerequisite of simultaneous sequence and structure alignment is feasible. The approach iteratively establishes local structure constraints of each sequence and infers fully folded individual structures by constrained MFE optimization. A comparison of structure edit distances of individual constraints and fully folded structures showed pronounced phylogenetic signal in fully folded structures. As model sequences we characterized secondary structures of 28S rRNA sequences of selected insects and examined their phylogenetic signal according to established phylogenetic hypotheses. PMID:19654047

  15. MicroRNA transcriptome profiling of mice brains infected with Japanese encephalitis virus by RNA sequencing.

    PubMed

    Li, Xin-Feng; Cao, Rui-Bing; Luo, Jun; Fan, Jian-Ming; Wang, Jing-Man; Zhang, Yuan-Peng; Gu, Jin-Yan; Feng, Xiu-Li; Zhou, Bin; Chen, Pu-Yan

    2016-04-01

    Japanese encephalitis (JE) is a mosquito borne viral disease, caused by Japanese encephalitis virus (JEV) infection producing severe neuroinflammation in the central nervous system (CNS) with the associated disruption of the blood brain barrier. MicroRNAs (miRNAs) are a family of 21-24 nt small non-coding RNAs that play important post-transcriptional regulatory roles in gene expression and have critical roles in virus pathogenesis. We examined the potential roles of miRNAs in JEV-infected suckling mice brains and found that JEV infection changed miRNA expression profiles when the suckling mice began showing nervous symptoms. A total of 1062 known and 71 novel miRNAs were detected in JEV-infected group, accompanied with 1088 known and 75 novel miRNAs in mock controls. Among these miRNAs, one novel and 25 known miRNAs were significantly differentially expressed, including 18 up-regulated and 8 down-regulated miRNAs which were further confirmed by real-time PCR. Gene ontology (GO) and signaling pathway analysis of the predicted target mRNAs of the modulated miRNAs showed that they are correlated with the regulation of apoptosis, neuron differentiation, antiviral immunity and infiltration of mouse brain, and the validated targets of 12 differentially expressed miRNAs were enriched for the regulation of cell programmed death, proliferation, transcription, muscle organ development, erythrocyte differentiation, gene expression, plasma membrane and protein domain specific binding. KEGG analysis further reveals that the validated target genes were involved in the Pathways in cancer, Neurotrophin signaling pathway, Toll like receptor signaling pathway, Endometrial cancer and Jak-STAT signaling pathway. We constructed the interaction networks of miRNAs and their target genes according to GO terms and KEGG pathways and the expression levels of several target genes were examined. Our data provides a valuable basis for further studies on the regulatory roles of miRNAs in JE

  16. Determining RNA quality for NextGen sequencing: some exceptions to the gold standard rule of 23S to 16S rRNA ratio

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Using next-generation-sequencing technology to assess entire transcriptomes requires high quality starting RNA. Currently, RNA quality is routinely judged using automated microfluidic gel electrophoresis platforms and associated algorithms. Here we report that such automated methods generate false-n...

  17. Direct assessment of transcription fidelity by high-resolution RNA sequencing

    PubMed Central

    Imashimizu, Masahiko; Oshima, Taku; Lubkowska, Lucyna; Kashlev, Mikhail

    2013-01-01

    Cancerous and aging cells have long been thought to be impacted by transcription errors that cause genetic and epigenetic changes. Until now, a lack of methodology for directly assessing such errors hindered evaluation of their impact to the cells. We report a high-resolution Illumina RNA-seq method that can assess noncoded base substitutions in mRNA at 10−4–10−5 per base frequencies in vitro and in vivo. Statistically reliable detection of changes in transcription fidelity through ∼103 nt DNA sites assures that the RNA-seq can analyze the fidelity in a large number of the sites where errors occur. A combination of the RNA-seq and biochemical analyses of the positions for the errors revealed two sequence-specific mechanisms that increase transcription fidelity by Escherichia coli RNA polymerase: (i) enhanced suppression of nucleotide misincorporation that improves selectivity for the cognate substrate, and (ii) increased backtracking of the RNA polymerase that decreases a chance of error propagation to the full-length transcript after misincorporation and provides an opportunity to proofread the error. This method is adoptable to a genome-wide assessment of transcription fidelity. PMID:23925128

  18. Conserved Non-Coding Sequences are Associated with Rates of mRNA Decay in Arabidopsis

    PubMed Central

    Spangler, Jacob B.; Feltus, Frank Alex

    2013-01-01

    Steady-state mRNA levels are tightly regulated through a combination of transcriptional and post-transcriptional control mechanisms. The discovery of cis-acting DNA elements that encode these control mechanisms is of high importance. We have investigated the influence of conserved non-coding sequences (CNSs), DNA patterns retained after an ancient whole genome duplication event, on the breadth of gene expression and the rates of mRNA decay in Arabidopsis thaliana. The absence of CNSs near α duplicate genes was associated with a decrease in breadth of gene expression and slower mRNA decay rates while the presence CNSs near α duplicates was associated with an increase in breadth of gene expression and faster mRNA decay rates. The observed difference in mRNA decay rate was fastest in genes with CNSs in both non-transcribed and transcribed regions, albeit through an unknown mechanism. This study supports the notion that some Arabidopsis CNSs regulate the steady-state mRNA levels through post-transcriptional control mechanisms and that CNSs also play a role in controlling the breadth of gene expression. PMID:23675377

  19. Joint modeling of RNase footprint sequencing profiles for genome-wide inference of RNA structure.

    PubMed

    Zou, Chenchen; Ouyang, Zhengqing

    2015-10-30

    Recent studies have revealed significant roles of RNA structure in almost every step of RNA processing, including transcription, splicing, transport and translation. RNase footprint sequencing (RNase-seq) has emerged to dissect RNA structures at the genome scale. However, it remains challenging to analyze RNase-seq data because of the issues of signal sparsity, variability and correlations among various RNases. We present a probabilistic framework, joint Poisson-gamma mixture (JPGM), for integrative modeling of multiple RNase-seq profiles. Combining JPGM with hidden Markov model allows genome-wide inference of RNA structures. We apply the joint modeling approach for inferring base pairing states on simulated data sets and RNase-seq profiles of the double-strand specific RNase V1 and single-strand specific RNase S1 in yeast. We demonstrate that joint analysis of V1 and S1 profiles outputs interpretable RNA structure states, while approaches that analyze each profile separately do not. The joint modeling approach predicts the structure states of all nucleotides in 3196 transcripts of yeast without compromising accuracy, while the simple thresholding approach misses 43% of the nucleotides. Furthermore, the posterior probabilities outputted by our model are able to resolve the structural ambiguity of ≈300 000 nucleotides with overlapping V1 and S1 cleavage sites. Our model also generates RNA accessibilities, which are associated with three-dimensional conformations. PMID:26400167

  20. The nucleotide sequence of Beneckea harveyi 5S rRNA. [bioluminescent marine bacterium

    NASA Technical Reports Server (NTRS)

    Luehrsen, K. R.; Fox, G. E.

    1981-01-01

    The primary sequence of the 5S ribosomal RNA isolated from the free-living bioluminescent marine bacterium Beneckea harveyi is reported and discussed in regard to indications of phylogenetic relationships with the bacteria Escherichia coli and Photobacterium phosphoreum. Sequences were determined for oligonucleotide products generated by digestion with ribonuclease T1, pancreatic ribonuclease and ribonuclease T2. The presence of heterogeneity is indicated for two sites. The B. harveyi sequence can be arranged into the same four helix secondary structures as E. coli and other prokaryotic 5S rRNAs. Examination of the 5S-RNS sequences of the three bacteria indicates that B. harveyi and P. phosphoreum are specifically related and share a common ancestor which diverged from an ancestor of E. coli at a somewhat earlier time, consistent with previous studies.

  1. Herpes simplex virus virion stimulatory protein mRNA leader contains sequence elements which increase both virus-induced transcription and mRNA stability.

    PubMed

    Blair, E D; Blair, C C; Wagner, E K

    1987-08-01

    To investigate the role of 5' noncoding leader sequence of herpes simplex virus type 1 (HSV-1) mRNA in infected cells, the promoter for the 65,000-dalton virion stimulatory protein (VSP), a beta-gamma polypeptide, was introduced into plasmids bearing the chloramphenicol acetyltransferase (cat) gene together with various lengths of adjacent viral leader sequences. Plasmids containing longer lengths of leader sequence gave rise to significantly higher levels of CAT enzyme in transfected cells superinfected with HSV-1. RNase T2 protection assays of CAT mRNA showed that transcription was initiated from an authentic viral cap site in all VSP-CAT constructs and that CAT mRNA levels corresponded to CAT enzyme levels. Use of cis-linked simian virus 40 enhancer sequences demonstrated that the effect was virus specific. Constructs containing 12 and 48 base pairs of the VSP mRNA leader gave HSV infection-induced CAT activities intermediate between those of the leaderless construct and the VSP-(+77)-CAT construct. Actinomycin D chase experiments demonstrated that the longest leader sequences increased hybrid CAT mRNA stability at least twofold in infected cells. Cotransfection experiments with a cosmid bearing four virus-specified transcription factors (ICP4, ICP0, ICP27, and VSP-65K) showed that sequences from -3 to +77, with respect to the viral mRNA cap site, also contained signals responsive to transcriptional activation. PMID:3037112

  2. LocARNAscan: Incorporating thermodynamic stability in sequence and structure-based RNA homology search

    PubMed Central

    2013-01-01

    Background The search for distant homologs has become an import issue in genome annotation. A particular difficulty is posed by divergent homologs that have lost recognizable sequence similarity. This same problem also arises in the recognition of novel members of large classes of RNAs such as snoRNAs or microRNAs that consist of families unrelated by common descent. Current homology search tools for structured RNAs are either based entirely on sequence similarity (such as blast or hmmer) or combine sequence and secondary structure. The most prominent example of the latter class of tools is Infernal. Alternatives are descriptor-based methods. In most practical applications published to-date, however, the information contained in covariance models or manually prescribed search patterns is dominated by sequence information. Here we ask two related questions: (1) Is secondary structure alone informative for homology search and the detection of novel members of RNA classes? (2) To what extent is the thermodynamic propensity of the target sequence to fold into the correct secondary structure helpful for this task? Results Sequence-structure alignment can be used as an alternative search strategy. In this scenario, the query consists of a base pairing probability matrix, which can be derived either from a single sequence or from a multiple alignment representing a set of known representatives. Sequence information can be optionally added to the query. The target sequence is pre-processed to obtain local base pairing probabilities. As a search engine we devised a semi-global scanning variant of LocARNA’s algorithm for sequence-structure alignment. The LocARNAscan tool is optimized for speed and low memory consumption. In benchmarking experiments on artificial data we observe that the inclusion of thermodynamic stability is helpful, albeit only in a regime of extremely low sequence information in the query. We observe, furthermore, that the sensitivity is bounded in

  3. The complete nucleotide sequence of Pepper mottle virus-Florida RNA.

    PubMed

    Warren, C E; Murphy, J F

    2003-01-01

    The Pepper mottle virus-Florida (PepMoV-FL) RNA genome was cloned and sequenced, and shown to consist of 9,717 nucleotides (nt) excluding the poly (A) tail. A single open reading frame was identified beginning at nucleotide position 169 encoding a polyprotein of 3068 amino acids. Phylogenetic sequence analysis revealed that of 44 full-length viral RNA genomes analyzed within the family Potyviridae, PepMoV-FL was most closely related to PepMoV-California (PepMoV-CA), Potato virus Y-H (PVY-H), PVY-N, PVY(o) and Potato virus V-DV42 (PVV-DV42). Using the PepMoV-FL sequence as a basis for comparison, the overall nucleotide sequence identity was highest between PepMoV-FL and PepMoV-CA at 93%, while the relationship was more distant with PVV-DV42 at 64% and for the PVY strains at 61%. A unique direct repeat sequence of 76 nucleotides was identified in the PepMoV-FL 3'-untranslated region (UTR), and this repeat sequence was confirmed not to occur in the PepMoV-CA sequence. Since the Florida isolate was among the first of the PepMoV isolates described, extensive biological and serological data on this isolate are available, and it has now been cloned and sequenced, we recommend that PepMoV-FL be recognized as the PepMoV type strain. PMID:12536304

  4. GeneTack database: genes with frameshifts in prokaryotic genomes and eukaryotic mRNA sequences.

    PubMed

    Antonov, Ivan; Baranov, Pavel; Borodovsky, Mark

    2013-01-01

    Database annotations of prokaryotic genomes and eukaryotic mRNA sequences pay relatively low attention to frame transitions that disrupt protein-coding genes. Frame transitions (frameshifts) could be caused by sequencing errors or indel mutations inside protein-coding regions. Other observed frameshifts are related to recoding events (that evolved to control expression of some genes). Earlier, we have developed an algorithm and software program GeneTack for ab initio frameshift finding in intronless genes. Here, we describe a database (freely available at http://topaz.gatech.edu/GeneTack/db.html) containing genes with frameshifts (fs-genes) predicted by GeneTack. The database includes 206 991 fs-genes from 1106 complete prokaryotic genomes and 45 295 frameshifts predicted in mRNA sequences from 100 eukaryotic genomes. The whole set of fs-genes was grouped into clusters based on sequence similarity between fs-proteins (conceptually translated fs-genes), conservation of the frameshift position and frameshift direction (-1, +1). The fs-genes can be retrieved by similarity search to a given query sequence via a web interface, by fs-gene cluster browsing, etc. Clusters of fs-genes are characterized with respect to their likely origin, such as pseudogenization, phase variation, etc. The largest clusters contain fs-genes with programed frameshifts (related to recoding events). PMID:23161689

  5. Inhibition of hepatitis B virus replication with linear DNA sequences expressing antiviral micro-RNA shuttles

    SciTech Connect

    Chattopadhyay, Saket; Ely, Abdullah; Bloom, Kristie; Weinberg, Marc S.; Arbuthnot, Patrick

    2009-11-20

    RNA interference (RNAi) may be harnessed to inhibit viral gene expression and this approach is being developed to counter chronic infection with hepatitis B virus (HBV). Compared to synthetic RNAi activators, DNA expression cassettes that generate silencing sequences have advantages of sustained efficacy and ease of propagation in plasmid DNA (pDNA). However, the large size of pDNAs and inclusion of sequences conferring antibiotic resistance and immunostimulation limit delivery efficiency and safety. To develop use of alternative DNA templates that may be applied for therapeutic gene silencing, we assessed the usefulness of PCR-generated linear expression cassettes that produce anti-HBV micro-RNA (miR) shuttles. We found that silencing of HBV markers of replication was efficient (>75%) in cell culture and in vivo. miR shuttles were processed to form anti-HBV guide strands and there was no evidence of induction of the interferon response. Modification of terminal sequences to include flanking human adenoviral type-5 inverted terminal repeats was easily achieved and did not compromise silencing efficacy. These linear DNA sequences should have utility in the development of gene silencing applications where modifications of terminal elements with elimination of potentially harmful and non-essential sequences are required.

  6. Phylogenetic diversity in the genus Bacillus as seen by 16S rRNA sequencing studies

    NASA Technical Reports Server (NTRS)

    Rossler, D.; Ludwig, W.; Schleifer, K. H.; Lin, C.; McGill, T. J.; Wisotzkey, J. D.; Jurtshuk, P. Jr; Fox, G. E.

    1991-01-01

    Comparative sequence analysis of 16S ribosomal (r)RNAs or DNAs of Bacillus alvei, B. laterosporus, B. macerans, B. macquariensis, B. polymyxa and B. stearothermophilus revealed the phylogenetic diversity of the genus Bacillus. Based on the presently available data set of 16S rRNA sequences from bacilli and relatives at least four major "Bacillus clusters" can be defined: a "Bacillus subtilis cluster" including B. stearothermophilus, a "B. brevis cluster" including B. laterosporus, a "B. alvei cluster" including B. macerans, B. maquariensis and B. polymyxa and a "B. cycloheptanicus branch".

  7. Phylogenetic analysis of the genera Thiobacillus and Thiomicrospira by 5S rRNA sequences.

    PubMed Central

    Lane, D J; Stahl, D A; Olsen, G J; Heller, D J; Pace, N R

    1985-01-01

    5S rRNA nucleotide sequences from Thiobacillus neapolitanus, Thiobacillus ferrooxidans, Thiobacillus thiooxidans, Thiobacillus intermedius, Thiobacillus perometabolis, Thiobacillus thioparus, Thiobacillus versutus, Thiobacillus novellus, Thiobacillus acidophilus, Thiomicrospira pelophila, Thiomicrospira sp. strain L-12, and Acidiphilium cryptum were determined. A phylogenetic tree, based upon comparison of these and other related 5S rRNA sequences, is presented. The results place the thiobacilli, Thiomicrospira spp., and Acidiphilium spp. in the "purple photosynthetic" bacterial grouping which also includes the enteric, vibrio, pseudomonad, and other familiar eubacterial groups in addition to the purple photosynthetic bacteria. The genus Thiobacillus is not an evolutionarily coherent grouping but rather spans the full breadth of the purple photosynthetic bacteria. PMID:3924899

  8. Reconstruction and applications of consensus yeast metabolic network based on RNA sequencing.

    PubMed

    Zhao, Yuqi; Wang, Yanjie; Zou, Lei; Huang, Jingfei

    2016-04-01

    One practical application of genome-scale metabolic reconstructions is to interrogate multispecies relationships. Here, we report a consensus metabolic model in four yeast species (Saccharomyces cerevisiae, S. paradoxus, S. mikatae, and S. bayanus) by integrating metabolic network simulations with RNA sequencing (RNA-seq) datasets. We generated high-resolution transcriptome maps of four yeast species through de novo assembly and genome-guided approaches. The transcriptomes were annotated and applied to build the consensus metabolic network, which was verified using independent RNA-seq experiments. The expression profiles reveal that the genes involved in amino acid and lipid metabolism are highly coexpressed. The diverse phenotypic characteristics, such as cellular growth and gene deletions, can be simulated using the metabolic model. We also explored the applications of the consensus model in metabolic engineering using yeast-specific reactions and biofuel production as examples. Similar strategies will benefit communities studying genome-scale metabolic networks of other organisms. PMID:27239440

  9. Long Noncoding RNA and mRNA Expression Profiles in the Thyroid Gland of Two Phenotypically Extreme Pig Breeds Using Ribo-Zero RNA Sequencing.

    PubMed

    Shen, Yifei; Mao, Haiguang; Huang, Minjie; Chen, Lixing; Chen, Jiucheng; Cai, Zhaowei; Wang, Ying; Xu, Ningying

    2016-01-01

    The thyroid gland is an important endocrine organ modulating development, growth, and metabolism, mainly by controlling the synthesis and secretion of thyroid hormones (THs). However, little is known about the pig thyroid transcriptome. Long non-coding RNAs (lncRNAs) regulate gene expression and play critical roles in many cellular processes. Yorkshire pigs have a higher growth rate but lower fat deposition than that of Jinhua pigs, and thus, these species are ideal models for studying growth and lipid metabolism. This study revealed higher levels of THs in the serum of Yorkshire pigs than in the serum of Jinhua pigs. By using Ribo-zero RNA sequencing-which can capture both polyA and non-polyA transcripts-the thyroid transcriptome of both breeds were analyzed and 22,435 known mRNAs were found to be expressed in the pig thyroid. In addition, 1189 novel mRNAs and 1018 candidate lncRNA transcripts were detected. Multiple TH-synthesis-related genes were identified among the 455 differentially-expressed known mRNAs, 37 novel mRNAs, and 52 lncRNA transcripts. Bioinformatics analysis revealed that differentially-expressed genes were enriched in the microtubule-based process, which contributes to THs secretion. Moreover, integrating analysis predicted 13 potential lncRNA-mRNA gene pairs. These data expanded the repertoire of porcine lncRNAs and mRNAs and contribute to understanding the possible molecular mechanisms involved in animal growth and lipid metabolism. PMID:27409639

  10. Structure and Genome Organization of Cherry Virus A (Capillovirus, Betaflexiviridae) from China Using Small RNA Sequencing

    PubMed Central

    Wang, Jiawei; Zhai, Ying; Liu, Weizhen; Dhingra, Amit

    2016-01-01

    Cherry virus A (CVA) (Capillovirus, Betaflexiviridae) is widely present in cherry-growing areas. We obtained the complete genome of a CVA isolate (CVA-TA) using small RNA deep sequencing, followed by overlapping reverse transcription-PCR (RT-PCR) and rapid amplification of cDNA ends (RACE). The newly identified 5′-untranslated region (5′-UTR) from CVA-TA may form additional hairpin and loop structures to stabilize the CVA genome. PMID:27174277

  11. Metatranscriptome of marine bacterioplankton during winter time in the North Sea assessed by total RNA sequencing.

    PubMed

    Kopf, Anna; Kostadinov, Ivaylo; Wichels, Antje; Quast, Christian; Glöckner, Frank Oliver

    2015-02-01

    Marine metatranscriptome data was generated as part of a study investigating the bacterioplankton communities towards the end of a diatom-dominated spring phytoplankton bloom. This genomic resource article reports a metatranscriptomic dataset from amidst the winter time prior to the occurrence of the spring diatom bloom. Up to 58% of all sequences could be assigned to predicted genes. Taxonomic analysis based on expressed 16S ribosomal RNA genes identified Alphaproteobacteria and Gammaproteobacteria as the most active community members. PMID:25479944

  12. Characterization of the genus Bifidobacterium by automated ribotyping and 16S rRNA gene sequences.

    PubMed

    Sakata, Shinji; Ryu, Chun Sun; Kitahara, Maki; Sakamoto, Mitsuo; Hayashi, Hidenori; Fukuyama, Masafumi; Benno, Yoshimi

    2006-01-01

    In order to characterize the genus Bifidobacterium, ribopatterns and approximately 500 bp (Escherichia coli positions 27 to 520) of 16S rRNA gene sequences of 28 type strains and 64 reference strains of the genus Bifidobacterium were determined. Ribopatterns obtained from Bifidobacterium strains were divided into nine clusters (clusters I-IX) with a similarity of 60%. Cluster V, containing 17 species, was further subdivided into 22 subclusters with a similarity of 90%. In the genus Bifidobacterium, four groups were shown according to Miyake et al.: (i) the Bifidobacterium longum infantis-longum-suis type group, (ii) the B. catenulatum-pseudocatenulatum group, (iii) the B. gallinarum-saeculare-pullorum group, and (iv) the B. coryneforme-indicum group, which showed higher than 97% similarity of the 16S rRNA gene sequences in each group. Using ribotyping analysis, unique ribopatterns were obtained from these species, and they could be separated by cluster analysis. Ribopatterns of six B. adolescentis strains were separated into different clusters, and also showed diversity in 16S rRNA gene sequences. B. adolescentis consisted of heterogeneous strains. The nine strains of B. pseudolongum subsp. pseudolongum were divided into five subclusters. Each type strain of B. pseudolongum subsp. pseudolongum and B. pseudolongum subsp. globosum and two intermediate groups, which were suggested by Yaeshima et al., consisted of individual clusters. B. animalis subsp. animalis and B. animalis subsp. lactis could not be separated by ribotyping using Eco RI. We conclude that ribotyping is able to provide another characteristic of Bifidobacterium strains in addition to 16S rRNA gene sequence phylogenetic analysis, and this information suggests that ribotyping analysis is a useful tool for the characterization of Bifidobacterium species in combination with other techniques for taxonomic characterization. PMID:16428867

  13. Structure and Genome Organization of Cherry Virus A (Capillovirus, Betaflexiviridae) from China Using Small RNA Sequencing.

    PubMed

    Wang, Jiawei; Zhai, Ying; Liu, Weizhen; Dhingra, Amit; Pappu, Hanu R; Liu, Qingzhong

    2016-01-01

    Cherry virus A (CVA) (Capillovirus, Betaflexiviridae) is widely present in cherry-growing areas. We obtained the complete genome of a CVA isolate (CVA-TA) using small RNA deep sequencing, followed by overlapping reverse transcription-PCR (RT-PCR) and rapid amplification of cDNA ends (RACE). The newly identified 5'-untranslated region (5'-UTR) from CVA-TA may form additional hairpin and loop structures to stabilize the CVA genome. PMID:27174277

  14. Higher order asymptotics for negative binomial regression inferences from RNA-sequencing data.

    PubMed

    Di, Yanming; Emerson, Sarah C; Schafer, Daniel W; Kimbrel, Jeffrey A; Chang, Jeff H

    2013-03-01

    RNA sequencing (RNA-Seq) is the current method of choice for characterizing transcriptomes and quantifying gene expression changes. This next generation sequencing-based method provides unprecedented depth and resolution. The negative binomial (NB) probability distribution has been shown to be a useful model for frequencies of mapped RNA-Seq reads and consequently provides a basis for statistical analysis of gene expression. Negative binomial exact tests are available for two-group comparisons but do not extend to negative binomial regression analysis, which is important for examining gene expression as a function of explanatory variables and for adjusted group comparisons accounting for other factors. We address the adequacy of available large-sample tests for the small sample sizes typically available from RNA-Seq studies and consider a higher-order asymptotic (HOA) adjustment to likelihood ratio tests. We demonstrate that 1) the HOA-adjusted likelihood ratio test is practically indistinguishable from the exact test in situations where the exact test is available, 2) the type I error of the HOA test matches the nominal specification in regression settings we examined via simulation, and 3) the power of the likelihood ratio test does not appear to be affected by the HOA adjustment. This work helps clarify the accuracy of the unadjusted likelihood ratio test and the degree of improvement available with the HOA adjustment. Furthermore, the HOA test may be preferable even when the exact test is available because it does not require ad hoc library size adjustments. PMID:23502340

  15. The four ingredients of single-sequence RNA secondary structure prediction. A unifying perspective

    PubMed Central

    Rivas, Elena

    2013-01-01

    Any method for RNA secondary structure prediction is determined by four ingredients. The architecture is the choice of features implemented by the model (such as stacked basepairs, loop length distributions, etc.). The architecture determines the number of parameters in the model. The scoring scheme is the nature of those parameters (whether thermodynamic, probabilistic, or weights). The parameterization stands for the specific values assigned to the parameters. These three ingredients are referred to as “the model.” The fourth ingredient is the folding algorithms used to predict plausible secondary structures given the model and the sequence of a structural RNA. Here, I make several unifying observations drawn from looking at more than 40 years of methods for RNA secondary structure prediction in the light of this classification. As a final observation, there seems to be a performance ceiling that affects all methods with complex architectures, a ceiling that impacts all scoring schemes with remarkable similarity. This suggests that modeling RNA secondary structure by using intrinsic sequence-based plausible “foldability” will require the incorporation of other forms of information in order to constrain the folding space and to improve prediction accuracy. This could give an advantage to probabilistic scoring systems since a probabilistic framework is a natural platform to incorporate different sources of information into one single inference problem. PMID:23695796

  16. Exploration of sequence space as the basis of viral RNA genome segmentation

    PubMed Central

    Moreno, Elena; Ojosnegros, Samuel; García-Arriaza, Juan; Escarmís, Cristina; Domingo, Esteban; Perales, Celia

    2014-01-01

    The mechanisms of viral RNA genome segmentation are unknown. On extensive passage of foot-and-mouth disease virus in baby hamster kidney-21 cells, the virus accumulated multiple point mutations and underwent a transition akin to genome segmentation. The standard single RNA genome molecule was replaced by genomes harboring internal in-frame deletions affecting the L- or capsid-coding region. These genomes were infectious and killed cells by complementation. Here we show that the point mutations in the nonstructural protein-coding region (P2, P3) that accumulated in the standard genome before segmentation increased the relative fitness of the segmented version relative to the standard genome. Fitness increase was documented by intracellular expression of virus-coded proteins and infectious progeny production by RNAs with the internal deletions placed in the sequence context of the parental and evolved genome. The complementation activity involved several viral proteins, one of them being the leader proteinase L. Thus, a history of genetic drift with accumulation of point mutations was needed to allow a major variation in the structure of a viral genome. Thus, exploration of sequence space by a viral genome (in this case an unsegmented RNA) can reach a point of the space in which a totally different genome structure (in this case, a segmented RNA) is favored over the form that performed the exploration. PMID:24757055

  17. RNA-Mediated Gene Duplication and Retroposons: Retrogenes, LINEs, SINEs, and Sequence Specificity

    PubMed Central

    2013-01-01

    A substantial number of “retrogenes” that are derived from the mRNA of various intron-containing genes have been reported. A class of mammalian retroposons, long interspersed element-1 (LINE1, L1), has been shown to be involved in the reverse transcription of retrogenes (or processed pseudogenes) and non-autonomous short interspersed elements (SINEs). The 3′-end sequences of various SINEs originated from a corresponding LINE. As the 3′-untranslated regions of several LINEs are essential for retroposition, these LINEs presumably require “stringent” recognition of the 3′-end sequence of the RNA template. However, the 3′-ends of mammalian L1s do not exhibit any similarity to SINEs, except for the presence of 3′-poly(A) repeats. Since the 3′-poly(A) repeats of L1 and Alu SINE are critical for their retroposition, L1 probably recognizes the poly(A) repeats, thereby mobilizing not only Alu SINE but also cytosolic mRNA. Many flowering plants only harbor L1-clade LINEs and a significant number of SINEs with poly(A) repeats, but no homology to the LINEs. Moreover, processed pseudogenes have also been found in flowering plants. I propose that the ancestral L1-clade LINE in the common ancestor of green plants may have recognized a specific RNA template, with stringent recognition then becoming relaxed during the course of plant evolution. PMID:23984183

  18. Sequence-Based Analysis Uncovers an Abundance of Non-Coding RNA in the Total Transcriptome of Mycobacterium tuberculosis

    PubMed Central

    Arnvig, Kristine B.; Comas, Iñaki; Thomson, Nicholas R.; Houghton, Joanna; Boshoff, Helena I.; Croucher, Nicholas J.; Rose, Graham; Perkins, Timothy T.; Parkhill, Julian; Dougan, Gordon; Young, Douglas B.

    2011-01-01

    RNA sequencing provides a new perspective on the genome of Mycobacterium tuberculosis by revealing an extensive presence of non-coding RNA, including long 5’ and 3’ untranslated regions, antisense transcripts, and intergenic small RNA (sRNA) molecules. More than a quarter of all sequence reads mapping outside of ribosomal RNA genes represent non-coding RNA, and the density of reads mapping to intergenic regions was more than two-fold higher than that mapping to annotated coding sequences. Selected sRNAs were found at increased abundance in stationary phase cultures and accumulated to remarkably high levels in the lungs of chronically infected mice, indicating a potential contribution to pathogenesis. The ability of tubercle bacilli to adapt to changing environments within the host is critical to their ability to cause disease and to persist during drug treatment; it is likely that novel post-transcriptional regulatory networks will play an important role in these adaptive responses. PMID:22072964

  19. Rtools: a web server for various secondary structural analyses on single RNA sequences.

    PubMed

    Hamada, Michiaki; Ono, Yukiteru; Kiryu, Hisanori; Sato, Kengo; Kato, Yuki; Fukunaga, Tsukasa; Mori, Ryota; Asai, Kiyoshi

    2016-07-01

    The secondary structures, as well as the nucleotide sequences, are the important features of RNA molecules to characterize their functions. According to the thermodynamic model, however, the probability of any secondary structure is very small. As a consequence, any tool to predict the secondary structures of RNAs has limited accuracy. On the other hand, there are a few tools to compensate the imperfect predictions by calculating and visualizing the secondary structural information from RNA sequences. It is desirable to obtain the rich information from those tools through a friendly interface. We implemented a web server of the tools to predict secondary structures and to calculate various structural features based on the energy models of secondary structures. By just giving an RNA sequence to the web server, the user can get the different types of solutions of the secondary structures, the marginal probabilities such as base-paring probabilities, loop probabilities and accessibilities of the local bases, the energy changes by arbitrary base mutations as well as the measures for validations of the predicted secondary structures. The web server is available at http://rtools.cbrc.jp, which integrates software tools, CentroidFold, CentroidHomfold, IPKnot, CapR, Raccess, Rchange and RintD. PMID:27131356

  20. Primer and platform effects on 16S rRNA tag sequencing

    SciTech Connect

    Tremblay, Julien; Singh, Kanwar; Fern, Alison; Kirton, Edward S.; He, Shaomei; Woyke, Tanja; Lee, Janey; Chen, Feng; Dangl, Jeffery L.; Tringe, Susannah G.

    2015-08-04

    Sequencing of 16S rRNA gene tags is a popular method for profiling and comparing microbial communities. The protocols and methods used, however, vary considerably with regard to amplification primers, sequencing primers, sequencing technologies; as well as quality filtering and clustering. How results are affected by these choices, and whether data produced with different protocols can be meaningfully compared, is often unknown. Here we compare results obtained using three different amplification primer sets (targeting V4, V6–V8, and V7–V8) and two sequencing technologies (454 pyrosequencing and Illumina MiSeq) using DNA from a mock community containing a known number of species as well as complex environmental samples whose PCR-independent profiles were estimated using shotgun sequencing. We find that paired-end MiSeq reads produce higher quality data and enabled the use of more aggressive quality control parameters over 454, resulting in a higher retention rate of high quality reads for downstream data analysis. While primer choice considerably influences quantitative abundance estimations, sequencing platform has relatively minor effects when matched primers are used. In conclusion, beta diversity metrics are surprisingly robust to both primer and sequencing platform biases.

  1. Primer and platform effects on 16S rRNA tag sequencing

    PubMed Central

    Tremblay, Julien; Singh, Kanwar; Fern, Alison; Kirton, Edward S.; He, Shaomei; Woyke, Tanja; Lee, Janey; Chen, Feng; Dangl, Jeffery L.; Tringe, Susannah G.

    2015-01-01

    Sequencing of 16S rRNA gene tags is a popular method for profiling and comparing microbial communities. The protocols and methods used, however, vary considerably with regard to amplification primers, sequencing primers, sequencing technologies; as well as quality filtering and clustering. How results are affected by these choices, and whether data produced with different protocols can be meaningfully compared, is often unknown. Here we compare results obtained using three different amplification primer sets (targeting V4, V6–V8, and V7–V8) and two sequencing technologies (454 pyrosequencing and Illumina MiSeq) using DNA from a mock community containing a known number of species as well as complex environmental samples whose PCR-independent profiles were estimated using shotgun sequencing. We find that paired-end MiSeq reads produce higher quality data and enabled the use of more aggressive quality control parameters over 454, resulting in a higher retention rate of high quality reads for downstream data analysis. While primer choice considerably influences quantitative abundance estimations, sequencing platform has relatively minor effects when matched primers are used. Beta diversity metrics are surprisingly robust to both primer and sequencing platform biases. PMID:26300854

  2. Primer and platform effects on 16S rRNA tag sequencing

    DOE PAGESBeta

    Tremblay, Julien; Singh, Kanwar; Fern, Alison; Kirton, Edward S.; He, Shaomei; Woyke, Tanja; Lee, Janey; Chen, Feng; Dangl, Jeffery L.; Tringe, Susannah G.

    2015-08-04

    Sequencing of 16S rRNA gene tags is a popular method for profiling and comparing microbial communities. The protocols and methods used, however, vary considerably with regard to amplification primers, sequencing primers, sequencing technologies; as well as quality filtering and clustering. How results are affected by these choices, and whether data produced with different protocols can be meaningfully compared, is often unknown. Here we compare results obtained using three different amplification primer sets (targeting V4, V6–V8, and V7–V8) and two sequencing technologies (454 pyrosequencing and Illumina MiSeq) using DNA from a mock community containing a known number of species as wellmore » as complex environmental samples whose PCR-independent profiles were estimated using shotgun sequencing. We find that paired-end MiSeq reads produce higher quality data and enabled the use of more aggressive quality control parameters over 454, resulting in a higher retention rate of high quality reads for downstream data analysis. While primer choice considerably influences quantitative abundance estimations, sequencing platform has relatively minor effects when matched primers are used. In conclusion, beta diversity metrics are surprisingly robust to both primer and sequencing platform biases.« less

  3. miRMOD: a tool for identification and analysis of 5′ and 3′ miRNA modifications in Next Generation Sequencing small RNA data

    PubMed Central

    Mukherjee, Sunil K.

    2015-01-01

    In the past decade, the microRNAs (miRNAs) have emerged to be important regulators of gene expression across various species. Several studies have confirmed different types of post-transcriptional modifications at terminal ends of miRNAs. The reports indicate that miRNA modifications are conserved and functionally significant as it may affect miRNA stability and ability to bind mRNA targets, hence affecting target gene repression. Next Generation Sequencing (NGS) of the small RNA (sRNA) provides an efficient and reliable method to explore miRNA modifications. The need for dedicated software, especially for users with little knowledge of computers, to determine and analyze miRNA modifications in sRNA NGS data, motivated us to develop miRMOD. miRMOD is a user-friendly, Microsoft Windows and Graphical User Interface (GUI) based tool for identification and analysis of 5′ and 3′ miRNA modifications (non-templated nucleotide additions and trimming) in sRNA NGS data. In addition to identification of miRNA modifications, the tool also predicts and compares the targets of query and modified miRNAs. In order to compare binding affinities for the same target, miRMOD utilizes minimum free energies of the miRNA:target and modified-miRNA:target interactions. Comparisons of the binding energies may guide experimental exploration of miRNA post-transcriptional modifications. The tool is available as a stand-alone package to overcome large data transfer problems commonly faced in web-based high-throughput (HT) sequencing data analysis tools. miRMOD package is freely available at http://bioinfo.icgeb.res.in/miRMOD. PMID:26623179

  4. Utility of next-generation RNA-sequencing in identifying chimeric transcription involving human endogenous retroviruses.

    PubMed

    Sokol, Martin; Jessen, Karen Margrethe; Pedersen, Finn Skou

    2016-01-01

    Several studies have shown that human endogenous retroviruses and endogenous retrovirus-like repeats (here collectively HERVs) impose direct regulation on human genes through enhancer and promoter motifs present in their long terminal repeats (LTRs). Although chimeric transcription in which novel gene isoforms containing retroviral and human sequence are transcribed from viral promoters are commonly associated with disease, regulation by HERVs is beneficial in other settings; for example, in human testis chimeric isoforms of TP63 induced by an ERV9 LTR protect the male germ line upon DNA damage by inducing apoptosis, whereas in the human globin locus the γ- and β-globin switch during normal hematopoiesis is mediated by complex interactions of an ERV9 LTR and surrounding human sequence. The advent of deep sequencing or next-generation sequencing (NGS) has revolutionized the way researchers solve important scientific questions and develop novel hypotheses in relation to human genome regulation. We recently applied next-generation paired-end RNA-sequencing (RNA-seq) together with chromatin immunoprecipitation with sequencing (ChIP-seq) to examine ERV9 chimeric transcription in human reference cell lines from Encyclopedia of DNA Elements (ENCODE). This led to the discovery of advanced regulation mechanisms by ERV9s and other HERVs across numerous human loci including transcription of large gene-unannotated genomic regions, as well as cooperative regulation by multiple HERVs and non-LTR repeats such as Alu elements. In this article, well-established examples of human gene regulation by HERVs are reviewed followed by a description of paired-end RNA-seq, and its application in identifying chimeric transcription genome-widely. Based on integrative analyses of RNA-seq and ChIP-seq, data we then present novel examples of regulation by ERV9s of tumor suppressor genes CADM2 and SEMA3A, as well as transcription of an unannotated region. Taken together, this article highlights

  5. Comparison of hepatocellular carcinoma miRNA expression profiling as evaluated by next generation sequencing and microarray.

    PubMed

    Murakami, Yoshiki; Tanahashi, Toshihito; Okada, Rina; Toyoda, Hidenori; Kumada, Takashi; Enomoto, Masaru; Tamori, Akihiro; Kawada, Norifumi; Taguchi, Y-h; Azuma, Takeshi

    2014-01-01

    MicroRNA (miRNA) expression profiling has proven useful in diagnosing and understanding the development and progression of several diseases. Microarray is the standard method for analyzing miRNA expression profiles; however, it has several disadvantages, including its limited detection of miRNAs. In recent years, advances in genome sequencing have led to the development of next-generation sequencing (NGS) technologies, which significantly advance genome sequencing speed and discovery. In this study, we compared the expression profiles obtained by next generation sequencing (NGS) with the profiles created using microarray to assess if NGS could produce a more accurate and complete miRNA profile. Total RNA from 14 hepatocellular carcinoma tumors (HCC) and 6 matched non-tumor control tissues were sequenced with Illumina MiSeq 50-bp single-end reads. Micro RNA expression profiles were estimated using miRDeep2 software. As a comparison, miRNA expression profiles for 11 out of 14 HCCs were also established by microarray (Agilent human microRNA microarray). The average total sequencing exceeded 2.2 million reads per sample and of those reads, approximately 57% mapped to the human genome. The average correlation for miRNA expression between microarray and NGS and subtraction were 0.613 and 0.587, respectively, while miRNA expression between technical replicates was 0.976. The diagnostic accuracy of HCC, p-value, and AUC were 90.0%, 7.22×10(-4), and 0.92, respectively. In summary, NGS created an miRNA expression profile that was reproducible and comparable to that produced by microarray. Moreover, NGS discovered novel miRNAs that were otherwise undetectable by microarray. We believe that miRNA expression profiling by NGS can be a useful diagnostic tool applicable to multiple fields of medicine. PMID:25215888

  6. Molecular cloning of five individual stage- and tissue-specific mRNA sequences from sea urchin pluteus embryos.

    PubMed Central

    Fregien, N; Dolecki, G J; Mandel, M; Humphreys, T

    1983-01-01

    Five developmentally regulated sea urchin mRNA sequences which increase in abundance between the blastula and pluteus stages of development were isolated by molecular cloning of cDNA. The regulated sequences all appeared in moderately abundant mRNA molecules of pluteus cells and represented 4% of the clones tested. There were no regulated sequences detected in the 40% of the clones which hybridized to the most abundant mRNA, and the screening procedures were inadequate to detect possible regulation in the 20 to 30% of the clones presumably derived from rare-class mRNA. The reaction of 32P[cDNA] from blastula and pluteus mRNA to dots of the cloned DNAs on nitrocellulose filters indicated that the mRNAs complementary to the different cloned pluteus-specific sequences were between 3- and 47-fold more prevalent at the pluteus stage than at the blastula stage. Polyadenylated RNA from different developmental stages was transferred from electrophoretic gels to nitrocellulose filters and reacted to the different cloned sequences. The regulated mRNAs were undetectable in the RNA of 3-h embryos, became evident at the hatching blastula stage, and reached a maximum in abundance by the gastrula or pluteus stage. Certain of the clones reacted to two sizes of mRNA which did not vary coordinately with development. Transfers of RNA isolated from each of the three cell layers of pluteus embryos that were reacted to the cloned sequences revealed that two of the sequences were found in the mRNA of all three layers, two were ectoderm specific, and one was endoderm specific. Four of the regulated sequences were complementary to one or two major bands and one to at least 50 bands on Southern transfers of restriction endonuclease-digested total sea urchin DNA. Images PMID:6688291

  7. DNA and RNA sequencing by nanoscale reading through programmable electrophoresis and nanoelectrode-gated tunneling and dielectric detection

    DOEpatents

    Lee, James W.; Thundat, Thomas G.

    2005-06-14

    An apparatus and method for performing nucleic acid (DNA and/or RNA) sequencing on a single molecule. The genetic sequence information is obtained by probing through a DNA or RNA molecule base by base at nanometer scale as though looking through a strip of movie film. This DNA sequencing nanotechnology has the theoretical capability of performing DNA sequencing at a maximal rate of about 1,000,000 bases per second. This enhanced performance is made possible by a series of innovations including: novel applications of a fine-tuned nanometer gap for passage of a single DNA or RNA molecule; thin layer microfluidics for sample loading and delivery; and programmable electric fields for precise control of DNA or RNA movement. Detection methods include nanoelectrode-gated tunneling current measurements, dielectric molecular characterization, and atomic force microscopy/electrostatic force microscopy (AFM/EFM) probing for nanoscale reading of the nucleic acid sequences.

  8. Close association between paralogous multiple isomiRs and paralogous/orthologues miRNA sequences implicates dominant sequence selection across various animal species.

    PubMed

    Guo, Li; Zhao, Yang; Zhang, Hui; Yang, Sheng; Chen, Feng

    2013-09-25

    MicroRNAs (miRNAs) are crucial negative regulators of gene expression at the post-transcriptional level. Next-generation sequencing technologies have identified a series of miRNA variants (named isomiRs). In this study, paralogous isomiR assemblies (from the miRNA locus) were systematically analyzed based on data acquired from deep sequencing data sets. Evolutionary analysis of paralogous (members in miRNA gene family in a specific species) and orthologues (across different animal species) miRNAs was also performed. The sequence diversity of paralogous isomiRs was found to be similar to the diversity of paralogous and orthologues miRNAs. Additionally, both isomiRs and paralogous/orthologues miRNAs were implicated in 5' and 3' ends (especially 3' ends), nucleotide substitutions, and insertions and deletions. Generally, multiple isomiRs can be produced from a single miRNA locus, but most of them had lower enrichment levels, and only several dominant isomiR sequences were detected. These dominant isomiR groups were always stable, and one of them would be selected as the most abundant miRNA sequence in specific animal species. Some isomiRs might be consistent to miRNA sequences in some species but not the other. Homologous miRNAs were often detected in similar isomiR repertoires, and showed similar expression patterns, while dominant isomiRs showed complex evolutionary patterns from miRNA sequences across the animal kingdom. These results indicate that the phenomenon of multiple isomiRs is not a random event, but rather the result of evolutionary pressures. The existence of multiple isomiRs enables different species to express advantageous sequences in different environments. Thus, dominant sequences emerge in response to functional and evolutionary pressures, allowing an organism to adapt to complex intra- and extra-cellular events. PMID:23856130

  9. Gene Profiling of Bone around Orthodontic Mini-Implants by RNA-Sequencing Analysis

    PubMed Central

    Nahm, Kyung-Yen; Heo, Jung Sun; Lee, Jae-Hyung; Lee, Dong-Yeol; Chung, Kyu-Rhim; Ahn, Hyo-Won; Kim, Seong-Hun

    2015-01-01

    This study aimed to evaluate the genes that were expressed in the healing bones around SLA-treated titanium orthodontic mini-implants in a beagle at early (1-week) and late (4-week) stages with RNA-sequencing (RNA-Seq). Samples from sites of surgical defects were used as controls. Total RNA was extracted from the tissue around the implants, and an RNA-Seq analysis was performed with Illumina TruSeq. In the 1-week group, genes in the gene ontology (GO) categories of cell growth and the extracellular matrix (ECM) were upregulated, while genes in the categories of the oxidation-reduction process, intermediate filaments, and structural molecule activity were downregulated. In the 4-week group, the genes upregulated included ECM binding, stem cell fate specification, and intramembranous ossification, while genes in the oxidation-reduction process category were downregulated. GO analysis revealed an upregulation of genes that were related to significant mechanisms, including those with roles in cell proliferation, the ECM, growth factors, and osteogenic-related pathways, which are associated with bone formation. From these results, implant-induced bone formation progressed considerably during the times examined in this study. The upregulation or downregulation of selected genes was confirmed with real-time reverse transcription polymerase chain reaction. The RNA-Seq strategy was useful for defining the biological responses to orthodontic mini-implants and identifying the specific genetic networks for targeted evaluations of successful peri-implant bone remodeling. PMID:25759820

  10. High quality RNA extraction from Maqui berry for its application in next-generation sequencing.

    PubMed

    Sánchez, Carolina; Villacreses, Javier; Blanc, Noelle; Espinoza, Loreto; Martinez, Camila; Pastor, Gabriela; Manque, Patricio; Undurraga, Soledad F; Polanco, Victor

    2016-01-01

    Maqui berry (Aristotelia chilensis) is a native Chilean species that produces berries that are exceptionally rich in anthocyanins and natural antioxidants. These natural compounds provide an array of health benefits for humans, making them very desirable in a fruit. At the same time, these substances also interfere with nucleic acid preparations, making RNA extraction from Maqui berry a major challenge. Our group established a method for RNA extraction of Maqui berry with a high quality RNA (good purity, good integrity and higher yield). This procedure is based on the adapted CTAB method using high concentrations of PVP (4 %) and β-mercaptoethanol (4 %) and spermidine in the extraction buffer. These reagents help to remove contaminants such as polysaccharides, proteins, phenols and also prevent the oxidation of phenolic compounds. The high quality of RNA isolated through this method allowed its uses with success in molecular applications for this endemic Chilean fruit, such as differential expression analysis of RNA-Seq data using next generation sequencing (NGS). Furthermore, we consider that our method could potentially be used for other plant species with extremely high levels of antioxidants and anthocyanins. PMID:27536526

  11. Genome-guided transcript assembly from integrative analysis of RNA sequence data

    PubMed Central

    Boley, Nathan; Stoiber, Marcus H.; Booth, Benjamin W.; Wan, Kenneth H.; Hoskins, Roger A.; Bickel, Peter J.; Celniker, Susan E.; Brown, James B.

    2014-01-01

    The identification of full length transcripts entirely from short-read RNA sequencing data (RNA-seq) remains a challenge in genome annotation pipelines. Here we describe an automated pipeline for genome annotation that integrates RNA-seq and gene-boundary data sets, which we call generalized RNA integration tool, or GRIT. By applying GRIT to Drosophila melanogaster short-read RNA-seq, cap analysis of gene expression (CAGE) and poly(A)-site-seq data collected for the modENCODE project, we recover the vast majority of previously annotated transcripts and double the total number of transcripts cataloged. We find that 20% of protein coding genes encode multiple protein-localization signals, and that, in 20 day old adult fly heads, genes with multiple poly-adenylation sites are more common than genes with alternate splicing or alternate promoters. When compared to the most widely used transcript assembly tools, GRIT recovers a larger fraction of annotated transcripts at higher precision. GRIT will enable the automated generation of high-quality genome annotations without necessitating extensive manual annotation. PMID:24633242

  12. Analyses of Long Non-Coding RNA and mRNA profiling using RNA sequencing during the pre-implantation phases in pig endometrium.

    PubMed

    Wang, Yueying; Xue, Songyi; Liu, Xiaoran; Liu, Huan; Hu, Tao; Qiu, Xiaotian; Zhang, Jinlong; Lei, Minggang

    2016-01-01

    Establishment of implantation in pig is accompanied by a coordinated interaction between the maternal uterine endometrium and conceptus development. We investigated the expression profiles of endometrial tissue on Days 9, 12 and 15 of pregnancy and on Day 12 of non-pregnancy in Yorkshire, and performed a comprehensive analysis of long non-coding RNAs (lncRNAs) in endometrial tissue samples by using RNA sequencing. As a result, 2805 novel lncRNAs, 2,376 (301 lncRNA and 2075 mRNA) differentially expressed genes (DEGs) and 2149 novel transcripts were obtained by pairwise comparison. In agreement with previous reports, lncRNAs shared similar characteristics, such as shorter in length, lower in exon number, lower at expression level and less conserved than protein coding transcripts. Bioinformatics analysis showed that DEGs were involved in protein binding, cellular process, immune system process and enriched in focal adhesion, Jak-STAT, FoxO and MAPK signaling pathway. We also found that lncRNAs TCONS_01729386 and TCONS_01325501 may play a vital role in embryo pre-implantation. Furthermore, the expression of FGF7, NMB, COL5A3, S100A8 and PPP1R3D genes were significantly up-regulated at the time of maternal recognition of pregnancy (Day 12 of pregnancy). Our results first identified the characterization and expression profile of lncRNAs in pig endometrium during pre-implantation phases. PMID:26822553

  13. Analyses of Long Non-Coding RNA and mRNA profiling using RNA sequencing during the pre-implantation phases in pig endometrium

    PubMed Central

    Wang, Yueying; Xue, Songyi; Liu, Xiaoran; Liu, Huan; Hu, Tao; Qiu, Xiaotian; Zhang, Jinlong; Lei, Minggang

    2016-01-01

    Establishment of implantation in pig is accompanied by a coordinated interaction between the maternal uterine endometrium and conceptus development. We investigated the expression profiles of endometrial tissue on Days 9, 12 and 15 of pregnancy and on Day 12 of non-pregnancy in Yorkshire, and performed a comprehensive analysis of long non-coding RNAs (lncRNAs) in endometrial tissue samples by using RNA sequencing. As a result, 2805 novel lncRNAs, 2,376 (301 lncRNA and 2075 mRNA) differentially expressed genes (DEGs) and 2149 novel transcripts were obtained by pairwise comparison. In agreement with previous reports, lncRNAs shared similar characteristics, such as shorter in length, lower in exon number, lower at expression level and less conserved than protein coding transcripts. Bioinformatics analysis showed that DEGs were involved in protein binding, cellular process, immune system process and enriched in focal adhesion, Jak-STAT, FoxO and MAPK signaling pathway. We also found that lncRNAs TCONS_01729386 and TCONS_01325501 may play a vital role in embryo pre-implantation. Furthermore, the expression of FGF7, NMB, COL5A3, S100A8 and PPP1R3D genes were significantly up-regulated at the time of maternal recognition of pregnancy (Day 12 of pregnancy). Our results first identified the characterization and expression profile of lncRNAs in pig endometrium during pre-implantation phases. PMID:26822553

  14. Molecular phylogeny of labyrinthulids and thraustochytrids based on the sequencing of 18S ribosomal RNA gene.

    PubMed

    Honda, D; Yokochi, T; Nakahara, T; Raghukumar, S; Nakagiri, A; Schaumann, K; Higashihara, T

    1999-01-01

    Labyrinthulids and thraustochytrids are unicellular heterotrophs, formerly considered as fungi, but presently are recognized as members in the stramenopiles of the kingdom Protista sensu lato. We determined the 18S ribosomal RNA gene sequences of 14 strains from different species of the six genera and analyzed the molecular phylogenetic relationships. The results conflict with the current classification based on morphology, at the genus and species levels. These organisms are separated, based on signature sequences and unique inserted sequences, into two major groups, which were named the labyrinthulid phylogenetic group and the thraustochytrid phylogenetic group. Although these groupings are in disagreement with many conventional taxonomic characters, they correlated better with the sugar composition of the cell wall. Thus, the currently used taxonomic criteria need serious reconsideration. PMID:10568038

  15. Preliminary study on mitochondrial 16S rRNA gene sequences and phylogeny of flatfishes (Pleuronectiformes)

    NASA Astrophysics Data System (ADS)

    You, Feng; Liu, Jing; Zhang, Peijun; Xiang, Jianhai

    2005-09-01

    A 605 bp section of mitochondrial 16S rRNA gene from Paralichthys olivaceus, Pseudorhombus cinnamomeus, Psetta maxima and Kareius bicoloratus, which represent 3 families of Order Pleuronectiformes was amplified by PCR and sequenced to show the molecular systematics of Pleuronectiformes for comparison with related gene sequences of other 6 flatfish downloaded from GenBank. Phylogenetic analysis based on genetic distance from related gene sequences of 10 flatfish showed that this method was ideal to explore the relationship between species, genera and families. Phylogenetic trees set-up is based on neighbor-joining, maximum parsimony and maximum likelihood methods that accords to the general rule of Pleuronectiformes evolution. But they also resulted in some confusion. Unlike data from morphological characters, P. olivaceus clustered with K. bicoloratus, but P. cinnamomeus did not cluster with P. olivaceus, which is worth further studying.

  16. A variant of Plasmodium ovale; analysis of its 18S ribosomal RNA gene sequence.

    PubMed

    Miyake, H; Suwa, S; Kimura, M; Wataya, Y

    1997-01-01

    We report here a new variant of human malaria parasite found by comparison of diagnostic results obtained from a new DNA diagnostic method named microtiter plate-hybridization (MPH) and traditional microscopic method. Total five cases of malaria were diagnosed as microscopy-positive but MPH-negative; one case was found in epidemiological research in Vietnam and four cases were obtained from imported malaria in Japan. Although they were quite similar to typical P. ovale morphologically in microscopy, sequence analysis of PCR-amplified DNA fragment revealed that their 18S ribosomal RNA gene sequence was different from published sequence of P. ovale. Combination of MPH and microscopic examination provides us a new method for detection of a new type of malaria parasite which is difficult to distinguish morphologically. PMID:9586115

  17. Using deep RNA sequencing for the structural annotation of the laccaria bicolor mycorrhizal transcriptome.

    SciTech Connect

    Larsen, P. E.; Trivedi, G.; Sreedasyam, A.; Lu, V.; Podila, G. K.; Collart, F. R.; Biosciences Division; Univ. of Alabama

    2010-07-06

    Accurate structural annotation is important for prediction of function and required for in vitro approaches to characterize or validate the gene expression products. Despite significant efforts in the field, determination of the gene structure from genomic data alone is a challenging and inaccurate process. The ease of acquisition of transcriptomic sequence provides a direct route to identify expressed sequences and determine the correct gene structure. We developed methods to utilize RNA-seq data to correct errors in the structural annotation and extend the boundaries of current gene models using assembly approaches. The methods were validated with a transcriptomic data set derived from the fungus Laccaria bicolor, which develops a mycorrhizal symbiotic association with the roots of many tree species. Our analysis focused on the subset of 1501 gene models that are differentially expressed in the free living vs. mycorrhizal transcriptome and are expected to be important elements related to carbon metabolism, membrane permeability and transport, and intracellular signaling. Of the set of 1501 gene models, 1439 (96%) successfully generated modified gene models in which all error flags were successfully resolved and the sequences aligned to the genomic sequence. The remaining 4% (62 gene models) either had deviations from transcriptomic data that could not be spanned or generated sequence that did not align to genomic sequence. The outcome of this process is a set of high confidence gene models that can be reliably used for experimental characterization of protein function. 69% of expressed mycorrhizal JGI 'best' gene models deviated from the transcript sequence derived by this method. The transcriptomic sequence enabled correction of a majority of the structural inconsistencies and resulted in a set of validated models for 96% of the mycorrhizal genes. The method described here can be applied to improve gene structural annotation in other species, provided that there

  18. Selection and Characterization of Pre-mRNA Splicing Enhancers: Identification of Novel SR Protein-Specific Enhancer Sequences

    PubMed Central

    Schaal, Thomas D.; Maniatis, Tom

    1999-01-01

    Splicing enhancers are RNA sequences required for accurate splice site recognition and the control of alternative splicing. In this study, we used an in vitro selection procedure to identify and characterize novel RNA sequences capable of functioning as pre-mRNA splicing enhancers. Randomized 18-nucleotide RNA sequences were inserted downstream from a Drosophila doublesex pre-mRNA enhancer-dependent splicing substrate. Functional splicing enhancers were then selected by multiple rounds of in vitro splicing in nuclear extracts, reverse transcription, and selective PCR amplification of the spliced products. Characterization of the selected splicing enhancers revealed a highly heterogeneous population of sequences, but we identified six classes of recurring degenerate sequence motifs five to seven nucleotides in length including novel splicing enhancer sequence motifs. Analysis of selected splicing enhancer elements and other enhancers in S100 complementation assays led to the identification of individual enhancers capable of being activated by specific serine/arginine (SR)-rich splicing factors (SC35, 9G8, and SF2/ASF). In addition, a potent splicing enhancer sequence isolated in the selection specifically binds a 20-kDa SR protein. This enhancer sequence has a high level of sequence homology with a recently identified RNA-protein adduct that can be immunoprecipitated with an SRp20-specific antibody. We conclude that distinct classes of selected enhancers are activated by specific SR proteins, but there is considerable sequence degeneracy within each class. The results presented here, in conjunction with previous studies, reveal a remarkably broad spectrum of RNA sequences capable of binding specific SR proteins and/or functioning as SR-specific splicing enhancers. PMID:10022858

  19. MicroRNA-Sequence Profiling Reveals Novel Osmoregulatory MicroRNA Expression Patterns in Catadromous Eel Anguilla marmorata

    PubMed Central

    Li, Peng; Yin, Shaowu; Wang, Li; Jia, Yihe; Shu, Xinhua

    2015-01-01

    MicroRNAs (miRNAs) are a class of endogenous small non-coding RNAs that regulate gene expression by post-transcriptional repression of mRNAs. Recently, several miRNAs have been confirmed to execute directly or indirectly osmoregulatory functions in fish via translational control. In order to clarify whether miRNAs play relevant roles in the osmoregulation of Anguilla marmorata, three sRNA libraries of A. marmorata during adjusting to three various salinities were sequenced by Illumina sRNA deep sequencing methods. Totally 11,339,168, 11,958,406 and 12,568,964 clear reads were obtained from 3 different libraries, respectively. Meanwhile, 34 conserved miRNAs and 613 novel miRNAs were identified using the sequence data. MiR-10b-5p, miR-181a, miR-26a-5p, miR-30d and miR-99a-5p were dominantly expressed in eels at three salinities. Totally 29 mature miRNAs were significantly up-regulated, while 72 mature miRNAs were significantly down-regulated in brackish water (10‰ salinity) compared with fresh water (0‰ salinity); 24 mature miRNAs were significantly up-regulated, while 54 mature miRNAs were significantly down-regulated in sea water (25‰ salinity) compared with fresh water. Similarly, 24 mature miRNAs were significantly up-regulated, while 45 mature miRNAs were significantly down-regulated in sea water compared with brackish water. The expression patterns of 12 dominantly expressed miRNAs were analyzed at different time points when the eels transferred from fresh water to brackish water or to sea water. These miRNAs showed differential expression patterns in eels at distinct salinities. Interestingly, miR-122, miR-140-3p and miR-10b-5p demonstrated osmoregulatory effects in certain salinities. In addition, the identification and characterization of differentially expressed miRNAs at different salinities can clarify the osmoregulatory roles of miRNAs, which will shed lights for future studies on osmoregulation in fish. PMID:26301415

  20. mRNA deep sequencing reveals 75 new genes and a complex transcriptional landscape in Mimivirus.

    PubMed

    Legendre, Matthieu; Audic, Stéphane; Poirot, Olivier; Hingamp, Pascal; Seltzer, Virginie; Byrne, Deborah; Lartigue, Audrey; Lescot, Magali; Bernadac, Alain; Poulain, Julie; Abergel, Chantal; Claverie, Jean-Michel

    2010-05-01

    Mimivirus, a virus infecting Acanthamoeba, is the prototype of the Mimiviridae, the latest addition to the nucleocytoplasmic large DNA viruses. The Mimivirus genome encodes close to 1000 proteins, many of them never before encountered in a virus, such as four amino-acyl tRNA synthetases. To explore the physiology of this exceptional virus and identify the genes involved in the building of its characteristic intracytoplasmic "virion factory," we coupled electron microscopy observations with the massively parallel pyrosequencing of the polyadenylated RNA fractions of Acanthamoeba castellanii cells at various time post-infection. We generated 633,346 reads, of which 322,904 correspond to Mimivirus transcripts. This first application of deep mRNA sequencing (454 Life Sciences [Roche] FLX) to a large DNA virus allowed the precise delineation of the 5' and 3' extremities of Mimivirus mRNAs and revealed 75 new transcripts including several noncoding RNAs. Mimivirus genes are expressed across a wide dynamic range, in a finely regulated manner broadly described by three main temporal classes: early, intermediate, and late. This RNA-seq study confirmed the AAAATTGA sequence as an early promoter element, as well as the presence of palindromes at most of the polyadenylation sites. It also revealed a new promoter element correlating with late gene expression, which is also prominent in Sputnik, the recently described Mimivirus "virophage." These results-validated genome-wide by the hybridization of total RNA extracted from infected Acanthamoeba cells on a tiling array (Agilent)--will constitute the foundation on which to build subsequent functional studies of the Mimivirus/Acanthamoeba system. PMID:20360389

  1. A genome-wide view of microsatellite instability: old stories of cancer mutations revisited with new sequencing technologies

    PubMed Central

    Kim, Tae-Min; Park, Peter J

    2014-01-01

    Microsatellites are simple tandem repeats that are present at millions of loci in the human genome. Microsatellite instability (MSI) refers to DNA slippage events on microsatellites that occur frequently in cancer genomes when there is a defect in the DNA mismatch repair system. These somatic mutations can result in inactivation of tumor suppressor genes or disrupt other non-coding regulatory sequences, thereby playing a role in carcinogenesis. Here, we will discuss the ways in which high-throughput sequencing data can facilitate a genome- or exome-wide discovery and more detailed investigation of MSI events in microsatellite-unstable cancer genomes. We will address the methodological aspects of this approach and highlight insights from recent analyses of colorectal and endometrial cancer genomes from The Cancer Genome Atlas project. These include identification of novel MSI targets within and across tumor types and the relationship between the likelihood of MSI events to chromatin structure. Given the increasing popularity of exome and genome sequencing of cancer genomes, a comprehensive characterization of MSI may serve as a valuable marker of cancer evolution and aid in a search for therapeutic targets. PMID:25371413

  2. SeqFold: genome-scale reconstruction of RNA secondary structure integrating high-throughput sequencing data.

    PubMed

    Ouyang, Zhengqing; Snyder, Michael P; Chang, Howard Y

    2013-02-01

    We present an integrative approach, SeqFold, that combines high-throughput RNA structure profiling data with computational prediction for genome-scale reconstruction of RNA secondary structures. SeqFold transforms experimental RNA structure information into a structure preference profile (SPP) and uses it to select stable RNA structure candidates representing the structure ensemble. Under a high-dimensional classification framework, SeqFold efficiently matches a given SPP to the most likely cluster of structures sampled from the Boltzmann-weighted ensemble. SeqFold is able to incorporate diverse types of RNA structure profiling data, including parallel analysis of RNA structure (PARS), selective 2'-hydroxyl acylation analyzed by primer extension sequencing (SHAPE-Seq), fragmentation sequencing (FragSeq) data generated by deep sequencing, and conventional SHAPE data. Using the known structures of a wide range of mRNAs and noncoding RNAs as benchmarks, we demonstrate that SeqFold outperforms or matches existing approaches in accuracy and is more robust to noise in experimental data. Application of SeqFold to reconstruct the secondary structures of the yeast transcriptome reveals the diverse impact of RNA secondary structure on gene regulation, including translation efficiency, transcription initiation, and protein-RNA interactions. SeqFold can be easily adapted to incorporate any new types of high-throughput RNA structure profiling data and is widely applicable to analyze RNA structures in any transcriptome. PMID:23064747

  3. The binding of TIA-1 to RNA C-rich sequences is driven by its C-terminal RRM domain.

    PubMed

    Cruz-Gallardo, Isabel; Aroca, Ángeles; Gunzburg, Menachem J; Sivakumaran, Andrew; Yoon, Je-Hyun; Angulo, Jesús; Persson, Cecilia; Gorospe, Myriam; Karlsson, B Göran; Wilce, Jacqueline A; Díaz-Moreno, Irene

    2014-01-01

    T-cell intracellular antigen-1 (TIA-1) is a key DNA/RNA binding protein that regulates translation by sequestering target mRNAs in stress granules (SG) in response to stress conditions. TIA-1 possesses three RNA recognition motifs (RRM) along with a glutamine-rich domain, with the central domains (RRM2 and RRM3) acting as RNA binding platforms. While the RRM2 domain, which displays high affinity for U-rich RNA sequences, is primarily responsible for interaction with RNA, the contribution of RRM3 to bind RNA as well as the target RNA sequences that it binds preferentially are still unknown. Here we combined nuclear magnetic resonance (NMR) and surface plasmon resonance (SPR) techniques to elucidate the sequence specificity of TIA-1 RRM3. With a novel approach using saturation transfer difference NMR (STD-NMR) to quantify protein-nucleic acids interactions, we demonstrate that isolated RRM3 binds to both C- and U-rich stretches with micromolar affinity. In combination with RRM2 and in the context of full-length TIA-1, RRM3 significantly enhanced the binding to RNA, particularly to cytosine-rich RNA oligos, as assessed by biotinylated RNA pull-down analysis. Our findings provide new insight into the role of RRM3 in regulating TIA-1 binding to C-rich stretches, that are abundant at the 5' TOPs (5' terminal oligopyrimidine tracts) of mRNAs whose translation is repressed under stress situations. PMID:24824036

  4. The binding of TIA-1 to RNA C-rich sequences is driven by its C-terminal RRM domain

    PubMed Central

    Cruz-Gallardo, Isabel; Aroca, Ángeles; Gunzburg, Menachem J; Sivakumaran, Andrew; Yoon, Je-Hyun; Angulo, Jesús; Persson, Cecilia; Gorospe, Myriam; Karlsson, B Göran; Wilce, Jacqueline A; Díaz-Moreno, Irene

    2014-01-01

    T-cell intracellular antigen-1 (TIA-1) is a key DNA/RNA binding protein that regulates translation by sequestering target mRNAs in stress granules (SG) in response to stress conditions. TIA-1 possesses three RNA recognition motifs (RRM) along with a glutamine-rich domain, with the central domains (RRM2 and RRM3) acting as RNA binding platforms. While the RRM2 domain, which displays high affinity for U-rich RNA sequences, is primarily responsible for interaction with RNA, the contribution of RRM3 to bind RNA as well as the target RNA sequences that it binds preferentially are still unknown. Here we combined nuclear magnetic resonance (NMR) and surface plasmon resonance (SPR) techniques to elucidate the sequence specificity of TIA-1 RRM3. With a novel approach using saturation transfer difference NMR (STD-NMR) to quantify protein–nucleic acids interactions, we demonstrate that isolated RRM3 binds to both C- and U-rich stretches with micromolar affinity. In combination with RRM2 and in the context of full-length TIA-1, RRM3 significantly enhanced the binding to RNA, particularly to cytosine-rich RNA oligos, as assessed by biotinylated RNA pull-down analysis. Our findings provide new insight into the role of RRM3 in regulating TIA-1 binding to C-rich stretches, that are abundant at the 5′ TOPs (5′ terminal oligopyrimidine tracts) of mRNAs whose translation is repressed under stress situations. PMID:24824036

  5. Details of gastropod phylogeny inferred from 18S rRNA sequences.

    PubMed

    Winnepenninckx, B; Steiner, G; Backeljau, T; De Wachter, R

    1998-02-01

    Some generally accepted viewpoints on the phylogenetic relationships within the molluscan class Gastropoda are reassessed by comparing complete 18S rRNA sequences. Phylogenetic analyses were performed using the neighbor-joining and maximum parsimony methods. The previously suggested basal position of Archaeogastropoda, including Neritimorpha and Vetigastropoda, in the gastropod clade is confirmed. The present study also provides new molecular evidence for the monophyly of both Caenogastropoda and Euthyneura (Pulmonata and Opisthobranchia), making Prosobranchia paraphyletic. The relationships within Caenogastropoda and Euthyneura data turn out to be very unstable on the basis of the present 18S rRNA sequences. The present 18S rRNA data question, but are insufficient to decide on, muricacean (Neogastropoda), neotaenioglossan, pulmonate, or stylommatophoran monophyly. The analyses also focus on two systellommatophoran families, namely, Veronicellidae and Onchidiidae. It is suggested that Systellommatophora are not a monophyletic unit but, due to the lack of stability in the euthyneuran clade, their affinity to either Opisthobranchia or Pulmonata could not be determined. PMID:9479694

  6. RNA sequencing reveals retinal transcriptome changes in STZ-induced diabetic rats

    PubMed Central

    LIU, YUAN-JIE; LIAN, ZHI-YUN; LIU, GENG; ZHOU, HONG-YING; YANG, HUI-JUN

    2016-01-01

    The present study aimed to investigate changes in retinal gene expression in streptozotocin (STZ)-induced diabetic rats using next-generation sequencing, utilize transcriptome signatures to investigate the molecular mechanisms of diabetic retinopathy (DR), and identify novel strategies for the treatment of DR. Diabetes was chemically induced in 10-week-old male Sprague-Dawley rats using STZ. Flash-electroretinography (F-ERG) was performed to evaluate the visual function of the rats. The retinas of the rats were removed to perform high throughput RNA sequence (RNA-seq) analysis. The a-wave, b-wave, oscillatory potential 1 (OP1), OP2 and ∑OP amplitudes were significantly reduced in the diabetic group, compared with those of the control group (P<0.05). Furthermore, the implicit b-wave duration 16 weeks post-STZ induction were significantly longer in the diabetic rats, compared with the control rats (P<0.001). A total of 868 genes were identified, of which 565 were upregulated and 303 were downregulated. Among the differentially expressed genes (DEGs), 94 apoptotic genes and apoptosis regulatory genes, and 19 inflammatory genes were detected. The results of the KEGG pathway significant enrichment analysis revealed enrichment in cell adhesion molecules, complement and coagulation cascades, and antigen processing and presentation. Diabetes alters several transcripts in the retina, and RNA-seq provides novel insights into the molecular mechanisms underlying DR. PMID:26781437

  7. Comparative Dynamics and Sequence Dependence of DNA and RNA Binding to Single Walled Carbon Nanotubes

    PubMed Central

    Landry, Markita P.; Vuković, Lela; Kruss, Sebastian; Bisker, Gili; Landry, Alexandra M.; Islam, Shahrin; Jain, Rishabh; Schulten, Klaus; Strano, Michael S.

    2015-01-01

    Noncovalent polymer-single walled carbon nanotube (SWCNT) conjugates have gained recent interest due to their prevalent use as electrochemical and optical sensors, SWCNT-based therapeutics, and for SWCNT separation. However, little is known about the effects of polymer-SWCNT molecular interactions on functional properties of these conjugates. In this work, we show that SWCNT complexed with related polynucleotide polymers (DNA, RNA) have dramatically different fluorescence stability. Surprisingly, we find a difference of nearly 2500-fold in fluorescence emission between the most fluorescently stable DNA-SWCNT complex, C30 DNA-SWCNT, compared to the least fluorescently stable complex, (AT)7A-(GU)7G DNA-RNA hybrid-SWCNT. We further reveal the existence of three regimes in which SWCNT fluorescence varies nonmonotonically with SWCNT concentration. We utilize molecular dynamics simulations to elucidate the conformation and atomic details of SWCNT-corona phase interactions. Our results show that variations in polynucleotide sequence or sugar backbone can lead to large changes in the conformational stability of the polymer SWCNT corona and the SWCNT optical response. Finally, we demonstrate the effect of the coronae on the response of a recently developed dopamine nanosensor, based on (GT)15 DNA- and (GU)15 RNA-SWCNT complexes. Our results clarify several features of the sequence dependence of corona phases produced by polynucleotides adsorbed to single walled carbon nanotubes, and the implications for molecular recognition in such phases. PMID:26005509

  8. SNPlice: variants that modulate Intron retention from RNA-sequencing data

    PubMed Central

    Movassagh, Mercedeh; Kowsari, Kamran; Seyfi, Ali; Kokkinaki, Maria; Edwards, Nathan J.; Golestaneh, Nady; Horvath, Anelia

    2015-01-01

    Rationale: The growing recognition of the importance of splicing, together with rapidly accumulating RNA-sequencing data, demand robust high-throughput approaches, which efficiently analyze experimentally derived whole-transcriptome splice profiles. Results: We have developed a computational approach, called SNPlice, for identifying cis-acting, splice-modulating variants from RNA-seq datasets. SNPlice mines RNA-seq datasets to find reads that span single-nucleotide variant (SNV) loci and nearby splice junctions, assessing the co-occurrence of variants and molecules that remain unspliced at nearby exon–intron boundaries. Hence, SNPlice highlights variants preferentially occurring on intron-containing molecules, possibly resulting from altered splicing. To illustrate co-occurrence of variant nucleotide and exon–intron boundary, allele-specific sequencing was used. SNPlice results are generally consistent with splice-prediction tools, but also indicate splice-modulating elements missed by other algorithms. SNPlice can be applied to identify variants that correlate with unexpected splicing events, and to measure the splice-modulating potential of canonical splice-site SNVs. Availability and implementation: SNPlice is freely available for download from https://code.google.com/p/snplice/ as a self-contained binary package for 64-bit Linux computers and as python source-code. Contact: pmudvari@gwu.edu or horvatha@gwu.edu Supplementary information: Supplementary data are available at Bioinformatics online. PMID:25481010

  9. Sequences implicated in the processing of Thermus thermophilus HB8 23S rRNA.

    PubMed Central

    Hartmann, R K; Ulbrich, N; Erdmann, V A

    1987-01-01

    Nuclease S1 mapping analyses were performed in order to detect processing intermediates of pre-23S rRNA from Thermus thermophilus HB8. Two processing sites were identified downstream the start of transcription and several consecutive cleavage sites are associated with the mature 5'-end. In the 3'-flanking region one "primary" site and two cleavages which generate short-living intermediates were detected. A series of successive intermediates in the region of the mature 3'-end implies the existence of--in analogy to Escherichia coli--a 3'-exonucleolytic activity. The data were correlated with potential secondary structures within the pre-23S rRNA, which exhibit various repeated sequence elements. M13 sequencing data support the existence of one secondary structural element associated with the strong "primary" cleavage site in the 3'-flanking region. In T. thermophilus we can exclude the formation of an extended base-paired and precursor-specific stem enclosing the 23S rRNA which is inferred to mediate recognition by RNase III in E. coli. Images PMID:3313273

  10. Combined heat shock protein 90 and ribosomal RNA sequence phylogeny supports multiple replacements of dinoflagellate plastids.

    PubMed

    Shalchian-Tabrizi, Kamran; Minge, Marianne A; Cavalier-Smith, Tom; Nedreklepp, Joachim M; Klaveness, Dag; Jakobsen, Kjetill S

    2006-01-01

    Dinoflagellates harbour diverse plastids obtained from several algal groups, including haptophytes, diatoms, cryptophytes, and prasinophytes. Their major plastid type with the accessory pigment peridinin is found in the vast majority of photosynthetic species. Some species of dinoflagellates have other aberrantly pigmented plastids. We sequenced the nuclear small subunit (SSU) ribosomal RNA (rRNA) gene of the "green" dinoflagellate Gymnodinium chlorophorum and show that it is sister to Lepidodinium viride, indicating that their common ancestor obtained the prasinophyte (or other green alga) plastid in one event. As the placement of dinoflagellate species that acquired green algal or haptophyte plastids is unclear from small and large subunit (LSU) rRNA trees, we tested the usefulness of the heat shock protein (Hsp) 90 gene for dinoflagellate phylogeny by sequencing it from four species with aberrant plastids (G. chlorophorum, Karlodinium micrum, Karenia brevis, and Karenia mikimotoi) plus Alexandrium tamarense, and constructing phylogenetic trees for Hsp90 and rRNAs, separately and together. Analyses of the Hsp90 and concatenated data suggest an ancestral origin of the peridinin-containing plastid, and two independent replacements of the peridinin plastid soon after the early radiation of the dinoflagellates. Thus, the Hsp90 gene seems to be a promising phylogenetic marker for dinoflagellate phylogeny. PMID:16677346

  11. The complete nucleotide sequence of a 16S ribosomal RNA gene from a blue-green alga, Anacystis nidulans.

    PubMed

    Tomioka, N; Sugiura, M

    1983-01-01

    The complete nucleotide sequence of a 16S ribosomal RNA gene from a blue-green alga, Anacystis nidulans, has been determined. Its coding region is estimated to be 1,487 base pairs long, which is nearly identical to those reported for chloroplast 16S rRNA genes and is about 4% shorter than that of the Escherichia coli gene. The 16S rRNA sequence of A. nidulans has 83% homology with that of tobacco chloroplast and 74% homology with that of E. coli. Possible stem and loop structures of A. nidulans 16S rRNA sequences resemble more closely those of chloroplast 16S rRNAs than those of E. coli 16S rRNA. These observations support the endosymbiotic theory of chloroplast origin. PMID:6412038

  12. Last glacial climate instability documented by coarse-grained sediments within the loess sequence, at Fanjiaping, Lanzhou, China

    NASA Astrophysics Data System (ADS)

    Jiang, H.; Wang, P.; Thompson, J.; Ding, Z.; Lu, Y.

    2009-12-01

    OSL dating, grain-size analysis and magnetic susceptibility measurements were conducted on the Fanjiaping loess section, the western Chinese Loess Plateau. The results confirm that last glacial high-frequency climatic shifts were documented in mid-latitude continental archives. Comparison and analysis of grain-size distribution features indicated that the sequence was generally dominated by silt, indicating wind-blown dust sedimentation. Nevertheless, the lower part of the sequence was marked by high fluctuations of coarse-grained sand content with horizontal bedding and washing-refilling structures, implying a fluvial process. These coarse-grain sedimentations were probably a response to brief intensifications of Asian summer monsoon at the start of MIS 4 and during the early to middle stage of MIS 3. During the stage 5-4 transition, the relatively high sea-surface temperatures of high latitudes probably led to a small meridional temperature gradient, which may have helped the Asian summer monsoon to penetrate northward, bringing more precipitation to Lanzhou. The early to middle stage of MIS 3 was a relatively warm and humid climate not only regionally in the low-latitude Yunnan Province and the mid-latitude Lanzhou, but also in the high-latitude East Siberian Arctic. During this period, instability of the ice sheets around the Arctic Ocean enhanced significantly and enormous icebergs were discharged, coincident with the moderate summer insolation. Because modeling results indicate that the boreal forest warmed both winter and summer air temperatures (relative to bare ground or tundra vegetation) during the last glacial period, the interstadial events observed in the middle to high latitude of Northern Hemisphere may be caused by a combination of enhanced insolation, ice sheet instability and the boreal forest recovery. This study thus provides new significant information about the response of terrestrial loessic palaeoenvironments to millennial

  13. A search for conserved sequences in coding regions reveals that the let-7 microRNA targets Dicer within its coding sequence

    PubMed Central

    Forman, Joshua J.; Legesse-Miller, Aster; Coller, Hilary A.

    2008-01-01

    Recognition sites for microRNAs (miRNAs) have been reported to be located in the 3′ untranslated regions of transcripts. In a computational screen for highly conserved motifs within coding regions, we found an excess of sequences conserved at the nucleotide level within coding regions in the human genome, the highest scoring of which are enriched for miRNA target sequences. To validate our results, we experimentally demonstrated that the let-7 miRNA directly targets the miRNA-processing enzyme Dicer within its coding sequence, thus establishing a mechanism for a miRNA/Dicer autoregulatory negative feedback loop. We also found computational evidence to suggest that miRNA target sites in coding regions and 3′ UTRs may differ in mechanism. This work demonstrates that miRNAs can directly target transcripts within their coding region in animals, and it suggests that a complete search for the regulatory targets of miRNAs should be expanded to include genes with recognition sites within their coding regions. As more genomes are sequenced, the methodological approach that we used for identifying motifs with high sequence conservation will be increasingly valuable for detecting functional sequence motifs within coding regions. PMID:18812516

  14. How close is close: 16S rRNA sequence identity may not be sufficient to guarantee species identity

    NASA Technical Reports Server (NTRS)

    Fox, G. E.; Wisotzkey, J. D.; Jurtshuk, P. Jr

    1992-01-01

    16S rRNA (genes coding for rRNA) sequence comparisons were conducted with the following three psychrophilic strains: Bacillus globisporus W25T (T = type strain) and Bacillus psychrophilus W16AT, and W5. These strains exhibited more than 99.5% sequence identity and within experimental uncertainty could be regarded as identical. Their close taxonomic relationship was further documented by phenotypic similarities. In contrast, previously published DNA-DNA hybridization results have convincingly established that these strains do not belong to the same species if current standards are used. These results emphasize the important point that effective identity of 16S rRNA sequences is not necessarily a sufficient criterion to guarantee species identity. Thus, although 16S rRNA sequences can be used routinely to distinguish and establish relationships between genera and well-resolved species, very recently diverged species may not be recognizable.

  15. 16S-23S ribosomal RNA spacer regions of Acetobacter europaeus and A. xylinum, tRNA genes and antitermination sequences.

    PubMed

    Sievers, M; Alonso, L; Gianotti, S; Boesch, C; Teuber, M

    1996-08-15

    The 16S-23S ribosomal RNA spacer regions of Acetobacter europaeus DSM 6160, A. xylinum NCIB 11664 and A. xylinum CL27 were amplified by PCR. Specific PCR products were obtained from each strain and their nucleotide sequences determined. The spacer region of A. europaeus comprises 768 nucleotides (nt), that of A. xylinum 778 nt and that of A. xylinum CL27 759 nt. Genes encoding tRNAIle and tRNAAla were identified. Putative antitermination sequences were found between the tRNAAla sequence and the 5'-terminus of the 23S rRNA coding sequence. The boxA element has the nucleotide sequence TGCTCTTTGATA. Based on hybridization data of digested chromosomal DNA with spacer-specific probes, the copy number of the rrn operons on the chromosome of Acetobacter strains is estimated to be four. PMID:8759788

  16. Determination of the Specificity Landscape for Ribonuclease P Processing of Precursor tRNA 5' Leader Sequences.

    PubMed

    Niland, Courtney N; Zhao, Jing; Lin, Hsuan-Chun; Anderson, David R; Jankowsky, Eckhard; Harris, Michael E

    2016-08-19

    Maturation of tRNA depends on a single endonuclease, ribonuclease P (RNase P), to remove highly variable 5' leader sequences from precursor tRNA transcripts. Here, we use high-throughput enzymology to report multiple-turnover and single-turnover kinetics for Escherichia coli RNase P processing of all possible 5' leader sequences, including nucleotides contacting both the RNA and protein subunits of RNase P. The results reveal that the identity of N(-2) and N(-3) relative to the cleavage site at N(1) primarily control alternative substrate selection and act at the level of association not the cleavage step. As a consequence, the specificity for N(-1), which contacts the active site and contributes to catalysis, is suppressed. This study demonstrates high-throughput RNA enzymology as a means to globally determine RNA specificity landscapes and reveals the mechanism of substrate discrimination by a widespread and essential RNA-processing enzyme. PMID:27336323

  17. RNA sequencing as a powerful tool in searching for genes influencing health and performance traits of horses.

    PubMed

    Stefaniuk, Monika; Ropka-Molik, Katarzyna

    2016-05-01

    RNA sequencing (RNA-seq) by next-generation technology is a powerful tool which creates new possibilities in whole-transcriptome analysis. In recent years, with the use of the RNA-seq method, several studies expanded transcriptional gene profiles to understand interactions between genotype and phenotype, supremely contributing to the field of equine biology. To date, in horses, massive parallel sequencing of cDNA has been successfully used to identify and quantify mRNA levels in several normal tissues, as well as to annotate genes. Moreover, the RNA-seq method has been applied to identify the genetic basis of several diseases or to investigate organism adaptation processes to the training conditions. The use of the RNA-seq approach has also confirmed that horses can be useful as a large animal model for human disease, especially in the field of immune response. The presented review summarizes the achievements of profiling gene expression in horses (Equus caballus). PMID:26446669

  18. The non-coding RNA composition of the mitotic chromosome by 5′-tag sequencing

    PubMed Central

    Meng, Yicong; Yi, Xianfu; Li, Xinhui; Hu, Chuansheng; Wang, Ju; Bai, Ling; Czajkowsky, Daniel M.; Shao, Zhifeng

    2016-01-01

    Mitotic chromosomes are one of the most commonly recognized sub-cellular structures in eukaryotic cells. Yet basic information necessary to understand their structure and assembly, such as their composition, is still lacking. Recent proteomic studies have begun to fill this void, identifying hundreds of RNA-binding proteins bound to mitotic chromosomes. However, by contrast, there are only two RNA species (U3 snRNA and rRNA) that are known to be associated with the mitotic chromosome, suggesting that there are many mitotic chromosome-associated RNAs (mCARs) not yet identified. Here, using a targeted protocol based on 5′-tag sequencing to profile the mammalian mCAR population, we report the identification of 1279 mCARs, the majority of which are ncRNAs, including lncRNAs that exhibit greater conservation across 60 vertebrate species than the entire population of lncRNAs. There is also a significant enrichment of snoRNAs and specific SINE RNAs. Finally, ∼40% of the mCARs are presently unannotated, many of which are as abundant as the annotated mCARs, suggesting that there are also many novel ncRNAs in the mCARs. Overall, the mCARs identified here, together with the previous proteomic and genomic data, constitute the first comprehensive catalogue of the molecular composition of the eukaryotic mitotic chromosomes. PMID:27016738

  19. RNA transcript sequencing reveals inorganic sulfur compound oxidation pathways in the acidophile Acidithiobacillus ferrivorans.

    PubMed

    Christel, Stephan; Fridlund, Jimmy; Buetti-Dinh, Antoine; Buck, Moritz; Watkin, Elizabeth L; Dopson, Mark

    2016-04-01

    Acidithiobacillus ferrivorans is an acidophile implicated in low-temperature biomining for the recovery of metals from sulfide minerals. Acidithiobacillus ferrivorans obtains its energy from the oxidation of inorganic sulfur compounds, and genes encoding several alternative pathways have been identified. Next-generation sequencing of At. ferrivorans RNA transcripts identified the genes coding for metabolic and electron transport proteins for energy conservation from tetrathionate as electron donor. RNA transcripts suggested that tetrathionate was hydrolyzed by the tetH1 gene product to form thiosulfate, elemental sulfur and sulfate. Despite two of the genes being truncated, RNA transcripts for the SoxXYZAB complex had higher levels than for thiosulfate quinone oxidoreductase (doxDAgenes). However, a lack of heme-binding sites in soxX suggested that DoxDA was responsible for thiosulfate metabolism. Higher RNA transcript counts also suggested that elemental sulfur was metabolized by heterodisulfide reductase (hdrgenes) rather than sulfur oxygenase reductase (sor). The sulfite produced as a product of heterodisulfide reductase was suggested to be oxidized by a pathway involving the sat gene product or abiotically react with elemental sulfur to form thiosulfate. Finally, several electron transport complexes were involved in energy conservation. This study has elucidated the previously unknown At. ferrivorans tetrathionate metabolic pathway that is important in biomining. PMID:26956550

  20. The non-coding RNA composition of the mitotic chromosome by 5'-tag sequencing.

    PubMed

    Meng, Yicong; Yi, Xianfu; Li, Xinhui; Hu, Chuansheng; Wang, Ju; Bai, Ling; Czajkowsky, Daniel M; Shao, Zhifeng

    2016-06-01

    Mitotic chromosomes are one of the most commonly recognized sub-cellular structures in eukaryotic cells. Yet basic information necessary to understand their structure and assembly, such as their composition, is still lacking. Recent proteomic studies have begun to fill this void, identifying hundreds of RNA-binding proteins bound to mitotic chromosomes. However, by contrast, there are only two RNA species (U3 snRNA and rRNA) that are known to be associated with the mitotic chromosome, suggesting that there are many mitotic chromosome-associated RNAs (mCARs) not yet identified. Here, using a targeted protocol based on 5'-tag sequencing to profile the mammalian mCAR population, we report the identification of 1279 mCARs, the majority of which are ncRNAs, including lncRNAs that exhibit greater conservation across 60 vertebrate species than the entire population of lncRNAs. There is also a significant enrichment of snoRNAs and specific SINE RNAs. Finally, ∼40% of the mCARs are presently unannotated, many of which are as abundant as the annotated mCARs, suggesting that there are also many novel ncRNAs in the mCARs. Overall, the mCARs identified here, together with the previous proteomic and genomic data, constitute the first comprehensive catalogue of the molecular composition of the eukaryotic mitotic chromosomes. PMID:27016738

  1. U2AF1 Mutations Alter Sequence Specificity of pre-mRNA Binding and Splicing

    PubMed Central

    Okeyo-Owuor, Theresa; White, Brian S.; Chatrikhi, Rakesh; Mohan, Dipika R.; Kim, Sanghyun; Griffith, Malachi; Ding, Li; Ketkar-Kulkarni, Shamika; Hundal, Jasreet; Laird, Kholiswa M.; Kielkopf, Clara L.; Ley, Timothy J.; Walter, Matthew J.; Graubert, Timothy A.

    2014-01-01

    We previously identified missense mutations in the U2AF1 splicing factor affecting codons S34 (S34F and S34Y) or Q157 (Q157R and Q157P) in 11% of patients with de novo myelodysplastic syndromes (MDS). Although the role of U2AF1 as an accessory factor in the U2 snRNP is well established, it is not yet clear how mutations affect splicing or contribute to MDS pathophysiology. We analyzed splice junctions in RNA-seq data generated from transfected CD34+ hematopoietic cells and found significant differences in the abundance of known and novel junctions in samples expressing mutant U2AF1 (S34F). For selected transcripts, splicing alterations detected by RNA-seq were confirmed by analysis of primary de novo MDS patient samples. These effects were not due to impaired U2AF1 (S34F) localization as it co-localized normally with U2AF2 within nuclear speckles. We further found evidence in the RNA-seq data for decreased affinity of U2AF1 (S34F) for uridine (relative to cytidine) at the e-3 position immediately upstream of the splice acceptor site and corroborated this finding using affinity binding assays. These data suggest that the S34F mutation alters U2AF1 function in the context of specific RNA sequences, leading to aberrant alternative splicing of target genes, some of which may be relevant for MDS pathogenesis. PMID:25311244

  2. Sequence variation within the rRNA gene loci of 12 Drosophila species

    PubMed Central

    Stage, Deborah E.; Eickbush, Thomas H.

    2007-01-01

    Concerted evolution maintains at near identity the hundreds of tandemly arrayed ribosomal RNA (rRNA) genes and their spacers present in any eukaryote. Few comprehensive attempts have been made to directly measure the identity between the rDNA units. We used the original sequencing reads (trace archives) available through the whole-genome shotgun sequencing projects of 12 Drosophila species to locate the sequence variants within the 7.8–8.2 kb transcribed portions of the rDNA units. Three to 18 variants were identified in >3% of the total rDNA units from 11 species. Species where the rDNA units are present on multiple chromosomes exhibited only minor increases in sequence variation. Variants were 10–20 times more abundant in the noncoding compared with the coding regions of the rDNA unit. Within the coding regions, variants were three to eight times more abundant in the expansion compared with the conserved core regions. The distribution of variants was largely consistent with models of concerted evolution in which there is uniform recombination across the transcribed portion of the unit with the frequency of standing variants dependent upon the selection pressure to preserve that sequence. However, the 28S gene was found to contain fewer variants than the 18S gene despite evolving 2.5-fold faster. We postulate that the fewer variants in the 28S gene is due to localized gene conversion or DNA repair triggered by the activity of retrotransposable elements that are specialized for insertion into the 28S genes of these species. PMID:17989256

  3. Construction of the mycoplasma evolutionary tree from 5S rRNA sequence data.

    PubMed Central

    Rogers, M J; Simmons, J; Walker, R T; Weisburg, W G; Woese, C R; Tanner, R S; Robinson, I M; Stahl, D A; Olsen, G; Leach, R H

    1985-01-01

    The 5S rRNA sequences of eubacteria and mycoplasmas have been analyzed and a phylogenetic tree constructed. We determined the sequences of 5S rRNA from Clostridium innocuum, Acholeplasma laidlawii, Acholeplasma modicum, Anaeroplasma bactoclasticum, Anaeroplasma abactoclasticum, Ureaplasma urealyticum, Mycoplasma mycoides mycoides, Mycoplasma pneumoniae, and Mycoplasma gallisepticum. Analysis of these and published sequences shows that mycoplasmas form a coherent phylogenetic group that, with C. innocuum, arose as a branch of the low G+C Gram-positive tree, near the lactobacilli and streptococci. The initial event in mycoplasma phylogeny was formation of the Acholeplasma branch; hence, loss of cell wall probably occurred at the time of genome reduction to approximately to 1000 MDa. A subsequent branch produced the Spiroplasma. This branch appears to have been the origin of sterol-requiring mycoplasmas. During development of the Spiroplasma branch there were several independent genome reductions, each to approximately 500 MDa, resulting in Mycoplasma and Ureaplasma species. Mycoplasmas, particularly species with the smallest genomes, have high mutation rates, suggesting that they are in a state of rapid evolution. PMID:2579388

  4. Identification and characterization of rhizospheric microbial diversity by 16S ribosomal RNA gene sequencing.

    PubMed

    Naveed, Muhammad; Mubeen, Samavia; Khan, SamiUllah; Ahmed, Iftikhar; Khalid, Nauman; Suleria, Hafiz Ansar Rasul; Bano, Asghari; Mumtaz, Abdul Samad

    2014-01-01

    In the present study, samples of rhizosphere and root nodules were collected from different areas of Pakistan to isolate plant growth promoting rhizobacteria. Identification of bacterial isolates was made by 16S rRNA gene sequence analysis and taxonomical confirmation on EzTaxon Server. The identified bacterial strains were belonged to 5 genera i.e. Ensifer, Bacillus, Pseudomona, Leclercia and Rhizobium. Phylogenetic analysis inferred from 16S rRNA gene sequences showed the evolutionary relationship of bacterial strains with the respective genera. Based on phylogenetic analysis, some candidate novel species were also identified. The bacterial strains were also characterized for morphological, physiological, biochemical tests and glucose dehydrogenase (gdh) gene that involved in the phosphate solublization using cofactor pyrroloquinolone quinone (PQQ). Seven rhizoshperic and 3 root nodulating stains are positive for gdh gene. Furthermore, this study confirms a novel association between microbes and their hosts like field grown crops, leguminous and non-leguminous plants. It was concluded that a diverse group of bacterial population exist in the rhizosphere and root nodules that might be useful in evaluating the mechanisms behind plant microbial interactions and strains QAU-63 and QAU-68 have sequence similarity of 97 and 95% which might be declared as novel after further taxonomic characterization. PMID:25477935

  5. Identification and characterization of rhizospheric microbial diversity by 16S ribosomal RNA gene sequencing

    PubMed Central

    Naveed, Muhammad; Mubeen, Samavia; khan, SamiUllah; Ahmed, Iftikhar; Khalid, Nauman; Suleria, Hafiz Ansar Rasul; Bano, Asghari; Mumtaz, Abdul Samad

    2014-01-01

    In the present study, samples of rhizosphere and root nodules were collected from different areas of Pakistan to isolate plant growth promoting rhizobacteria. Identification of bacterial isolates was made by 16S rRNA gene sequence analysis and taxonomical confirmation on EzTaxon Server. The identified bacterial strains were belonged to 5 genera i.e. Ensifer, Bacillus, Pseudomona, Leclercia and Rhizobium. Phylogenetic analysis inferred from 16S rRNA gene sequences showed the evolutionary relationship of bacterial strains with the respective genera. Based on phylogenetic analysis, some candidate novel species were also identified. The bacterial strains were also characterized for morphological, physiological, biochemical tests and glucose dehydrogenase (gdh) gene that involved in the phosphate solublization using cofactor pyrroloquinolone quinone (PQQ). Seven rhizoshperic and 3 root nodulating stains are positive for gdh gene. Furthermore, this study confirms a novel association between microbes and their hosts like field grown crops, leguminous and non-leguminous plants. It was concluded that a diverse group of bacterial population exist in the rhizosphere and root nodules that might be useful in evaluating the mechanisms behind plant microbial interactions and strains QAU-63 and QAU-68 have sequence similarity of 97 and 95% which might be declared as novel after further taxonomic characterization. PMID:25477935

  6. Identification of sRNA interacting with a transcript of interest using MS2-affinity purification coupled with RNA sequencing (MAPS) technology

    PubMed Central

    Lalaouna, David; Massé, Eric

    2015-01-01

    RNA sequencing (RNAseq) technology recently allowed the identification of thousands of small RNAs (sRNAs) within the prokaryotic kingdom. However, drawing the comprehensive interaction map of a sRNA remains a challenging task. To address this problem, we recently developed a method called MAPS (MS2 affinity purification coupled with RNA sequencing) to characterize the full targetome of specific sRNAs. This method enabled the identification of target RNAs interacting with sRNAs, regardless of the type of regulation (positive or negative), type of targets (mRNA, tRNA, sRNA) or their abundance. We also demonstrated that we can use this technology to perform a reverse MAPS experiment, where an RNA fragment of interest is used as bait to identify interacting sRNAs. Here, we demonstrated that RybB and MicF sRNAs co-purified with internal transcribed spacers (ITS) of metZ–metW–metV tRNA transcript, confirming results obtained with MS2-RybB MAPS. Both raw and analyzed RNAseq data are available in GEO database (GSE66517). PMID:26484242

  7. Detection of alternative splice and gene duplication by RNA sequencing in Japanese flounder, Paralichthys olivaceus.

    PubMed

    Wang, Wenji; Wang, Jing; You, Feng; Ma, Liman; Yang, Xiao; Gao, Jinning; He, Yan; Qi, Jie; Yu, Haiyang; Wang, Zhigang; Wang, Xubo; Wu, Zhihao; Zhang, Quanqi

    2014-12-01

    Japanese flounder (Paralichthys olivaceus) is one of the economic important fish in China. Sexual dimorphism, especially the different growth rates and body sizes between two sexes, makes this fish a good model to investigate mechanisms responsible for such dimorphism for both fundamental questions in evolution and applied topics in aquaculture. However, the lack of "omics" data has hindered the process. The recent advent of RNA-sequencing technology provides a robust tool to further study characteristics of genomes of nonmodel species. Here, we performed de novo transcriptome sequencing for a double haploid Japanese flounder individual using Illumina sequencing. A single lane of paired-end sequencing produced more than 27 million reads. These reads were assembled into 107,318 nonredundant transcripts, half of which (51,563; 48.1%) were annotated by blastx to public protein database. A total of 1051 genes that had potential alternative splicings were detected by Chrysalis implemented in Trinity software. Four of 10 randomly picked genes were verified truly containing alternative splicing by cloning and Sanger sequencing. Notably, using a doubled haploid Japanese flounder individual allow us to analyze gene duplicates. In total, 3940 "single-nucleotide polymorphisms" were detected form 1859 genes, which may have happened gene duplicates. This study lays the foundation for structural and functional genomics studies in Japanese flounder. PMID:25512620

  8. Detection of Alternative Splice and Gene Duplication by RNA Sequencing in Japanese Flounder, Paralichthys olivaceus

    PubMed Central

    Wang, Wenji; Wang, Jing; You, Feng; Ma, Liman; Yang, Xiao; Gao, Jinning; He, Yan; Qi, Jie; Yu, Haiyang; Wang, Zhigang; Wang, Xubo; Wu, Zhihao; Zhang, Quanqi

    2014-01-01

    Japanese flounder (Paralichthys olivaceus) is one of the economic important fish in China. Sexual dimorphism, especially the different growth rates and body sizes between two sexes, makes this fish a good model to investigate mechanisms responsible for such dimorphism for both fundamental questions in evolution and applied topics in aquaculture. However, the lack of “omics” data has hindered the process. The recent advent of RNA-sequencing technology provides a robust tool to further study characteristics of genomes of nonmodel species. Here, we performed de novo transcriptome sequencing for a double haploid Japanese flounder individual using Illumina sequencing. A single lane of paired-end sequencing produced more than 27 million reads. These reads were assembled into 107,318 nonredundant transcripts, half of which (51,563; 48.1%) were annotated by blastx to public protein database. A total of 1051 genes that had potential alternative splicings were detected by Chrysalis implemented in Trinity software. Four of 10 randomly picked genes were verified truly containing alternative splicing by cloning and Sanger sequencing. Notably, using a doubled haploid Japanese flounder individual allow us to analyze gene duplicates. In total, 3940 “single-nucleotide polymorphisms” were detected form 1859 genes, which may have happened gene duplicates. This study lays the foundation for structural and functional genomics studies in Japanese flounder. PMID:25512620

  9. Translational readthrough potential of natural termination codons in eucaryotes – The impact of RNA sequence

    PubMed Central

    Dabrowski, Maciej; Bukowy-Bieryllo, Zuzanna; Zietkiewicz, Ewa

    2015-01-01

    Termination of protein synthesis is not 100% efficient. A number of natural mechanisms that suppress translation termination exist. One of them is STOP codon readthrough, the process that enables the ribosome to pass through the termination codon in mRNA and continue translation to the next STOP codon in the same reading frame. The efficiency of translational readthrough depends on a variety of factors, including the identity of the termination codon, the surrounding mRNA sequence context, and the presence of stimulating compounds. Understanding the interplay between these factors provides the necessary background for the efficient application of the STOP codon suppression approach in the therapy of diseases caused by the presence of premature termination codons. PMID:26176195

  10. Steroidogenic activity of a peptide specified by the reversed sequence of corticotropin mRNA.

    PubMed Central

    Clarke, B L; Blalock, J E

    1990-01-01

    The molecular recognition theory predicts that a reversed (3'----5') reading of an mRNA should yield a peptide that is structurally and functionally similar to that specified in the 5'----3' direction. We tested this idea by synthesizing a corticotropin (ACTH) analogue using a reverse reading of bovine mRNA for ACTH-(1-24). This peptide, designated ACTH-3'----5', had a similar hydropathic profile to native ACTH-5'----3' but had only 30% sequence homology and eight different charge substitutions. ACTH-3'----5' specifically bound to the surface of mouse Y-1 adrenal cells and to polyclonal anti-ACTH antibody. Additionally, ACTH-3'----5' stimulated cAMP synthesis and steroidogenesis in adrenal cells. These findings show that ACTH-3'----5' mimics the corticotropic properties of native ACTH, thereby further validating the molecular recognition theory. PMID:2175911

  11. Single-Cell RNA-Sequencing Reveals a Continuous Spectrum of Differentiation in Hematopoietic Cells

    PubMed Central

    Macaulay, Iain C.; Svensson, Valentine; Labalette, Charlotte; Ferreira, Lauren; Hamey, Fiona; Voet, Thierry; Teichmann, Sarah A.; Cvejic, Ana

    2016-01-01

    Summary The transcriptional programs that govern hematopoiesis have been investigated primarily by population-level analysis of hematopoietic stem and progenitor cells, which cannot reveal the continuous nature of the differentiation process. Here we applied single-cell RNA-sequencing to a population of hematopoietic cells in zebrafish as they undergo thrombocyte lineage commitment. By reconstructing their developmental chronology computationally, we were able to place each cell along a continuum from stem cell to mature cell, refining the traditional lineage tree. The progression of cells along this continuum is characterized by a highly coordinated transcriptional program, displaying simultaneous suppression of genes involved in cell proliferation and ribosomal biogenesis as the expression of lineage specific genes increases. Within this program, there is substantial heterogeneity in the expression of the key lineage regulators. Overall, the total number of genes expressed, as well as the total mRNA content of the cell, decreases as the cells undergo lineage commitment. PMID:26804912

  12. RBRIdent: An algorithm for improved identification of RNA-binding residues in proteins from primary sequences.

    PubMed

    Xiong, Dapeng; Zeng, Jianyang; Gong, Haipeng

    2015-06-01

    Rapid and correct identification of RNA-binding residues based on the protein primary sequences is of great importance. In most prevalent machine-learning-based identification methods; however, either some features are inefficiently represented, or the redundancy between features is not effectively removed. Both problems may weaken the performance of a classifier system and raise its computational complexity. Here, we addressed the above problems and developed a better classifier (RBRIdent) to identify the RNA-binding residues. In an independent benchmark test, RBRIdent achieved an accuracy of 76.79%, Matthews correlation coefficient of 0.3819 and F-measure of 75.58%, remarkably outperforming all prevalent methods. These results suggest the necessity of proper feature description and the essential role of feature selection in this project. All source data and codes are freely available at http://166.111.152.91/RBRIdent. PMID:25846271

  13. Expression profiling of Drosophila mitochondrial genes via deep mRNA sequencing

    PubMed Central

    Torres, Tatiana Teixeira; Dolezal, Marlies; Schlötterer, Christian; Ottenwälder, Birgit

    2009-01-01

    Mitochondria play an essential role in several cellular processes. Nevertheless, very little is known about patterns of gene expression of genes encoded by the mitochondrial DNA (mtDNA). In this study, we used next-generation sequencing (NGS) for transcription profiling of genes encoded in the mitochondrial genome of Drosophila melanogaster and D. pseudoobscura. The analysis of males and females in both species indicated that the expression pattern was conserved between the two species, but differed significantly between both sexes. Interestingly, mRNA levels were not only different among genes encoded by separate transcription units, but also showed significant differences among genes located in the same transcription unit. Hence, mRNA abundance of genes encoded by mtDNA seems to be heavily modulated by post-transcriptional regulation. Finally, we also identified several transcripts with a noncanonical structure, suggesting that processing of mitochondrial transcripts may be more complex than previously assumed. PMID:19843606

  14. Depletion of Free 30S Ribosomal Subunits in Escherichia coli by Expression of RNA Containing Shine-Dalgarno-Like Sequences

    PubMed Central

    Mawn, Mary V.; Fournier, Maurille J.; Tirrell, David A.; Mason, Thomas L.

    2002-01-01

    We have constructed synthetic coding sequences for the expression of poly(α,l-glutamic acid) (PLGA) as fusion proteins with dihydrofolate reductase (DHFR) in Escherichia coli. These PLGA coding sequences use both GAA and GAG codons for glutamic acid and contain sequence elements (5′-GAGGAGG-3′) that resemble the consensus Shine-Dalgarno (SD) sequence found at translation initiation sites in bacterial mRNAs. An unusual feature of DHFR-PLGA expression is that accumulation of the protein is inversely related to the level of induction of its mRNA. Cellular protein synthesis was inhibited >95% by induction of constructs for either translatable or untranslatable PLGA RNAs. Induction of PLGA RNA resulted in the depletion of free 30S ribosomal subunits and the appearance of new complexes in the polyribosome region of the gradient. Unlike normal polyribosomes, these complexes were resistant to breakdown in the presence of puromycin. The novel complexes contained 16S rRNA, 23S rRNA, and PLGA RNA. We conclude that multiple noninitiator SD-like sequences in the PLGA RNA inhibit cellular protein synthesis by sequestering 30S small ribosomal subunits and 70S ribosomes in nonfunctional complexes on the PLGA mRNA. PMID:11751827

  15. RNAPattMatch: a web server for RNA sequence/structure motif detection based on pattern matching with flexible gaps

    PubMed Central

    Drory Retwitzer, Matan; Polishchuk, Maya; Churkin, Elena; Kifer, Ilona; Yakhini, Zohar; Barash, Danny

    2015-01-01

    Searching for RNA sequence-structure patterns is becoming an essential tool for RNA practitioners. Novel discoveries of regulatory non-coding RNAs in targeted organisms and the motivation to find them across a wide range of organisms have prompted the use of computational RNA pattern matching as an enhancement to sequence similarity. State-of-the-art programs differ by the flexibility of patterns allowed as queries and by their simplicity of use. In particular—no existing method is available as a user-friendly web server. A general program that searches for RNA sequence-structure patterns is RNA Structator. However, it is not available as a web server and does not provide the option to allow flexible gap pattern representation with an upper bound of the gap length being specified at any position in the sequence. Here, we introduce RNAPattMatch, a web-based application that is user friendly and makes sequence/structure RNA queries accessible to practitioners of various background and proficiency. It also extends RNA Structator and allows a more flexible variable gaps representation, in addition to analysis of results using energy minimization methods. RNAPattMatch service is available at http://www.cs.bgu.ac.il/rnapattmatch. A standalone version of the search tool is also available to download at the site. PMID:25940619

  16. Identification and characterization of an intervening sequence within the 23S ribosomal RNA genes of Edwardsiella ictaluri

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Comparison of the 23S rRNA gene sequences of Edwardsiella tarda and Edwardsiella ictaluri confirmed a close phylogenetic relationship between these two fish pathogen species and a distant relation with the 'core' members of the Enterobacteriaceae family. Analysis of the rrl gene for 23S rRNA in E. i...

  17. Ultra-Deep Sequencing Reveals the microRNA Expression Pattern of the Human Stomach

    PubMed Central

    Ribeiro-dos-Santos, Ândrea; Khayat, André S.; Silva, Artur; Alencar, Dayse O.; Lobato, Jessé; Luz, Larissa; Pinheiro, Daniel G.; Varuzza, Leonardo; Assumpção, Monica; Assumpção, Paulo; Santos, Sidney; Zanette, Dalila L.; Silva, Wilson A.; Burbano, Rommel; Darnet, Sylvain

    2010-01-01

    Background While microRNAs (miRNAs) play important roles in tissue differentiation and in maintaining basal physiology, little is known about the miRNA expression levels in stomach tissue. Alterations in the miRNA profile can lead to cell deregulation, which can induce neoplasia. Methodology/Principal Findings A small RNA library of stomach tissue was sequenced using high-throughput SOLiD sequencing technology. We obtained 261,274 quality reads with perfect matches to the human miRnome, and 42% of known miRNAs were identified. Digital Gene Expression profiling (DGE) was performed based on read abundance and showed that fifteen miRNAs were highly expressed in gastric tissue. Subsequently, the expression of these miRNAs was validated in 10 healthy individuals by RT-PCR showed a significant correlation of 83.97% (P<0.05). Six miRNAs showed a low variable pattern of expression (miR-29b, miR-29c, miR-19b, miR-31, miR-148a, miR-451) and could be considered part of the expression pattern of the healthy gastric tissue. Conclusions/Significance This study aimed to validate normal miRNA profiles of human gastric tissue to establish a reference profile for healthy individuals. Determining the regulatory processes acting in the stomach will be important in the fight against gastric cancer, which is the second-leading cause of cancer mortality worldwide. PMID:20949028

  18. Exome Sequencing Identifies Mitochondrial Alanyl-tRNA Synthetase Mutations in Infantile Mitochondrial Cardiomyopathy

    PubMed Central

    Götz, Alexandra; Tyynismaa, Henna; Euro, Liliya; Ellonen, Pekka; Hyötyläinen, Tuulia; Ojala, Tiina; Hämäläinen, Riikka H.; Tommiska, Johanna; Raivio, Taneli; Oresic, Matej; Karikoski, Riitta; Tammela, Outi; Simola, Kalle O.J.; Paetau, Anders; Tyni, Tiina; Suomalainen, Anu

    2011-01-01

    Infantile cardiomyopathies are devastating fatal disorders of the neonatal period or the first year of life. Mitochondrial dysfunction is a common cause of this group of diseases, but the underlying gene defects have been characterized in only a minority of cases, because tissue specificity of the manifestation hampers functional cloning and the heterogeneity of causative factors hinders collection of informative family materials. We sequenced the exome of a patient who died at the age of 10 months of hypertrophic mitochondrial cardiomyopathy with combined cardiac respiratory chain complex I and IV deficiency. Rigorous data analysis allowed us to identify a homozygous missense mutation in AARS2, which we showed to encode the mitochondrial alanyl-tRNA synthetase (mtAlaRS). Two siblings from another family, both of whom died perinatally of hypertrophic cardiomyopathy, had the same mutation, compound heterozygous with another missense mutation. Protein structure modeling of mtAlaRS suggested that one of the mutations affected a unique tRNA recognition site in the editing domain, leading to incorrect tRNA aminoacylation, whereas the second mutation severely disturbed the catalytic function, preventing tRNA aminoacylation. We show here that mutations in AARS2 cause perinatal or infantile cardiomyopathy with near-total combined mitochondrial respiratory chain deficiency in the heart. Our results indicate that exome sequencing is a powerful tool for identifying mutations in single patients and allows recognition of the genetic background in single-gene disorders of variable clinical manifestation and tissue-specific disease. Furthermore, we show that mitochondrial disorders extend to prenatal life and are an important cause of early infantile cardiac failure. PMID:21549344

  19. Accommodation of profound sequence differences at the interfaces of eubacterial RNA polymerase multi-protein assembly.

    PubMed

    Swapna, Lakshmipuram Seshadri; Rekha, Nambudiry; Srinivasan, Narayanaswamy

    2012-01-01

    Evolutionarily divergent proteins have been shown to change their interacting partners. RNA polymerase assembly is one of the rare cases which retain its component proteins in the course of evolution. This ubiquitous molecular assembly, involved in transcription, consists of four core subunits (alpha, beta, betaprime, and omega), which assemble to form the core enzyme. Remarkably, the orientation of the four subunits in the complex is conserved from prokaryotes to eukaryotes although their sequence similarity is low. We have studied how the sequence divergence of the core subunits of RNA polymerase is accommodated in the formation of the multi-molecular assembly, with special reference to eubacterial species. Analysis of domain composition and order of the core subunits in >85 eubacterial species indicates complete conservation. However, sequence analysis indicates that interface residues of alpha and omega subunits are more divergent than those of beta, betaprime, and sigma70 subunits. Although beta and betaprime are generally well-conserved, residues involved in interaction with divergent subunits are not conserved. Insertions/deletions are also observed near interacting regions even in case of the most conserved subunits, beta and betaprime. Homology modelling of three divergent RNA polymerase complexes, from Helicobacter pylori, Mycoplasma pulmonis and Onion yellows phytoplasma, indicates that insertions/deletions can be accommodated near the interface as they generally occur at the periphery. Evaluation of the modeled interfaces indicates that they are physico-chemically similar to that of the template interfaces in Thermus thermophilus, indicating that nature has evolved to retain the obligate complex in spite of substantial substitutions and insertions/deletions. PMID:22359428

  20. Exploring the Polyadenylated RNA Virome of Sweet Potato through High-Throughput Sequencing

    PubMed Central

    Lai, Xian-Jun; Wang, Hai-Yan; Zhang, Yi-Zheng

    2014-01-01

    Background Viral diseases are the second most significant biotic stress for sweet potato, with yield losses reaching 20% to 40%. Over 30 viruses have been reported to infect sweet potato around the world, and 11 of these have been detected in China. Most of these viruses were detected by traditional detection approaches that show disadvantages in detection throughput. Next-generation sequencing technology provides a novel, high sensitive method for virus detection and diagnosis. Methodology/Principal Findings We report the polyadenylated RNA virome of three sweet potato cultivars using a high throughput RNA sequencing approach. Transcripts of 15 different viruses were detected, 11 of which were detected in cultivar Xushu18, whilst 11 and 4 viruses were detected in Guangshu 87 and Jingshu 6, respectively. Four were detected in sweet potato for the first time, and 4 were found for the first time in China. The most prevalent virus was SPFMV, which constituted 88% of the total viral sequence reads. Virus transcripts with extremely low expression levels were also detected, such as transcripts of SPLCV, CMV and CymMV. Digital gene expression (DGE) and reverse transcription polymerase chain reaction (RT-PCR) analyses showed that the highest viral transcript expression levels were found in fibrous and tuberous roots, which suggest that these tissues should be optimum samples for virus detection. Conclusions/Significance A total of 15 viruses were presumed to present in three sweet potato cultivars growing in China. This is the first insight into the sweet potato polyadenylated RNA virome. These results can serve as a basis for further work to investigate whether some of the 'new' viruses infecting sweet potato are pathogenic. PMID:24901789

  1. Unprecedented High-Resolution View of Bacterial Operon Architecture Revealed by RNA Sequencing

    PubMed Central

    Creecy, James P.; Maddox, Scott M.; Grissom, Joe E.; Conkle, Trevor L.; Shadid, Tyler M.; Teramoto, Jun; San Miguel, Phillip; Shimada, Tomohiro; Ishihama, Akira; Mori, Hirotada

    2014-01-01

    ABSTRACT We analyzed the transcriptome of Escherichia coli K-12 by strand-specific RNA sequencing at single-nucleotide resolution during steady-state (logarithmic-phase) growth and upon entry into stationary phase in glucose minimal medium. To generate high-resolution transcriptome maps, we developed an organizational schema which showed that in practice only three features are required to define operon architecture: the promoter, terminator, and deep RNA sequence read coverage. We precisely annotated 2,122 promoters and 1,774 terminators, defining 1,510 operons with an average of 1.98 genes per operon. Our analyses revealed an unprecedented view of E. coli operon architecture. A large proportion (36%) of operons are complex with internal promoters or terminators that generate multiple transcription units. For 43% of operons, we observed differential expression of polycistronic genes, despite being in the same operons, indicating that E. coli operon architecture allows fine-tuning of gene expression. We found that 276 of 370 convergent operons terminate inefficiently, generating complementary 3′ transcript ends which overlap on average by 286 nucleotides, and 136 of 388 divergent operons have promoters arranged such that their 5′ ends overlap on average by 168 nucleotides. We found 89 antisense transcripts of 397-nucleotide average length, 7 unannotated transcripts within intergenic regions, and 18 sense transcripts that completely overlap operons on the opposite strand. Of 519 overlapping transcripts, 75% correspond to sequences that are highly conserved in E. coli (>50 genomes). Our data extend recent studies showing unexpected transcriptome complexity in several bacteria and suggest that antisense RNA regulation is widespread. PMID:25006232

  2. Transcriptomic Analysis of Petunia hybrida in Response to Salt Stress Using High Throughput RNA Sequencing

    PubMed Central

    Villarino, Gonzalo H.; Bombarely, Aureliano; Giovannoni, James J.; Scanlon, Michael J.; Mattson, Neil S.

    2014-01-01

    Salinity and drought stress are the primary cause of crop losses worldwide. In sodic saline soils sodium chloride (NaCl) disrupts normal plant growth and development. The complex interactions of plant systems with abiotic stress have made RNA sequencing a more holistic and appealing approach to study transcriptome level responses in a single cell and/or tissue. In this work, we determined the Petunia transcriptome response to NaCl stress by sequencing leaf samples and assembling 196 million Illumina reads with Trinity software. Using our reference transcriptome we identified more than 7,000 genes that were differentially expressed within 24 h of acute NaCl stress. The proposed transcriptome can also be used as an excellent tool for biological and bioinformatics in the absence of an available Petunia genome and it is available at the SOL Genomics Network (SGN) http://solgenomics.net. Genes related to regulation of reactive oxygen species, transport, and signal transductions as well as novel and undescribed transcripts were among those differentially expressed in response to salt stress. The candidate genes identified in this study can be applied as markers for breeding or to genetically engineer plants to enhance salt tolerance. Gene Ontology analyses indicated that most of the NaCl damage happened at 24 h inducing genotoxicity, affecting transport and organelles due to the high concentration of Na+ ions. Finally, we report a modification to the library preparation protocol whereby cDNA samples were bar-coded with non-HPLC purified primers, without affecting the quality and quantity of the RNA-seq data. The methodological improvement presented here could substantially reduce the cost of sample preparation for future high-throughput RNA sequencing experiments. PMID:24722556

  3. Comparative transcriptome analysis of epithelial and fiber cells in newborn mouse lenses with RNA sequencing

    PubMed Central

    Hoang, Thanh V.; Kumar, Praveen Kumar Raj; Sutharzan, Sreeskandarajan; Tsonis, Panagiotis A.; Liang, Chun

    2014-01-01

    Purpose The ocular lens contains only two cell types: epithelial cells and fiber cells. The epithelial cells lining the anterior hemisphere have the capacity to continuously proliferate and differentiate into lens fiber cells that make up the large proportion of the lens mass. To understand the transcriptional changes that take place during the differentiation process, high-throughput RNA-Seq of newborn mouse lens epithelial cells and lens fiber cells was conducted to comprehensively compare the transcriptomes of these two cell types. Methods RNA from three biologic replicate samples of epithelial and fiber cells from newborn FVB/N mouse lenses was isolated and sequenced to yield more than 24 million reads per sample. Sequence reads that passed quality filtering were mapped to the reference genome using Genomic Short-read Nucleotide Alignment Program (GSNAP). Transcript abundance and differential gene expression were estimated using the Cufflinks and DESeq packages, respectively. Gene Ontology enrichment was analyzed using GOseq. RNA-Seq results were compared with previously published microarray data. The differential expression of several biologically important genes was confirmed using reverse transcription (RT)-quantitative PCR (qPCR). Results Here, we present the first application of RNA-Seq to understand the transcriptional changes underlying the differentiation of epithelial cells into fiber cells in the newborn mouse lens. In total, 6,022 protein-coding genes exhibited differential expression between lens epithelial cells and lens fiber cells. To our knowledge, this is the first study identifying the expression of 254 long intergenic non-coding RNAs (lincRNAs) in the lens, of which 86 lincRNAs displayed differential expression between the two cell types. We found that RNA-Seq identified more differentially expressed genes and correlated with RT-qPCR quantification better than previously published microarray data. Gene Ontology analysis showed that genes

  4. High-Resolution Analysis of Coronavirus Gene Expression by RNA Sequencing and Ribosome Profiling

    PubMed Central

    Jones, Joshua D.; Chung, Betty Y.-W.; Siddell, Stuart G.; Brierley, Ian

    2016-01-01

    Members of the family Coronaviridae have the largest genomes of all RNA viruses, typically in the region of 30 kilobases. Several coronaviruses, such as Severe acute respiratory syndrome-related coronavirus (SARS-CoV) and Middle East respiratory syndrome-related coronavirus (MERS-CoV), are of medical importance, with high mortality rates and, in the case of SARS-CoV, significant pandemic potential. Other coronaviruses, such as Porcine epidemic diarrhea virus and Avian coronavirus, are important livestock pathogens. Ribosome profiling is a technique which exploits the capacity of the translating ribosome to protect around 30 nucleotides of mRNA from ribonuclease digestion. Ribosome-protected mRNA fragments are purified, subjected to deep sequencing and mapped back to the transcriptome to give a global “snap-shot” of translation. Parallel RNA sequencing allows normalization by transcript abundance. Here we apply ribosome profiling to cells infected with Murine coronavirus, mouse hepatitis virus, strain A59 (MHV-A59), a model coronavirus in the same genus as SARS-CoV and MERS-CoV. The data obtained allowed us to study the kinetics of virus transcription and translation with exquisite precision. We studied the timecourse of positive and negative-sense genomic and subgenomic viral RNA production and the relative translation efficiencies of the different virus ORFs. Virus mRNAs were not found to be translated more efficiently than host mRNAs; rather, virus translation dominates host translation at later time points due to high levels of virus transcripts. Triplet phasing of the profiling data allowed precise determination of translated reading frames and revealed several translated short open reading frames upstream of, or embedded within, known virus protein-coding regions. Ribosome pause sites were identified in the virus replicase polyprotein pp1a ORF and investigated experimentally. Contrary to expectations, ribosomes were not found to pause at the ribosomal

  5. High-Resolution Analysis of Coronavirus Gene Expression by RNA Sequencing and Ribosome Profiling.

    PubMed

    Irigoyen, Nerea; Firth, Andrew E; Jones, Joshua D; Chung, Betty Y-W; Siddell, Stuart G; Brierley, Ian

    2016-02-01

    Members of the family Coronaviridae have the largest genomes of all RNA viruses, typically in the region of 30 kilobases. Several coronaviruses, such as Severe acute respiratory syndrome-related coronavirus (SARS-CoV) and Middle East respiratory syndrome-related coronavirus (MERS-CoV), are of medical importance, with high mortality rates and, in the case of SARS-CoV, significant pandemic potential. Other coronaviruses, such as Porcine epidemic diarrhea virus and Avian coronavirus, are important livestock pathogens. Ribosome profiling is a technique which exploits the capacity of the translating ribosome to protect around 30 nucleotides of mRNA from ribonuclease digestion. Ribosome-protected mRNA fragments are purified, subjected to deep sequencing and mapped back to the transcriptome to give a global "snap-shot" of translation. Parallel RNA sequencing allows normalization by transcript abundance. Here we apply ribosome profiling to cells infected with Murine coronavirus, mouse hepatitis virus, strain A59 (MHV-A59), a model coronavirus in the same genus as SARS-CoV and MERS-CoV. The data obtained allowed us to study the kinetics of virus transcription and translation with exquisite precision. We studied the timecourse of positive and negative-sense genomic and subgenomic viral RNA production and the relative translation efficiencies of the different virus ORFs. Virus mRNAs were not found to be translated more efficiently than host mRNAs; rather, virus translation dominates host translation at later time points due to high levels of virus transcripts. Triplet phasing of the profiling data allowed precise determination of translated reading frames and revealed several translated short open reading frames upstream of, or embedded within, known virus protein-coding regions. Ribosome pause sites were identified in the virus replicase polyprotein pp1a ORF and investigated experimentally. Contrary to expectations, ribosomes were not found to pause at the ribosomal

  6. Maxicircle DNA and edited mRNA sequences of closely related trypanosome species: implications of kRNA editing for evolution of maxicircle genomes.

    PubMed Central

    Read, L K; Fish, W R; Muthiani, A M; Stuart, K

    1993-01-01

    kRNA editing produces functional mRNAs by uridine insertion and deletion. We analyzed portions of the apocytochrome b and NADH dehydrogenase subunits 7 and 8 (ND7 and 8) genes and their edited mRNAs in Trypanosoma congolense and compared these to the corresponding sequences in T.brucei. We find that these genes are highly diverged between the two species, especially in the positions of thymidines and in nucleotide transitions. Editing eliminates differences in encoded uridines producing edited mRNAs that are identical except for the nucleotide substitutions. The resulting predicted proteins are identical since all nucleotide substitutions are silent. A T.congolense minicircle-encoded gRNA which can specify editing of ND8 mRNA was identified. This gRNA can basepair with both T.congolense and T.brucei ND8 mRNA despite nucleotide transitions due to the flexibility of G:U base-pairing. These results illustrate how editing affects the characteristics of maxicircle sequence divergence and allows protein sequence conservation despite a level of DNA sequence divergence which would be predicted to be intolerable in the absence of editing. PMID:8396763

  7. SimSeq: a nonparametric approach to simulation of RNA-sequence datasets

    PubMed Central

    Benidt, Sam; Nettleton, Dan

    2015-01-01

    Motivation: RNA sequencing analysis methods are often derived by relying on hypothetical parametric models for read counts that are not likely to be precisely satisfied in practice. Methods are often tested by analyzing data that have been simulated according to the assumed model. This testing strategy can result in an overly optimistic view of the performance of an RNA-seq analysis method. Results: We develop a data-based simulation algorithm for RNA-seq data. The vector of read counts simulated for a given experimental unit has a joint distribution that closely matches the distribution of a source RNA-seq dataset provided by the user. We conduct simulation experiments based on the negative binomial distribution and our proposed nonparametric simulation algorithm. We compare performance between the two simulation experiments over a small subset of statistical methods for RNA-seq analysis available in the literature. We use as a benchmark the ability of a method to control the false discovery rate. Not surprisingly, methods based on parametric modeling assumptions seem to perform better with respect to false discovery rate control when data are simulated from parametric models rather than using our more realistic nonparametric simulation strategy. Availability and implementation: The nonparametric simulation algorithm developed in this article is implemented in the R package SimSeq, which is freely available under the GNU General Public License (version 2 or later) from the Comprehensive R Archive Network (http://cran.rproject.org/). Contact: sgbenidt@gmail.com Supplementary information: Supplementary data are available at Bioinformatics online. PMID:25725090

  8. RNA sequence reveals mouse retinal transcriptome changes early after axonal injury.

    PubMed

    Yasuda, Masayuki; Tanaka, Yuji; Ryu, Morin; Tsuda, Satoru; Nakazawa, Toru

    2014-01-01

    Glaucoma is an ocular disease characterized by progressive retinal ganglion cell (RGC) death caused by axonal injury. However, the underlying mechanisms involved in RGC death remain unclear. In this study, we investigated changes in the transcriptome profile following axonal injury in mice (C57BL/6) with RNA sequencing (RNA-seq) technology. The experiment group underwent an optic nerve crush (ONC) procedure to induce axonal injury in the right eye, and the control group underwent a sham procedure. Two days later, we extracted the retinas and performed RNA-seq and a pathway analysis. We identified 177 differentially expressed genes with RNA-seq, notably the endoplasmic reticulum (ER) stress-related genes Atf3, Atf4, Atf5, Chac1, Chop, Egr1 and Trb3, which were significantly upregulated. The pathway analysis revealed that ATF4 was the most significant upstream regulator. The antioxidative response-related genes Hmox1 and Srxn1, as well as the immune response-related genes C1qa, C1qb and C1qc, were also significantly upregulated. To our knowledge, this is the first reported RNA-seq investigation of the retinal transcriptome and molecular pathways in the early stages after axonal injury. Our results indicated that ER stress plays a key role under these conditions. Furthermore, the antioxidative defense and immune responses occurred concurrently in the early stages after axonal injury. We believe that our study will lead to a better understanding of and insight into the molecular mechanisms underlying RGC death after axonal injury. PMID:24676137

  9. RNA Sequence Reveals Mouse Retinal Transcriptome Changes Early after Axonal Injury

    PubMed Central

    Yasuda, Masayuki; Tanaka, Yuji; Ryu, Morin; Tsuda, Satoru; Nakazawa, Toru

    2014-01-01

    Glaucoma is an ocular disease characterized by progressive retinal ganglion cell (RGC) death caused by axonal injury. However, the underlying mechanisms involved in RGC death remain unclear. In this study, we investigated changes in the transcriptome profile following axonal injury in mice (C57BL/6) with RNA sequencing (RNA-seq) technology. The experiment group underwent an optic nerve crush (ONC) procedure to induce axonal injury in the right eye, and the control group underwent a sham procedure. Two days later, we extracted the retinas and performed RNA-seq and a pathway analysis. We identified 177 differentially expressed genes with RNA-seq, notably the endoplasmic reticulum (ER) stress-related genes Atf3, Atf4, Atf5, Chac1, Chop, Egr1 and Trb3, which were significantly upregulated. The pathway analysis revealed that ATF4 was the most significant upstream regulator. The antioxidative response-related genes Hmox1 and Srxn1, as well as the immune response-related genes C1qa, C1qb and C1qc, were also significantly upregulated. To our knowledge, this is the first reported RNA-seq investigation of the retinal transcriptome and molecular pathways in the early stages after axonal injury. Our results indicated that ER stress plays a key role under these conditions. Furthermore, the antioxidative defense and immune responses occurred concurrently in the early stages after axonal injury. We believe that our study will lead to a better understanding of and insight into the molecular mechanisms underlying RGC death after axonal injury. PMID:24676137

  10. Polymorphism Identification and Improved Genome Annotation of Brassica rapa Through Deep RNA Sequencing

    PubMed Central

    Devisetty, Upendra Kumar; Covington, Michael F.; Tat, An V.; Lekkala, Saradadevi; Maloof, Julin N.

    2014-01-01

    The mapping and functional analysis of quantitative traits in Brassica rapa can be greatly improved with the availability of physically positioned, gene-based genetic markers and accurate genome annotation. In this study, deep transcriptome RNA sequencing (RNA-Seq) of Brassica rapa was undertaken with two objectives: SNP detection and improved transcriptome annotation. We performed SNP detection on two varieties that are parents of a mapping population to aid in development of a marker system for this population and subsequent development of high-resolution genetic map. An improved Brassica rapa transcriptome was constructed to detect novel transcripts and to improve the current genome annotation. This is useful for accurate mRNA abundance and detection of expression QTL (eQTLs) in mapping populations. Deep RNA-Seq of two Brassica rapa genotypes—R500 (var. trilocularis, Yellow Sarson) and IMB211 (a rapid cycling variety)—using eight different tissues (root, internode, leaf, petiole, apical meristem, floral meristem, silique, and seedling) grown across three different environments (growth chamber, greenhouse and field) and under two different treatments (simulated sun and simulated shade) generated 2.3 billion high-quality Illumina reads. A total of 330,995 SNPs were identified in transcribed regions between the two genotypes with an average frequency of one SNP in every 200 bases. The deep RNA-Seq reassembled Brassica rapa transcriptome identified 44,239 protein-coding genes. Compared with current gene models of B. rapa, we detected 3537 novel transcripts, 23,754 gene models had structural modifications, and 3655 annotated proteins changed. Gaps in the current genome assembly of B. rapa are highlighted by our identification of 780 unmapped transcripts. All the SNPs, annotations, and predicted transcripts can be viewed at http://phytonetworks.ucdavis.edu/. PMID:25122667

  11. Identification of Novel Transcribed Regions in Zebrafish (Danio rerio) Using RNA-Sequencing

    PubMed Central

    Wang, Jingwen; Vesterlund, Liselotte; Kere, Juha; Jiao, Hong

    2016-01-01

    Zebrafish (Danio rerio) has emerged as a model organism to investigate vertebrate development and human genetic diseases. However, the zebrafish genome annotation is still ongoing and incomplete, and there are still new gene transcripts to be found. With the introduction of massive parallel sequencing, whole transcriptome studies became possible. In the present study, we aimed to discover novel transcribed regions (NTRs) using developmental transcriptome data from RNA sequencing. In order to achieve this, we developed an in-house bioinformatics pipeline for NTR discovery. Using the pipeline, we detected 152 putative NTRs that at the time of discovery were not annotated in Ensembl and NCBI gene database. Four randomly selected NTRs were successfully validated using RT-PCR, and expression profiles of 10 randomly selected NTRs were evaluated using qRT-PCR. The identification of these 152 NTRs provide new information for zebrafish genome annotation as well as new candidates for studies of zebrafish gene function. PMID:27462902

  12. Differential Pathogenesis of Lung Adenocarcinoma Subtypes Involving Sequence Mutations, Copy Number, Chromosomal Instability, and Methylation

    PubMed Central

    Wilkerson, Matthew D.; Yin, Xiaoying; Walter, Vonn; Zhao, Ni; Cabanski, Christopher R.; Hayward, Michele C.; Miller, C. Ryan; Socinski, Mark A.; Parsons, Alden M.; Thorne, Leigh B.; Haithcock, Benjamin E.; Veeramachaneni, Nirmal K.; Funkhouser, William K.; Randell, Scott H.; Bernard, Philip S.; Perou, Charles M.; Hayes, D. Neil

    2012-01-01

    Background Lung adenocarcinoma (LAD) has extreme genetic variation among patients, which is currently not well understood, limiting progress in therapy development and research. LAD intrinsic molecular subtypes are a validated stratification of naturally-occurring gene expression patterns and encompass different functional pathways and patient outcomes. Patients may have incurred different mutations and alterations that led to the different subtypes. We hypothesized that the LAD molecular subtypes co-occur with distinct mutations and alterations in patient tumors. Methodology/Principal Findings The LAD molecular subtypes (Bronchioid, Magnoid, and Squamoid) were tested for association with gene mutations and DNA copy number alterations using statistical methods and published cohorts (n = 504). A novel validation (n = 116) cohort was assayed and interrogated to confirm subtype-alteration associations. Gene mutation rates (EGFR, KRAS, STK11, TP53), chromosomal instability, regional copy number, and genomewide DNA methylation were significantly different among tumors of the molecular subtypes. Secondary analyses compared subtypes by integrated alterations and patient outcomes. Tumors having integrated alterations in the same gene associated with the subtypes, e.g. mutation, deletion and underexpression of STK11 with Magnoid, and mutation, amplification, and overexpression of EGFR with Bronchioid. The subtypes also associated with tumors having concurrent mutant genes, such as KRAS-STK11 with Magnoid. Patient overall survival, cisplatin plus vinorelbine therapy response and predicted gefitinib sensitivity were significantly different among the subtypes. Conclusions/ Significance The lung adenocarcinoma intrinsic molecular subtypes co-occur with grossly distinct genomic alterations and with patient therapy response. These results advance the understanding of lung adenocarcinoma etiology and nominate patient subgroups for future evaluation of treatment response

  13. Characterization of microRNA transcriptome in lung cancer by next-generation deep sequencing

    PubMed Central

    Ma, Jie; Mannoor, Kaiissar; Gao, Lu; Tan, Afang; Guarnera, Maria A.; Zhan, Min; Shetty, Amol; Stass, Sanford A; Xing, Lingxiao; Jiang, Feng

    2014-01-01

    Non-small cell lung cancer (NSCLC) is the leading cause of cancer death. Systematically characterizing miRNAs in NSCLC will help develop biomarkers for its diagnosis and subclassification, and identify therapeutic targets for the treatment. We used next-generation deep sequencing to comprehensively characterize miRNA profiles in eight lung tumor tissues consisting of two major types of NSCLC, squamous cell carcinoma (SCC) and adenocarcinoma (AC). We used quantitative PCR (qPCR) to verify the findings in 40 pairs of stage I NSCLC tissues and the paired normal tissues, and 60 NSCLC tissues of different types and stages. We also investigated the function of identified miRNAs in lung tumorigenesis. Deep sequencing identified 896 known miRNAs and 14 novel miRNAs, of which, 24 miRNAs displayed dysregulation with fold change ≥4.5 in either stage I ACs or SCCs or both relative to normal tissues. qPCR validation showed that 14 of 24 miRNAs exhibited consistent changes with deep sequencing data. Seven miRNAs displayed distinctive expressions between SCC and AC, from which, a panel of four miRNAs (miRs-944, 205-3p, 135a-5p, and 577) was identified that cold differentiate SCC from AC with 93.3% sensitivity and 86.7% specificity. Manipulation of miR-944 expression in NSCLC cells affected cell growth, proliferation, and invasion by targeting a tumor suppressor, SOCS4. Evaluating miR-944 in 52 formalin-fixed paraffin-embedded SCC tissues revealed that miR-944 expression was associated with lymph node metastasis. This study presents the earliest use of deep sequencing for profiling miRNAs in lung tumor specimens. The identified miRNA signatures may provide biomarkers for early detection, subclassification, and predicting metastasis, and potential therapeutic targets of NSCLC. PMID:24785186

  14. MiRNA Expression Profile for the Human Gastric Antrum Region Using Ultra-Deep Sequencing

    PubMed Central

    Hamoy, Igor G.; Darnet, Sylvain; Burbano, Rommel; Khayat, André; Gonçalves, André Nicolau; Alencar, Dayse O.; Cruz, Aline; Magalhães, Leandro; Araújo Jr., Wilson; Silva, Artur; Santos, Sidney; Demachki, Samia; Assumpção, Paulo; Ribeiro-dos-Santos, Ândrea

    2014-01-01

    Background MicroRNAs are small non-coding nucleotide sequences that regulate gene expression. These structures are fundamental to several biological processes, including cell proliferation, development, differentiation and apoptosis. Identifying the expression profile of microRNAs in healthy human gastric antrum mucosa may help elucidate the miRNA regulatory mechanisms of the human stomach. Methodology/Principal Findings A small RNA library of stomach antrum tissue was sequenced using high-throughput SOLiD sequencing technology. The total read count for the gastric mucosa antrum region was greater than 618,000. After filtering and aligning using with MirBase, 148 mature miRNAs were identified in the gastric antrum tissue, totaling 3,181 quality reads; 63.5% (2,021) of the reads were concentrated in the eight most highly expressed miRNAs (hsa-mir-145, hsa-mir-29a, hsa-mir-29c, hsa-mir-21, hsa-mir-451a, hsa-mir-192, hsa-mir-191 and hsa-mir-148a). RT-PCR validated the expression profiles of seven of these highly expressed miRNAs and confirmed the sequencing results obtained using the SOLiD platform. Conclusions/Significance In comparison with other tissues, the antrum’s expression profile was unique with respect to the most highly expressed miRNAs, suggesting that this expression profile is specific to stomach antrum tissue. The current study provides a starting point for a more comprehensive understanding of the role of miRNAs in the regulation of the molecular processes of the human stomach. PMID:24647245

  15. Technologically important extremophile 16S rRNA sequence Shannon entropy and fractal property comparison with long term dormant microbes

    NASA Astrophysics Data System (ADS)

    Holden, Todd; Gadura, N.; Dehipawala, S.; Cheung, E.; Tuffour, M.; Schneider, P.; Tremberger, G., Jr.; Lieberman, D.; Cheung, T.

    2011-10-01

    Technologically important extremophiles including oil eating microbes, uranium and rocket fuel perchlorate reduction microbes, electron producing microbes and electrode electrons feeding microbes were compared in terms of their 16S rRNA sequences, a standard targeted sequence in comparative phylogeny studies. Microbes that were reported to have survived a prolonged dormant duration were also studied. Examples included the recently discovered microbe that survives after 34,000 years in a salty environment while feeding off organic compounds from other trapped dead microbes. Shannon entropy of the 16S rRNA nucleotide composition and fractal dimension of the nucleotide sequence in terms of its atomic number fluctuation analyses suggest a selected range for these extremophiles as compared to other microbes; consistent with the experience of relatively mild evolutionary pressure. However, most of the microbes that have been reported to survive in prolonged dormant duration carry sequences with fractal dimension between 1.995 and 2.005 (N = 10 out of 13). Similar results are observed for halophiles, red-shifted chlorophyll and radiation resistant microbes. The results suggest that prolonged dormant duration, in analogous to high salty or radiation environment, would select high fractal 16S rRNA sequences. Path analysis in structural equation modeling supports a causal relation between entropy and fractal dimension for the studied 16S rRNA sequences (N = 7). Candidate choices for high fractal 16S rRNA microbes could offer protection for prolonged spaceflights. BioBrick gene network manipulation could include extremophile 16S rRNA sequences in synthetic biology and shed more light on exobiology and future colonization in shielded spaceflights. Whether the high fractal 16S rRNA sequences contain an asteroidlike extra-terrestrial source could be speculative but interesting.

  16. Relation between mRNA expression and sequence information in Desulfovibrio vulgaris: Combinatorial contributions of upstream regulatory motifs and coding sequence features to variations in mRNA abundance

    SciTech Connect

    Wu, Gang; Nie, Lei; Zhang, Weiwen

    2006-05-26

    ABSTRACT-The context-dependent expression of genes is the core for biological activities, and significant attention has been given to identification of various factors contributing to gene expression at genomic scale. However, so far this type of analysis has been focused whether on relation between mRNA expression and non-coding sequence features such as upstream regulatory motifs or on correlation between mRN abundance and non-random features in coding sequences (e.g. codon usage and amino acid usage). In this study multiple regression analyses of the mRNA abundance and all sequence information in Desulfovibrio vulgaris were performed, with the goal to investigate how much coding and non-coding sequence features contribute to the variations in mRNA expression, and in what manner they act together...

  17. Sequence comparisons in the aminoacyl-tRNA synthetases with emphasis on regions of likely homology with sequences in the Rossmann fold in the methionyl and tyrosyl enzymes.

    PubMed

    Walker, E J; Jeffrey, P D

    1988-02-01

    Amino acid sequences of aminoacyl-tRNA synthetases specific for 12 different amino acids have now been published. Differences in origin at the species and organelle level result in 20 distinct sequences being available for comparison. Some of these were compared in small groups as they were determined and, although some homologies were detected, it was generally concluded that there was surprisingly little sequence homology in this functionally related group of enzymes. We have made comparisons of all of the available sequences by using a combination of computer and manual alignment methods and knowledge of the sequences in the Rossmann fold region of methionyl-tRNA synthetase from E. coli and tyrosyl-tRNA synthetase from B. stearothermophilus, enzymes whose three-dimensional structures have been described. It emerges that all of the aminoacyl-tRNA synthetase sequences thus examined show considerable homology with each other over at least parts of this region, some over virtually all of it. We conclude that a great deal more similarity than had previously been suspected exists in these proteins. In particular, the alignments we have made strongly imply the existence of a mononucleotide binding site of the Rossmann fold configuration in all of the synthetases compared. PMID:3283733

  18. The complete sequence of the genomic RNA of an isolate of Lily virus X (genus Potexvirus).

    PubMed

    Chen, J; Shi, Y-H; Adams, M J; Chen, J-P

    2005-04-01

    The complete sequence of the genomic RNA of an isolate of Lily virus X (LVX) has been determined for the first time. The isolate from the Netherlands was 5823 nucleotide (nt) long excluding the 3'-poly(A) tail, making it the shortest reported potexvirus sequence. The 5'-non-coding region begins with GGAAAA like that of Scallion virus X (ScaVX) and some isolates of Cymbidium mosaic virus (CymMV), whereas those of other sequenced potexviruses probably all begin with GAAAA. The genome organisation was similar to that of other members of the genus except that a TGBp3-like region lacked a normal AUG start codon. A phylogenetic analysis based on the entire coding sequence showed that LVX was most closely related to Strawberry mild yellow edge virus and belonged in a subgroup of the genus that also contains CymMV, Narcissus mosaic virus, ScaVX, Pepino mosaic virus, Potato aucuba mosaic virus and White clover mosaic virus. PMID:15578239

  19. Characterization by Small RNA Sequencing of Taro Bacilliform CH Virus (TaBCHV), a Novel Badnavirus.

    PubMed

    Kazmi, Syeda Amber; Yang, Zuokun; Hong, Ni; Wang, Guoping; Wang, Yanfen

    2015-01-01

    RNA silencing is an antiviral immunity that regulates gene expression through the production of small RNAs (sRNAs). In this study, deep sequencing of small RNAs was used to identify viruses infecting two taro plants. Blast searching identified five and nine contigs assembled from small RNAs of samples T1 and T2 matched onto the genome sequences of badnaviruses in the family Caulimoviridae. Complete genome sequences of two isolates of the badnavirus determined by sequence specific amplification comprised of 7,641 nucleotides and shared overall nucleotide similarities of 44.1%‒55.8% with other badnaviruses. Six open reading frames (ORFs) were identified on the plus strand, showed amino acid similarities ranging from 59.8% (ORF3) to 10.2% (ORF6) to the corresponding proteins encoded by other badnaviruses. Phylogenetic analysis also supports that the virus is a new member in the genus Badnavirus. The virus is tentatively named as Taro bacilliform CH virus (TaBCHV), and it is the second badnavirus infecting taro plants, following Taro bacilliform virus (TaBV). In addition, analyzes of viral derived small RNAs (vsRNAs) from TaBCHV showed that almost equivalent number of vsRNAs were generated from both strands and the most abundant vsRNAs were 21 nt, with uracil bias at 5' terminal. Furthermore, TaBCHV vsRNAs were asymmetrically distributed on its entire circular genome at both orientations with the hotspots mainly generated in the ORF5 region. PMID:26207896

  20. Characterization by Small RNA Sequencing of Taro Bacilliform CH Virus (TaBCHV), a Novel Badnavirus

    PubMed Central

    Kazmi, Syeda Amber; Yang, Zuokun; Hong, Ni; Wang, Guoping; Wang, Yanfen

    2015-01-01

    RNA silencing is an antiviral immunity that regulates gene expression through the production of small RNAs (sRNAs). In this study, deep sequencing of small RNAs was used to identify viruses infecting two taro plants. Blast searching identified five and nine contigs assembled from small RNAs of samples T1 and T2 matched onto the genome sequences of badnaviruses in the family Caulimoviridae. Complete genome sequences of two isolates of the badnavirus determined by sequence specific amplification comprised of 7,641 nucleotides and shared overall nucleotide similarities of 44.1%‒55.8% with other badnaviruses. Six open reading frames (ORFs) were identified on the plus strand, showed amino acid similarities ranging from 59.8% (ORF3) to 10.2% (ORF6) to the corresponding proteins encoded by other badnaviruses. Phylogenetic analysis also supports that the virus is a new member in the genus Badnavirus. The virus is tentatively named as Taro bacilliform CH virus (TaBCHV), and it is the second badnavirus infecting taro plants, following Taro bacilliform virus (TaBV). In addition, analyzes of viral derived small RNAs (vsRNAs) from TaBCHV showed that almost equivalent number of vsRNAs were generated from both strands and the most abundant vsRNAs were 21 nt, with uracil bias at 5' terminal. Furthermore, TaBCHV vsRNAs were asymmetrically distributed on its entire circular genome at both orientations with the hotspots mainly generated in the ORF5 region. PMID:26207896

  1. Homology of the 3' terminal sequences of the 18S rRNA of Bombyx mori and the 16S rRNA of Escherchia coli.

    PubMed Central

    Samols, D R; Hagenbuchle, O; Gage, L P

    1979-01-01

    The terminal 220 base pairs (bp) of the gene for 18S rRNA and 18 bp of the adjoining spacer rDNA of the silkworm Bombyx mori have been sequenced. Comparison with the sequence of the 16S rRNA gene of Escherichia coli has shown that a region including 45 bp of the B. mori sequence at the 3' end is remarkably homologous with the 3' terminal E. coli sequence. Other homologies occur in the terminal regions of the 18S and 16S rRNAs, including a perfectly conserved stretch of 13 bp within a longer homology located 150--200 bp from the 3' termini. These homologies are the most extensive so far reported between prokaryotic and eukaryotic genomic DNA. Images PMID:390496

  2. RNA Sequencing Analysis of the Gametophyte Transcriptome from the Liverwort, Marchantia polymorpha

    PubMed Central

    Sharma, Niharika; Jung, Chol-Hee; Bhalla, Prem L.; Singh, Mohan B.

    2014-01-01

    The liverwort Marchantia polymorpha is a member of the most basal lineage of land plants (embryophytes) and likely retains many ancestral morphological, physiological and molecular characteristics. Despite its phylogenetic importance and the availability of previous EST studies, M. polymorpha’s lack of economic importance limits accessible genomic resources for this species. We employed Illumina RNA-Seq technology to sequence the gametophyte transcriptome of M. polymorpha. cDNA libraries from 6 different male and female developmental tissues were sequenced to delineate a global view of the M. polymorpha transcriptome. Approximately 80 million short reads were obtained and assembled into a non-redundant set of 46,533 transcripts (> = 200 bp) from 46,070 loci. The average length and the N50 length of the transcripts were 757 bp and 471 bp, respectively. Sequence comparison of assembled transcripts with non-redundant proteins from embryophytes resulted in the annotation of 43% of the transcripts. The transcripts were also compared with M. polymorpha expressed sequence tags (ESTs), and approximately 69.5% of the transcripts appeared to be novel. Twenty-one percent of the transcripts were assigned GO terms to improve annotation. In addition, 6,112 simple sequence repeats (SSRs) were identified as potential molecular markers, which may be useful in studies of genetic diversity. A comparative genomics approach revealed that a substantial proportion of the genes (35.5%) expressed in M. polymorpha were conserved across phylogenetically related species, such as Selaginella and Physcomitrella, and identified 580 genes that are potentially unique to liverworts. Our study presents an extensive amount of novel sequence information for M. polymorpha. This information will serve as a valuable genomics resource for further molecular, developmental and comparative evolutionary studies, as well as for the isolation and characterization of functional genes that are involved in

  3. Analysis of the mouse gut microbiome using full-length 16S rRNA amplicon sequencing.

    PubMed

    Shin, Jongoh; Lee, Sooin; Go, Min-Jeong; Lee, Sang Yup; Kim, Sun Chang; Lee, Chul-Ho; Cho, Byung-Kwan

    2016-01-01

    Demands for faster and more accurate methods to analyze microbial communities from natural and clinical samples have been increasing in the medical and healthcare industry. Recent advances in next-generation sequencing technologies have facilitated the elucidation of the microbial community composition with higher accuracy and greater throughput than was previously achievable; however, the short sequencing reads often limit the microbial composition analysis at the species level due to the high similarity of 16S rRNA amplicon sequences. To overcome this limitation, we used the nanopore sequencing platform to sequence full-length 16S rRNA amplicon libraries prepared from the mouse gut microbiota. A comparison of the nanopore and short-read sequencing data showed that there were no significant differences in major taxonomic units (89%) except one phylotype and three taxonomic units. Moreover, both sequencing data were highly similar at all taxonomic resolutions except the species level. At the species level, nanopore sequencing allowed identification of more species than short-read sequencing, facilitating the accurate classification of the bacterial community composition. Therefore, this method of full-length 16S rRNA amplicon sequencing will be useful for rapid, accurate and efficient detection of microbial diversity in various biological and clinical samples. PMID:27411898

  4. Analysis of the mouse gut microbiome using full-length 16S rRNA amplicon sequencing

    PubMed Central

    Shin, Jongoh; Lee, Sooin; Go, Min-Jeong; Lee, Sang Yup; Kim, Sun Chang; Lee, Chul-Ho; Cho, Byung-Kwan

    2016-01-01

    Demands for faster and more accurate methods to analyze microbial communities from natural and clinical samples have been increasing in the medical and healthcare industry. Recent advances in next-generation sequencing technologies have facilitated the elucidation of the microbial community composition with higher accuracy and greater throughput than was previously achievable; however, the short sequencing reads often limit the microbial composition analysis at the species level due to the high similarity of 16S rRNA amplicon sequences. To overcome this limitation, we used the nanopore sequencing platform to sequence full-length 16S rRNA amplicon libraries prepared from the mouse gut microbiota. A comparison of the nanopore and short-read sequencing data showed that there were no significant differences in major taxonomic units (89%) except one phylotype and three taxonomic units. Moreover, both sequencing data were highly similar at all taxonomic resolutions except the species level. At the species level, nanopore sequencing allowed identification of more species than short-read sequencing, facilitating the accurate classification of the bacterial community composition. Therefore, this method of full-length 16S rRNA amplicon sequencing will be useful for rapid, accurate and efficient detection of microbial diversity in various biological and clinical samples. PMID:27411898

  5. Specific binding of a HeLa cell nuclear protein to RNA sequences in the human immunodeficiency virus transactivating region.

    PubMed Central

    Gaynor, R; Soultanakis, E; Kuwabara, M; Garcia, J; Sigman, D S

    1989-01-01

    The transactivator protein, tat, encoded by the human immunodeficiency virus is a key regulator of viral transcription. Activation by the tat protein requires sequences downstream of the transcription initiation site called the transactivating region (TAR). RNA derived from the TAR is capable of forming a stable stem-loop structure and the maintenance of both the stem structure and the loop sequences located between +19 and +44 is required for complete in vivo activation by tat. Gel retardation assays with RNA from both wild-type and mutant TAR constructs generated in vitro with SP6 polymerase indicated specific binding of HeLa nuclear proteins to the TAR. To characterize this RNA-protein interaction, a method of chemical "imprinting" has been developed using photoactivated uranyl acetate as the nucleolytic agent. This reagent nicks RNA under physiological conditions at all four nucleotides in a reaction that is independent of sequence and secondary structure. Specific interaction of cellular proteins with TAR RNA could be detected by enhanced cleavages or imprints surrounding the loop region. Mutations that either disrupted stem base-pairing or extensively changed the primary sequence resulted in alterations in the cleavage pattern of the TAR RNA. Structural features of the TAR RNA stem-loop essential for tat activation are also required for specific binding of the HeLa cell nuclear protein. Images PMID:2544877

  6. Phylogenetic analysis of the Listeria monocytogenes based on sequencing of 16S rRNA and hlyA genes.

    PubMed

    Soni, Dharmendra Kumar; Dubey, Suresh Kumar

    2014-12-01

    The discrimination between Listeria monocytogenes and Listeria species has been detected. The 16S rRNA and hlyA were PCR amplified with set of oligonucleotide primers with flank 1,500 and 456 bp fragments, respectively. Based on the differences in 16S rRNA and hlyA genes, a total 80 isolates from different environmental, food and clinical samples confirmed it to be L. monocytogenes. The 16S rRNA sequence similarity suggested that the isolates were similar to the previously reported ones from different habitats by others. The phylogenetic interrelationships of the genus Listeria were investigated by sequencing of 16S rRNA and hlyA gene. The 16S rRNA sequence indicated that genus Listeria is comprised of following closely related but distinct lines of descent, one is the L. monocytogenes species group (including L. innocua, L. ivanovii, L. seeligeri and L. welshimeri) and other, the species L. grayi, L. rocourtiae and L. fleischmannii. The phylogenetic tree based on hlyA gene sequence clearly differentiates between the L. monocytogenes, L. ivanovii and L. seeligeri. In the present study, we identified 80 isolates of L. monocytogenes originating from different clinical, food and environmental samples based on 16S rRNA and hlyA gene sequence similarity. PMID:25205124

  7. Optimization of high-throughput sequencing kinetics for determining enzymatic rate constants of thousands of RNA substrates.

    PubMed

    Niland, Courtney N; Jankowsky, Eckhard; Harris, Michael E

    2016-10-01

    Quantification of the specificity of RNA binding proteins and RNA processing enzymes is essential to understanding their fundamental roles in biological processes. High-throughput sequencing kinetics (HTS-Kin) uses high-throughput sequencing and internal competition kinetics to simultaneously monitor the processing rate constants of thousands of substrates by RNA processing enzymes. This technique has provided unprecedented insight into the substrate specificity of the tRNA processing endonuclease ribonuclease P. Here, we investigated the accuracy and robustness of measurements associated with each step of the HTS-Kin procedure. We examine the effect of substrate concentration on the observed rate constant, determine the optimal kinetic parameters, and provide guidelines for reducing error in amplification of the substrate population. Importantly, we found that high-throughput sequencing and experimental reproducibility contribute to error, and these are the main sources of imprecision in the quantified results when otherwise optimized guidelines are followed. PMID:27296633

  8. Requirement of the 5'-end genomic sequence as an upstream cis-acting element for coronavirus subgenomic mRNA transcription.

    PubMed Central

    Liao, C L; Lai, M M

    1994-01-01

    We have developed a defective interfering (DI) RNA containing a chloramphenicol acetyltransferase reporter gene, placed behind an intergenic sequence, for studying subgenomic mRNA transcription of mouse hepatitis virus (MHV), a prototype coronavirus. Using this system, we have identified the sequence requirement for MHV subgenomic mRNA transcription. We show that this sequence requirement differs from that for RNA replication. In addition to the previously identified requirement for an intergenic (promoter) sequence, additional sequences from the 5' end of genomic RNA are required for subgenomic mRNA transcription. These upstream sequences include the leader RNA and a spacer sequence between the leader and intergenic sequence, which is derived from the 5' untranslated region and part of gene 1. The spacer sequence requirement is specific, since only the sequence derived from the 5' end of RNA genome, but not from other MHV genomic regions or heterologous sequences, could initiate subgenomic transcription from the intergenic sequence. These results strongly suggest that the wild-type viral subgenomic mRNAs (mRNA2 to mRNA7) and probably their counterpart subgenomic negative-sense RNAs cannot be utilized for mRNA amplification. Furthermore, we have demonstrated that a partial leader sequence present at the 5' end of genome, which lacks the leader-mRNA fusion sequence, could still support subgenomic mRNA transcription. In this case, the leader sequences of the subgenomic transcripts were derived exclusively from the wild-type helper virus, indicating that the MHV leader RNA initiates in trans subgenomic mRNA transcription. Thus, the leader sequence can enhance subgenomic transcription even when it cannot serve as a primer for mRNA synthesis. These results taken together suggest that the 5'-end leader sequence of MHV not only provides a trans-acting primer for mRNA initiation but also serves as a cis-acting element required for the transcription of subgenomic mRNAs. The

  9. Reverse Engineering of Vaccine Antigens Using High Throughput Sequencing-enhanced mRNA Display

    PubMed Central

    Guo, Nini; Duan, Hongying; Kachko, Alla; Krause, Benjamin W.; Major, Marian E.; Krause, Philip R.

    2015-01-01

    Vaccine reverse engineering is emerging as an important approach to vaccine antigen identification, recently focusing mainly on structural characterization of interactions between neutralizing monoclonal antibodies (mAbs) and antigens. Using mAbs that bind unknown antigen structures, we sought to probe the intrinsic features of antibody antigen-binding sites with a high complexity peptide library, aiming to identify conformationally optimized mimotope antigens that capture mAb-specific epitopes. Using a high throughput sequencing-enhanced messenger ribonucleic acid (mRNA) display approach, we identified high affinity binding peptides for a hepatitis C virus neutralizing mAb. Immunization with the selected peptides induced neutralizing activity similar to that of the original mAb. Antibodies elicited by the most commonly selected peptides were predominantly against specific epitopes. Thus, using mRNA display to interrogate mAbs permits high resolution identification of functional peptide antigens that direct targeted immune responses, supporting its use in vaccine reverse engineering for pathogens against which potent neutralizing mAbs are available. Research in Context We used a large number of randomly produced small proteins (“peptides”) to identify peptides containing specific protein sequences that bind efficiently to an antibody that can prevent hepatitis C virus infection in cell culture. After the identified peptides were injected into mice, the mice produced their own antibodies with characteristics similar to the original antibody. This approach can provide previously unavailable information about antibody binding and could also be useful in developing new vaccines. PMID:26425692

  10. Uncovering microRNA-mediated response to SO2 stress in Arabidopsis thaliana by deep sequencing.

    PubMed

    Li, Lihong; Xue, Meizhao; Yi, Huilan

    2016-10-01

    Sulfur dioxide (SO2) is a major air pollutant and has significant impacts on plants. MicroRNAs (miRNAs) are a class of gene expression regulators that play important roles in response to environmental stresses. In this study, deep sequencing was used for genome-wide identification of miRNAs and their expression profiles in response to SO2 stress in Arabidopsis thaliana shoots. A total of 27 conserved miRNAs and 5 novel miRNAs were found to be differentially expressed under SO2 stress. qRT-PCR analysis showed mostly negative correlation between miRNA accumulation and target gene mRNA abundance, suggesting regulatory roles of these miRNAs during SO2 exposure. The target genes of SO2-responsive miRNAs encode transcription factors and proteins that regulate auxin signaling and stress response, and the miRNAs-mediated suppression of these genes could improve plant resistance to SO2 stress. Promoter sequence analysis of genes encoding SO2-responsive miRNAs showed that stress-responsive and phytohormone-related cis-regulatory elements occurred frequently, providing additional evidence of the involvement of miRNAs in adaption to SO2 stress. This study represents a comprehensive expression profiling of SO2-responsive miRNAs in Arabidopsis and broads our perspective on the ubiquitous regulatory roles of miRNAs under stress conditions. PMID:27232729

  11. Digital inventory of Arabidopsis transcripts revealed by 61 RNA sequencing samples.

    PubMed

    Sun, Xiaoyong; Yang, Qiuying; Deng, Zhiping; Ye, Xinfu

    2014-10-01

    Alternative splicing is an essential biological process to generate proteome diversity and phenotypic complexity. Recent improvements in RNA sequencing accuracy and computational algorithms have provided unprecedented opportunities to examine the expression levels of Arabidopsis (Arabidopsis thaliana) transcripts. In this article, we analyzed 61 RNA sequencing samples from 10 totally independent studies of Arabidopsis and calculated the transcript expression levels in different tissues, treatments, developmental stages, and varieties. These data provide a comprehensive profile of Arabidopsis transcripts with single-base resolution. We quantified the expression levels of 40,745 transcripts annotated in The Arabidopsis Information Resource 10, comprising 73% common transcripts, 15% rare transcripts, and 12% nondetectable transcripts. In addition, we investigated diverse common transcripts in detail, including ubiquitous transcripts, dominant/subordinate transcripts, and switch transcripts, in terms of their expression and transcript ratio. Interestingly, alternative splicing was the highly enriched function for the genes related to dominant/subordinate transcripts and switch transcripts. In addition, motif analysis revealed that TC motifs were enriched in dominant transcripts but not in subordinate transcripts. These motifs were found to have a strong relationship with transcription factor activity. Our results shed light on the complexity of alternative splicing and the diversity of the contributing factors. PMID:25118256

  12. RNA Sequencing of Mouse Sinoatrial Node Reveals an Upstream Regulatory Role for Islet-1 in Cardiac Pacemaker Cells

    PubMed Central

    Vedantham, Vasanth; Galang, Giselle; Evangelista, Melissa; Deo, Rahul C.; Srivastava, Deepak

    2015-01-01

    Rationale Treatment of sinus node disease with regenerative or cell-based therapies will require a detailed understanding of gene regulatory networks in cardiac pacemaker cells (PCs). Objective To characterize the transcriptome of PCs using RNA sequencing, and to identify transcriptional networks responsible for PC gene expression. Methods and Results We used laser capture micro-dissection (LCM) on a sinus node reporter mouse line to isolate RNA from PCs for RNA sequencing (RNA-Seq). Differential expression and network analysis identified novel SAN-enriched genes, and predicted that the transcription factor Islet-1 (Isl1) is active in developing pacemaker cells. RNA-Seq on SAN tissue lacking Isl1 established that Isl1 is an important transcriptional regulator within the developing SAN. Conclusions (1) The PC transcriptome diverges sharply from other cardiomyocytes; (2) Isl1 is a positive transcriptional regulator of the PC gene expression program. PMID:25623957

  13. 1,N6-etheno deoxy and ribo adenosine and 3,N4-etheno deoxy and ribo cytidine phosphoramidites. Strongly fluorescent structures for selective introduction in defined sequence DNA and RNA molecules.

    PubMed Central

    Srivastava, S C; Raza, S K; Misra, R

    1994-01-01

    Synthesis of 1,N6-etheno-2'-deoxyadenosine, 3,N4-etheno-2'-deoxycytidine, and further chemistry on both deoxy and ribo series etheno nucleosides produces the corresponding phosphoramidites. These novel phosphoramidites are introduced selectively, quantitatively, and at specific positions at single or multiple sites into DNA or RNA sequences. The purification and chemistry involved in the synthesis of these products has been optimized to achieve the purity in excess of 99%. The resulting phosphoramidites were tested for their ability to couple and produce poly deoxy and ribonucleotides by solid phase chemistry. The coupling efficiency achieved was greater than 99% per step. Due to the instability of these etheno compounds in acidic and basic medium, various criteria to obtain pure oligomers have been established. The selective introduction of these fluorescent nucleosides into defined sequence DNA and RNA molecule will greatly facilitate the structure-function studies of various RNAs, protein-RNA structures, and DNA-RNA based diagnostics applications. The characteristic and high fluorescent intensity (detection below 1 x 10(-9) M for adenosine sites and below 1 x 10(-7) M for cytidine sites) is particularly suited for the biochemical and biological research and product development applications. The usefulness of these etheno containing modified sequences as sequencing and amplification primers is demonstrated by their full participation in polymerase chain reaction experiments. Images PMID:7513082

  14. RNA sequencing read depth requirement for optimal transcriptome coverage in Hevea brasiliensis

    PubMed Central

    2014-01-01

    Background One of the concerns of assembling de novo transcriptomes is determining the amount of read sequences required to ensure a comprehensive coverage of genes expressed in a particular sample. In this report, we describe the use of Illumina paired-end RNA-Seq (PE RNA-Seq) reads from Hevea brasiliensis (rubber tree) bark to devise a transcript mapping approach for the estimation of the read amount needed for deep transcriptome coverage. Findings We optimized the assembly of a Hevea bark transcriptome based on 16 Gb Illumina PE RNA-Seq reads using the Oases assembler across a range of k-mer sizes. We then assessed assembly quality based on transcript N50 length and transcript mapping statistics in relation to (a) known Hevea cDNAs with complete open reading frames, (b) a set of core eukaryotic genes and (c) Hevea genome scaffolds. This was followed by a systematic transcript mapping process where sub-assemblies from a series of incremental amounts of bark transcripts were aligned to transcripts from the entire bark transcriptome assembly. The exercise served to relate read amounts to the degree of transcript mapping level, the latter being an indicator of the coverage of gene transcripts expressed in the sample. As read amounts or datasize increased toward 16 Gb, the number of transcripts mapped to the entire bark assembly approached saturation. A colour matrix was subsequently generated to illustrate sequencing depth requirement in relation to the degree of coverage of total sample transcripts. Conclusions We devised a procedure, the “transcript mapping saturation test”, to estimate the amount of RNA-Seq reads needed for deep coverage of transcriptomes. For Hevea de novo assembly, we propose generating between 5–8 Gb reads, whereby around 90% transcript coverage could be achieved with optimized k-mers and transcript N50 length. The principle behind this methodology may also be applied to other non-model plants, or with reads from other second generation

  15. Single-cell RNA sequencing: revealing human pre-implantation development, pluripotency and germline development.

    PubMed

    Petropoulos, S; Panula, S P; Schell, J P; Lanner, F

    2016-09-01

    Early human development is a dynamic, heterogeneous, complex and multidimensional process. During the first week, the single-cell zygote undergoes eight to nine rounds of cell division generating the multicellular blastocyst, which consists of hundreds of cells forming spatially organized embryonic and extra-embryonic tissues. At the level of transcription, degradation of maternal RNA commences at around the two-cell stage, coinciding with embryonic genome activation. Although numerous efforts have recently focused on delineating this process in humans, many questions still remain as thorough investigation has been limited by ethical issues, scarce availability of human embryos and the presence of minute amounts of DNA and RNA. In vitro cultures of embryonic stem cells provide some insight into early human development, but such studies have been confounded by analysis